uima-as-docbooks/src/docbook/ref.async.deployment.xml

<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN" "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd" [ <!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" > %uimaents; ]>  <chapter id="ugr.ref.async.deploy"> <title>Asynchronous Scaleout Deployment Descriptor</title>  <section id="ugr.ref.async.deploy.descriptor_organization"> <title>Descriptor Organization</title> <para> Each deployment descriptor describes one service, associated with a single UIMA descriptor (aggregate or primitive), and describes the deployment of those UIMA components that are co-located, together with specifications of connections to those subcomponents that are remote. </para> <para> The deployment descriptor is used to augment information contained in an analysis engine descriptor. It adds information concerning <itemizedlist spacing="compact"> <listitem><para>which components are managed using AS</para></listitem> <listitem><para>queue names for connecting components</para></listitem> <listitem><para>error thresholds and recovery / terminate action specifications</para></listitem> <listitem><para>error handling routine specifications</para></listitem>   </itemizedlist> </para> <para>The application can include both Java and non-Java components; the deployment descriptors are slightly different for non-Java components.</para> <para>Since the UIMA system property <code>uima.framework_impl</code> can be used to provide a custom implementation of the UIMAFramework class which may include an XML parser for pre-processing descriptors, it may be necessary to provide a custom parser for Saxon to use when processing the UIMA-AS descriptors. This can be accomplished by including in the classpath a class whose name is formed by adding the suffix <code>_SAXParser</code> to the name of the class specified in the <code>uima.framework_impl</code> system property. If such a class exists it will be passed to the Saxon Transform class via the -x option. The class should implement the <code>org.xml.sax.XMLReader</code> or the <code>javax.xml.parsers.SAXParserFactory</code> interface, or extend <code>org.xml.sax.XMLFilterImpl</code>. </para> </section>    <section id="ugr.ref.async.deploy.descriptor"> <title>Deployment Descriptor</title> <para>Each deployment descriptor describes components associated with one UIMA descriptor. The basic structure of a Deployment Descriptor is as follows: <programlisting> <![CDATA[<analysisEngineDeploymentDescription xmlns="http://uima.apache.org/resourceSpecifier">  <name>[String]</name> <description>[String]</description> <version>[String]</version> <vendor>[String]</vendor> <deployment protocol="jms" provider="activemq"> <casPool  numberOfCASes="xxx"  initialFsHeapSize="nnn"  disableJCasCache=  "[true/false]"/> <service>    <custom name="..." value="..."/> <inputQueue .../> <topDescriptor .../> <environmentVariables .../>  <analysisEngine key="key name" async="[true/false]" internalReplyQueueScaleout="nn1"  inputQueueScaleout="nn2">  <scaleout numberOfInstances="1"/>   <casMultiplier poolSize="5"  initialFsHeapSize="nnn"  processParentLast="[true/false]"  disableJCasCache="[true/false]/>  <asyncPrimitiveErrorConfiguration .../>  <delegates>   <analysisEngine key="key name" async="[true/false]" internalReplyQueueScaleout="nn1" inputQueueScaleout="nn2"> ...  </analysisEngine> . . . <remoteAnalysisEngine key="key name"  remoteReplyQueueScaleout="nn1">   <casMultiplier poolSize="5" initialFsHeapSize="nnn" processParentLast="[true/false]"  disableJCasCache="[true/false]/>  <inputQueue ... /> <serializer method="xmi"/> <asyncAggregateErrorConfiguration ... /> </remoteAnalysisEngine> . . . </delegates> </analysisEngine> </service> </deployment> </analysisEngineDeploymentDescription>]]></programlisting></para> </section>    <section id="ugr.ref.async.deploy.descriptor.caspool"> <title>CAS Pool</title> <para>This element specifies information for managing CAS pools. Having more CASes in the pools enables more AS components to run at the same time. For instance, if your application had four components, but one was slow, you might deploy 10 instances of the slow component. To get all 10 instances working on CASes simultaneously, your CAS pool should be at least 10 CASes. The casPool size should be small enough to avoid paging.</para> <para>This element and all its attributes are optional; if not specified, the values take their defaults (see below).</para> <para>If the <code>numberOfCASes</code> is not specified, it is set to either 1 or, for top level asynchronous deployments, the <code>scaleout numberOfInstances</code>.</para> <para>The initialFsHeapSize attribute allows setting the size of the initial CAS Feature Structure heap. This number is specified in bytes, and the default is approximately 2 megabytes for Java top-level services, and 40 kilobytes for C++ top level services. The heap grows as needed; this parameter is useful for those cases where the expected heap size is much smaller than the default. If not specified, its default value is <code>2,000,000</code> words.</para> <para>The disableJCasCache attribute on the <casMultiplier> element is optional, and allows disabling of JCas cache. The JCas cache is an internal datastructure that caches any JCas object created by the CAS. This may result in better performance for applications that make extensive use of the JCas, but also incurs a steep memory overhead. If you're processing large documents and have memory issues, you should disable this option. In general, just try running a few experiments to see what setting works better for your application. The JCas cache is enabled by default.</para>  </section>        <section id="ugr.ref.async.deploy.descriptor.service"> <title>Service</title> <para> This section is required and specifies the deployment information for the service.</para> </section>    <section id="ugr.ref.async.deploy.descriptor.custom"> <title>Customizing the deployment</title> <para>The <custom> element(s) are optional. Each one, if specified, requires a name parameter, and can have an optional value parameter. They are intended to provide additional information needed for particular kinds of deployment. </para> <para>The following lists the things that can be specified here.</para> <itemizedlist> <listitem><para> name="run_top_level_CPP_service_as_separate_process"</para> <para>(no value used)</para> <para>Causes the top level component, which must be a component specified as using <frameworkImplementation>org.apache.uima.cpp</frameworkImplementation> and which must be specified as async="false" (the default), to be run in a separate process, rather than via using the JNI.</para> </listitem> </itemizedlist> </section>    <section id="ugr.ref.async.deploy.descriptor.input_queue"> <title>Input Queue</title> <para>The inputQueue element is required. It identifies the input queue for the service. </para> <programlisting><![CDATA[<inputQueue brokerURL="tcp://x.y.z:portnumber" endpoint="queue_name" prefetch="1"/>]]></programlisting> <para> The brokerURL attribute is optional. When omitted, a default value of tcp://localhost:61616 will be used. A different brokerURL can be provided as an override when launching a service. Consult README that provides an example of brokerURL override. The queue broker address includes a protocol specification, which should be set to either "tcp", or "http". The brokerURL attribute specifies the queue broker URL, typically its network address and port. .</para> <para>The http protocol is similar to the tcp protocol, but is preferred for wide-area-network connections where there may be firewall issues, as it supports http tunnelling. </para>  <warning><para>When remote delegates are being used,  the brokerURL value used for this remote delegate is used also for the remote reply Queue, and must be valid for both the client to send requests and the remote service to send replies to. The URL to use for the reply is resolved on the remote system when sending a reply. Using "localhost" will not work, nor will partially specified URLs unless they resolve to the same URL on all nodes where services are running. The recommended best practice is to use fully qualified URL names.</para></warning> <para>The queue name is used to uniquely identify a queue belonging to a particular broker.</para> <para> The <literal>prefetch</literal> attribute controls prefetching of messages for an instance of the service. It can be 0 - which disables prefetching. This is useful in some realtime applications for reducing latency. In this case, when a new request arrives, any available instance will take the request; if prefetching was set above 0, the request might be prefetched by a busy service. The default value if not specified is 0. </para> <note><para>The <literal>prefetch</literal> attribute is only used with the top inputQueue element for the service.</para></note>    </section>    <section id="ugr.ref.async.deploy.descriptor.top_descriptor"> <title>Top level Analysis Engine descriptor</title> <titleabbrev>Top Level AE Descriptor</titleabbrev> <para>Each service must indicate some analysis engine to run, using this element. </para> <programlisting><![CDATA[<topDescriptor> <import location="..." />  </topDescriptor>]]></programlisting> <para> This is the standard UIMA import element. Imports can be by name or by location; see <olink targetdoc="&uima_docs_ref;" targetptr="ugr.ref.xml.component_descriptor.imports"/>. </para> </section>    <section id="ugr.ref.async.deploy.descriptor.environment_variables"> <title>Setting Environment Variables</title> <para>This element is optional, and provides a way to set environment variables.</para> <note><para>This element is only allowed and used for top level Analysis Engines specifying <frameworkImplementation>org.apache.uima.cpp</frameworkImplementation> and running using the <custom name="run_top_level_CPP_service_as_separate_process">; it is not supported for Java Analysis Engines.</para></note> <para>Components written in C++ can be run as a top level service. These components are launched in a separate process, and by default, all the environment variables of the launching process are passed to the new process. This element allows the environment variables of the new process to be augmented. </para> <programlisting><![CDATA[<environmentVariables>  <environmentVariable name="xxx">value goes here</environmentVariable> </environmentVariables>]]></programlisting> <para> Usually, the value will replace any existing value. As a special exception, for the environment variables used as the PATH (for Windows) or LD_LIBRARY_PATH (for Linux) or DYLD_LIBRARY_PATH (for MacOS), the value will be "prepended" with a path separator character appropriate for the platform, to any existing value. </para> </section>    <section id="ugr.ref.async.deploy.descriptor.ae"> <title>Analysis Engine</title> <para>This is used to describe an element which is an analysis engine. It is optional and only needed if the defaults are being overridden. The <literal>async</literal> attribute is only used for aggregates, and specifies that this aggregate will be run asynchronously (with input queues in front of all of its delegates) or not. If not specified, the async property defaults to "false" except in the case where the deployment descriptor includes the <delegates> element, when it defaults to "true". If you specify async="false", then it is an error to specify any <delegates> in the deployment descriptor. </para>  <para>The <literal>key</literal> attribute must have as its value the key name used in the containing aggregate descriptor to uniquely identify this delegate. Since the top level aggregate is not contained in another aggregate, this can be omitted for that element. Deployment information is matched to delegates using the key name specified in the aggregate descriptor to identify the delegate. </para> <programlisting><![CDATA[<analysisEngine key="key name" async="true" internalReplyQueueScaleout="nn1"  inputQueueScaleout="nn2">  <scaleout numberOfInstances="1"/>   <casMultiplier poolSize="5"  initialFsHeapSize="nn"  processParentLast="[true/false]"  disableJCasCache="[true/false]/>   <asyncAggregateErrorConfiguration .../>  <asyncPrimitiveErrorConfiguration .../>  <delegates>  <analysisEngine key="key name" ...>  ...  </analysisEngine> . . . <remoteAnalysisEngine key="key name"  remoteReplyQueueScaleout="nn1">   <casMultiplier poolSize="5"  initialFsHeapSize="nnn"  processParentLast="[true/false]"  disableJCasCache="[true/false]/>  <inputQueue ... /> <serializer method="[xmi|binary]"/>  <asyncAggregateErrorConfiguration .../>  </remoteAnalysisEngine> . . . </delegates> . . . </analysisEngine>]]></programlisting> <para><analysisEngine> is used to specify deployment details for an analysis engine. It is optional, and if omitted, defaults will be used: The analysis engine will be run synchronously (processing only one CAS at a time), with a scaleout of 1, using the default error configuration.</para> <para> The attributes <code>internalReplyQueueScaleout</code> and <code>inputQueueScaleout</code> only have meaning and are allowed when async="true" is specified (which in turn can only be set true for aggregates) or is the default ( which happens when the aggregate has delegate deployment options specified in the deployment descriptor). These attributes default to 1. For asynchronous aggregates, they control the number of threads used to do the work of the aggregate outside of running the delegates. This work can include one or more of the following: <itemizedlist> <listitem> <para>deserializing an input CAS (only on the input Queue), or serializing the resulting CAS back to a remote requester (only if the requester is remote).</para> </listitem> <listitem> <para>running the flow controller</para> </listitem> <listitem> <para>serializing CASes being sent to remote delegates (only useful if one or more of the delegates is remote). </para> </listitem> </itemizedlist> </para> <para> These attributes provide a way to scale out this work on multi-core machines, if these tasks become a bottleneck. </para> <para>Note that if an aggregates flow controller specifies that the first delegate the CAS should flow to is a remote, the work of serializing the CAS to that remote is done using the inputQueue thread, and the scaleout parameter that would apply would be the inputQueueScaleout. For subsequent delegates, the work is done on the internalReplyQueueScaleout threads. </para> <para> The <scaleout ...> element specifies, for co-located primitive or non-AS aggregates (async="false") at the bottom of an aggregate tree, how many replicated instances are created.  </para> <para>The <casMultiplier> element inside an <analysisEngine> element is required if the analysis engine component is a CAS multiplier, and is an error if specified for other components. It specifies for CAS multipliers the size of the pool of CASes used by that CAS multiplier for generating extra CASes.</para> <note><para>The actual CAS pool size can be bigger than the size specified here. The custom CAS multiplier code specifies how many CASes it needs access to at the same time; the actual CAS pool size is the value in the deployment descriptor, plus the value in custom CM code, minus 1.</para></note> <para>The initialFsHeapSize attribute on the <casMultiplier> element is optional, and allows setting the size of the initial CAS Feature Structure heap for CASes in this pool. This number is specified in bytes, and the default is approximately 2 megabytes for Java top-level services, and 40 kilobytes for C++ top level services. The heap grows as needed; this parameter is useful for those cases where the expected heap size is much smaller than the default.</para> <para>The disableJCasCache attribute on the <casMultiplier> element is optional, and allows disabling of JCas cache. The JCas cache is an internal datastructure that caches any JCas object created by the CAS. This may result in better performance for applications that make extensive use of the JCas, but also incurs a steep memory overhead. If you're processing large documents and have memory issues, you should disable this option. In general, just try running a few experiments to see what setting works better for your application. The JCas cache is enabled by default.</para> <para>The processParentLast attribute on the <casMultiplier> element is optional, and specifies processing order of an input CAS relative to its children. If true, a flow of an input CAS will be suspended after it is returned from a Cas Multiplier delegate until all its child CASes have finished processing. If false, an input CAS can be processed in parallel with its children.</para> <para>The <remoteAnalysisEngine> elements are used to specify that the delegate is not co-located, and how to connect to it. The <code>remoteReplyQueueScaleout</code> is optional; if not specified it defaults to 1. This scaleout is the number of threads that will be used to do the work of the containing aggregate when replies are returned from this remote delegate. This work is described above. It may be useful to set this to > 1 if, for instance, there are many CASes coming back from a remote delegate (perhaps the remote is a CAS Multiplier), and each one has to be deserialized. </para> <para>The <serializer> element describes what method of serialization to use. This element is optional and it may be set to either <code>binary</code> or <code>xmi</code>. If omitted, <code>xmi</code> serialization will be used by default. <code>Xmi</code> serialization can be quite verbose and produce large output for CASes containing many annotations; on the plus side, it supports serialization between components where the type systems may not be exactly identical (for instance, they could be different subsets of larger, common type systems). <code>Binary</code> serialization produces a smaller output size and is more efficient; on the minus side, it requires that the type systems for both components have exactly the same type and feature codes - which in practice means that the type systems have to be identical. Also, the binary serialization format is new with 2.3.0 release, and is not always available. For example, C++ services do not (currently) support this format. </para> <para> The <inputQueue> element specifies the remote's input queue. The casMultiplier element inside a remoteAnalysisEngine element is only specified if the remote component is a CAS Multiplier, and it specifies the size of a pool of CASes kept to receive the new CASes from the remote component, and the initial size of those CASes. Its poolSize must be equal to or larger than the casMultiplier poolSize specified for that remote component.</para> <note><para>As of release 2.3.1, the previous restrictions limiting remote CAS Multiplier to just one have been lifted; you can have any number, and they can be scaled out as well.</para></note> <note><para>The brokerURL value used for this remote delegate must be valid for both the client to send requests and the remote service to send replies.</para></note> <para>Services may be running on nodes with firewalls, where the only port open is the one for http. In this case, you can use the http protocol.</para> <para>The <asyncPrimitiveErrorConfiguration> element is only allowed within a top-level analysis engine specification (that is, one that is not a delegate of another, containing analysis engine).</para> </section>    <section id="ugr.ref.async.deploy.descriptor.errorconfig"> <title>Error Configuration descriptors</title> <para>Error Configuration descriptors can be included directly in the deployment descriptors, or they may use the <import> mechanism to import another file having the specification. </para> <para>For AS Aggregates, the configuration applicable to delegates goes in <asyncAggregateErrorConfiguration> elements for the delegate. </para> <para>For AS Primitives, there is one <asyncPrimitiveErrorConfiguration> element that configures threshold-based termination. The other kinds of error configuration are not applicable for AS Primitives. </para> <para>See <olink targetdoc="uima_async_scaleout" targetptr="ugr.async.eh"></olink> for a complete overview of error handling. </para>   <para>The Error Configuration descriptor for AS Aggregates is as follows; note that all the elements are optional: <programlisting><![CDATA[<asyncAggregateErrorConfiguration xmlns="http://uima.apache.org/resourceSpecifier">  <name>[String]</name> <description>[String]</description> <version>[String]</version> <vendor>[String]</vendor> <import ... />  <getMetadataErrors maxRetries="n" timeout="xxx_milliseconds" errorAction="disable|terminate"/> <processCasErrors maxRetries="n" timeout="xxx_milliseconds" continueOnRetryFailure="true|false" thresholdCount="xxx" thresholdWindow="yyy" thresholdAction="disable|terminate"/> <collectionProcessCompleteErrors timeout="xxx_milliseconds" additionalErrorAction="disable|terminate"/> </asyncAggregateErrorConfiguration>]]></programlisting></para> <para>For an AS Primitive, the <asyncPrimitiveErrorConfiguration> element appears at the top level, and has this form: <programlisting><![CDATA[<asyncPrimitiveErrorConfiguration xmlns="http://uima.apache.org/resourceSpecifier">  <name>[String]</name> <description>[String]</description> <version>[String]</version> <vendor>[String]</vendor> <import ... />  <processCasErrors thresholdCount="xxx" thresholdWindow="yyy" thresholdAction="terminate"/> <collectionProcessCompleteErrors additionalErrorAction="terminate"/> </asyncPrimitiveErrorConfiguration>]]></programlisting> </para>  <para> The maxRetries attribute specifies the maximum number of retries to do. If this is set to 0 (the default), no retries are done. </para> <para>The continueOnRetryFailure attribute, if set to 'true' causes the framework to ask the aggregate's flow controller if the processing for the CAS can continue. If this attribute is 'false' or if the flow controller indicates it cannot continue, further processing on the CAS is stopped and an error is returned from the aggregate. Warning: there are some conditions in the current implementation where this is not yet being done; this is a known issue. </para> <warning><para> If maxRetries > 0 or the continueOnRetryFailure attribute is 'true', the CAS will be saved before sending it to remote delegates, to enable these actions. For co-located delegates, the CAS is <emphasis>not</emphasis> copied and a process failure may cause it to become corrupt. Even though this may be true, the continue option is supported. It is the Flow Controller's responsibility to determine what to do with a CAS that failed during processing. </para> </warning> <para> The timeout attribute specifies the timeout values used when sending commands to remote delegates. The units are milliseconds and a value of 0 has the special meaning of no timeout.</para> <para> The thresholdCount and thresholdWindow attributes specify the threshold at which the thresholdAction is taken. If xxx errors occur within a window of size yyy, the framework takes the specified action of either disabling this delegate, or terminating the containing AS Aggregate (or if not an AS Aggregate, terminating the AS Primitive). A thresholdCount of 0 (the default) has the special meaning of no threshold, i.e. errors ignored, and a thresholdWindow of 0 (the default) means no window, i.e. all errors counted. </para> <para> An action of 'disable' applies to the specified delegate, removing it from the flow so the containing aggregate will no longer send it commands. The 'terminate' action applies to the entire service containing this component, disconnecting it from its input queue and shutting it down. Note that when disabling, the framework asks the flow controller to remove the delegate from the flow, but if the flow controller cannot reasonably operate without this component it can convert the action to 'terminate' by throwing an AnalysisEngineProcessException.FLOW_CANNOT_CONTINUE_AFTER_REMOVE exception. </para> <para> Note that the only action for an AS Primitive on getMetadata failure is to terminate, and this is always the case, so it is not listed as an configuration option. This is also the default action for an AS Aggregate getMetadata failure. </para> </section> <section id="ugr.ref.async.deploy.descriptor.errorconfig.defaults"> <title>Error Configuration defaults</title> <para> If the <errorConfiguration> element is omitted, or if some sub elements of this are omitted, the following defaults are used: <itemizedlist> <listitem><para>The maxRetries parameter is set to 0. </para></listitem> <listitem><para>Timeout defaults are set to 0, meaning no timeout, except for the getMetadata command for remote delegates; here the default is 60000 (1 minute)</para></listitem> <listitem><para>The continueOnRetryFailure action is set to "false".</para></listitem> <listitem><para>The thresholdCount value is set to 0, meaning no threshold, errors are ignored.</para> </listitem> <listitem><para>The thresholdWindow value is set to 0, meaning no window, all errors are counted.</para> </listitem> <listitem><para>No disable or terminate action will be done (i.e. errors ignored), except for the getMetadata command where the default is to terminate.</para></listitem> </itemizedlist> </para> </section> </chapter>

uima-as-docbooks/src/docbook/ref.async.deployment.xml (504 lines of code) (raw):