ruta-docbook/src/docbook/tools.ruta.language.syntax.xml (156 lines of code) (raw):

<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN" "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[ <!ENTITY imgroot "images/tools/tools.ruta/" > <!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" > %uimaents; ]> <!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> <section id="ugr.tools.ruta.language.syntax"> <title>Syntax</title> <para> UIMA Ruta defines its own language for writing rules and rule scripts. This section gives a formal overview of its syntax. </para> <para> Structure: The overall structure of a UIMA Ruta script is defined by the following syntax. <programlisting><![CDATA[Script -> PackageDeclaration? GlobalStatements Statements PackageDeclaration -> "PACKAGE" DottedIdentifier ";" GlobalStatments -> GlobalStatement* GlobalStatment -> ("SCRIPT" | "ENGINE") DottedIdentifier2 ";" | UimafitImport | ImportStatement UimafitImport -> "UIMAFIT" DottedIdentifier2 ("(" DottedIdentifier2 (COMMA DottedIdentifier2)+ ")")?; Statements -> Statement* Statement -> Declaration | VariableDeclaration | BlockDeclaration | SimpleStatement ";"]]></programlisting> Comments are excluded from the syntax definition. Comments start with "//" and always go to the end of the line. </para> <para> Syntax of import statements: <programlisting><![CDATA[ImportStatement -> (ImportType | ImportPackage | ImportTypeSystem) ";" ImportType -> "IMPORT" Type ("FROM" Typesystem)? ("AS" Alias)? ImportPackage -> "IMPORT" "PAKAGE" Package ("FROM" Typesystem)? ("AS" Alias)? ImportTypeSystem -> "IMPORT" "PACKAGE" "*" "FROM" TypeSystem ("AS" Alias)? | "IMPORT" "*" "FROM" Typesystem | "TYPESYSTEM" Typesystem Type -> DottedIdentifier Package -> DottedIdentifier TypeSystem -> DottedIdentifier2 Alias -> Identifier]]></programlisting> Example beginning of a UIMA Ruta file: <programlisting><![CDATA[PACKAGE uima.ruta.example; // import the types of this type system // (located in the descriptor folder -> types folder) IMPORT * FROM types.BibtexTypeSystem; SCRIPT uima.ruta.example.Author; SCRIPT uima.ruta.example.Title; SCRIPT uima.ruta.example.Year; ]]></programlisting> Syntax of declarations: <programlisting><![CDATA[Declaration -> "DECLARE" (AnnotationType)? Identifier ("," Identifier )* | "DECLARE" AnnotationType? Identifier ( "(" FeatureDeclaration ")" )? FeatureDeclaration -> ( (AnnotationType | "STRING" | "INT" | "FLOAT" "DOUBLE" | "BOOLEAN") Identifier) )+ VariableDeclaration -> (("TYPE" Identifier ("," Identifier)* ("=" AnnotationType)?) | ("STRING" Identifier ("," Identifier)* ("=" StringExpression)?) | (("INT" | "DOUBLE" | "FLOAT") Identifier ("," Identifier)* ("=" NumberExpression)?) | ("BOOLEAN" Identifier ("," Identifier)* ("=" BooleanExpression)?) | ("ANNOTATION" Identifier("="AnnotationExpression)?) | ("WORDLIST" Identifier ("=" WordListExpression | StringExpression)?) | ("WORDTABLE" Identifier ("=" WordTableExpression | StringExpression)?) | ("TYPELIST" Identifier ("=" TypeListExpression)?) | ("STRINGLIST" Identifier ("=" StringListExpression)?) | (("INTLIST" | "DOUBLELIST" | "FLOATLIST") Identifier ("=" NumberListExpression)?) | ("BOOLEANLIST" Identifier ("=" BooleanListExpression)?)) | ("ANNOTATIONLIST" Identifier ("=" AnnotationListExpression)?) AnnotationType -> BasicAnnotationType | declaredAnnotationType BasicAnnotationType -> ('COLON'| 'SW' | 'MARKUP' | 'PERIOD' | 'CW'| 'NUM' | 'QUESTION' | 'SPECIAL' | 'CAP' | 'COMMA' | 'EXCLAMATION' | 'SEMICOLON' | 'NBSP'| 'AMP' | '_' | 'SENTENCEEND' | 'W' | 'PM' | 'ANY' | 'ALL' | 'SPACE' | 'BREAK') BlockDeclaration -> "BLOCK" "(" Identifier ")" RuleElementWithCA "{" Statements "}" actionDeclaration -> "ACTION" Identifier "(" ("VAR"? VarType Identifier)? ("," "VAR"? VarType Identifier)*")" "=" Action ( "," Action)* ";" conditionDeclaration-> "CONDITION" Identifier "(" ("VAR"? VarType Identifier)? ("," "VAR"? VarType Identifier)*")" "=" Condition ( "," Condition)* ";"]]></programlisting> Syntax of statements and rule elements: <programlisting><![CDATA[SimpleStatement -> SimpleRule | RegExpRule | ConjunctRules | DocumentActionRule SimpleRule -> RuleElements ";" RegExpRule -> StringExpression "->" GroupAssignment ("," GroupAssignment)* ";" ConjunctRules -> RuleElements ("%" RuleElements)+ ";" DocumentActionRule -> Actions ";" GroupAssignment -> TypeExpression | NumberEpxression "=" TypeExpression RuleElements -> RuleElement+ RuleElement -> (Identifier ":")? "@"? RuleElementType | RuleElementLiteral | RuleElementComposed | RuleElementWildCard | RuleElementOptional RuleElementType -> AnnotationTypeExpr OptionalRuleElementPart RuleElementWithCA -> AnnotationTypeExpr ("{" Conditions? Actions? "}")? AnnotationTypeExpr -> (TypeExpression | AnnotationExpression TypeListExpression | AnnotationListExpression) (Operator)? Expression ("{" Conditions "}")? FeatureMatchExpression -> TypeExpression ( "." Feature)+ ( Operator (Expression | "null"))? RuleElementLiteral -> SimpleStringExpression OptionalRuleElementPart RuleElementComposed -> "(" RuleElement ("&" RuleElement)+ ")" | "(" RuleElement ("|" RuleElement)+ ")" | "(" RuleElements ")" OptionalRuleElementPart OptionalRuleElementPart-> QuantifierPart? ("{" Conditions? Actions? "}")? InlinedRules? InlinedRules -> ("<-" "{" SimpleStatement+ "}")* ("->" "{" SimpleStatement+ "}")* RuleElementWildCard -> "#"("{" Conditions? Actions? }")? InlinedRules? RuleElementOptional -> "_"("{" Conditions? Actions? }")? InlinedRules? QuantifierPart -> "*" | "*?" | "+" | "+?" | "?" | "??" | "[" NumberExpression "," NumberExpression "]" | "[" NumberExpression "," NumberExpression "]?" Conditions -> Condition ( "," Condition )* Actions -> "->" (Identifier ":")? Action ( "," (Identifier ":")? Action)* ]]></programlisting> Since each condition and each action has its own syntax, conditions and actions are described in their own section. For conditions see <xref linkend='ugr.tools.ruta.language.conditions' /> , for actions see <xref linkend='ugr.tools.ruta.language.actions' />. The syntax of expressions is explained in <xref linkend='ugr.tools.ruta.language.expressions' />. </para> <para> It is also possible to use specific expression as implicit conditions or action additionally to the set of available conditions and actions. <programlisting><![CDATA[Condition -> BooleanExpression | FeatureMatchExpression Action -> TypeExpression | FeatureAssignmentExpression | VariableAssignmentExpression ]]></programlisting> </para> <para> Identifier: <programlisting><![CDATA[DottedIdentifier -> Identifier ("." Identifier)* DottedIdentifier2 -> Identifier (("."|"-") Identifier)* Identifier -> letter (letter|digit)* ]]></programlisting> </para> </section>