Teneo Linguistic Modeling Language
Reference Manual
This manual targets conversational AI developers and users of the Teneo Platform
The Teneo Engine uses syntaxes and operators written in the Teneo Linguistic Modeling Language (TLML) to evaluate and process sentences; this manual describes the syntax and the different operators available.
Basic Concepts
TLML Syntax
Very briefly, the Teneo Linguistic Modeling Language syntax describes the word patterns the system needs to recognize in the sentence in order to produce a match which results in the system triggering a Flow, selecting a transition, etc. The TLML syntax can either be drafted using the NLU Generator or be written by the Platform user.
There are two types of TLML syntaxes:
- Atomic syntax, which can be used on its own, and
- Compound syntax, which is formed by a combination of one or more syntaxes and an operator, which defines the relationship between said syntaxes.
In some of the examples used here, atomic syntax is expressed using letters (A B C) while operators are represented by different symbols (like + & >>). A compound syntax will look like A & B, where A and B will be referred to as the operands.
The Teneo Linguistic Modeling Language is initially entered in the solution as text strings, which are parsed in Engine according to the TLML syntax rules; the result of the parsing is an Abstract Syntax tree (AST) structure that determines how the syntax is evaluated (see the section Evaluating TLML Syntax). The syntax functionality resides in the elements of the AST.
Parsing splits the syntax text into parts, according to the syntax semantics.
In general terms, the TLML syntax is not case-sensitive. See section Exact option for further details on the exceptions.
Sentence & Word, Sentence Range and Range Start
Sentence & Word
Before the evaluation of the TLML syntax begins, the system pre-processes the user input and splits it into sentences and words in the Input Processors. Sentences are formed by one or more words (unless the user input is empty; then only one sentence exists that contains no words).
Words are made available in multiple forms although only two are relevant for the TLML syntax evaluation:
- The raw form, which is the spelling of the word exactly as found in the evaluated sentence
- The final form, which is the spelling of the word after applying normalization and spelling corrections.
The TLML syntax rules test the final form of a word. Exception: if Exact Option is being applied, then the raw form is tested.
In some of the examples here, words are expressed using letters; if a sentence contains the same word more than once it will be identified by a number (A B C1 C2).
Sentence Range
The TLML syntax is matched against a whole sentence or, depending on many factors, a certain range of words from a sentence. The sentence words that are tested against a TLML syntax rule are known as the sentence range.
For instance, if having the sentence A B1 C B2 E but the evaluation of the sentence starts at C, then the sentence range would be C B2 E.
See the example below for further details.
Range Start
The index of the first word of the sentence range is the range start. I.e. the range start (RS) is a number that indicates the position of the word in the sentence where the TLML syntax needs to start being (or continue to be) tested.
The range start works as follows:
- At the beginning of the evaluation of a sentence the range start is 1, so it points at the first word of the sentence. This means that, at this stage, the TLML syntax may match words in the entire sentence.
- When the TLML syntax is matched, the range start is updated to the index of the word following the one that has matched the TLML syntax being tested (note that this rule isn't followed by the Maybe Operator). With each increase of the range start, the range of sentence words available for matching decreases.
- The evaluated range of words begins at range start and always extends until the end of the sentence.
Some compound syntaxes reset the range start to the beginning of the sentence when they evaluate an operand.
Example
TLML Syntax: B
Sentence: A B1 C B2 E
The syntax evaluation will do the following:
- At the beginning of the evaluation, the sentence range is the whole sentence A B1 C B2 E, and the range start is 1.
- When the B in the second position of the input sentence is matched, the range start is set to 3 and thus reduces the sentence range to C B2 E.
- When the B in the fourth position of the input sentence is matched, the range start is set to 5 and thus reduces the sentence range to E.
Dependencies between sentences
Sentences are evaluated independently against the TLML syntaxes; therefore, one sentence's test results are not affected by other sentences.
There is, however, an exception: scripts may set variables (via side effects) in one sentence and test these variables on the next sentence. This creates a dependency between sentences and would be affected by sentence testing order.
Syntax Match Instance and Used Words
If a TLML syntax matches a sentence, then the Boolean condition result is "true", and for each possible matches a match data instance is generated.
A match data instance, or match instance (MI), provides the following information:
- The used words (UW), which are the words in the sentence that matched the TLML syntax.
- The range start to be used for the ongoing TLML syntax evaluation.
- The Language Object (LOB) variables, which are defined by LOBs. During TLML syntax evaluation, LOB variables are maintained in a separate hidden variable space and become available to general scripting after the syntax evaluation.
- The Natural Language Understanding (NLU) data; see NLU variables and attached scripts for more details.
If a TLML syntax does not match a sentence, then the Boolean result is "false", and match data instances are not generated.
Example
TLML Syntax: B
Sentence: A B1 C B2 E
This combination of sentence and syntax will produce 2 match instances:
- When the B in the second position of the input sentence matches, the match instance generated contains the value UW = B (at word position 2) and RS = 3.
- When the B in the fourth position matches, the match instance generated contains the value UW = B (at word position 4) and RS = 5.
Syntax Context
When using the Teneo Platform, the TLML syntax is usually complex, composed by many atomic syntax rules and operators; these elements affect how the TLML syntax is evaluated.
Example
TLML Syntax: A
Sentence: A B C
In this example, under ordinary circumstances (range start = 1), the syntax would match, since A is in the sentence range.
However, if a match were to modify the value of the range start to 2 or 3, then the sentence range would no longer contain A and therefore the syntax would not be fulfilled, even though the word A is in the sentence.
Condition Context
As seen in the example, the operations happening before evaluating a TLML syntax may have a huge impact testing it.
The entirety of syntaxes evaluated before a particular syntax is named the condition context, which provides the following data:
- One or more match data instances that have already been generated by the evaluation of the syntax. If the syntax hasn't started being evaluated, there is a default match instance which contains the initial value of the condition context such as range start equal to 1.
- The flag DIRECTLY_FOLLOWED_BY (DFB), which indicates whether the current syntax is being evaluated in the context of the right operand of a compound syntax of the type Directly Followed By (>>) - Operator.
Usually the value of the DFB flag is propagated through compound syntaxes, with some exceptions that are described in the section Compound syntaxes.
Evaluating TLML Syntax
Operators are described in full details in the section Compound syntaxes.
Evaluation of TLML Syntax according to AST
As mentioned, TLML syntax can be quite complex and require several operations to complete their testing. TLML syntax matching follows the Abstract Syntax tree (AST) structure, which determines which steps need to be followed (and in which order) to test the TLML syntax. The system evaluates the operators (like + & >>) with a top-to-bottom approach, and operands (like A B C) left-to-right.
As an example, take the TLML syntax A >> B, which would correspond to the following AST and evaluation steps:
- Create default MI and condition context. Pass it to the operator
- Pass the received condition context to the left operand
- Test left operand A. Update and/or create new MIs if matches occur
- If the operator requires it, modify the condition context; The resulting condition context is passed to the right operand
- Test right operand B. Update and/or create new MIs if matches occur
- Continue with the next TLML syntax to be evaluated. If it's the last TLML syntax, select the final result.
Passing the condition context
It is important to highlight that operators have different behaviors regarding the condition context.
Some operators pass the condition context (including the existing MI) to the first operand, which modifies the existing MIs (or creates new ones) if matches occur. Once the first operand is evaluated, the updated condition context is returned to the operator, which forwards it to the second operand to continue the evaluation. The second operand is evaluated, and the context is updated if necessary, generating the result of the TLML syntax.
Other operators (i.e. And and Extended And operators), however, duplicate the MI and pass one copy to each operand. Operands evaluate the sentence independently and return their updated condition context to the operator. These two contexts are then merged by the operator by joining left operand MIs with right operand MIs, resulting in the final context.
Obtaining the final result
Firstly, for each syntax part to which Longest Match is applied, the highest used words count of all matches is determined. Those matches with fewer words are discarded.
Afterwards, if several possible matches remains, the system selects the final result according to these rules:
- Rule 1: Left Most of First
First, the system determines the leftmost first used word of all MIs and selects the MIs which contain the used word. - Rule 2: Left Most of Last
If there are two or more MIs that contain the Left Most First used word, then out of them the system selects the MIs that contain the Left Most Last used word. - Rule 3: Longest Match
If rules 1 and 2 do not narrow it down to a single MI, the out of them the system selects those with the highest count of used words. - Rule 4: Internal Order
If after applying these rules there are still two or more MIs left, then out of them the system picks up the first MI, according to the order in which the MIs were generated by the TLML syntax.
Unused Word Limitation
It is possible to limit the amount of Unused Words to take into account when evaluating the TLML syntax (this is quite useful to carry on conversations regarding specific topics, for instance). If applied, the process of evaluating the TLML syntax described in the past section is followed by a check whether count of words not used by the TLML syntax term of the selected match is not exceeding said limit. That is, if the amount of Unused Words is set to e.g. 3, the TLML syntax is fulfilled only if at most 3 words in the sentence are not used by the matching TLML syntax term.
To do so, set the desired amount of unused words as the "Limit unused words to" setting under Advanced Options in the TLML Syntax Match.
The following examples illustrate how the unused word limit is applied.
Unused Words Limit | TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|---|
[left blank] | who+are+you | Who are you? | You are who? |
Who are you going to marry? | |||
0 | who+are+you | Who are you? | [anything else] |
1 | who+are+you | Who are you? | Who are you talking to? |
Just who are you? | Who are the people you met? | ||
5 | who+are+you | Who are you? | Who are you going to marry next Sunday, Fred? |
Who are you going to marry? |
Examples
Please note that the following examples are meant to explain the TLML Syntax AST, the workflow of the TLML syntax evaluation and how condition context is built, and not to explain how operators work.
A+B and A>>B
TLML Syntax: A + B and A >> B
Sentence: A1 A2 B
For explanatory purposes, equal sentence words will be identified with a number as well, e.g. A1
This example illustrates the case when:
- The condition context is updated by the operands and then forwarded between the operator and the operands
- The condition context is updated by the operands and while being forwarded between the operator and the operands
* See the section Obtaining the final result.
(A+B)>>C
TLML Syntax: (A + B) >> C
Sentence: A B1 B2 C
For explanatory purposes, equal sentence words will be identified with a number as well, e.g. A1
This example illustrates a more complex case in which the condition context is updated by multiple operands and while being forwarded between operators and operands.
TLML Syntax
In this document, the following meta syntax (based on the Backus-Naur form) is used for the formal description of the TLML syntax. However, it is important to highlight that the TLML syntax itself is not based on the Backus-Naur form, only the meta syntax.
The syntax is specified as a set of one or more derivation rules, written as:
symbol = expression
Where symbol is a non-terminal entity defined by the expression, which consists of one or more sequences of symbols or literal characters (literals).
Multiple sequences of symbols and literals are separated by the vertical bar "|", which indicates a choice. The symbol may then be substituted by any of the expressions on the right:
color = red | green | blue
If the expression contains sequences of literal characters, they must be enclosed into single quotes:
digit = '0' | '1' | '2' | ... | '9'
An optional part of an expression is enclosed into square brackets:
number = [ '-' ] digits
A plus "+" sign is used to denote one or more repetitions, and an asterisk "*" represents zero or more repetitions:
number = [ '-' ] digit+ [ '.' digit* ]
Round brackets "()" are used for grouping:
number = ( '0' octal-digits ) | ( '0x' hex-digits )
Minus "-" is used for exclusion:
valid_character = all_characters - invalid_characters
Common syntax elements
As explained before, TLML syntax tests the words of the sentence. In the syntax a word is a sequence of one or more non-reserved, non-whitespace Unicode characters (the Unicode characters covered by the Unicode standard).
If reserved characters are included in the word, then the entire word must be delimited by double quotes. Therefore, if a word includes a literal double quote, then this quote must be duplicated and the word must be written between quotes as well. The effectively resulting word is the concatenation of the non_quote_characters and singular double quote characters.
Formally, the common syntax elements are the following:
- unicode_chars = all possible Unicode characters
- word = (unicode_chars - delimiter) + ("" non_quote_character* ('""' non_quote_character*)* "")
- delimiter = reserved_character | comment_delimiter | whitespace
- reserved_character = '&' | '+' | '>' | '~' | '/' | '(' | ')' | '!' | 'ยง' | '|' | '*' | ':' | '#' | '%' | '@' | '{' | '}' | '^' | '"' | '='
- comment_delimiter = '=='
- whitespace = blank | tab | end_of_line | java_whitespace
- non_quote_character = unicode_chars - "'
For details on whitespace see Javadoc of Character.isWhitespace and Character.isSpaceChar.
For details on the comment_delimiter, see section Comments.
The unicode_chars are all characters valid in the Teneo Platform (as defined by the Unicode standard).
Atomic syntaxes
Atomic syntaxes get their name from the fact that they are the smallest possible unit of a syntax, and all TLML syntaxes must be built using them.
The atomic syntaxes are:
- Word syntax
- Word Part syntax
- Language Object syntax
- Entity syntax
- Input Annotation syntax
- Dynamic syntax
- Topic syntax
- Script syntax
The following additional options are available for word, word part and dynamic syntaxes:
- Position Option
- Exact Options
When writing TLML syntax, it is important to be aware of the reserved characters. For more information, please see the Reserved Characters section.
Word syntax
Official name: Word syntax
Alternative name: Raw word condition, Word condition
Syntax
tlml
1word
2
General description
For a word syntax to be fulfilled, the given word must be recognized in the sentence. Here, the position and the frequency of the word within the sentence are irrelevant. This syntax is not case-sensitive.
The word syntax supports the Position Option and the Exact Option.
Logical behavior
The word syntax is true if and only if at least one match instance of the condition context exists so that
- if the DFB flag is NOT activated, the word matches one or more of the sentence words from range start of the match instance to sentence end
- if the DFB flag is activated, the word matches the sentence word at and only at range start of the match instance.
For more information on DFB, please see Directly Followed By operator.
Generated match data
The matching word is added to each MI in the condition context with a range start lower than or equal to (exactly equal to if the DFB flag is activated) the position of the matching word. All MIs with a range start higher than the positions of all matching words are dropped.
If the sentence contains multiple words which match the TLML syntax, a copy of all the MIs will be created for each match (i.e. each match will have its own set of MIs); then each matching word will be added to the used words of its set of MIs.
- Range start: next word after the matching sentence word
- Used words: used words of the condition context match plus the matching sentence word
- NLU scripts data: not modified
- LOB variables: not modified
- Longest match data: not modified
Examples
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
Congress | Congress | Congressional |
Congress person | Congressperson |
Word Part syntax
Official name: Word Part syntax
Alternative name: Word Part condition
Syntax
There are 4 possible ways to use the word part syntax:
tlml
1word '*' | '*' word | '*' word '*' | '*'
2
If reserved characters are included in the word part, then the entire word part (but not including the '*' character) must be delimited by double quotes.
General description
The word part syntax uses wildcard ( * ) at the beginning, the end, or the beginning and end of the word part to be recognized. It may also be used by itself to represent a single word.
A wildcard at the beginning allows any characters in front of the word part to be recognized; a wildcard at the end allows any characters after the word part to be recognized; the wildcards at both beginning and end allow any characters before and after the word part to be recognized.
A wildcard by itself allows any whole word to be recognized.
For instance, when using *cat, at least one word ending with "cat" must be in the sentence for the TLML syntax to match. However, the word "cat" by itself is also a match.
Please note that words that are captured by the word part syntax are fully recognized, not just the "syntax part". Besides, they will not be counted as part of the unrecognized word count. See the examples for more details.
The word part syntax supports the Position Option and the Exact Option.
Logical behavior
The word part syntax is true if and only if at least one match instance of the condition context exists so that
- if the DFB flag is NOT activated, the word part matches one or more of the sentence words from range start of the match instance to sentence end
- if the DFB flag is activated, the word part matches the sentence word at and only at range start of the match instance.
Generated match data
The matching word is added to each MI in the condition context with a range start lower than or equal to (exactly equal to if the DFB flag is activated) the position of the matching word. All MIs with a range start higher than the positions of all matching words are dropped.
If the sentence contains multiple words which match the TLML syntax, a copy of all the MIs will be created for each match (i.e. each match will have their own set of MIs); then each matching word will be added to the used words of their set of MIs.
- Range start: next word after the matching sentence word
- Used words: used words of the condition context match plus the matching sentence word
- NLU script data: not modified
- LOB variables: not modified
- Longest match data: not modified
Examples
TLML Syntax | Matches User Input | Unmatched User Input |
---|---|---|
Love* | Lover | Beloved |
Love | Glove | |
*love | Glove | Lover |
Clove | Clover | |
Love | Beloved | |
*love* | Beloved | Luv |
Clover | ||
Love | ||
I+love+* | I love tofu | I love |
I love cats | It's you who I love | |
I love martial arts |
Language Object syntax
Official name: Language Object syntax
Alternative name: Language Object condition
Syntax
tlml
1'%' language_object_name
2
3language_object_name = word
4
General description
Language Objects (LOBs) can contain a TLML syntax of any type (atomic or compound), LOB variables and NLU variables. The LOB is, in itself, a named TLML syntax so using a LOB effectively acts as a shortcut to the TLML syntax contained in it, making it easier to reuse.
Please note that even though the syntax of a Language Object name is a word (thus allows all possible Unicode characters), by convention it is written in UPPERCASE and cannot contain whitespace (spaces, tabs, newlines, etc.) or any of the reserved_characters (see chapter Reserved Characters for a listing). This constraint is enforced by the Teneo Frontend application.
For the TLML syntax of a LOB to be fulfilled, the sentence must contain text fulfilling the syntax contained in the LOB itself.
The TLML syntax of the referenced Language Object may set NLU variables, which are provided to optional Predicate or Propagation scripts attached to the syntax of a Language Object. For more details, see section NLU variables and attached scripts.
The referenced Language Object may also set LOB variables on the generated matches. LOB variables have value type String and stick on the matches and are passed on during the syntax evaluation. On syntax top-level, the LOB variables are accessible with scripting API methods EngineAcess.getLangObjVariable(String _sName)
and EngineAcess.getLangObjVariables()
.
Script expressions embedded into a LOB variable value are replaced with their evaluation result at the time of evaluation of the syntax referencing to this LOB. The special LOB variable value _USED_WORDS
will be replaced with the used words of the particular match of the TLML syntax of the Language Object. The resulting value is the concatenation of the matching sentence words (with the exact spelling given in the original user input), separated by single whitespace characters.
Logical behavior
The TLML syntax of the Language Object is true if and only if the syntax of the referenced Language Object itself is true. If the referenced Language Object doesn't exist, the the Language Object syntax is false.
Generated match data
The matches of a Language Object's syntax are generated by the TLML syntax of the referenced Language Object itself. If NLU scripts are attached to the Language Object syntax, they will be added to these matches.
- Range start: set by the TLML syntax of the Language Object
- Used words: set by the TLML syntax of the Language Object
- NLU variables: all NLU variables defined by the Language Object and set by propagation scripts in the syntax of the Language Object. These variables are available in predicate and propagation scripts attached to the Language Object syntax.
- NLU script data: the NLU scripts attached to the Language Object syntax are appended
- LOB variables: the LOB variables defined on the referenced Language Object are added to the set of LOB variables of the match. Existing LOB variables with the same name are overwritten.
- Longest match data: set by the TLML syntax of the Language Object
Examples
In the following example, we'll assume there is a Language Object named US_PRESIDENTS_LIVING.LIST
with the TLML syntax:
tlml
1Biden/Trump/Obama/Clinton/Bush/Carter
2
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
%US_PRESIDENTS_LIVING.LIST | Carter | Reagan |
Clinton | Nixon | |
Obama | ||
Biden |
Entity syntax
Official name: Entity syntax
Alternative name: Entity condition
Syntax
tlml
1'%' entity_name .ENTITY
2
3entity_name = word
4
General description
An Entity can hold one atomic syntax per entry. They are limited to the types: Word syntax (word or multi-words), Language Object syntax, Entity syntax or Annotation syntax. Each entry value must be unique within the Entity. Zero, one or more NLU variables of type string or script may be attached to the entries.
The Longest Match (:L) option is implicitly applied to the entire Entity syntax.
Please note that even though the syntax of an Entity name is a word (thus allows all possible Unicode characters), by convention it is written in UPPERCASE and cannot contain whitespace (spaces, tabs, newlines, etc.) or any of the reserved_characters (see chapter Reserved Characters for a listing). Furthermore, an Entity name always ends in .ENTITY. This constraint is enforced by the Teneo Frontend application.
For an Entity syntax to be fulfilled, the sentence must contain text fulfilling the syntax contained in an entry of the Entity itself.
The referenced Entity may set NLU variables. For more details, see section NLU variables and attached scripts.
Logical behavior
The Entity syntax is true if and only if the syntax of the referenced Entity itself is true. If the referenced Entity doesn't exist, then the Entity syntax is false.
Generated match data
The matches of the syntax of an Entity are generated by the syntax of the referenced Entity itself. If variables are defined in the Entity, they will be added to these matches.
- Range start: set by the TLML syntax of the Entity
- Used words: set by the TLML syntax of the Entity
- NLU variables: all NLU variables defined by the Entity
- NLU script data: not modified
- LOB variables: not modified
- Longest match data: set by the TLML syntax of the Entity
Examples
In the following example, we'll assume there is an Entity named COLOR.ENTITY
, with the entries:
TLML
1Blue
2Green
3Red
4Yellow
5
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
%COLOR.ENTITY | Green | Orange |
I like red | I don't like black | |
Yellow stones | Yellowstone |
Annotation syntax
Official name: Annotation syntax
Alternative name: Annotation condition, Input Annotation condition/syntax
Syntax
tlml
1'%$' annotation_name
2
3annotation_name = word
4
General description
Annotations follow the same naming conventions as Language Objects and Entities, thus names must be written in UPPERCASE and cannot contain whitespace (spaces, tabs, newlines etc.) or any of the reserved_characters (see chapter Reserved characters for a listing).
For an Annotation syntax to be fulfilled, an annotation with the name given in the Annotation syntax must exist on the sentence itself or one or more sentence words.
The variables of the matching annotation are provided to optional predicate and propagation scripts attached to the annotation syntax. For more details, see section NLU variables and attached scripts.
The platform generates two default collections of annotations:
- System Annotations: Built-in, these two annotations depend on the session state and are set by the Teneo Engine itself:
_INIT
if a dialogue has just begun
_TIMEOUT
if the current dialogue was restarted after being timed out - Standard Annotations: These annotations are set by the Input Processors of the Teneo Engine, according to the active language configuration. See the Annotating inputs page for a listing of these annotations.
According to the active language configuration, input processors may set additional annotations, for example, part-of-speech annotations, or number/digit annotations.
Annotations may also be created by solution scripting.
Logical behavior
The annotation syntax is true if and only if any of the following occurs:
- if the current sentence itself has an annotation with the name given in the annotation syntax (that is, the annotation is not assigned to any sentence word)
- if the current sentence has a word with an annotation with the name given in the annotation syntax, then if the DFB flag is NOT activate, the word with the annotation is in the sentence words from range start of a condition context match instance to sentence end
- if the current sentence has a word with an annotation with the name given in the annotation syntax, then if the DFB flag is activated, the word with the annotation is exactly at the sentence range start of a condition context match instance.
Note that an annotation instance can be assigned to multiple words. In this case the above logical rules are applied only to the left most of the words the annotation is assigned to. That is, such an annotation is not fulfilling the annotation syntax if the left most word is to the left of the range start of a match instance, even if one or more of the other words are in the sentence words from range start to sentence end.
Generated match data
The matching word is added to each MI in the condition context with a range start lower than or equal to (exactly equal to if the DFB flag is activated) the position of the matching word. All MIs with a range start higher than the positions of all matching words are dropped.
If the sentence contains multiple words which match the TLML syntax, a copy of all the MIs will be created for each match (i.e. each match will have their own set of MIs); then each matching word will be added to the used words of their set of MIs.
- Range start: next word after the matching sentence word
- Used words: used words of the condition context match plus the used words of the annotation
- NLU variables: set to the variables attached to the annotation
- NLU script data: the NLU scripts attached to the annotation syntax are appended
- LOB variables: not modified
- Longest match data: not modified
Examples
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
%$_QUESTION | What? | Question |
? huh ?? | ||
%NN.POS | A book | Book this |
A nice play | Play louder |
Dynamic syntax
Official name: Dynamic syntax
Alternative name: Dynamic condition, Dynamic word condition
Syntax
tlml
1`#` variable_name
2
3variable_name = word
4
General description
For a dynamic syntax to be fulfilled, the sentence must contain the content of the specified variable; case, position and frequency of that content within the sentence are irrelevant. The dynamic syntax matches a sentence word if the simplified form of the variable content is exactly the same as the final word form of the sentence word.
The "variable content" is the result of Java method toString()
called on the variable content object.
Remember that variable names are case sensitive.
The dynamic syntax supports the Position Option and the Exact Option.
Logical behavior
The dynamic syntax is true if and only if at least one match instance of the condition context exists so that:
- if the DFB flag is NOT activated, the dynamic syntax matches one or more of the sentence words from range start of the match instance to sentence end
- if the DFB flag is activated, the dynamic syntax matches the sentence word at and only at range start of the match instance
The dynamic syntax is false if:
- the variable doesn't exists, or
- the variable content is null, or
- the call to method
toString()
on the variable content object caused an exception
Generated match data
The matching word is added to each MI in the condition context with a range start lower than or equal to (exactly equal to if the DFB flag is activated) the position of the matching word. All MIs with a range start higher than the positions of all matching words are dropped.
If the sentence contains multiple words which match the TLML syntax, a copy of all the MIs will be created for each match (i.e. each match will have their own set of MIs); then each matching word will be added to the used words of their set of MIs.
- Range start: next word after the matching sentence word
- Used words: used words of the condition context match plus the matching sentence word
- NLU script data: not modified
- LOB variables: not modified
- Longest match data: not modified
Examples
In the following example, we'll assume there is a variable called User_City
with the value "Memphis".
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
#User_City | Memphis | MemphisGrizzlies |
MEMPHIS | Tennessee |
Topic syntax
Official name: Topic syntax
Alternative name: Topic condition
Syntax
tlml
1'@' variable_name
2
3variable_name = word
4
General description
For a topic syntax to be fulfilled, a scripting variable with the given name must be set to a value which is true (remember that variable names are case-sensitive). The topic syntax neither takes sentence words into account nor adds the used words of the resulting matches.
The set of available scripting variables depends on the location of the topic syntax. In the TLML syntax of global listeners, only session variables are accessible; otherwise session and Flow variables are accessible. Session variables can be assigned a certain value for a specific number of turns, after which they revert to their default values. The life span of a session variable assignment can be set by solution scripting but is, by default, the entire session.
See section Glossary โ Concept reference for the definition of Studio concepts such as Flows, listeners and variables.
Logical behavior
The topic syntax is true if and only if the variable content:
- has a Boolean type and the value is true, or
- has a number type and the value is neither zero nor NaN ("not a number"), or
- is a character and the character code is not zero, or
- is a character sequence (e.g. a string) and the sequence is not empty
If the variable value doesn't have any of the types listed above or is null, then the topic syntax is false.
Generated match data
The matches of the condition context are not changed and passed on as as is.
- Range start: not modified
- Used words: not modified
- NLU script data: not modified
- LOB variables: not modified
- Longest match data: not modified
Examples
In the following example, we'll assume there is a global Boolean variable called fish
in the solution.
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
@fish | All inputs are recognized as long as the variable fish has a value that returns true. | No input is recognized if the variable fish has a value that returns false. |
Script syntax
Official name: Script syntax
Alternative name: Script condition
Syntax
tlml
1'{' script_expression '}'
2
3script_expression = ( unicode_chars - '}' | '\}' )*
4
A script condition is formed by a script expression enclosed in braces. A right brace within the script expression text must be escaped by prefixing it with a backslash '\}'. All other reserved characters of the syntax can be used in the script expression text without escaping.
In summary, the effective script expression text results from stripping off the braces and removing the backslash in front of escaped right braces; backslashes appearing elsewhere in the text are kept.
General description
The script expression is evaluated with the Groovy scripting engine.
For a script condition to be fulfilled, the result of the expression evaluation has to have a Boolean value of true. If the result type is not a Boolean, then it is converted to Boolean according to Groovy's type conversion rules.
The script condition does not take sentence word into account, nor is it adding to the used words of the resulting matches.
Generated match data
The matches of the condition context are not changed and passed on as it is.
- Range start: not modified
- Used words: not modified
- NLU script data: not modified
- LOB variables: not modified
- Longest match data: not modified
Examples
In the following example, we'll assume that a User_Age
variable has been set, with a value equal to the user's age.
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
{User_Age > 21} | All inputs are recognized as long as User_Age is greater than 21. | No input is recognized if the User_Age is less than or equal to 21. |
Compound syntaxes
Compound syntaxes consist of two or more atomic syntaxes put together in formulas with the use of operators. There are several operators and they can be organized by the use of parentheses. Note that parentheses are only necessary when more than one operator is being used in the TLML syntax.
The following operators are possible for compound syntaxes:
- And (&) - operator
- Same Match (&=) - operator
- Bigger Match (&>) - operator
- Smaller Match (&<) - operator
- Overlap Match (&~) - operator
- Different Match (&^) - operator
- Not Same Match (!&=) - operator
- Not Bigger Match (!&>) - operator
- Not Smaller Match (!&<) - operator
- Not Overlap Match (!&~) - operator
- Not Different Match (!&^) - operator
- Followed By (+) - operator
- Directly Followed By (>>) - operator
- Not Directly Followed By (!>>) - operator
- Or (/) - operator
- Not (!) - operator
- Maybe (~) - operator
- Compound (|) - operator
- Parentheses
The following binding groups are defined for the operators:
Binding groups | Operators |
---|---|
Strong binding | NOT (!), COMPOUND (|) |
Weaker binding | OR (/), AND (&), SAME MATCH (&=), BIGGER MATCH (&>), SMALLER MATCH (&<), OVERLAP MATCH (&~), DIFFERENT MATCH (&^), NOT SAME MATCH (!&=), NOT BIGGER MATCH (!&>), NOT SMALLER MATCH (!&<), NOT OVERLAP MATCH (!&~), NOT DIFFERENT MATCH (!&^), FOLLOWED BY (+), DIRECTLY FOLLOWED BY (>>), NOT DIRECTLY FOLLOWED BY (!>>), MAYBE (~) |
The evaluation order of compound syntaxes containing different operators of equal binding strength is unspecified, therefore it needs to be clarified by using parentheses.
And operators
The And operators section refers to a set of compound operators (&, &=, &>, &<, &~, &^) which have a very similar behavior.
Syntax
tlml
1left_syntax whitespace* and_operator whitespace* right_syntax
2
3and_operator = '&' | '&=' | '&>' | '&<' | '&~' | '&^'
4left_syntax = atomic_syntax | compound_syntax
5right_syntax = atomic_syntax | compound_syntax
6
The And operators are binary operators, that is, they take two operands with the and_operator symbol separating them. The left and the right operands can be atomic or compound syntaxes.
General description
The And operators define a Boolean AND syntax on the two operand conditions; for the AND syntax to be fulfilled, both operand conditions must be met.
Logical behavior
The And syntax is true if an only if both operand syntaxes are true.
If the DFB flag is active in the passed-in condition context, then the operand syntaxes are evaluated twice:
- Once with the DFB flag passed only to the left syntax (evaluation 1)
- And again, with the DFB flag passed only to the right syntax (evaluation 2)
This is done to ensure that the And syntax is fulfilled also if only one of the many operand syntaxes matches at the range start position of the passed-in MIs.
Generated match data
Both operand syntaxes are evaluated independently of each other, by using a copy of the condition context passed in to the And syntax. The evaluation and MI updates of one operand condition do not affect the evaluation and MI updates of the other operand condition.
After the evaluation of the operand syntaxes, left and right output MIs are "multiplied" to generate the output of the And syntax: each possible pair of a left and right output MI, both originating from the same input MI and fulfilling the operator's used word constraint, is merged into a final MI. This multiplication is "left-heavy", that is, it begins with the first left-side MI to generate merges with right-side MIs in their order, followed by mergers of the second left-side MI with the right-side MIs, and so forth. If two or more generated MIs have the same used words, then only the first one is kept and the others are dropped.
If the DFB flag is activated in the passed-in condition context, and both operand syntaxes are fulfilled in both evaluations (see above), then the MIs generated by multiplication of the MIs of the first evaluation are followed by the MIs generated by multiplication of the MIs of the second evaluation.
The MI merging is carried out by inserting data of the right-side MI into the left-side MI. This effects that, if both MIs have NLU scripts attached, then the left-side scripts are executed before the right-side scripts. Furthermore, if both MIs have LOB variables with the same name, then for each of these conflicting variables the value from the right-side MI gets precedence.
- Range start: range start max from left-side and right-side MI
- Used words: aggregation of used words from left-side and right-side MI
- NLU script data: left-side NLU scripts, followed by the right-side NLU scripts
- LOB variables: aggregation of LOB variables from left-side and right-side MI (right-side one taking precedence in case of variables with the same name)
- Longest match data: left-side longest match data, followed by the right-side longest match data
And operator
Official name: And operator
Representation: &
Syntax
tlml
1left_syntax whitespace* '&' whitespace* right_syntax
2
3left_syntax = atomic_syntax | compound_syntax
4right_syntax = atomic_syntax | compound_syntax
5
General description
The & operator is the simplest of the And operators family. It does not impose a constraint on the used words of the matches from each operand.
For more information of this operator's logical behavior and generated match data, please refer to the beginning of the section And operators.
Examples
The first example illustrates how the syntax only requires that the words are used, regardless of their order.
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
I&love&you | I love you | Everyone loves you |
Your parents love you but I don't |
This other example shows that the syntax applies not only to unique words but also to LIST Language Objects or other constructions.
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
%CITIES.LIST & %CITIES.LIST | Barcelona | Andorra |
London | Portugal | |
Tokyo and Kyoto |
Extended And operators
General description
Same as the simple And operator, all extended And operators define a Boolean And syntax, but with an additional requirement: the used words of at least one pair of MI consisting of one left operand MI and one right operand MI must fulfill an extra constraint in order to meet the overall And syntax.
Logical behavior
The extended And syntax is true if and only if:
- both operand conditions are true, and
- the used word constrain is fulfilled. The used word constraint varies depending on the operator and is met when at least one pair of MI, formed by a left-side operand match and a right-side operand match which origin from the same MI in the condition context, comply with the required used word restriction.
Same Match operator
Official name: Same Match operator
Alternative name: Same Match extended And operator
Representation: &=
Syntax
tlml
1left_syntax whitespace* '&=' whitespace* right_syntax
2
3left_syntax = atomic_syntax | compound_syntax
4right_syntax = atomic_syntax | compound_syntax
5
General description
The Same Match operator is an enhanced member of the And operators family; therefore, for the syntax to be fulfilled, both operand syntaxes must be met. Since this is an extended operator, it also requires an addition used word constraint. In this case, the syntax is fulfilled if and only if at least a pair of a left-side operand match and a right-side operand match exists in which both matches have the same used words, at the same positions.
This operator is particularly useful when working with annotations, although it may be applied with any type of TLML syntax writing.
For more information on this operator's logical behavior and generated match data, please refer to the beginning of the section And operators.
Examples
The first example only matches synonyms of the verb "run" in past tense.
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
%RUN.VB.SYN &= %MST_PAST.ANNOT | I ran home | I am jogging |
He bolted out | I race |
This other example only matches if "hello" is the first word in a sentence.
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
%HELLO.FW.LEX &= %FIRST_WORD.SCRIPT | Hello to you | I said hello |
Hello and good morning |
Bigger Match operator
Official name: Bigger Match operator
Alternative name: Bigger Match extended And operator
Representation: &>
Syntax
tlml
1left_syntax whitespace* '&>' whitespace* right_syntax
2
3left_syntax = atomic_syntax | compound_syntax
4right_syntax = atomic_syntax | compound_syntax
5
General description
The Bigger Match operator is an enhanced member of the And operators family. This operator defines an And syntax that imposes an additional constraint on the used words of the operand matches; this syntax is fulfilled if and only if at least one pair of a left-side operand match and a right-side operand match exists so that the left-side match contains the same or additional used words, at the same sentence position, as the right-side match, or the right-side match has no used words. In other words, the left-side match must be a superset - same or bigger - of the right-side match.
This operator is particularly useful when working with annotations, although it may be applied with any type of TLML syntax writing.
For more information on this operator's logical behavior and generated match data please refer to the beginning of the section And operators.
Examples
The operator &> matches if the operand on the left matches on all words in the input that the operand on the right matches on, but the operand on the left can optionally match on more words as well.
This is useful, for example, when you only want a subset of a list to match.
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
%SUPPORTED_PRINTERS.LIST &> Epson | I need the user documentation for Epson 202XP | Where do I download a driver for the HP202XXL |
Where do I download a driver for Epson SSLX? | ||
%CAR_BRANDS.LIST &> hybrid | Honda Accord hybrid | Subaru Forester |
Ford Fusion Hybrid |
Smaller Match operator
Official name: Smaller Match operator
Alternative name: Smaller Match extended And operator
Representation: &<
Syntax
tlml
1left_syntax whitespace* '&<' whitespace* right_syntax
2
3left_syntax = atomic_syntax | compound_syntax
4right_syntax = atomic_syntax | compound_syntax
5
General description
The Smaller Match operator is an enhanced member of the And operators family. This operator defines an And syntax that imposes an additional constraint on the used words of the operand matches; the syntax is fulfilled if and only if at least one pair of a left-side operand match and a right-side operand match exists so that the right-side match contains the same or additional used words, at the same sentence position, as the left-side match, or the left-side match has no used words. In other words, the left-side match must be a subset - same or smaller - of the right-side match.
This operator is particularly useful when working with annotations, although it may be applied with any type of TLML syntax writing.
For more information on this operator's logical behavior and generated match data, please refer to the beginning of section And operators.
Examples
For the syntax to evaluate to true, both operands must match in the input, and the right operand must match on at least all the words matched by the left operand.
For instance, we may be searching for movies that only contain USA cities:
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
%CITIES_UNITED_STATES.LIST &< %MOVIE_TITLES.LIST | The movie Fear and Loathing in Las Vegas | I like going around loathing in Las Vegas |
I went to see Crocodile Dundee in Los Angeles | I went to the zoo and saw a crocodile in Los Angeles | |
Have you ever seen Escape from New York? | The Lord of the Rings is a great film | |
Chicago is a great musical | I don't like Vicky Cristina Barcelona |
Overlap Match operator
Official name: Overlap Match operator
Alternative name: Overlap Match extended And operator
Representation: &~
Syntax
tlml
1left_syntax whitespace* '&~' whitespace* right_syntax
2
3left_syntax = atomic_syntax | compound_syntax
4right_syntax = atomic_syntax | compound_syntax
5
General description
The Overlap Match operator is an enhanced member of the And operators family. This operator defines an And syntax that imposes an additional constraint on the used words of the operand matches; this syntax is fulfilled if and only if at least one pair of a left-side operand match and a right-side operand match exists so that the matches share at least one used word at the same sentence position. Thus, the syntax is not fulfilled if a match has no used words.
This operator is particularly useful when working with annotations, although it may be applied with any type of TLML syntax writing.
For more information on this operator's logical behavior and generated match data, please refer to the beginning of section And operators.
Examples
The operator matches when both operands overlap, i.e. they share at least one used word in the input.
It is useful when the left and right operand share some or all used words.
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
%COUNTRIES.LIST &~ (the >> united >> *) | I am a citizen of the United States of America | I shop at the United Colors of Benetton |
I travel to the United Kingdom quite often |
Different Match operator
Official name: Different Match operator
Alternative name: Different Match extended And operator
Representation: &^
Syntax
tlml
1left_syntax whitespace* '&^' whitespace* right_syntax
2
3left_syntax = atomic_syntax | compound_syntax
4right_syntax = atomic_syntax | compound_syntax
5
General description
The Different Match operator is an enhanced member of the And operators family. This operator defines an And syntax that imposes an additional constraint on the used words of the operand matches; this syntax is fulfilled if and only if at least one pair of a left-side operand match and a right-side operand match exists so that the matches share no used word at the same sentence position.
This operator is particularly useful when working with annotations, although it may be applied with any type of TLML syntax writing.
For information on this operator's logical behavior and generated match data, please refer to the beginning of the section And operators.
Examples
The Different Match operator matches if both operands do not overlap, i.e. they do not share any word in the input.
For instance, the below examples matches if the input contains two different cities/nationalities.
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
%CITIES.LIST &^ %CITIES.LIST | Paris and London | London |
Barcelona and New York City | ||
(%MY.FW.LEX >> %NATIONALITY.NN.LEX) &^ %NATIONALITIES.LIST &^ %NATIONALITIES.LIST | My nationality is French and Canadian | My nationality is Norwegian |
My nationalities are Chinese and Italian |
Negated And operators
The Negated And operators section refers to the set of compound operators (!&=, !&>, !&<, !&~, !&^) which have a very similar behavior.
Syntax
tlml
1left_syntax whitespace* negated_and_operator whitespace* right_syntax
2
3negated_and_operator = '!&=' | '!&>' | '!&<' | '!&~' | '!&^'
4left_syntax = atomic_syntax | compound_syntax
5right_syntax = atomic_syntax | compound_syntax
6
The Negated And operators are binary operators, that is, they take two operands with the negated_and_operator symbol separating them. The left and right operands can be atomic or compound conditions.
General description
The Negated And operators define a logical interconnection syntax on the two operand conditions. For a Negated And operator to be fulfilled, at least the left-side operand must be fulfilled. If the right-side operand is fulfilled as well, then the total operator syntax is met only if none of the matches of the right-side syntax comply with the used word constraint, which depends on the specific operator being used.
Note that the right-side syntax is always evaluated on the entire sentence, with the DFB flag deactivated, regardless of the condition context.
The Negated And operators are particularly useful when working with annotations, although they may be applied with any type of TLML syntax writing.
Logical behavior
The Negated And syntax is true if and only if:
- the left-side operand is true and the right-side operand is false, or
- both operand syntaxes are true and no pair formed by a left-side match and a right-side match exists so that the operator's used word constraint is fulfilled.
Generated match data
The condition context is passed only to the left-side syntax. The right-side syntax is always evaluated with an initial condition context, that is, it's being tested on the entire sentence, with the FDB flag deactivated.
The resulting MIs are all the MIs generated by the left-side operand that do not match any MIs of the right-side operand, in an operator specific way. That is, the right-side operand MIs are used to drop certain MIs of the left-side operand; otherwise they do not contribute in any way to the final MIs.
- Range start: as set by the left-side syntax
- Used words: as set by the left-side syntax
- NLU script data: as set by the left-side syntax
- LOB variables: as set by the left-side syntax
- Longest match data: as set by the left-side syntax
Not Same Match operator
Official name: Not Same Match operator
Alternative name: Not Same Match extended And operator
Representation: !&=
Syntax
tlml
1left_syntax whitespace* '!&=' whitespace* right_syntax
2
3left_syntax = atomic_syntax | compound_syntax
4right_syntax = atomic_syntax | compound_syntax
5
General description
The Not Same Match operator is a member of the Negated And operators family. For the syntax to be fulfilled, at least the left-side syntax must be fulfilled. If the right-side syntax is also fulfilled, then the entire syntax is fulfilled if and only if at least one match of the left-side syntax does not have the exact same used words as any match of the right-side syntax. Here matches of the right-side syntax with no used words are not considered.
For more information on this operator's logical behavior and generated match data, please refer to the beginning of the section Negated And operators.
Examples
The following examples illustrate the behavior of the operator, which is very useful to exclude an exact, specific pattern from a syntax.
For instance, the first examples will only match if the user input sentence contains a synonym from the WRITE.VB.SYN
or CHANGE.VB.SYN
Language Objects, but do not end in ing.
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
%WRITE.VB.SYN !&= *ing | I like to write | I like writing |
He writes | She is into scribbling | |
She should write a book | I digress in writings | |
%CHANGE.VB.SYN !&= *ing | I want to change my ticket | I am changing tickets |
I need to swap seats | I was swapping seats |
In the following example, the user input "play" matches as long as it doesn't appear as the first word in the matched user input.
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
PLAY !&= *:1 | I want to play guitar | Play me a song! |
Next, the user input "play" matches as long as it isn't annotated as a noun.
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
%PLAY !&= %NN.POS | I want to play some music | I am going to see a play in the theatre |
Finally, in this example, any user input simplified to "meta" matches, except the exact user input "Metร ".
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
Meta !&= Metร :E | It is a meta question | We attended the Metร conference |
Facebook is called Meta now |
Not Bigger Match operator
Official name: Not Bigger Match operator
Alternative name: Not Bigger Match extended And operator
Representation: !&>
Syntax
tlml
1left_syntax whitespace* '!&>' whitespace* right_syntax
2
3left_syntax = atomic_syntax | compound_syntax
4right_syntax = atomic_syntax | compound_syntax
5
General description
The Not Bigger Match operator is a member of the Negated And operators family. For its syntax to be fulfilled, at least the left-side syntax must be fulfilled. If the right-side syntax is also fulfilled, then the entire syntax is fulfilled if and only if at least one match of the left-side syntax does not have the same or more used words as/than any match of the right-side syntax. Here matches of the right-side syntax with no used words are not considered.
For more information on this operator's logical behavior and generated match data, please refer to the beginning of the section Negated And operators.
Examples
This syntax is useful when you want to exclude a smaller part of a syntax or pattern from an existing syntax.
The below example only matches when the user has a single issue and no more.
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
%I_HAVE_A_PROBLEM.PHR !&> *s | I have a problem | I have several problems |
I have one issue with this | I have two issues | |
I have a problem seeing the page |
The following example only matches on synonyms of "buy" that are not in the past tense; the Not Bigger Match operator is used as some of the synonyms may be longer than one word.
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
%BUY.VB.SYN !&> %$PAST.POS | I am buying a ticket | I bought a new car |
I am shopping for shoes | I just purchased a new laptop |
The next example will match as long as what is matched by * (asterisk) isn't also matched by the Language Object COLORS.LIST
.
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
(%I_NEED.PHR >> %NUMBERS.LIST >> %MY_PRODUCTS.LIST) !&> %COLORS.LIST | I need 5 new brackets | I need five green brackets |
I really need 7 new shelves |
This example will match on all synonyms of "turn down", "bring down", etc. except for "lower".
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
%TURN_DOWN.VB.SYN !&> %LOWER.VB.LEX | Turn down the music | My you lower the music? |
Bring down the volume |
Finally, this example will match as long as none of the words matched by the PHR Language Object is the first word in the user input.
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
%HOW_DO_I_CANCEL.PHR !&> %FIRST_WORD.SCRIPT | Hello, can you help me cancel... | How do I cancel... |
Please help me to cancel this! |
Not Smaller Match operator
Official name: Not Smaller Match operator
Alternative name: Not Smaller Match extended And operator
Representation: !&<
Syntax
tlml
1left_syntax whitespace* '!&<' whitespace* right_syntax
2
3left_syntax = atomic_syntax | compound_syntax
4right_syntax = atomic_syntax | compound_syntax
5
General description
The Not Smaller Match operator is an enhanced member of the Negated And operators family. For its syntax to be fulfilled, the left-side syntax must match on the user input. The right-side syntax does not have to match, but if it does, it cannot consume all of the used words that the left-side syntax consumed. The left-side syntax has to be smaller or equal to the right-side syntax (consume less or the same amount of used words).
For more information on this operator's logical behavior and generated match data, please refer to the beginning of the section Negated And operators.
Examples
The first example matches user inputs with "worry"/"worried", etc. (i.e. conjugations of the verb "worry"), as long as it doesn't also match on "don't worry" (i.e. the Language Object DO_NOT.VB.LEX
and any included conjugations)..
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
%WORRY.VB.LEX !&< (%DO_NOT.VB.MUL >> %WORRY.VB.LEX) | I worry often | Don't worry! |
I am worried | Please do not worry |
The following example matches user inputs with the word "great" as long as the input doesn't also match on "Great Britain".
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
(%GREAT.ADJ.LEX !&< %GREAT_BRITAIN.NN.MUL) | Great day! | Great Britain |
Great to see you |
The next example matches on user inputs with the word "nice" as long as it doesn't also match on "to Nice", "via Nice", etc.
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
(%NICE.ADJ.LEX !&< ((%TO.FW.LEX / %FROM.FW.LEX / %VIA.FW.LEX / %BY.FW.LEX / %OVER.FW.LEX) >> %NICE.NN.LEX)) | A nice day! | I am flying from Nice |
Nice to see you. | I go via Nice |
Lastly, the example below matches on user inputs with the word "love" as long as it doesn't also match on the idiom "for the love of..."
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
(%LOVE.VB.LEX !&< (%FOR.FW.LEX >> %THE.FW.LEX >> %LOVE.VB.LEX >> %OF.FW.LEX)) | I love it! | For the love of God! |
Love is in the air |
Not Overlap Match operator
Official name: Not Overlap Match operator
Alternative names: Not Overlap Match extended And operator
Representation: !&~
Syntax
tlml
1left_syntax whitespace* '!&~' whitespace* right_syntax
2
3left_syntax = atomic_syntax | compound_syntax
4right_syntax = atomic_syntax | compound_syntax
5
General description
The Not Overlap Match operator is a member of the Negated And operators family. For its syntax to be fulfilled, at least the left-side syntax must be fulfilled. If the right-side syntax is also fulfilled, then the entire syntax is fulfilled if and only if at least one match of the left-side syntax does not share any used words with any match of the right-side syntax.
For more information on this operator's logical behavior and generated match data, please refer to the beginning of the section Negated And operators.
Examples
This operator is useful when you want a TLML syntax to match in general, except in a specific context which may contain both more or less used words than the syntax.
For example, if wanting to match countries which do not contain any cardinal points.
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
%COUNTRIES.LIST !&~ %COMPASS_POINTS.LIST | I've been east of Papua New Guinea | I'm visiting the South Sandwich Islands |
Spain is south of France | The capital of North Korea is Pyongyang | |
I live in Portugal | The West Bank is a territory, not a state |
In the following example, the input "smart phone" and synonyms matches as long as none of the used words are part of the objects annotated as "device".
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
%SMART_PHONE.NN.SYN !&~ %$DEVICE | My smartphone received a notification | John's iPhone |
A smartphone just rang | The phone is in the kitchen |
Next, we may want to match requests to make a phone call while keeping out constructions like "call me X".
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
(%CAN_YOU_PHONE.PHR !&~ %MY_NAME_IS.PHR) + %NAMES.LIST | Please call Anna | Please call me Paul |
Can you call Pete? | Can you call me Paul from now on? |
Finally, this example does not match the question in the exact sequence "Did I open".
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
%I_OPEN.PHR !&~ %DID_I.VB.MUL | I open | Did I open? (due to the overlapping "i") |
I open, don't I? |
Not Different Match operator
Official name: Not different Match operator
Alternative name: Not Different Match extended And operator
Representation: !&^
Syntax
tlml
1left_syntax whitespace* '!&^' whitespace* right_syntax
2
3left_syntax = atomic_syntax | compound_syntax
4right_syntax = atomic_syntax | compound_syntax
5
General description
The Not Different Match operator is a member of the Negated And operators family. For its syntax to be fulfilled, at least the left-side syntax must be fulfilled. If the right-side syntax is also fulfilled, then the entire syntax is fulfilled if and only if at least one match of the left-side syntax exists that does not share no used words with any match of the right-side syntax (that is, at least one left-side match shares used words with every right-side match). Here matches of the right-side syntax with no used words are not considered.
For more information on this operator's logical behavior and generated match data, please refer to the beginning of the section Negated And operators.
Examples
An obvious use case is aiming for a match unless the user input contains a specific word; another use case could be when wanting to allow a word/word pattern to match if it occurs once in the sentence, but not if it appears twice.
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
(%FLOWERS.LIST &^ %POSITIVE_ADJECTIVES.LIST) !&^ (*flower & super*) | The windflower is supernice! | When a superhero marries they beflower his wife with wonderful rose petals |
The mayflower is superb | Dandelions are beautiful when they reflower but I find them superficial | |
The sunflower is beautiful but too large for a jar | I heard a supermodel say that tulips are excellent and they taste like cauliflower | |
I heard that tulips are tasty, just like cauliflower | ||
Roses are pretty | ||
(%HOW_DO_I_OPEN.PHR & %ACCOUNT.NN.LEX) !&^ %SAVING.NN.LEX | How do I open an account? | How do I open a savings account? |
How do I open an account for savings? | ||
(%I_LIKE.PHR &^ %FRUIT.LIST) !&^ %FRUIT.LIST | I like apples | I like apples and pears |
I like pears |
Other Compound operators
Followed By operator
Official name: Followed By operator
Representation: +
Syntax
tlml
1left_syntax whitespace* '+' whitespace* right_syntax
2
3left_syntax = atomic_syntax | compound_syntax
4right_syntax = atomic_syntax | compound_syntax
5
The Followed By operator is a binary operator, that is, it takes two operands with the + operator symbol separating them. The left and right operands can be atomic or compound conditions.
General description
The Followed By operator defines a Boolean syntax on the two operand syntaxes; for this syntax to be fulfilled, both operand syntaxes must be met so that the right operand syntax is fulfilled somewhere within the sentence range of a left operand MI. That is, the used words of a right operand match must follow the used words of a left operand match, directly or with a gap.
NOTE: this does not take into account the used words generated by the right operand of a Maybe syntax within the operands of the Followed By syntax. See section Maybe operator for details.
Logical behavior
The Followed By syntax is true if and only if
- the left-side syntax is true, and
- the right-side syntax is true on one or more of the MIs of the left-side syntax.
Generated match data
The condition context is passed as it is to the left-side syntax, then the DFB flag is deactivated and the condition context (with the MIs of the left-side syntax) is passed to the right-side syntax. Thus, the right-side syntax either drops MIs or adds data to them. The MIs returned by the right-side syntax are the final MIs generated by the Followed By syntax.
- Range start: range start of the right-side MI
- Used words: aggregation of used words from left-side and right-side MI
- NLU script data: left-side NLU scripts, followed by the right-side NLU scripts
- LOB variables: aggregation of LOB variables from left-side and right-side MI (right-side one taking precedence in case of variables with the same name)
- Longest match data: left-side longest math data, followed by right-side longest match data
Examples
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
i+love+you | I love you | Your parents love you, but I don't |
I don't love you | You love the same person I do |
Directly Followed By operator
Official name: Directly Followed By operator
Representation: >>
Syntax
tlml
1left_syntax whitespace* '>>' whitespace* right_syntax
2
3left_syntax = atomic_syntax | compound_syntax
4right_syntax = atomic_syntax | compound_syntax
5
The Directly Followed By operator is a binary operator, that is, it takes two operands with the >> operator symbol separating them. The left and right operands can be atomic or compound syntaxes.
General description
The Directly Followed By operator defines a Boolean syntax on the two operand syntaxes; for this syntax to be fulfilled, both operand syntaxes must be met so that the right operand syntax is fulfilled at the beginning of the sentence range of a left operand MI. That is, the first used word of a right operand match must directly follow the last used word of a left operand match, with no gap.
NOTE: this does not take into account the used words generated by the right operand of a Maybe syntax within the operands of the Directly Followed By syntax. See section Maybe operator for details.
Logical behavior
The Directly Followed By syntax is true if and only if
- the left-side syntax is true, and
- the right-side syntax is true on one or more of the MIs of the left-side syntax, with DFB flag activated
Generated match data
The condition context is passed as is to the left-side syntax, the the DFB flag is activated and the condition context (with the MI of the left-side syntax) is passed to the right-side syntax. Thus, the right-side syntax either drops MIs or adds data to them. The MIs returned by the right-side syntax are the final MIs generated by the Directly Followed By syntax.
- Range start: range start of the right-side MI
- Used words: aggregation of used words from left-side and right-side MI
- NLU script data: left-side NLU scripts, followed by the right-side NLU scripts
- LOB variables: aggregation of LOB variables from left-side and right-side MI (right-side one taking precedence in case of variables with same name)
- Longest match data: left-side longest match data, followed by the right-side longest match data.
Examples
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
Big>>man | That's a big man | That's a big fat man |
That's a big big man |
Not Directly Followed By operator
Syntax
tlml
1left_syntax whitespace* '!>>' whitespace* right_syntax
2
3left_syntax = atomic_syntax | compound_syntax
4right_syntax = atomic_syntax | compound_syntax
5
The Not Directly Followed By operator is a binary operator, that is, it takes two operands with the !>> operator symbol separating them. The left and right operands can be atomic or compound syntaxes.
General descriptions
The Not Directly Followed By operator defines a logical interconnection syntax on the two operand syntaxes; for this syntax to be fulfilled, at least the left-side operand syntax must be met. If the right-side operand syntax is fulfilled too, then the entire syntax is fulfilled if a left-side syntax match exists so that its used words are not directly followed by the used words of a right-side syntax match.
NOTE: this does not take into account the used words generated by the right operand of a Maybe syntax within the operands of the Not Directly Followed By syntax. See section Maybe operator for details.
Logical behavior
The Not Directly Followed By syntax is true if and only if
- the left-side syntax is true and the right-side syntax is false, or
- both syntaxes are true and at least one MI of the left-side syntax exists so that the right-side syntax on this MI, with DFB flag activated, is false.
Generated match data
The condition syntax is passed only to the left-side syntax. The right-side syntax is always evaluated with a condition context with exactly one of the left-side MIs and the DFB flag activated.
The resulting MIs are all the MIs generated by the left-side operand that do not allow a match of the right-side operand. That is, the right-side syntax is used to drop certain MIs of the left-side operand; otherwise it does not contribute in any way to the final MIs.
- Range start: as set by the left-side syntax
- Used words: as set by the left-side syntax
- NLU script data: as set by the left-side syntax
- LOB variables: as set by the left-side syntax
- Longest match data: as set by the left-side syntax
Examples
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
Big!>>man | That's a big woman | That's a big man |
That's a big big man |
Or operator
Official name: Or operator
Representation: /
Syntax
tlml
1left_syntax whitespace* '/' whitespace* right_syntax
2
3left_syntax = atomic_syntax | compound_syntax
4right_syntax = atomic_syntax | compound_syntax
5
The Or operator is a binary operator, that is, it takes two operands with the / operator symbol separating them. The left and right operands can be atomic or compound syntaxes.
General description
The Or operator defines a Boolean OR syntax on the two operand conditions; for this OR syntax to be fulfilled, at least one of the operand syntaxes must be met.
Logical behavior
The Or syntax is true if and only if at least one of the operand syntaxes is true.
Generated match data
Both operand syntaxes are evaluated independently of each other, by using a copy of the condition context passed in to the OR syntax. The evaluation and MI updates of one operand syntax are not affecting the evaluation and MI updates of the other operand syntax.
The syntax condition's result is the collection of the MIs of both operand syntaxes, in which the left-side MIs are followed by the right-side MIs (that is, in "syntax order"). A right-side operand MI is dropped if it has the same used words as a left-side operand MI.
- Range start: as generated by the particular operand syntax
- Used words: as generated by the particular operand syntax
- NLU script data: as generated by the particular operand syntax
- LOB variables: as generated by the particular operand syntax
- Longest match data: as generated by the particular operand syntax
Examples
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
dog/cat/hamster/parakeet | I have a dog | I don't like cats or dogs |
I used to have a cat | I love riding my horse | |
I don't have a parakeet |
Not operator
Official name: Not operator
Representation: !
Syntax
tlml
1'!' whitespace* syntax
2
3syntax = atomic_syntax | compound_syntax
4
The Not operator is an unary operator, that is, it takes the one operand that follows after the ! operator symbol.
General description
The Not operator defines a Boolean NOT condition. For this syntax to be fulfilled, the operand syntax must not be met.
Syntaxes can be strengthened by using the Not operator to screen out sense-reversing words, such as "not" and "don't".
Logical behavior
The Not syntax is true if and only if at least one MI of the condition context exists for which the operand syntax is false.
Generated match data
The operand syntax is re-evaluated separately for each MI of the condition context. The resulting MIs are those for which the operand syntax does not match. That is, the operand syntax is used to drop certain MIs of the condition context; otherwise it does not contribute in any way to the final MIs.
- Range start: not modified
- Used words: not modified
- NLU script data: not modified
- LOB variables: not modified
- Longest match data: not modified
Examples
TLML Syntax | Match User Input | Unmatched User Input |
---|---|---|
!cat | I have a dog | I don't have a cat |
I love cats | Are you a cat? | |
Bird!dog | I have a bird | I don't have a dog |
I have a bird and hate dogs | I have a bird, but no dog |
See additional examples of using the Not operator in the Bracketed syntax section.
Maybe operator
Official name: Maybe operator
Representation: ~
Syntax
tlml
1left_syntax whitespace* '~' whitespace* right_syntax
2
3left_syntax = atomic_syntax | compound_syntax
4right_syntax = atomic_syntax | compound_syntax
5
The Maybe operator is a binary operator, that is, it takes two operands with the ~ operator symbol separating them. The left and right operands can be atomic or compound syntaxes.
General description
The Maybe operator defines a syntax that is fulfilled if the left operand syntax is fulfilled. The right operand syntax may be fulfilled also, but it doesn't have to. The left operand syntax is evaluated in the condition context, while the right operand syntax is always evaluated on the entire sentence.
The data (notably the used words) of all matches of the right operand syntax is accumulated and then added to all matches of the left operand syntax.
The most appropriate use of the Maybe syntax is in syntaxes that have limited the count of unused words to a low number. This limit allows to specify how many words besides those which have explicitly been specified in the TLML syntax may be contained within the sentence and still satisfy the total syntax.
Logical behavior
The Maybe syntax is true if the left-side syntax is true. The logical result of the right-side syntax is not considered.
Generated match data
The condition context is passed only to the left-side syntax. The right-side syntax is always evaluated with an initial condition syntax, that is, it's being tested on the entire sentence, with the DFB flag deactivated.
The resulting MIs are all the MIs generated by the left-side operand, merged with the accumulated data of all right-side syntax MIs (if any).
NOTE: the range start of the final MIs is defined only by the left-side syntax, even if the MI data merged in from right-side MIs contain used words after that range start. In other words, the MIs of the right-side syntax do not change the sentence test range.
- Range start: as set by the left-side syntax
- Used words: used words of the left-side MI plus used words of all right-side MIs
- NLU script data: NLU scripts of left-side MI, followed by NLU scripts of all right-side MIs
- LOB variables: aggregation of LOB variables from left-side MI and all right-side MIs (left-side one taking precedence in case of variables with same name)
- Longest match data: left-side MI longest match data, followed by longest match data of all right-side MIs
Examples
This example shows how the Maybe operator can trigger the right syntax. In this case, the syntax is matched if the phrase Language Object WHAT_IS_THE_PRICE.PHR
occurs with 0 (no) other words not annotated as nouns.
Unused Words Limit | TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|---|
0 | %WHAT_IS_THE_PRICE_OF.PHR ~ (* !&= %POS_NOUN.ANNOT) | What is the price of it? | What is the price of the car? |
What is the cost then? | What is the cost of the ticket? | ||
What is the cost? | The iPhone what is the price of it? |
Compound operator
Official name: Compound operator
Representation: | and ยง
Syntax
tlml
1left_syntax whitespace* compound_operator whitespace* right_syntax
2
3compound_operator = '|' | 'ยง'
4left_syntax = word_syntax
5right_syntax = word_syntax
6
The Compound operator is a binary operator, that is, it takes two operands with the | (or ยง) operator symbol separating them. The left and the right operands can only be word syntax, without any syntax options.
The two different operator symbols have no functional difference; they solely exist for convenient usage on most western language keyboards.
General description
The Compound operator defines a syntax that allows to recognize compound terms in a variety of forms. It may only be used to separate two words; it will not work with word parts or parenthetical expressions. The Compound operator will recognize the two words if they appear as a compound word, if separated by a hyphen, or if used as two distinct words. In the latter case, however, the first word must appear directly before the second in the sentence, as if the Directly Followed By operator had been used. No words may appear between the two words.
More formally, the Compound syntax word1|word2 is functionally equivalent to this TLML syntax:
tlml
1word1word2 / ( word1 >> word2 ) / ( word1 >> "-" >> word2 )
2
Logical behavior
The match data is generated as by evaluation of the above functionally equivalent TLML syntax.
- Range start: as set by functionally equivalent syntax
- Used words: as set by functionally equivalent syntax
- NLU script data: as set by functionally equivalent syntax
- LOB variables: as set by functionally equivalent syntax
- Longest match data: as set by functionally equivalent syntax
Examples
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
dogยงhouse | We have a doghouse | He's a house-dog |
We have a dog-house | We have a house and a dog | |
We have a dog and a house |
Bracketed syntax
Official name: Bracketed syntax
Alternative name: Parentheses
Representation: ( )
Syntax
tlml
1'(' whitespace* syntax whitespace* ')'
2
3syntax = atomic_syntax | compound_syntax
4
General description
Parentheses do not define a syntax in their own but are used to group and organize syntax parts. They separate sets of individual operators as terms in their own right.
A Bracketed syntax also allows to attach optional NLU scripts and to apply Syntax options to syntax parts as a whole. It supports the Longest Match Option and the Optional Match Option.
The result of a Bracketed syntax is the result of the entire syntax wrapped by the parentheses.
Logical behavior
The logical behavior of a Bracketed syntax is the same as the logical result of the syntax within the parentheses.
Generated match data
The match data is generated as by the wrapped syntax.
- Range start: as set by the wrapped syntax
- Used words: as set by the wrapped syntax
- NLU variables: as set by the wrapped syntax
- NLU script data: NLU scripts of the wrapped syntax followed by the NLU scripts attached to the brackets
- LOB variables: as set by the wrapped syntax
- Longest match data: as set by the wrapped syntax
Examples
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
I+like+my+ (dog/cat/hamster/goldfish) | I like my dog | I like cats |
I like my cat | I like my dogs |
The long list of pet names separated by the Or operator becomes just one more term in the string of Followed By terms, i.e.
The term I
Followed By LIKE
Followed By MY
Followed By DOG Or CAT Or HAMSTER Or GOLDFISH
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
((dogs/cats) + make + (good/great) + pets) &! (don't /dont / (do+not)) | Dogs makes really good pets | Dogs do not make great pets |
Cats make great pets when they're happy | Cats don't make good pets | |
Great pets, dogs make | ||
Cats make great pets when they don't scratch the coach | ||
Cats do make great pets but not great guardians |
This is:
The term DOGS Or CATS
Followed by MAKE
Followed by GOOD Or GREAT
Followed by PETS
And
Not including DON'T
Or __DONT __
Or DO Followed By NOT
When the And, Followed By, Or, and Maybe operators are used by themselves, parentheses are not necessary. Acceptable TLML syntax could be:
tlml
1i & like & my & dog
2
3my + dog + is + english + sheepdog
4
5dog / cat / bird / mouse / gerbil / hamster / frog
6
7sheepdog ~english
8
The Not and Compound operators "bind" stronger than the other operators. They may be combined with one of the other operators without the use of parentheses. Acceptable TLML syntax could be:
tlml
1i & like & my &! cat
2
3my + dog + is + english + sheep|dog
4
Syntax options
Syntax options may be attached to TLML syntaxes and serve to modify a certain aspect of the syntax behavior. Syntax options are not TLML syntaxes in themselves, thus they cannot stand alone.
Syntax
tlml
1syntax ( ':' option )*
2
3option = position_option | exact_option | longest_match_option | candidate_search_option
4
An option is separated from the syntax (or the previous option) by a colon, without any whitespaces. If the syntax itself (e.g. a word syntax) contains a colon, then the syntax must be embedded into double quotes.
Multiple options may be applied to the same syntax, in arbitrary order, but they may not be repeated. Depending on the particular condition only certain options may be applied; this is detailed in the following sections.
Position option
Official name: Position option
Alternative name: Position modifier/flag
Syntax
tlml
1word_testing_syntax other_option* position_option other_option*
2
3position_option = ':' [ '-' ] word_position
4word_position = ( '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' )+
5word_testing_syntax= word_syntax | word_part_syntax | dynamic_syntax
6other_option = exact_option
7
The Position option can be applied to a word syntax, word part syntax or dynamic syntax.
General description
In order for the constrained syntax to be fulfilled, the sentence has to contain the word matched by the syntax at the specified word_position. The first word on the sentence has position 1. A positive relative to the sentence end may be specified by giving a negative number; the last word on the sentence having position -1.
Position value 0 is invalid.
The Position option may be combined with the Exact option.
Logical behavior
The syntax with an attached Position option is true if and only if the sentence contains the syntax at the specified word_position.
Generated match data
The Position option does not modify the match data generated by the constrained syntax.
- Range start: not modified
- Used words: not modified
- NLU script data: not modified
- LOB variables: not modified
- Longest match data: not modified
Examples
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
name:3 | What's my name? | What is my name? |
name:-3 | Her name is Smith | Her name used to be... |
Exact option
Official name: Exact option
Alternative name: Exact modifier/flag
Syntax
tlml
1word_testing_syntax other_option* exact_option other_option*
2
3exact_option = ':' ('E' | 'e')
4word_testing_syntax = word_syntax | word_part_syntax | dynamic_syntax
5other_option = position_option
6
The Exact option can be applied to word syntax, word part syntax or dynamic syntax.
General description
In order for the constrained syntax to be fulfilled, the sentence has to contain the word exactly as spelled in the syntax or scripting variable; thus, no normalization, spelling tolerance, auto-correction or simplification is applied.
Please note:
- The Exact option cannot be used directly on Language Objects nor Entities; this means that syntax like %OBJECT:E is syntactically wrong; however, the Exact option is applicable to words within a Language Object.
- Furthermore, the Exact option is not applicable to bracketed syntax: e.g. a syntax like (one/two):E is invalid syntax and should be written as (one:E/two:E).
The Exact option may be combined with the Position option.
Logical behavior
The syntax with an attached Exact option is true if and only if the sentence contains the specified word exactly as spelled in the TLML syntax or scripting variable.
Generated match data
The Exact option does not modify the match data generated by the constrained syntax.
- Range start: not modified
- Used words: not modified
- NLU script data: not modified
- LOB variables: not modified
- Longest match data: not modified
Examples
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
horse:E | horse | Horse |
hoarse | ||
*house:E | Boathouse | BOATHOUSE |
DogHouse | ||
Love*:E | Loved | loved |
beloved |
And, if the variable Pet has a current value of Sheepdog:
TLML Syntax | Matched User Input | Unmatched User Input |
---|---|---|
#Pet:E | Sheepdog | sheep-dog |
sheep-dog |
Longest Match option
Official name: Longest Match option
Alternative name: Longest Match modifier/flag
Syntax
tlml
1bracketed_syntax other_option* longest_match_option other option*
2
3longest_match_option = ':' ( 'L' | 'l' )
4other_option = optional_match_option
5
The Longest Match option can only be applied to bracketed syntax.
General description
The Longest Match option modifies the way the Engine selects the final matching instance when returning the final result (see the section Obtaining the final result).
When using the Longest Match option, the Engine will first evaluate the full syntax, and will then select the match instance with the maximum amount of used words among the MIs generated by the wrapped syntax. This is called deferred Longest Match selection.
The Longest Match option is mostly useful for the Or operator, in case more than one of its alternatives matches.
The Longest Match option may be combined with the Optional Match option.
Logical behavior
The syntax with an attached Longest Match option is true if and only if the wrapped syntax is true. The Longest Match option only enforces the selection of the match with the maximum amount of used words among all matching instances of the wrapped syntax.
Generated match data
The Longest Match option does not modify the match data generated by the wrapped syntax.
- Range start: not modified
- Used words: not modified
- NLU script data: not modified
- LOB variables: not modified
- Longest match data: Longest match data is added when evaluating, and it is used for selecting the final match instance
Examples
TLML Syntax | User Input | Used Words |
---|---|---|
joe / (joe+doe) | Joe Doe. | Joe |
(joe / (joe+doe)):L | Joe Doe. | Joe Doe |
Candidate Search option
Official name: Candidate Search option
Alternative name: Candidate Search modifier/flag, Complete Evaluation option/modifier/flag
Syntax
tlml
1bracketed_syntax other_option* candidate_search_option other_option*
2
3candidate_search_option = ':' ( 'C' | 'c' )
4
The Candidate Search option can only be applied to bracketed syntax.
General description
Since Teneo version 4.1, this syntax option has no effect and is deprecated. The Candidate Search option should not be used in TLML syntax conditions and users are encouraged to remove it from any syntax.
Optional Match option
Official name: Optional Match option
Alternative name: Optional modifier/flag
Syntax
tlml
1bracketed_syntax other_option* optional_match_option other_option*
2
3optional_match_option = ':' ( 'O' | 'o' )
4other_option = longest_match_option
5
The Optional Match option can only be applied to bracketed syntax.
General description
The Optional Match option marks the wrapped syntax to be optional in the surrounding syntax. Please, refer to the Optionality section (next) for full description of the effects of this option.
Examples
TLML Syntax | User Input | Used words |
---|---|---|
%I_AM.PHR >> (%REALLY.ADV.SYN):O >> %HAPPY.ADJV.SYN | I am happy | I, am, happy |
I am really happy | I, am, really, happy | |
(%I_WOULD_LIKE_TO.PHR + %TRAVEL.VB.SYN) &^ (%TO.FW.LEX >> %CITIES.LIST):O &^ (%FROM.FW.LEX >> %CITIES.LIST):O | I want to travel | I, want, to, travel |
I want to travel to York. | I, want, to, travel, to, York | |
I want to travel to New York. | I, want, to, travel, to, New, York | |
I want to travel from Barcelona | I, want, to, travel, from, Barcelona | |
I want to travel to Barcelona from Madrid | I, want, to, travel, to, Barcelona, from, Madrid | |
%I_WANT_TO_ORDER.PHR + (%SOFT_DRINKS.LIST):O | I want to order | I, want, to, order |
I want to order a Pepsi | I, want, to, order, Pepsi | |
I want to order a Pepsi max | I, want, to, order, Pepsi, max |
Optionality
By use of the Optional Match option, a TLML syntax part can be marked as optional. An optional syntax part may match the sentence, but it doesn't have to.
Optionality does not change the behavior of a syntax part that is marked as optional, but rather changes the behavior of a syntax operator that has an optional syntax as operand. This is expressed as a syntax expansion rule that depends on the particular syntax operator.
In order to ensure that the final syntax match includes sentence words that are matched by an optional syntax part, the Longest Match option is implicitly applied by the expansion rule.
Syntax expansion rules
A syntax expansion rule expresses the behavior of a syntax operator with optional operand(s). This expansion is applied during syntax evaluation, and the condition AST is not changed.
Certain expansion rules propagate the optionality from an operand of the unexpanded syntax to the total syntax resulting from the expansion. Therefore, the expansion rules are applied recursively until further expansion is not possible anymore.
Whole syntax
If optionality is applied to the entire syntax, then the result either is the longest match of the non-optional syntax, or a true but empty (no used words) match. That is, the total syntax result is always true.
Note: This does not apply for the syntax of a Language Object; see next rule.
Original | Expanded |
---|---|
(optional_syntax):O | (optional_syntax / {true}):L |
Language Object syntax
If optionality is applied to the entire syntax of a Language Object, then the optionality is just propagated to all Language Object syntaxes that refer to this Language Object.
Original | Expanded |
---|---|
%LOB | (%LOB):O |
LOB syntax = (optional_syntax):O | LOB syntax = optional_syntax |
Or operator
If optionality is applied to only one operand of the Or operator, then the optionality is redundant. If optionality is applied to both operands then it is propagated to the syntax, and the Longest Match option is applied too.
Original | Expanded |
---|---|
obligatory_syntax / (optional_syntax):O | obligatory_syntax / optional_syntax |
(optional_syntax1):O / (optional_syntax2):O | ((optional_syntax1 / optional_syntax2):L):O |
Operators &, +, >>, &=, &<, &>, &~, &^
If optionality is applied to only one operand of these operators, then the result is the matches of the non-optional syntax plus the matches of the non-syntactical operand. If optionality is applied to both operands, then it is propagated to the syntax, and the syntax result is the matches of the non-optional syntax plus the matches of each operand alone. In all cases the Longest Match option is applied to.
Original | Expanded |
---|---|
(optional_syntax):O OP obligatory_syntax | ((optional_syntax OP obligatory_syntax) / obligatory_syntax):L |
obligatory_syntax OP (optional_syntax):O | ((obligatory_syntax OP optional_syntax) / obligatory_syntax):L |
(optional_syntax1):O OP (optional_syntax2):O | (( (optional_syntax1 OP optional_syntax2) / optional_syntax1 / optional_syntax2 ):L ):O |
Operators ~, !>>, !&=, !&<, !&>, !&~, !&^
Optionality applied to the right operand is redundant and therefore ignored. Optionality applied to the left operand is propagated to the syntax, and the Longest Match option is applied to.
Original | Expanded |
---|---|
(optional_syntax):O OP obligatory_syntax | ((optional_syntax OP obligatory_syntax):L):O |
obligatory_syntax OP (optional_syntax):O | obligatory_syntax OP optional_syntax |
(optional_syntax1):O OP (optional_syntax2):O | ((optional_syntax1 OP optional_syntax2):L):O |
Operator !
Optionality applied to the operand of the Not operator is redundant and therefore ignored.
Original | Expanded |
---|---|
!(optional_syntax):O | !optional_syntax |
Bracketed syntax
Optionality applied to a bracketed syntax is just propagated outside of the parentheses.
Original | Expanded |
---|---|
... ( (optional_syntax):O ) ... | ... ( (optional_syntax) ):O ... |
Examples
The following examples try to demonstrate how the optionality expansion rules are applied.
Original syntax | Expanded syntax (applied) | Notes |
---|---|---|
(a/b/(c):O) !>> (d):O | (a/b/(c):O !>> (d):O) | Initial TLML syntax |
(a/b/c) !>> (d):O | Redundant optionality in the last operand of the Or operator removed | |
(a/b/c) !>> d | Redundant optionality in the Not Directly Followed By operator removed | |
(a + (b):O ) &= c | (a + (b):O &= c) | Initial TLML syntax |
((c + b) / a):L &= c | Optionality expanded for the Followed By operator | |
((a):O & b) >> c | ((a):O & b) >> c | Initial TLML syntax |
((a & b) / b):L >> c | Expanding the optionality following the rules for the And operator | |
( (a):O / (b):O ) !&> c | ( (a):O / (b):O ) !&> c | Initial TLML syntax |
((a/b):L):O !&> c | Optionality propagated to operator level and Longest Match added as both operands of the Or are optional | |
(((a/b):L !&> c):L):O | Optionality propagated to operator level and Longest Match added according to the expansion rules of the Not Bigger Match operator | |
(((a/b):L !&> c):L {true}):L | Optionality expanded according to the whole syntax rule | |
(a + (b):O ) &= c | (a + (b):O ) &= c | Initial TLML syntax |
((a + b) / a):L &= c | Optionality expanded for the Followed By operator | |
(a):O & (b):O | (a):O & (b):O | Initial TLML syntax |
(((a & b) /a/b):L ):O | Optionality expanded according to the expansion rules for the And operator with optional operands | |
(((a & b) /a/b):L {true}):L | Optionality expanded using the whole syntax rule |
NLU variables and attached scripts
Natural Language Understanding (NLU) variables and attached scripts are a way of capturing and/or writing TLML syntax on information from inputs.
There are two types of attached scripts:
- Propagation scripts which are regular scripts, but with a dynamic script context that depends on where the script is defined. They allow users to assign values to NLU variables and propagate up NLU variables through the Language Object and Entity hierarchy.
- Predicate scripts which allow users to attach NLU/annotation variable value dependent constraints to TLML syntax.
NLU variables are defined in Language Objects (LOB), Entities and propagation scripts, and are explicitly propagated up through the LOB/Entity hierarchy by propagation scripts. From top-level syntaxes (which are those located in listeners, triggers and transitions) the NLU variables can be assigned to session and Flow variables through propagation scripts NLU variables can also be used by predicate scripts as variable value dependent constraints. Variable assignment, testing and propagation happens only after a successful full match of a TLML syntax.
Using NLU variables and attached scripts has several benefits. First, it eliminates the need to use listeners for capturing information in many common use cases. With NLU variables and propagation scripts, it is also possible to assign different values to a variable from within the same Language Object or Entity. In addition, each LOB/Entity syntax instance has its own private NLU variable space, making it possible to collect information from several occurrences of the same Language Object/Entity in the same input in a predictable and efficient way. Finally, predicate scripts enrich the syntax language with yet another fine-tuning level, making it possible to only match a syntax if a variable fulfills a specified criterion.
NLU variable representation
Each Language Object or Entity can have a list of NLU variables (in addition to LOB variables for Language Objects). This list holds the NLU variables defined for the Language Object or Entity, together with a default value for each variable.
Unlike LOB variables, which can only hold a string, an NLU variable can have any type of object as a value. Just like with Flow variables or session variables, the default value is a script expression which is evaluated by the system; if the default value is empty, the script will evaluate to null.
An NLU variable cannot be named lob or _USED_WORDS, since it would conflict with special variables which hold those names (please see the following sections).
Propagation scripts
Syntax
tlml
1propagation_script = '^{' propagation_script_text '}'
2
3propagation_script_text = ( any_char_but_right_brace | '\}' )*
4
Description
Unlike Language Object variables, NLU variables need to be explicitly propagated to the top level syntax. This means that, even if a Language Object or an Entity is matched during syntax testing, its NLU variables are not automatically available in the syntax matching result.
The variable propagation is handled by propagation scripts. Propagation scripts are not mandatory and can be attached to bracketed syntax parts, Language Object references, Entity references, input annotation references or directly after an attached predicate script.
They are attached to TLML syntax with the caret symbol and embedded inside braces, like this:
tlml
1(New>>York)^{city="new_york"}
2
3%CITY^{city="new_york"}
4
Script context
Propagation scripts are regular scripts, but the script context is dynamic and depends on where the script is defined.
There are two different types of context:
- Top level syntax, which are those syntaxes that are not referred to by another syntax, i.e. those attached to triggers, listeners and transitions
- LOB/Entity syntaxes and Annotation conditions
Examples
In top-level syntax, NLU variable values are propagated to Flow/session variables.
TLML syntax | result |
---|---|
%I_WANT_TO_GO.PHR + (to >> %HAMBURG.NN.LEX)^{destination="Hamburg"} | "Hamburg" would be propagated to the Flow/session variable destination |
(%I_WANT_TO_GO.PHR + to) >> (%CITIES.LIST^{destination="lob.city"}):L | The CITIES.LIST NLU variable city would be propagated to the Flow/session variable destination |
%MY_NAME_IS.PHR >> (Tom/Dick/Harry)^{user_name=_USED_WORDS} | "Tom", "Dick" or "Harry" would be propagated to the Flow/session variable user_name |
In Language Object syntaxes, values are propagated to NLU variables of that Language Object instead of Flow/session variables.
For instance, given the Language Object named CITIES.LIST, with the NLU variable city, we could write the following TLML syntax and script to propagate the city name:
tlml
1%HAMBURG.NN.LEX^{city="Hamburg"} /
2%LOS_ANGELES.NN.LEX^{city="Los Angeles"} /
3%ROME.NN.LEX^{city="Rome"}
4
Access rules for propagation scripts
The access rules for propagation scripts are described below.
Flow and session variables
- Propagation scripts which are part of the syntax of a Language Object/Entity do not have access to Flow or session variables. In order to assign the value of an NLU variable in a Language Object/Entity to a Flow or session variable, the value needs to be propagated up from the Language Object/Entity via top-level syntax.
- Propagation scripts which are part of a top-level syntax have read and write access to the Flow variables which are in scope, all top-level syntaxes - except for Global listeners - belong to a certain Flow and have access to that Flow's flow variables.
- Propagation scripts which are part of a top-level syntax have read and write access to session variables.
Language Object variables
- Propagation scripts don't have access to Language Object variables.
NLU variables
- Propagation scripts attached directly to a Language Object/Entity reference in any TLML syntax have access to a special read-only variable called lob. This lob variable is a read-only map containing the NLU variable values for that Language Object instance.
- Propagation scripts which are part of a Language Object/Entity syntax have read and write access to the NLU variables of that Language Object/Entity.
Input annotation variables
- Propagation scripts attached directly to an input annotation reference in TLML syntax have access to the input annotation variables via the lob variable, just like they would access NLU variables if attached to a Language Object/Entity.
- The special input annotation variable value _USED_WORDS is mapped to a string containing the original form of the sentence words specified by the word indices attached to the input annotation, concatenated with a single space character.
Used words
- All propagation scripts have access to the special variable _USED_WORDS, which holds a string representation of the original form of the words that make up the match, separated by whitespace.
- The variable _USED_WORDS contains the used words of the sub-syntax (e.g. bracketed syntax part, LOB syntax, Entity syntax or annotation syntax) to which the script is attached, not the used words of the entire syntax. To get the used words of the entire match, _USED_WORDS can be used as an NLU variable default value (if it is a Language Object syntax) or the entire syntax needs to be embedded into brackets.
- When using the used-words-related engine scripting API methods from propagation scripts, they return the used words that make up the match of the syntax the propagation script is attached to.
Static methods
- Just like any other script, all propagation scripts have access to globally defined classes (e.g. defined in the "solution loaded" script).
Engine scripting API
- All propagation scripts have read only access to the engine scripting API.
Predicate scripts
Syntax
tlml
1predicate_Script = ':{' predicate_script_text '}'
2
Description
Predicate scripts enrich the TLML syntax language with yet another fine-tuning level, making it possible to attach NLU/annotation variable value dependent constraints to syntaxes, for instance, only match on an annotation if its variable value lies within a given range or only match on an Entity if a given NLU variable contains a specific value.
Predicate scripts are syntactically attached to the syntax with the colon symbol and embedded inside braces, like this:
tlml
1%CITY.ENTITY:{lob.sCity=="New York"}
2
Just like propagation scripts, predicate scripts can be attached to bracketed syntax parts, Language Object references, Entity references or input annotation references.
Script context
A predicate script runs with its own local variables space that does not have access to session or Flow variables. Furthermore can a predicate script not propagate values, but the propagation will happen inside a LOB/Entity to which the predicate script is attached. A propagation script can however be attached in conjunction with the predicate script, but must follow after the predicate script, like this:
tlml
1%$NUMBER:{lob.numericValue>100}^{number=lob.numericValue}
2
The predicate script will be executed first. If it is evaluated to false, the match is disregarded, and the predicate script won't be executed. Since the predicate script is executed first it won't have access to the NLU variables of the current condition context.
Examples
TLML syntax | Matched User Input | Unmatched User Input |
---|---|---|
(*):{_USED_WORDS!="Hello"} | Hi | Hello |
Welcome | ||
Thanks | ||
%ANIMALS.LIST:{lob.sAnimalType=="Pet"} | Cat | Tiger |
Dog | Bull | |
Kitten | Wasp | |
%CITY.ENTITY:{lob.sCity=="New York"} | New York | Los Angeles |
Paris | ||
Barcelona | ||
%$NUMBER:{lob.numericValue>100} | 131 | 99 |
150 | 4 | |
1300 | 0.4 |
Access rules for predicate scripts
Flow and session variables
- Predicate scripts don't have access to Flow or session variables
Language Object variables
- Predicate scripts don't have access to Language Object variabels
NLU variables
- Predicate scripts attached directly to a Language Object/Entity reference in any syntax have access to a special read-only variable called lob. This lob variable is a read-only map containing the NLU variable values for that Language Object/Entity instance.
Input Annotation variables
- Predicate scripts attached directly to an input annotation reference in a syntax have access to the input annotation variables via the lob variable, just like they would access NLU variables if attached to a Language Object/Entity.
Used words
- All predicate scripts have read access to the special variable _USED_WORDS, much like propagation scripts.
Attached script execution
Evaluation
The variable propagation occurs while the TLML syntax is being evaluated if required for the evaluation of a predicate script. Otherwise it occurs after the syntax is fully matched.
Variables of the script context
For the top-level syntax as well as for the syntax of each Language Object/Entity, a separate script context is generated that provides a certain set of variables.
- In the top-level context, session and Flow variables are visible for propagation script. Predicate scripts do not have access to session or Flow variables.
- In a Language Object/Entity context, the NLU variables defined for the Language Object/Entity are visible for propagation scripts, but not the session nor Flow variables. Predicate scripts do not have access to the variables in this context.
- In both types of context, the special variable _USED_WORDS is available and provides the used words for a Language Object reference, an Entity reference, an input annotation reference, or a bracketed expression.
- Also, in both contexts, the special variable lob is available, but only for the propagation/predicate scripts attached directly to Language Object/Entity references or input annotation references. This variable refers to the context of the Language Object, Entity or input annotation to which the script is attached, and it contains the NLU variables generated in that context (note: a reference to an unknown NLU variable evaluates to null). Using the lob variable, the propagation script forwards NLU variables and used words from the Language Object, Entity or input annotation context to the context of the syntax that contains the Language Object/Entity/annotation reference. The predicate script doesn't forward the NLU variables but makes them available for evaluation within the script.
Attached script execution
Only the parts of the syntax which had a match in the sentence are used for initialization of NLU variables and execution of attached scripts. This recursively applies to all Language Object/Entity syntaxes that were referred from the matching syntax parts.
A script context is shared for all attached scripts within the same top-level, Language Object or Entity syntax instance. A predicate script will be executed first if present. If it is evaluated to false/null, the match is disregarded, and the propagation script won't be executed. Propagation scripts are executed in an order that ensures that default values of the NLU variables of a referenced Language Object/Entity are evaluated first, followed by execution of all propagation scripts in the Language Object/Entity syntax. This applies recursively, thus default values and propagation scripts of the top-level syntax are executed in order to forward NLU variables into session and Flow variables. Note: the top-level syntax does not have default values for NLU variables.
The default values of the NLU variables of a Language Object/Entity are evaluated first, but in an unspecified order. Next the propagation scripts of the Language Object/Entity syntax are executed, mostly in an unspecified order. It is however guaranteed that all propagation scripts within a bracketed expression are executed so that they are not interleaved with the execution of propagation scripts within other bracketed expressions or in the syntax itself. In general, users should not implement scripts that rely on the execution order within a same syntax.
If a Language Object/Entity reference doesn't have a propagation script, nor a predicate script, then propagation processing for this Language Object/Entity is skipped. Thus, neither their default values of this Language Object's/Entity's NLU variables nor any attached scripts that may exist in the Language Object/Entity syntax are evaluated. This applies recursively for any Language Object/Entity reference in the Language Object/Entity syntax.
Comments
Syntax
tlml
1'==' (any_text - '==' )* '=='
2
A comment is any text between a pair of comment delimiter symbols: two equals (=) characters; note that the text cannot contain the comment delimiter itself. Comments are ignored during parsing of the TLML syntax and behave like whitespace, thus they may be placed anywhere between syntax operators and their operands, but not within an atomic syntax or between a syntax and its syntax options or attached scripts.
General description
Comments allow to place any text, like editing hints, within TLML syntax. Comments are ignored by the Teneo Engine and have no effect on the syntax behavior.
Example
tlml
1'(i&like&my&cat == to be revised ==) / (my+dog+is+english+sheepdog)'
2
Reserved characters
The following characters are reserved in Teneo Linguistic Modeling Language:
+ | / | ~ | ( | : | ยง | | | " | { | ^ |
& | ! | \ | ) | # | % | @ | > | } | = |
If reserved characters need to be used within the syntax as a part of the sentence, in order to recognize them the syntax text containing the reserved characters must be put into quotation marks (").
Example
An & within a syntax text not to be recognized as an operator:
tlml
1Roberts+"&"+Son
2
Glossary - Concept reference
Concept name | Concept description |
---|---|
Syntax (condition) | Syntaxes (conditions) describe the word patterns the system need to recognize in sentences, in order to produce a match, which results in the system triggering a Flow, selecting a transition, etc. |
Used words | The words in the sentence that matched the syntax |
Sentence range | The part of the sentence that will be tested against the syntax |
Range start | Pointer to the first word of the sentence range |
Operator | It defines the relationship between syntaxes |
Longest match data | Data used by Engine if Longest Match option is applied, for selecting the MI with the maximum amount of used words |
Flow | A simple Flow consists of TLML syntax that will match the sentence and the answer to give to said input. It contains at least one transitions and one node in sequence. A complex Flow consists of more than one transition. |
Transition | A transition is the representation of the move from one node to another and may contain Matches and/or After Matches which determine whether or not it is possible to move along that edge in the Flow |
Listener | Listeners are a special construct that allows Teneo to listen to input in a transition, Flow or elsewhere and execute a script when a certain TLML syntax is fulfilled |
Global listener | Global listeners are always active and are defined for use within one or more Flows |
Flow listener | Flow listeners are used in Flows |
Local listener | Local listeners are used in specific transitions |
Variable | Variables are normal Groovy variables that store data or information that can be entered/computed and be referenced to from scripts |
Session variable | Session variables are accessible from all Flows |
Flow variables | Flow variables are defined as belonging to one Flow and can only be accessed when the Flow is active |
Standard annotations | With all language configurations, standard annotations are set by the input processors of the Teneo Engine; see the Language Capabilities section for more information |
NLU | Natural Language Understanding |
LOB | Language Object; a Language Object is a named TLML syntax that usually represents a semantic meaning of a word or combination of words in a sentence |
Entity | An Entity is a named TLML syntax that contain a smaller or larger collection of entries to identify and optionally extract relevant data via variables |
AST | Abstract Syntax Tree |
MI | Match Instance |