This article provides the in-depth syntax for text rules.  Note that text rules should only be used by technical staff who understand syntax.


Introduction

Items in a Codeit codeframe can have text matching rules attached to them.  For example, you might specify a rule that any verbatims containing the text "Coke" or "Pepsi" should be automatically mapped to  "Code (2) - Colas".  

You can allow the AI system to use these rules in the autocoding process.  So, in the example above, the verbatim "I like Coke" would be automatically coded as code 2.  Text matching rules are also useful for filtering items. So, in the example above, filtering on the text match rules for "Code (2)" would display any items containing the text "Coke" or "Pepsi".


Detailed Syntax Guide

As well as straight text strings for matching complete words, Codeit text match rules can include Regular Expressions as a syntax. Regular Expressions are a common technique for defining text matching patterns.  Although Regular Expressions are very powerful and can get quite elaborate, you can achieve a lot with very simple expressions.


Taking the example above, "Coke" or "Pepsi" can be defined by the regular expression: "Coke|Pepsi" - where the "|" symbol represents OR.  In this example, we are are searching for the string of characters "coke" OR "pepsi" anywhere in the verbatim text.   


Warning: only use syntax if you are a technical user as a lack of complete understanding could result in incorrect coding.  


SyntaxMeaning
\bcoke|\bpepsi
To match only words that "begin" with "coke" or "pepsi" we would use
: the "\b" syntax which stands for a "boundary", or anything other than a letter prior to the character string. So, the word must begin with the letters "coke" OR "pepsi". This regular expression would match "coke", "cokes", "cokey" OR "pepsi", "pepsilicious". The string must begin with "coke" OR "pepsi" but, any number of trailing characters are accepted. Another way to write this regular expression is - \b(coke|pepsi)
\bcoke\b|\bpepsi\b
This is a strict match for exactly "coke" OR "pepsi" with no leading or trailing characters. There must be a boundary at the beginning and end of each string. Another way to write this regular expression is - \b(coke|pepsi)\b
^none$
The entire verbatim is just "none". The "^" character represents the beginning of the entire verbatim and the "$" character represents the end of the entire verbatim.
good.flavour
To represent "any" character in your string, use the "." character. This expression would match "good" followed by any character then, "flavour".  Examples: "good flavour", "good-flavour", "good#flavour"
\blike.{1,10}\bflavour
This regular expression introduces the "near" concept. We know that the "." character matches any single character so by adding a range, "{1,10}" we are now allowing between 1 and 10 "any characters" to occur between our matches for "like" followed by "flavour".  This expression would match: "I like the flavour", "I like the rich flavour".  Where "like" is within 10 characters (including spaces) of "flavour".
\b(like|love|prefer).{1,20}\b(flavour|taste|smell)
Complex expressions can make use of parentheses. So, we are searching for the strings "like", "love" or "prefer" within 20 characters of "flavour", "taste" or "smell".
\bflavo[ur]
To accept a range of characters within your string, use the "[" and "]" symbols to enclose the range.  In this example, we will match "flavou" and "flavor".  We are assuming that if we match "flavou" that it is likely that the word in the verbatim is actually flavour, flavours, flavouring, etc.
\brec[ei][ei]ve
To accommodate common misspellings you might allow for the commonly misused characters in your expression.  This example matches "receive", "recieve", "receeve" and "reciive".  Allowing for transposition of the "e" and "i" characters.
[0-9]
Matches any number 0 through 9. If the data contains a numeric value for each respondent's age, you may wish to define age-ranges such as "18-24" using 1[8-9]|2[0-4], or "35-44" 3[0-5]|4[0-4]

Text Match Rule Exceptions

Each Text Match Rule may also include an "Exception Rule". This rule specifies a negative match string where any match of the exception rule will be excluded from the results. Exception rules use the same syntax as the positive match rules.


Find out more about regular expressions here.