NCModel (NLPCraft 0.7.2)

All Known Implementing Classes:

NCModelAdapter, NCModelFileAdapter
```
public interface NCModel
```
Main interface for user-defined data model.
A model generally defines:
- How to interpret the input from the user.
- How to map it to and query a particular data source.
- How to format the data source result back to the user.
At a minimum the following methods need to be implemented on this interface:
- getId()
- getName()
- getVersion()
- query(NCQueryContext) - the main method that user implements to provide result.
All other methods have reasonable defaults. In most cases, however, method getElements() should provide at least one user-defined element.
Lifecycle
There are two lifecycle callbacks that you can optionally override:
- initialize(NCProbeContext) - called only once during the model deployment to initialize the model.
- discard() - called only once to discard the model during the orderly shutdown of the probe. This method may not be called if the probe process was killed.
While user can implement this interface directly in most cases it is highly recommended to use one of the supplied adapters: NCModelAdapter or NCModelFileAdapter.
External JSON/YAML Configuration
Most of the model static configuration can be (and usually is) declared via external JSON or YAML files (see NCModelFileAdapter for loading JSON or YAML models). All JSON properties correspond to their counterparts in this interface. For example:
```
 {
      "id": "user.defined.id",
      "name": "User Defined Name",
      "version": "1.0",
      "description": "Short model description.",
      "enabledTokens": ["google:person", "google:location"]
      "examples": [],
      "macros": [],
      "metadata": {
          "myConfig": "myProperty"
      },
      "elements": [
          {
              "id": "x:id", 
              "group": "default",
              "parentId": null,
              "excludedSynonyms": [],
              "synonyms": [],
              "relations": {},
              "metadata": {},
              "values": []
          }
      ],
      "additionalStopwords": [],
      "excludedStopwords": [],
      "suspiciousWords": []
 }
 
```
Note that many examples shipped with NLPCraft use external JSON or YAML model configuration.
See Also:

NCModelAdapter, NCModelFileAdapter

Field Summary

Fields
Modifier and Type	Field and Description
`static Set<String>`	`DFLT_ENABLED_TOKENS` Default set of enabled built-in tokens.
`static boolean`	`DFLT_IS_DUP_SYNONYMS_ALLOWED` Default value returned from `isDupSynonymsAllowed()` method.
`static boolean`	`DFLT_IS_NO_NOUNS_ALLOWED` Default value returned from `isNoNounsAllowed()` method.
`static boolean`	`DFLT_IS_NO_USER_TOKENS_ALLOWED` Default value returned from `isNoUserTokensAllowed()` method.
`static boolean`	`DFLT_IS_NON_ENGLISH_ALLOWED` Default value returned from `isNonEnglishAllowed()` method.
`static boolean`	`DFLT_IS_NOT_LATIN_CHARSET_ALLOWED` Default value returned from `isNotLatinCharsetAllowed()` method.
`static boolean`	`DFLT_IS_PERMUTATE_SYNONYMS` Default value returned from `isPermutateSynonyms()` method.
`static boolean`	`DFLT_IS_SWEAR_WORDS_ALLOWED` Default value returned from `isSwearWordsAllowed()` method.
`static int`	`DFLT_JIGGLE_FACTOR` Default value returned from `getJiggleFactor()` method.
`static int`	`DFLT_MAX_FREE_WORDS` Default value returned from `getMaxFreeWords()` method.
`static int`	`DFLT_MAX_SUSPICIOUS_WORDS` Default value returned from `getMaxSuspiciousWords()` method.
`static int`	`DFLT_MAX_TOKENS` Default value returned from `getMaxTokens()` method.
`static int`	`DFLT_MAX_TOTAL_SYNONYMS` Default value returned from `getMaxTotalSynonyms()` method.
`static int`	`DFLT_MAX_UNKNOWN_WORDS` Default value returned from `getMaxUnknownWords()` method.
`static int`	`DFLT_MAX_WORDS` Default value returned from `getMaxWords()` method.
`static NCMetadata`	`DFLT_METADATA` Default value returned from `getJiggleFactor()` method.
`static int`	`DFLT_MIN_NON_STOPWORDS` Default value returned from `getMinNonStopwords()` method.
`static int`	`DFLT_MIN_TOKENS` Default value returned from `getMinTokens()` method.
`static int`	`DFLT_MIN_WORDS` Default value returned from `getMinWords()` method.
`static Function<NCQueryContext,NCQueryResult>`	`DFLT_QRY_FUNCTION` Default query method implementation that throw exception.

Method Summary

All Methods Instance Methods Abstract Methods Default Methods
Modifier and Type	Method and Description
`default void`	`discard()` A callback before this model instance gets discarded.
`default Set<String>`	`getAdditionalStopWords()` Gets an optional list of stopwords to add to the built-in ones.
`default String`	`getDescription()` Gets optional short model description.
`default Set<NCElement>`	`getElements()` Gets a set of model elements.
`default Set<String>`	`getEnabledTokens()` Gets set of IDs for built-in tokens that should be enabled and detected for this model.
`default Set<String>`	`getExamples()` Gets an optional list of example sentences demonstrating what can be asked with this model.
`default Set<String>`	`getExcludedStopWords()` Gets an optional list of stopwords to exclude from the built-in list of stopwords.
`String`	`getId()` Gets unique, immutable ID of this model.
`default int`	`getJiggleFactor()` Measure of how much sparsity is allowed when user input words are reordered in attempt to match the multi-word synonyms.
`default Map<String,String>`	`getMacros()` Gets an optional map of macros to be used in this model.
`default int`	`getMaxFreeWords()` Gets maximum number of free words until automatic rejection.
`default int`	`getMaxSuspiciousWords()` Gets maximum number of suspicious words until automatic rejection.
`default int`	`getMaxTokens()` Gets maximum number of all tokens (system and user defined) above which user input will be automatically rejected as too long.
`default int`	`getMaxTotalSynonyms()` Total number of synonyms allowed per model.
`default int`	`getMaxUnknownWords()` Gets maximum number of unknown words until automatic rejection.
`default int`	`getMaxWords()` Gets maximum word count (including stopwords) above which user input will be automatically rejected as too long.
`default NCMetadata`	`getMetadata()` Gets optional user specific model metadata can be set by the developer and accessed later.
`default int`	`getMinNonStopwords()` Gets minimum word count (excluding stopwords) below which user input will be automatically rejected as ambiguous sentence.
`default int`	`getMinTokens()` Gets minimum number of all tokens (system and user defined) below which user input will be automatically rejected as too short.
`default int`	`getMinWords()` Gets minimum word count (including stopwords) below which user input will be automatically rejected as too short.
`String`	`getName()` Gets descriptive name of this model.
`default NCCustomParser`	`getParser()` Gets optional custom user parser for model elements.
`default Set<String>`	`getSuspiciousWords()` Gets an optional list of suspicious words.
`String`	`getVersion()` Gets the version of this model using semantic versioning.
`default void`	`initialize(NCProbeContext probeCtx)` Probe calls this method to initialize the model when it gets deployed in the probe.
`default boolean`	`isDupSynonymsAllowed()` Whether or not duplicate synonyms are allowed.
`default boolean`	`isNonEnglishAllowed()` Whether or not to allow non-English language in user input.
`default boolean`	`isNoNounsAllowed()` Whether or not to allow user input without a single noun.
`default boolean`	`isNotLatinCharsetAllowed()` Whether or not to allow non-Latin charset in user input.
`default boolean`	`isNoUserTokensAllowed()` Whether or not to allow the user input with no user token detected.
`default boolean`	`isPermutateSynonyms()` Whether or not to permutate multi-word synonyms.
`default boolean`	`isSwearWordsAllowed()` Whether or not to allow known English swear words in user input.
`default NCQueryResult`	`query(NCQueryContext ctx)` Processes user input provided in the given query context and either returns the query result or throws an exception.

- Field Detail
  - DFLT_JIGGLE_FACTOR
```
static final int DFLT_JIGGLE_FACTOR
```
    Default value returned from getJiggleFactor() method.
    
    See Also:
    
    Constant Field Values
  - DFLT_METADATA
```
static final NCMetadata DFLT_METADATA
```
    Default value returned from getJiggleFactor() method.
  - DFLT_MAX_UNKNOWN_WORDS
```
static final int DFLT_MAX_UNKNOWN_WORDS
```
    Default value returned from getMaxUnknownWords() method.
    
    See Also:
    
    Constant Field Values
  - DFLT_MAX_FREE_WORDS
```
static final int DFLT_MAX_FREE_WORDS
```
    Default value returned from getMaxFreeWords() method.
    
    See Also:
    
    Constant Field Values
  - DFLT_MAX_SUSPICIOUS_WORDS
```
static final int DFLT_MAX_SUSPICIOUS_WORDS
```
    Default value returned from getMaxSuspiciousWords() method.
    
    See Also:
    
    Constant Field Values
  - DFLT_MIN_WORDS
```
static final int DFLT_MIN_WORDS
```
    Default value returned from getMinWords() method.
    
    See Also:
    
    Constant Field Values
  - DFLT_MAX_WORDS
```
static final int DFLT_MAX_WORDS
```
    Default value returned from getMaxWords() method.
    
    See Also:
    
    Constant Field Values
  - DFLT_MIN_TOKENS
```
static final int DFLT_MIN_TOKENS
```
    Default value returned from getMinTokens() method.
    
    See Also:
    
    Constant Field Values
  - DFLT_MAX_TOKENS
```
static final int DFLT_MAX_TOKENS
```
    Default value returned from getMaxTokens() method.
    
    See Also:
    
    Constant Field Values
  - DFLT_MIN_NON_STOPWORDS
```
static final int DFLT_MIN_NON_STOPWORDS
```
    Default value returned from getMinNonStopwords() method.
    
    See Also:
    
    Constant Field Values
  - DFLT_IS_NON_ENGLISH_ALLOWED
```
static final boolean DFLT_IS_NON_ENGLISH_ALLOWED
```
    Default value returned from isNonEnglishAllowed() method.
    
    See Also:
    
    Constant Field Values
  - DFLT_IS_NOT_LATIN_CHARSET_ALLOWED
```
static final boolean DFLT_IS_NOT_LATIN_CHARSET_ALLOWED
```
    Default value returned from isNotLatinCharsetAllowed() method.
    
    See Also:
    
    Constant Field Values
  - DFLT_IS_SWEAR_WORDS_ALLOWED
```
static final boolean DFLT_IS_SWEAR_WORDS_ALLOWED
```
    Default value returned from isSwearWordsAllowed() method.
    
    See Also:
    
    Constant Field Values
  - DFLT_IS_NO_NOUNS_ALLOWED
```
static final boolean DFLT_IS_NO_NOUNS_ALLOWED
```
    Default value returned from isNoNounsAllowed() method.
    
    See Also:
    
    Constant Field Values
  - DFLT_IS_PERMUTATE_SYNONYMS
```
static final boolean DFLT_IS_PERMUTATE_SYNONYMS
```
    Default value returned from isPermutateSynonyms() method.
    
    See Also:
    
    Constant Field Values
  - DFLT_IS_DUP_SYNONYMS_ALLOWED
```
static final boolean DFLT_IS_DUP_SYNONYMS_ALLOWED
```
    Default value returned from isDupSynonymsAllowed() method.
    
    See Also:
    
    Constant Field Values
  - DFLT_MAX_TOTAL_SYNONYMS
```
static final int DFLT_MAX_TOTAL_SYNONYMS
```
    Default value returned from getMaxTotalSynonyms() method.
    
    See Also:
    
    Constant Field Values
  - DFLT_IS_NO_USER_TOKENS_ALLOWED
```
static final boolean DFLT_IS_NO_USER_TOKENS_ALLOWED
```
    Default value returned from isNoUserTokensAllowed() method.
    
    See Also:
    
    Constant Field Values
  - DFLT_QRY_FUNCTION
```
static final Function<NCQueryContext,NCQueryResult> DFLT_QRY_FUNCTION
```
    Default query method implementation that throw exception.
  - DFLT_ENABLED_TOKENS
```
static final Set<String> DFLT_ENABLED_TOKENS
```
    Default set of enabled built-in tokens. The following built-in tokens are enabled by default:
    - nlpcraft:date
    - nlpcraft:geo
    - nlpcraft:num
    - nlpcraft:coordinate
    - nlpcraft:function
- Method Detail
  - getId
```
String getId()
```
    Gets unique, immutable ID of this model.
    Note that model IDs are immutable while name and version can be changed freely. Changing model ID is equal to creating a completely new model. Model IDs (unlike name and version) are not exposed to the end user and only serve a technical purpose. ID's max length is 32 characters.
    JSON
    If using JSON/YAML model presentation this is set by id property:
```
 {
      "id": "my.model.id"
 }
 
```
    Returns:
    
    Unique, immutable ID of this model.
  - getName
```
String getName()
```
    Gets descriptive name of this model. Name's max length is 64 characters.
    JSON
    If using JSON/YAML model presentation this is set by name property:
```
 {
      "name": "My Model"
 }
 
```
    Returns:
    
    Descriptive name for this model.
  - getVersion
```
String getVersion()
```
    Gets the version of this model using semantic versioning. Version's max length is 16 characters.
    JSON
    If using JSON/YAML model presentation this is set by version property:
```
 {
      "version": "1.0.0"
 }
 
```
    Returns:
    
    A version compatible with (www.semver.org) specification.
  - getDescription
```
default String getDescription()
```
    Gets optional short model description. This can be displayed by the management tools.
    JSON
    If using JSON/YAML model presentation this is set by description property:
```
 {
      "description": "Model description..."
 }
 
```
    Returns:
    
    Optional short model description.
  - getMaxUnknownWords
```
default int getMaxUnknownWords()
```
    Gets maximum number of unknown words until automatic rejection. An unknown word is a word that is not part of Princeton WordNet database. If you expect a very formalized and well defined input without uncommon slang and abbreviations you can set this to a small number like one or two. However, in most cases we recommend to leave it as default or set it to a larger number like five or more.
    Default
    If not provided by the model the default value DFLT_MAX_UNKNOWN_WORDS will be used.
    JSON
    If using JSON/YAML model presentation this is set by maxUnknownWords property:
```
 {
      "maxUnknownWords": 2
 }
 
```
    Returns:
    
    Maximum number of unknown words until automatic rejection.
  - getMaxFreeWords
```
default int getMaxFreeWords()
```
    Gets maximum number of free words until automatic rejection. A free word is a known word that is not part of any recognized token. In other words, a word that is present in the user input but won't be used to understand its meaning. Setting it to a non-zero risks the misunderstanding of the user input, while setting it to zero often makes understanding logic too rigid. In most cases we recommend setting to between one and three. If you expect the user input to contain many noisy idioms, slang or colloquials - you can set it to a larger number.
    Default
    If not provided by the model the default value DFLT_MAX_FREE_WORDS will be used.
    JSON
    If using JSON/YAML model presentation this is set by maxFreeWords property:
```
 {
      "maxFreeWords": 2
 }
 
```
    Returns:
    
    Maximum number of free words until automatic rejection.
  - getMaxSuspiciousWords
```
default int getMaxSuspiciousWords()
```
    Gets maximum number of suspicious words until automatic rejection. A suspicious word is a word that is defined by the model that should not appear in a valid user input under no circumstances. A typical example of suspicious words would be words "sex" or "porn" when processing queries about children books. In most cases this should be set to zero (default) to automatically reject any such suspicious words in the user input.
    Default
    If not provided by the model the default value DFLT_MAX_SUSPICIOUS_WORDS will be used.
    JSON
    If using JSON/YAML model presentation this is set by maxSuspiciousWords property:
```
 {
      "maxSuspiciousWords": 2
 }
 
```
    Returns:
    
    Maximum number of suspicious words until automatic rejection.
  - getMinWords
```
default int getMinWords()
```
    Gets minimum word count (including stopwords) below which user input will be automatically rejected as too short. In almost all cases this value should be greater than or equal to one.
    Default
    If not provided by the model the default value DFLT_MIN_WORDS will be used.
    JSON
    If using JSON/YAML model presentation this is set by minWords property:
```
 {
      "minWords": 2
 }
 
```
    Returns:
    
    Minimum word count (including stopwords) below which user input will be automatically rejected as too short.
  - getMaxWords
```
default int getMaxWords()
```
    Gets maximum word count (including stopwords) above which user input will be automatically rejected as too long. In almost all cases this value should be greater than or equal to one.
    Default
    If not provided by the model the default value DFLT_MAX_WORDS will be used.
    JSON
    If using JSON/YAML model presentation this is set by maxWords property:
```
 {
      "maxWords": 50
 }
 
```
    Returns:
    
    Maximum word count (including stopwords) above which user input will be automatically rejected as too long.
  - getMinTokens
```
default int getMinTokens()
```
    Gets minimum number of all tokens (system and user defined) below which user input will be automatically rejected as too short. In almost all cases this value should be greater than or equal to one.
    Default
    If not provided by the model the default value DFLT_MIN_TOKENS will be used.
    JSON
    If using JSON/YAML model presentation this is set by minTokens property:
```
 {
      "minTokens": 1
 }
 
```
    Returns:
    
    Minimum number of all tokens.
  - getMaxTokens
```
default int getMaxTokens()
```
    Gets maximum number of all tokens (system and user defined) above which user input will be automatically rejected as too long. Note that sentences with large number of token can result in significant processing delay and substantial memory consumption.
    Default
    If not provided by the model the default value DFLT_MAX_TOKENS will be used.
    JSON
    If using JSON/YAML model presentation this is set by maxTokens property:
```
 {
      "maxTokens": 100
 }
 
```
    Returns:
    
    Maximum number of all tokens.
  - getMinNonStopwords
```
default int getMinNonStopwords()
```
    Gets minimum word count (excluding stopwords) below which user input will be automatically rejected as ambiguous sentence.
    Default
    If not provided by the model the default value DFLT_MIN_NON_STOPWORDS will be used.
    JSON
    If using JSON/YAML model presentation this is set by minNonStopwords property:
```
 {
      "minNonStopwords": 2
 }
 
```
    Returns:
    
    Minimum word count (excluding stopwords) below which user input will be automatically rejected as too short.
  - isNonEnglishAllowed
```
default boolean isNonEnglishAllowed()
```
    Whether or not to allow non-English language in user input. Currently, only English language is supported. However, model can choose whether or not to automatically reject user input that is detected to be a non-English. Note that current algorithm only works reliably on longer user input (10+ words). On short sentences it will often produce an incorrect result.
    Default
    If not provided by the model the default value DFLT_IS_NON_ENGLISH_ALLOWED will be used.
    JSON
    If using JSON/YAML model presentation this is set by nonEnglishAllowed property:
```
 {
      "nonEnglishAllowed": false
 }
 
```
    Returns:
    
    Whether or not to allow non-English language in user input.
  - isNotLatinCharsetAllowed
```
default boolean isNotLatinCharsetAllowed()
```
    Whether or not to allow non-Latin charset in user input. Currently, only Latin charset is supported. However, model can choose whether or not to automatically reject user input with characters outside of Latin charset. If false such user input will be automatically rejected.
    Default
    If not provided by the model the default value DFLT_IS_NOT_LATIN_CHARSET_ALLOWED will be used.
    JSON
    If using JSON/YAML model presentation this is set by nonLatinCharsetAllowed property:
```
 {
      "nonLatinCharsetAllowed": false
 }
 
```
    Returns:
    
    Whether or not to allow non-Latin charset in user input.
  - isSwearWordsAllowed
```
default boolean isSwearWordsAllowed()
```
    Whether or not to allow known English swear words in user input. If false - user input with detected known English swear words will be automatically rejected.
    Default
    If not provided by the model the default value DFLT_IS_SWEAR_WORDS_ALLOWED will be used.
    JSON
    If using JSON/YAML model presentation this is set by swearWordsAllowed property:
```
 {
      "swearWordsAllowed": false
 }
 
```
    Returns:
    
    Whether or not to allow known swear words in user input.
  - isNoNounsAllowed
```
default boolean isNoNounsAllowed()
```
    Whether or not to allow user input without a single noun. If false such user input will be automatically rejected. Typically for command or query-oriented models this should be set to false as any command or query should have at least one noun subject. However, for conversational models this can be set to false to allow for a smalltalk and one-liners.
    Default
    If not provided by the model the default value DFLT_IS_NO_NOUNS_ALLOWED will be used.
    JSON
    If using JSON/YAML model presentation this is set by noNounsAllowed property:
```
 {
      "noNounsAllowed": false
 }
 
```
    Returns:
    
    Whether or not to allow user input without a single noun.
  - isPermutateSynonyms
```
default boolean isPermutateSynonyms()
```
    Whether or not to permutate multi-word synonyms. Automatic multi-word synonyms permutations greatly increase the total number of synonyms in the system and allows for better multi-word synonym detection. For example, if permutation is allowed the synonym "a b c" will be automatically converted into a sequence of synonyms of "a b c", "b a c", "a c b".
    Default
    If not provided by the model the default value DFLT_IS_PERMUTATE_SYNONYMS will be used.
    JSON
    If using JSON/YAML model presentation this is set by permutateSynonyms property:
```
 {
      "permutateSynonyms": true
 }
 
```
    Returns:
    
    Whether or not to permutate multi-word synonyms.
  - isDupSynonymsAllowed
```
default boolean isDupSynonymsAllowed()
```
    Whether or not duplicate synonyms are allowed. If true - the model will pick the random model element when multiple elements found due to duplicate synonyms. If false - model will print error message and will not deploy.
    Default
    If not provided by the model the default value DFLT_IS_DUP_SYNONYMS_ALLOWED will be used.
    JSON
    If using JSON/YAML model presentation this is set by dupSynonymsAllowed property:
```
 {
      "dupSynonymsAllowed": true
 }
 
```
    Returns:
    
    Whether or not to allow duplicate synonyms.
  - getMaxTotalSynonyms
```
default int getMaxTotalSynonyms()
```
    Total number of synonyms allowed per model. Model won't deploy if total number of synonyms exceeds this number.
    Default
    If not provided by the model the default value DFLT_MAX_TOTAL_SYNONYMS will be used.
    JSON
    If using JSON/YAML model presentation this is set by maxTotalSynonyms property:
```
 {
      "maxTotalSynonyms": true
 }
 
```
    Returns:
    
    Total number of synonyms allowed per model.
  - isNoUserTokensAllowed
```
default boolean isNoUserTokensAllowed()
```
    Whether or not to allow the user input with no user token detected. If false such user input will be automatically rejected. Note that this property only applies to user-defined token (i.e. model element). Even if there are no user defined tokens, the user input may still contain system token like nlpcraft:geo or nlpcraft:date. In many cases models should be build to allow user input without user tokens. However, set it to false if presence of at least one user token is mandatory.
    Default
    If not provided by the model the default value DFLT_IS_NO_USER_TOKENS_ALLOWED will be used.
    JSON
    If using JSON/YAML model presentation this is set by noUserTokensAllowed property:
```
 {
      "noUserTokensAllowed": false
 }
 
```
    Returns:
    
    Whether or not to allow the user input with no user token detected.
  - getJiggleFactor
```
default int getJiggleFactor()
```
    Measure of how much sparsity is allowed when user input words are reordered in attempt to match the multi-word synonyms. Zero means no reordering is allowed. One means that only one word in a synonym can move one position left or right, and so on. Empirically the value of 2 proved to be a good default value in most cases. Note that larger values mean that synonym words can be almost in any random place in the user input which makes synonym matching practically meaningless. Maximum value is 4.
    Default
    If not provided by the model the default value DFLT_JIGGLE_FACTOR will be used.
    JSON
    If using JSON/YAML model presentation this is set by jiggleFactor property:
```
 {
      "jiggleFactor": 2
 }
 
```
    Returns:
    
    Word jiggle factor (sparsity measure).
  - getMetadata
```
default NCMetadata getMetadata()
```
    Gets optional user specific model metadata can be set by the developer and accessed later.
    
    Returns:
    
    Optional user defined model metadata.
  - getAdditionalStopWords
```
default Set<String> getAdditionalStopWords()
```
    Gets an optional list of stopwords to add to the built-in ones.
    Stopword is an individual word (i.e. sequence of characters excluding whitespaces) that contribute no semantic meaning to the sentence. For example, 'the', 'wow', or 'hm' provide no semantic meaning to the sentence and can be safely excluded from semantic analysis.
    NLPCraft comes with a carefully selected list of English stopwords which should be sufficient for a majority of use cases. However, you can add additional stopwords to this list. The typical use for user-defined stopwords are jargon parasite words that are specific to the model's domain.
    JSON
    If using JSON/YAML model presentation this is set by additionalStopwords property:
```
 {
      "additionalStopwords": [
          "stopword1",
          "stopword2"
      ]
 }
 
```
    Returns:
    
    Potentially empty list of additional stopwords.
  - getExcludedStopWords
```
default Set<String> getExcludedStopWords()
```
    Gets an optional list of stopwords to exclude from the built-in list of stopwords.
    Just like you can add additional stopwords via getAdditionalStopWords() you can exclude certain words from the list of stopwords. This can be useful in rare cases when default built-in stopword has specific meaning of your model. In order to process them you need to exclude them from the list of stopwords.
    JSON
    If using JSON/YAML model presentation this is set by excludedStopwords property:
```
 {
      "excludedStopwords": [
          "excludedStopword1",
          "excludedStopword2"
      ]
 }
 
```
    Returns:
    
    Potentially empty list of excluded stopwords.
  - getExamples
```
default Set<String> getExamples()
```
    Gets an optional list of example sentences demonstrating what can be asked with this model. These examples may be displayed by the management tools. It is highly recommended to supply a good list of examples for the model as this provides perhaps the best description to the end user on how a particular model can be used.
    JSON
    If using JSON/YAML model presentation this is set by examples property:
```
 {
      "examples": [
          "Example questions one",
          "Another sample sentence"
      ]
 }
 
```
    Returns:
    
    Potentially empty list of model request examples.
  - getSuspiciousWords
```
default Set<String> getSuspiciousWords()
```
    Gets an optional list of suspicious words. A suspicious word is a word that generally should not appear in user sentence when used with this model. For example, if a particular model is for children oriented book search, the words "sex" and "porn" should probably NOT appear in the user input and can be automatically rejected when added here and model's metadata MAX_SUSPICIOUS_WORDS property set to zero.
    Note that by setting model's metadata MAX_SUSPICIOUS_WORDS property to non-zero value you can adjust the sensitivity of suspicious words auto-rejection logic.
    JSON
    If using JSON/YAML model presentation this is set by suspiciousWords property:
```
 {
      "suspiciousWords": [
          "sex",
          "porn"
      ]
 }
 
```
    Returns:
    
    Potentially empty list of suspicious words in their lemma form.
  - getMacros
```
default Map<String,String> getMacros()
```
    Gets an optional map of macros to be used in this model. Macros and option groups are instrumental in defining model's elements. See NCElement for documentation on macros.
    JSON
    If using JSON/YAML model presentation this is set by macros property:
```
 {
      "macros": [
          {
              "name": "<OF>",
              "macro": "{of|for|per}"
          },
          {
              "name": "<CUR>",
              "macro": "{current|present|moment|now}"
          }
      ]
 }
 
```
    Returns:
    
    Potentially empty map of macros.
  - getElements
```
default Set<NCElement> getElements()
```
    Gets a set of model elements. Model can have zero or more user defined elements.
    An element is the main building block of the semantic model. User data model element defines an entity that will be automatically recognized in the user input either by one of its synonyms or values, or directly by its ID.
    Note that unless model elements are loaded dynamically it is highly recommended to declare model elements in the external JSON/YAML model configuration (under elements property):
```
 {
      "elements": [
         {
             "id": "wt:hist",
             "synonyms": [
                 "{<WEATHER>|*} <HISTORY>",
                 "<HISTORY> {<OF>|*} <WEATHER>"
             ],
             "description": "Past weather conditions."
         }
      ]
 }
 
```
    Returns:
    
    Set of model elements, potentially empty.
  - discard
```
default void discard()
```
    A callback before this model instance gets discarded. It gives the model a chance to do a state cleanup, if necessary. There's no guarantee that this method will ever be invoked - it is only invoked if data probe process exits orderly. However, it is guaranteed that this method will be called at most once. If model implementation has the state that must be persisted it needs to persist that state through its own mechanisms without relying on this method.
    Note that if model has an important state it is highly recommended that it would store it periodically instead of relying on this method.
    Default
    Default implementation is a no-op.
    JSON
    If using JSON/YAML model presentation this method will have no-op implementation.
  - initialize
```
default void initialize(NCProbeContext probeCtx)
```
    Probe calls this method to initialize the model when it gets deployed in the probe. This method is guaranteed to be called and it will be called only once.
    Default
    Default implementation stores provided probe context in the metadata under __NC_PROBE_CTX name:
```
 default void initialize(NCProbeContext probeCtx) {
     getMetadata().put("__NC_PROBE_CTX", probeCtx);
 }
 
```
    Parameters:
    
    probeCtx - Probe context.
  - query
```
default NCQueryResult query(NCQueryContext ctx)
                     throws NCRejection
```
    Processes user input provided in the given query context and either returns the query result or throws an exception. This is the main method that user should implement when developing a semantic model. See NCIntentSolver for intent-based user input processing for a simplified way to encode that processing logic.
    
    Parameters:
    
    ctx - Query context containing parsed user input and all associated data.
    
    Returns:
    
    Query result. This result cannot be null. In case of any errors this method should throw NCRejection exception.
    
    Throws:
    
    NCRejection - Thrown when user input cannot be processed as is and should be rejected.
    
    See Also:
    
    NCIntentSolver
  - getEnabledTokens
```
default Set<String> getEnabledTokens()
```
    Gets set of IDs for built-in tokens that should be enabled and detected for this model. Unless model requests (i.e. enables) the built-in tokens in this method the NLP subsystem will not attempt to detect them. Note that you don't have to specify your own user elements (tokens) here.
    Default
    The following built-in tokens are enabled by default implementation of this method:
    - nlpcraft:date
    - nlpcraft:geo
    - nlpcraft:num
    - nlpcraft:coordinate
    - nlpcraft:function
    Note that this method can return an empty list if the data model doesn't need any built-in tokens for its logic. See NCToken for the list of all supported built-in tokens.
    JSON
    If using JSON/YAML model presentation this is set by enabledTokens property:
```
 {
      "enabledTokens": [
          "google:person",
          "google:location",
          "stanford:money"
      ]
 }
 
```
    Returns:
    
    Set of built-in tokens, potentially empty, that should be enabled and detected for this model.
  - getParser
```
default NCCustomParser getParser()
```
    Gets optional custom user parser for model elements.
    By default the semantic data model detects its elements by their synonyms declared in the model. However, in some cases the synonyms (or the regular expressions) are simply not expressive enough. In such cases, a user-defined custom parser can be defined for the model that would allow the user to define its own logic to detect the model elements in the user input programmatically. Note that there can be only one custom parser per model and it can detect any number of model elements.
    
    Returns:
    
    Custom user parser for model elements or null if not used (default).

Interface NCModel

Lifecycle

External JSON/YAML Configuration

Field Summary

Method Summary

Field Detail

DFLT_JIGGLE_FACTOR

DFLT_METADATA

DFLT_MAX_UNKNOWN_WORDS

DFLT_MAX_FREE_WORDS

DFLT_MAX_SUSPICIOUS_WORDS

DFLT_MIN_WORDS

DFLT_MAX_WORDS

DFLT_MIN_TOKENS

DFLT_MAX_TOKENS

DFLT_MIN_NON_STOPWORDS

DFLT_IS_NON_ENGLISH_ALLOWED

DFLT_IS_NOT_LATIN_CHARSET_ALLOWED

DFLT_IS_SWEAR_WORDS_ALLOWED

DFLT_IS_NO_NOUNS_ALLOWED

DFLT_IS_PERMUTATE_SYNONYMS

DFLT_IS_DUP_SYNONYMS_ALLOWED

DFLT_MAX_TOTAL_SYNONYMS

DFLT_IS_NO_USER_TOKENS_ALLOWED

DFLT_QRY_FUNCTION

DFLT_ENABLED_TOKENS

Method Detail

getId

getName

getVersion

getDescription

getMaxUnknownWords

getMaxFreeWords

getMaxSuspiciousWords

getMinWords

getMaxWords

getMinTokens

getMaxTokens

getMinNonStopwords

isNonEnglishAllowed

isNotLatinCharsetAllowed

isSwearWordsAllowed

isNoNounsAllowed

isPermutateSynonyms

isDupSynonymsAllowed

getMaxTotalSynonyms

isNoUserTokensAllowed

getJiggleFactor

getMetadata

getAdditionalStopWords

getExcludedStopWords

getExamples

getSuspiciousWords

getMacros

getElements

discard

initialize

query

getEnabledTokens

getParser