Vivekn inspired sentiment analysis model
Vivekn inspired sentiment analysis model
content feature limit, to boost performance in very dirt text.
content feature limit, to boost performance in very dirt text. Default disabled with -1
Get content feature limit, to boost performance in very dirt text.
Get content feature limit, to boost performance in very dirt text. Default disabled with -1
Get Proportion of feature content to be considered relevant.
Get Proportion of feature content to be considered relevant. Defaults to 0.5
input annotations columns currently used
Gets annotation column name going to generate
Gets annotation column name going to generate
Get Proportion to lookahead in unimportant features.
Get Proportion to lookahead in unimportant features. Defaults to 0.025
proportion of feature content to be considered relevant.
proportion of feature content to be considered relevant. Defaults to 0.5
Input annotator type : TOKEN, DOCUMENT
Input annotator type : TOKEN, DOCUMENT
columns that contain annotations necessary to run this annotator AnnotatorType is used both as input and output columns if not specified
columns that contain annotations necessary to run this annotator AnnotatorType is used both as input and output columns if not specified
Detects negations and transforms them into not_ form
Detects negations and transforms them into not_ form
Output annotator type : SENTIMENT
Output annotator type : SENTIMENT
Removes unfrequent scenarios from scope.
Removes unfrequent scenarios from scope. The higher the better performance. Defaults 1
column with the sentiment result of every row.
column with the sentiment result of every row. Must be 'positive' or 'negative'
when training on small data you may want to disable this to not cut off infrequent words
Set content feature limit, to boost performance in very dirt text.
Set content feature limit, to boost performance in very dirt text. Default disabled with -1
Set Proportion of feature content to be considered relevant.
Set Proportion of feature content to be considered relevant. Defaults to 0.5
Overrides required annotators column if different than default
Overrides required annotators column if different than default
Overrides annotation column name when transforming
Overrides annotation column name when transforming
Column with sentiment analysis row’s result for training.
Column with sentiment analysis row’s result for training. If not set, external sources need to be set instead. Column with the sentiment result of every row. Must be 'positive' or 'negative'
Set Proportion to lookahead in unimportant features.
Set Proportion to lookahead in unimportant features. Defaults to 0.025
requirement for pipeline transformation validation.
requirement for pipeline transformation validation. It is called on fit()
proportion to lookahead in unimportant features.
proportion to lookahead in unimportant features. Defaults to 0.025
takes a Dataset and checks to see if all the required annotation types are present.
takes a Dataset and checks to see if all the required annotation types are present.
to be validated
True if all the required types are present, else false
Required input and expected output annotator types
Inspired on vivekn sentiment analysis algorithm https://github.com/vivekn/sentiment/.
requires sentence boundaries to give score in context. Tokenization to make sure tokens are within bounds. Transitivity requirements are also required.
See https://github.com/JohnSnowLabs/spark-nlp/tree/master/src/test/scala/com/johnsnowlabs/nlp/annotators/sda/vivekn for further reference on how to use this API.