characters used to explicitly mark sentence bounds
whether take lists into consideration at sentence detection
whether to explode each sentence into a different row, for better parallelization.
whether to explode each sentence into a different row, for better parallelization. Defaults to false.
Custom sentence separator text
Whether to take lists into consideration at sentence detection.
Whether to take lists into consideration at sentence detection. Defaults to true.
Whether to split sentences into different Dataset rows.
Whether to split sentences into different Dataset rows. Useful for higher parallelism in fat rows. Defaults to false.
Get the maximum allowed length for each sentence
Get the minimum allowed length for each sentence
Length at which sentences will be forcibly split
Whether to consider abbreviation strategies for better accuracy but slower performance.
Whether to consider abbreviation strategies for better accuracy but slower performance. Defaults to true.
Use only custom bounds without considering those of Pragmatic Segmenter.
Use only custom bounds without considering those of Pragmatic Segmenter. Defaults to false. Needs customBounds.
Set the maximum allowed length for each sentence
Set the minimum allowed length for each sentence
Custom sentence separator text
Whether to take lists into consideration at sentence detection.
Whether to take lists into consideration at sentence detection. Defaults to true.
Whether to split sentences into different Dataset rows.
Whether to split sentences into different Dataset rows. Useful for higher parallelism in fat rows. Defaults to false.
Set the maximum allowed length for each sentence
Set the minimum allowed length for each sentence
Length at which sentences will be forcibly split
Whether to consider abbreviation strategies for better accuracy but slower performance.
Whether to consider abbreviation strategies for better accuracy but slower performance. Defaults to true.
Use only custom bounds without considering those of Pragmatic Segmenter.
Use only custom bounds without considering those of Pragmatic Segmenter. Defaults to false. Needs customBounds.
length at which sentences will be forcibly split.
whether to apply abbreviations at sentence detection
whether to only utilize custom bounds for sentence detection
See https://github.com/JohnSnowLabs/spark-nlp/tree/master/src/test/scala/com/johnsnowlabs/nlp/annotators/sbd/pragmatic for further reference on how to use this API