Package com.yahoo.schema.document
Class NormalizeLevel
java.lang.Object
com.yahoo.schema.document.NormalizeLevel
class representing the character normalization
we want to do on query and indexed text.
Levels are strict subsets, so doing accent
removal means doing codepoint normalizing
and case normalizing also.
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic enum
The current levels are as follows: NONE: no changes to input text CODEPOINT: convert text into Unicode Normalization Form Compatibility Composition LOWERCASE: also convert text into lowercase letters ACCENT: do both above and remove accents on characters -
Constructor Summary
ConstructorDescriptionConstruct a default (full) normalizelevel,NormalizeLevel
(NormalizeLevel.Level level, boolean fromUser) Construct for a specific level, possibly user specified -
Method Summary
Modifier and TypeMethodDescriptionboolean
Returns whether accents should be removed from textgetLevel()
void
Change the current level to CODEPOINT as inferred by other features' needs.void
Change the current level to LOWERCASE as inferred by other features' needs.
-
Constructor Details
-
NormalizeLevel
public NormalizeLevel()Construct a default (full) normalizelevel, -
NormalizeLevel
Construct for a specific level, possibly user specified- Parameters:
level
- which level to usefromUser
- whether this was specified by the user
-
-
Method Details
-
doRemoveAccents
public boolean doRemoveAccents()Returns whether accents should be removed from text -
inferCodepoint
public void inferCodepoint()Change the current level to CODEPOINT as inferred by other features' needs. If the current level was user specified it will not change; also this will not increase the level. -
inferLowercase
public void inferLowercase()Change the current level to LOWERCASE as inferred by other features' needs. If the current level was user specified it will not change; also this will not increase the level. -
getLevel
-