Class PslDataParser
This class provides functionality to parse public suffix list (PSL) data files. The PSL data is used to determine the domain suffixes that are recognized as public or private, including handling of wildcard and exception rules. The parser reads the PSL data file line by line, categorizing each line into the appropriate trie structure based on whether it is a public or private suffix, and whether it is an exact match, wildcard, or exception rule. The parsed data is then stored in a `PslData` object for further use in domain validation processes.
-
Constructor Summary
ConstructorsModifierConstructorDescriptionprivate
Private constructor to prevent instantiation of this class. -
Method Summary
-
Constructor Details
-
PslDataParser
private PslDataParser()Private constructor to prevent instantiation of this class.This class is designed to be used in a static context, and therefore, instantiation is not necessary. The private constructor ensures that no instances of this class can be created, enforcing the static nature of its methods and usage.
-
-
Method Details
-
parsePslData
Parse the given public suffix list data file and return the parsed data.This method takes a `Reader` object as input, which is expected to provide the contents of a valid public suffix list data file. The method processes the file line by line, categorizing each line into the appropriate trie structure within a `PslData` object. The file format is defined by the public suffix list project and includes rules for exact matches, wildcards, and exceptions. The method handles both public and private suffixes, ensuring that all relevant data is parsed and stored correctly.
- Parameters:
reader
- Reader for the public suffix list data file- Returns:
- `PslData` object containing the parsed data
- Throws:
IllegalStateException
- if an I/O error occurs while reading the file
-
addToTrie
Add the given substring to the given trie.This helper method inserts a substring into the specified `Trie` object. It also handles the conversion of Unicode domain names to their ASCII-compatible encoding (punycode) and inserts the punycode representation into the trie as well. This ensures that both Unicode and punycode versions of the domain names are recognized and validated correctly.
- Parameters:
substring
- Substring to add to the trietrie
- Trie to add the substring to
-