- termFrequency: use the number of times the term occurs in the document (x = freqi).
- binary: use 1 if the term occurs in the document or 0 if it doesn't (x = χ(freqi)).
- logarithmic: take the logarithm (base 10) of 1 + the number of times the term occurs in the document.
(x = log(1 + freqi))
- augmentedNormalizedTermFrequency: this formula adds to the binary frequency a "normalized" component expressing the
frequency of a term relative to the highest frequency of terms observed in that document
(x = 0.5 * (χ(freqi) + (freqi / maxk(freqk))) )
Linear Supertypes
Enumeration, Serializable, Serializable, AnyRef, Any
- termFrequency: use the number of times the term occurs in the document (x = freqi). - binary: use 1 if the term occurs in the document or 0 if it doesn't (x = χ(freqi)). - logarithmic: take the logarithm (base 10) of 1 + the number of times the term occurs in the document. (x = log(1 + freqi)) - augmentedNormalizedTermFrequency: this formula adds to the binary frequency a "normalized" component expressing the frequency of a term relative to the highest frequency of terms observed in that document (x = 0.5 * (χ(freqi) + (freqi / maxk(freqk))) )