org.opencms.search
Class CmsSearchSimilarity
java.lang.Object
org.apache.lucene.search.Similarity
org.apache.lucene.search.DefaultSimilarity
org.opencms.search.CmsSearchSimilarity
- All Implemented Interfaces:
- Serializable
public class CmsSearchSimilarity
- extends org.apache.lucene.search.DefaultSimilarity
Reduces the importance of the computeNorm(String, FieldInvertState)
factor
for the CmsSearchField.FIELD_CONTENT
field, while
keeping the Lucene default for all other fields.
This implementation was added since apparently the default length norm is heavily biased
for small documents. In the default, even if a term is found in 2 documents the same number of
times, the smaller document (containing less terms) will have a score easily 3x as high as
the longer document. Using this implementation the importance of the term number is reduced.
Inspired by Chuck Williams WikipediaSimilarity.
- Since:
- 6.0.0
- See Also:
- Serialized Form
Fields inherited from class org.apache.lucene.search.DefaultSimilarity |
discountOverlaps |
Fields inherited from class org.apache.lucene.search.Similarity |
NO_DOC_ID_PROVIDED |
Constructor Summary |
CmsSearchSimilarity()
Creates a new instance of the OpenCms search similarity. |
Method Summary |
float |
computeNorm(String fieldName,
org.apache.lucene.index.FieldInvertState state)
Special implementation for "compute norm" to reduce the significance of this factor
for the CmsSearchField.FIELD_CONTENT field, while
keeping the Lucene default for all other fields. |
Methods inherited from class org.apache.lucene.search.DefaultSimilarity |
coord, getDiscountOverlaps, idf, queryNorm, setDiscountOverlaps, sloppyFreq, tf |
Methods inherited from class org.apache.lucene.search.Similarity |
decodeNorm, decodeNormValue, encodeNorm, encodeNormValue, getDefault, getNormDecoder, idfExplain, idfExplain, idfExplain, lengthNorm, scorePayload, setDefault, tf |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
CmsSearchSimilarity
public CmsSearchSimilarity()
- Creates a new instance of the OpenCms search similarity.
computeNorm
public float computeNorm(String fieldName,
org.apache.lucene.index.FieldInvertState state)
- Special implementation for "compute norm" to reduce the significance of this factor
for the
CmsSearchField.FIELD_CONTENT
field, while
keeping the Lucene default for all other fields.
- Overrides:
computeNorm
in class org.apache.lucene.search.DefaultSimilarity
- See Also:
DefaultSimilarity.computeNorm(java.lang.String, org.apache.lucene.index.FieldInvertState)