Class NameRewriter

  • All Implemented Interfaces:
    com.yahoo.component.Component, java.lang.Comparable<com.yahoo.component.Component>

    public class NameRewriter
    extends QueryRewriteSearcher
    This rewriter would add rewrites to name entities to boost precision
    - FSA dict: [normalized original query]\t[rewrite 1]\t[rewrite 2]\t[etc]
    - Features:
    OriginalAsUnit flag: add proximity boosting to original query
    RewritesAsUnitEquiv flag: add proximity boosted rewrites to original query
    RewritesAsEquiv flag: add rewrites to original query
    Author:
    Karen Sze Wing Lee
    • Constructor Summary

      Constructors 
      Constructor Description
      NameRewriter​(com.yahoo.component.ComponentId id, com.yahoo.filedistribution.fileacquirer.FileAcquirer fileAcquirer, RewritesConfig config)
      Constructor for NameRewriter
      Load configs using default format
      NameRewriter​(RewritesConfig config, java.util.HashMap<java.lang.String,​java.io.File> fileList)
      Constructor for NameRewriter unit test
      Load configs using default format
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      boolean configure​(com.yahoo.filedistribution.fileacquirer.FileAcquirer fileAcquirer, RewritesConfig config, java.util.HashMap<java.lang.String,​java.io.File> fileList)
      Instance creation time config loading besides FSA
      Empty for this rewriter
      java.util.HashMap<java.lang.String,​java.lang.String> getDefaultFSAs()
      Get default FSA dictionary names
      java.lang.String getRewriterName()
      Get the name of the rewriter
      boolean getSkipRewriterIfRewritten()
      Get the flag which specifies whether this rewriter.
      java.util.HashMap<java.lang.String,​java.lang.Object> rewrite​(Query query, java.lang.String dictKey)
      Main logic of rewriter
      - Retrieve rewrites from FSA dict
      - rewrite query using features that are enabled by user
      • Methods inherited from class com.yahoo.component.chain.ChainedComponent

        getAnnotatedDependencies, getDefaultAnnotatedDependencies, getDependencies, initDependencies
      • Methods inherited from class com.yahoo.component.AbstractComponent

        clone, compareTo, deconstruct, getClassName, getId, getIdString, hasInitializedId, initId, isDeconstructable, setIsDeconstructable
      • Methods inherited from class java.lang.Object

        equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Field Detail

      • NAME_ENTITY_EXPAND_DICT

        public static final java.lang.String NAME_ENTITY_EXPAND_DICT
        See Also:
        Constant Field Values
      • NAME_ENTITY_EXPAND_DICT_FILENAME

        public static final java.lang.String NAME_ENTITY_EXPAND_DICT_FILENAME
        See Also:
        Constant Field Values
    • Constructor Detail

      • NameRewriter

        @Inject
        public NameRewriter​(com.yahoo.component.ComponentId id,
                            com.yahoo.filedistribution.fileacquirer.FileAcquirer fileAcquirer,
                            RewritesConfig config)
        Constructor for NameRewriter
        Load configs using default format
      • NameRewriter

        public NameRewriter​(RewritesConfig config,
                            java.util.HashMap<java.lang.String,​java.io.File> fileList)
        Constructor for NameRewriter unit test
        Load configs using default format
    • Method Detail

      • configure

        public boolean configure​(com.yahoo.filedistribution.fileacquirer.FileAcquirer fileAcquirer,
                                 RewritesConfig config,
                                 java.util.HashMap<java.lang.String,​java.io.File> fileList)
        Instance creation time config loading besides FSA
        Empty for this rewriter
        Specified by:
        configure in class QueryRewriteSearcher
        Parameters:
        fileAcquirer - Required param for retrieving file type config (see vespa's search container doc for more detail)
        config - Config from vespa-services.xml (see vespa's search container doc for more detail)
        fileList - pairs of file name and file handler for unit tests
        Returns:
        boolean true if loaded successfully, false otherwise
      • rewrite

        public java.util.HashMap<java.lang.String,​java.lang.Object> rewrite​(Query query,
                                                                                  java.lang.String dictKey)
                                                                           throws java.lang.RuntimeException
        Main logic of rewriter
        - Retrieve rewrites from FSA dict
        - rewrite query using features that are enabled by user
        Specified by:
        rewrite in class QueryRewriteSearcher
        Parameters:
        query - Query object from searcher
        dictKey - the key passed from previous rewriter to be treated as "original query from user" For example, if previous is misspell rewriter, it would pass the corrected query as the "original query from user". For other rewriters which add variants, abbr, etc to the query, the original query should be passed as a key. This rewriter could still choose to ignore this key. This key is not the rewritten query itself. For example, if original query is (willl smith) and the rewritten query is (willl smith) OR (will smith) the key to be passed could be (will smith)
        Returns:
        HashMap which contains the key value pairs:
        - whether this query has been rewritten by this rewriter
        key: rewritten
        value: true or false
        - the key to be treated as "original query from user" in next rewriter downstream, for example, misspell rewriter would pass the corrected query as the "original query from user" to the next rewriter. For other rewriters which add variants, abbr, etc to the query, the original query should be passed as a key. This key is not necessarily consumed by the next rewriter. The next rewriter can still choose to ignore this key.
        key: newDictKey
        value: new dict key
        Throws:
        java.lang.RuntimeException
      • getSkipRewriterIfRewritten

        public boolean getSkipRewriterIfRewritten()
        Get the flag which specifies whether this rewriter. should be skipped if the query has been rewritten
        Specified by:
        getSkipRewriterIfRewritten in class QueryRewriteSearcher
        Returns:
        true if rewriter should be skipped, false otherwise
      • getRewriterName

        public java.lang.String getRewriterName()
        Get the name of the rewriter
        Specified by:
        getRewriterName in class QueryRewriteSearcher
        Returns:
        Name of the rewriter
      • getDefaultFSAs

        public java.util.HashMap<java.lang.String,​java.lang.String> getDefaultFSAs()
        Get default FSA dictionary names
        Specified by:
        getDefaultFSAs in class QueryRewriteSearcher
        Returns:
        Pair of FSA dictionary name and filename