Constructor with default index, query analyzers and Lucene similarity
Constructor with default index, query analyzers and Lucene similarity
Input DataFrame
Instantiate a LuceneRDD from a DataFrame
Instantiate a LuceneRDD from a DataFrame
Spark DataFrame
Instantiate a LuceneRDD with an iterable
Instantiate a LuceneRDD with an iterable
Input type
Elements to index
Index analyzer name
Query analyzer name
Lucene scoring similarity, i.e., BM25 or TF-IDF
Lucene Analyzer per field (indexing time), default empty
Lucene Analyzer per field (query time), default empty
Spark Context
Instantiate a LuceneRDD given an RDD[T]
Instantiate a LuceneRDD given an RDD[T]
Generic type
RDD of type T
Index analyzer name
Query analyzer name
Lucene scoring similarity, i.e., BM25 or TF-IDF
Lucene Analyzer per field (indexing time), default empty
Lucene Analyzer per field (query time), default empty
Deduplication via blocking
Deduplication via blocking
Entities DataFrame to deduplicate
Function that maps Row to Lucene Query
Columns on which exact match is required
Number of top-K query results
Parameters for index-time and query-time analysis
Entity linkage between two DataFrame by blocking / filtering on one or more columns.
Entity linkage between two DataFrame by blocking / filtering on one or more columns.
Queries / entities to be linked with @corpus
DataFrame of entities to be linked with queries parameter
Function[Row, Query] that converts Row to a Lucene Query
List of query columns for HashPartitioner
List of entity columns for HashPartitioner
Number of linked results
Parameters for index and query time analysis
Returns top-k linked results as RDD of Tuple2 where _1 is query and _2 is top-k linked results as SparkScoreDoc.
Get the configured analyzers or fallback to English
Get the configured analyzers or fallback to English
Return project information, i.e., version number, build time etc
Return project information, i.e., version number, build time etc