Package com.github.jelmerk.knn
Interface Index<TId,TVector,TItem extends Item<TId,TVector>,TDistance>
-
- Type Parameters:
TId
- Type of the external identifier of an itemTVector
- Type of the vector to perform distance calculation onTItem
- Type of items stored in the indexTDistance
- Type of distance between items (expect any numeric type: float, double, int, ..)
- All Superinterfaces:
Serializable
- All Known Implementing Classes:
BruteForceIndex
,HnswIndex
,StatisticsDecorator
public interface Index<TId,TVector,TItem extends Item<TId,TVector>,TDistance> extends Serializable
Read write K-nearest neighbors search index.- See Also:
- k-nearest neighbors algorithm
-
-
Field Summary
Fields Modifier and Type Field Description static int
DEFAULT_PROGRESS_UPDATE_INTERVAL
By default after indexing this many items progress will be reported to registered progress listeners.
-
Method Summary
All Methods Instance Methods Abstract Methods Default Methods Modifier and Type Method Description void
add(TItem item)
Add a new item to the index.default void
addAll(Collection<TItem> items)
Add multiple items to the indexdefault void
addAll(Collection<TItem> items, int numThreads, ProgressListener listener, int progressUpdateInterval)
Add multiple items to the index.default void
addAll(Collection<TItem> items, ProgressListener listener)
Add multiple items to the index.List<SearchResult<TItem,TDistance>>
findNearest(TVector vector, int k)
Find the items closest to the passed in vector.default List<SearchResult<TItem,TDistance>>
findNeighbors(TId id, int k)
Find the items closest to the item identified by the passed in id.Optional<TItem>
get(TId id)
Returns an item by its identifier.boolean
remove(TId id)
Removes an item from the index.default void
save(File file)
Saves the index to a file.void
save(OutputStream out)
Saves the index to an OutputStream.default void
save(Path path)
Saves the index to a path.int
size()
Returns the size of the index.
-
-
-
Field Detail
-
DEFAULT_PROGRESS_UPDATE_INTERVAL
static final int DEFAULT_PROGRESS_UPDATE_INTERVAL
By default after indexing this many items progress will be reported to registered progress listeners.- See Also:
- Constant Field Values
-
-
Method Detail
-
add
void add(TItem item)
Add a new item to the index. If the item already exists in the index the old item will first be removed from the index. for this removes need to be enabled for the index.- Parameters:
item
- the item to add to the index
-
remove
boolean remove(TId id)
Removes an item from the index.- Parameters:
id
- unique identifier or the item to remove- Returns:
true
if an item was removed from the index. In case the index does not support removals this will always be false
-
addAll
default void addAll(Collection<TItem> items) throws InterruptedException
Add multiple items to the index- Parameters:
items
- the items to add to the index- Throws:
InterruptedException
- thrown when the thread doing the indexing is interrupted
-
addAll
default void addAll(Collection<TItem> items, ProgressListener listener) throws InterruptedException
Add multiple items to the index. Reports progress to the passed in implementation ofProgressListener
everyDEFAULT_PROGRESS_UPDATE_INTERVAL
elements indexed.- Parameters:
items
- the items to add to the indexlistener
- listener to report progress to- Throws:
InterruptedException
- thrown when the thread doing the indexing is interrupted
-
addAll
default void addAll(Collection<TItem> items, int numThreads, ProgressListener listener, int progressUpdateInterval) throws InterruptedException
Add multiple items to the index. Reports progress to the passed in implementation ofProgressListener
every progressUpdateInterval elements indexed.- Parameters:
items
- the items to add to the indexnumThreads
- number of threads to use for parallel indexinglistener
- listener to report progress toprogressUpdateInterval
- after indexing this many items progress will be reported- Throws:
InterruptedException
- thrown when the thread doing the indexing is interrupted
-
size
int size()
Returns the size of the index.- Returns:
- size of the index
-
get
Optional<TItem> get(TId id)
Returns an item by its identifier.- Parameters:
id
- unique identifier or the item to return- Returns:
- an item
-
findNearest
List<SearchResult<TItem,TDistance>> findNearest(TVector vector, int k)
Find the items closest to the passed in vector.- Parameters:
vector
- the vectork
- number of items to return- Returns:
- the items closest to the passed in vector
-
findNeighbors
default List<SearchResult<TItem,TDistance>> findNeighbors(TId id, int k)
Find the items closest to the item identified by the passed in id. If the id does not match an item an empty list is returned. the element itself is not included in the response.- Parameters:
id
- id of the item to find the neighbors ofk
- number of items to return- Returns:
- the items closest to the item
-
save
void save(OutputStream out) throws IOException
Saves the index to an OutputStream. Saving may lock the index for updates.- Parameters:
out
- the output stream to write the index to- Throws:
IOException
- in case of I/O exception
-
save
default void save(File file) throws IOException
Saves the index to a file. Saving may lock the index for updates.- Parameters:
file
- file to write the index to- Throws:
IOException
- in case of I/O exception
-
save
default void save(Path path) throws IOException
Saves the index to a path. Saving may lock the index for updates.- Parameters:
path
- file to write the index to- Throws:
IOException
- in case of I/O exception
-
-