Vector Search
GridGain can index vectors stored in a field and then search the cache based on the provided vector.
Requirements
-
GridGain must be running on Java 11 or later.
-
GridGain license must provide access to vector search feature.
-
Vector search can only be implemented for REPLICATED caches.
-
Vectors for the field must be acquired by using a separate model, as no model is provided with GridGain.
Installation
To start using vector store, enable the optional gridgain-vector-query
module.
Vector Fields
When creating the field for vector, mark the field that will hold the vector with the QueryVectorField
annotation. This field must have the float[]
type. GridGain will create a vector index based on the provided embedding.
The example below shows a class that uses a text field and a vector field:
public class Article {
/**
* Content (indexed).
*/
@QueryTextField
private String content;
@QueryVectorField
private float[] contentVector;
/**
* Required for binary deserialization.
*/
public Article() {
// No-op.
}
public Article(String contentVector, float[] contentVec) {
this.contentVector = contentVector;
this.vec = contentVec;
}
}
Objects with vector fields can be stored as normal. GridGain will build an additional index for the vector column that can be queried.
Performing a Vector Query
To perform a vector query, you would need a search vector provided by the same model as the one used to create the original vectors for the database objects. In this example, we will assume that you have procured the required vector already. Once the vector is available, you can use the VectorQuery
object to create a query and send it to the cluster with the query
method:
float[] searchVector = // get from model
VectorQuery myQuery = new VectorQuery(Article.class, "myField", searchVector, 5)
cache.query(myQuery).getAll());
The VectorQuery
constructor accepts the following parameters:
-
The first parameter specifies the Article object representing the cache entry type.
-
The second parameter specifies the name of the vector field that will be searched.
-
The third parameter specifies the previously obtained search vector.
-
The fourth parameter specifies the maximum number of results to return. This parameter, often referred to as
k
in nearest neighbor searches, determines how many nearest neighbors the query will retrieve. -
You can also specify an optional fifth threshold parameter to control the quality of results returned, for example:
float[] searchVector = // get from model // Using the overloaded constructor with threshold parameter VectorQuery myQuery = new VectorQuery(Article.class, "myField", searchVector, 5, 0.75) cache.query(myQuery).getAll());
The threshold must be a float value between 0.0 and 1.0, where higher values mean the results must be more similar to the search vector. This example returns up to 5 nearest neighbors, but only those that have a similarity score of at least 0.75. If fewer than 5 neighbors meet this threshold, fewer results will be returned. Using a threshold can help ensure that your search only returns relevant results and filters out vectors that are too dissimilar from the search vector.
© 2025 GridGain Systems, Inc. All Rights Reserved. Privacy Policy | Legal Notices. GridGain® is a registered trademark of GridGain Systems, Inc.
Apache, Apache Ignite, the Apache feather and the Apache Ignite logo are either registered trademarks or trademarks of The Apache Software Foundation.