Unfortunately directly indexing documents into Marqo as vectors is not supported at the moment.
Having the inference as part of the add_documents pipeline (which creates the vectors for you) is an intrinsic part of Marqo.
I can offer some alternative approaches that may or may not be useful in your case:
If your vectors come from a supported model architecture then you could load your model into Marqo during index creation, you can refer to the documentation for loading generic CLIP and SBERT models. Marqo will then create the vectors for you.
You can also search directly with vectors once your data is indexed via the context parameter in the search API.
Thank you for your insightful answer. Using the CLIP and SBERT models is not an option as I want to index documents with audio features extracted using essentia.
But maybe I’m just overlooking something and the features from Marqo are already supporting what I want to achieve.
I see, at this stage we don’t currently support any models that do feature extraction from audio directly.
If you are just looking to work with vectors that you already have, then I would suggest that a vector database that doesn’t include the inference might be a better fit for your use case. There are some good open source options such as HNSWLib which is a more minimal implementation of vector search or OpenSearch which is more complex to use but is also more feature rich.
Marqo is targeted towards working directly with text and images where the vectors are created internally using a model of your choice.
For example an index created as
mq.create_index("my-index", model="hf/e5-base")
would then use the e5-base model to create the vectors. Adding documents like
mq.index("my-index").add_documents(
[
{
"text": "The EMU is a spacesuit that provides environmental protection.",
},
],
tensor_fields=["text"],
)
Would take the "text" and turn it into a vector using the hf/e5-base base. The same process is applied at search time where a text is converted into a vector to do the vector search.