I need some help with marqo

Hey,
I’ve been trying to understand how marqo works but the documentation isn’t all that clear.

Context

First off I've got no experience with vector databases and context embedding. \

I’m attempting to create some context embedding for my model. I’d like to use marqo for the vector database part where I’ll be storing past converstations, documents and maybe some other stuff like social media posts/emails/whatnot.

Questions

Q1:
Would it be possible setup a docker compose file for marqo, I’ve got no idea on what to do with the volume part (I want it mapped to a folder in home directory).
Could an example docker compose be added to the documentation.

Q2:
When I specify the embedding model to use in the creation of the index and adding of documents, will it then keep the model loaded in memory so long as the docker is active?
If that were the case and I add stuff to the database with a different script at a different time, I would not have to specify the same model (so long as the index is the same) again right?

Q3:
Could you add support for the heavy intfloat/5-mistral-7b-instruct model
(this model is right up there at rank 1 with 4096 dimensions).

Q4:
So If I’ve understood things correctly, I could use a different script in the creation and addition of bulk documents utilizing CUDA and during the context embedding I could just run it off the cpu right?

Oh and with the custom model properties part of the documentation
This is what I could piece together looking at the examples and API:

modelProperties": {
        "name": "e5-mistral-7b-instruct",
        "dimensions": 4096
        "tokens": 32768,
        "type": "hf",
        "modelLocation":"./models/e5-mistral-7b-instruct.zip"
    },

I’m unsure if this is correct

Hi @DuckY-Y, I’ve put some answers to each of your questions below, let me know if you have any follow-ups!

A1:
What data do you want to map to your home folder?

A simple compose would be something like:

version: '3'

services:
  marqo:
    image: marqoai/marqo:latest
    container_name: marqo
    ports:
      - "8882:8882"
    restart: unless-stopped

If you wanted to do something like map the model cache to a host machine folder you could add:

volumes:
      - /home/path/to/dir:/app/src/marqo/cache

A2:
Embedding models are tied to an index so once you have created it, the model is fixed. You can continue to add documents to or search an existing index without needing to specify the model. If you want to check which model you have used for an index you can use the /settings endpoint.

Additionally when you reboot the container the model will still be cached and searching or adding documents to the existing index will load the correct model back into memory. (as long as you don’t delete the container or anything)

A3:
Unfortunately intfloat/e5-mistral-7b-instruct isn’t supported at the moment. This model requires extra code for the last token pooling which is not part of the sentence transformers API.

While this model is at the top of MTEB at the moment it is also very large and occupies about 25-30GB in memory in my experience. (It took the authors 3 days on 8 A100 GPUs to evaluate the model on MTEB with truncation to 512 tokens).

You might have an easier time getting started with a model like hf/e5-base-v2. e.g.

mq.create_index("my-index-name", model="hf/e5-base-v2")

There are also small and large variants.

You can load in models that use the sentence transformers API with index settings like:

import marqo 
mq = marqo.Client()

settings ={
    "modelProperties": {
        "name": "BAAI/bge-large-en-v1.5",
        "type": "hf",
        "dimensions": 1024,
    },
    "model": "bge-large-1.5", # any unique name you want
    "normalizeEmbeddings": True,
}

mq.create_index("test-index-bge", settings_dict=settings)

mq.index("test-index-bge").search("hello world")

A4:
Yep, by default Marqo will select the ‘best’ available device - if you have Nvidia Container Toolkit and a CUDA GPU working then it will use the GPU by default. If you want to use a specific device then there is also a device query parameter for search and documents where you can specify “cuda” or “cpu”.

1 Like