Where are embedding models stored?

rcking · August 24, 2024, 7:51am

We would like to avoid downloading models (e.g., multilingual-clip/XLM-Roberta-Large-Vit-L-14) more than once, even if the marqo container is shut down. This presumable means mapping an internal path to a persistent docker volume. But where are these models stored?

rcking · August 24, 2024, 8:15am

As far as I can tell, the cache directory is:

/app/src/marqo/cache

is that correct?

papa99do · August 26, 2024, 12:24am

Hello @rcking . Yes, it is correct. You can mount a volume to that folder to cache models.

Some more details from the code base:

We’ve provided environment variables to allow you to specify Model location, by default, models are stored in <marqo_root>/cache folder (marqo/src/marqo/s2_inference/configs.py at b43bab503758084783c7bfd8d0ed55535c6ef802 · marqo-ai/marqo · GitHub)
Here is the code to retrieve marqo_root folder from the MARQO_ROOT_PATH env var, if not provided, the default values if /app/src/marqo (marqo/src/marqo/tensor_search/utils.py at dd67f53b3855e55390c7ad050ffeebd39169480c · marqo-ai/marqo · GitHub)

Please let me know if you have further questions. Thanks.

rcking · August 26, 2024, 8:02am

Thanks @papa99do. Sorry in advance for the long reply…

I do not think that marqo behaves as expected. Here is what I tried:

On a local machine, with internet connection:

$ docker run -d --name marqo -it -p 8882:8882 -v /vespa:/opt/vespa/var -v /marqo:/app/src/marqo/cache marqoai/marqo:2.11

$ curl -X POST -H 'Content-type: application/json' http://localhost:8882/indexes/test -d '{"model": "hf/e5-base-v2"}'

$ curl localhost:8882/models
{"models":[{"model_name":"hf/e5-base-v2","model_device":"cpu"},{"model_name":"open_clip/ViT-B-32/laion2b_s34b_b79k","model_device":"cpu"}]}
$ curl localhost:8882/indexes
{"results":[{"indexName":"test"}]}

Then I added some documents, carried out a query successfully.

Then I stopped the container, disconnected from the internet, and re-started the container. Query still works successfully.

Then I stopped the container and removed the container. Then I tried to run marqo again (this time, no internet connection).

$ docker logs marqo

loading for: model_name=hf/e5-base-v2 and properties={'name': 'intfloat/e5-base-v2', 'dimensions': 768, 'tokens': 512, 'type': 'hf', 'model_size': 0.438, 'text_query_prefix': 'query: ', 'text_chunk_prefix': 'passage: ', 'notes': ''}
INFO:marqo.tensor_search.index_meta_cache:Last index cache refresh at 1724658687.0970526
/usr/local/lib/python3.8/site-packages/huggingface_hub/file_download.py:1150: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
ERROR:marqo.s2_inference.s2_inference:Error loading model hf/e5-base-v2 on device cpu with normalization=True.
Error message is Marqo encountered an error loading the Hugging Face model = `intfloat/e5-base-v2` using AutoTokenizer Please ensure that the model is a valid Hugging Face model and retry.
 Original error message = We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like intfloat/e5-base-v2 is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/urllib3/connection.py", line 168, in _new_conn
    conn = connection.create_connection(
  File "/usr/local/lib/python3.8/site-packages/urllib3/util/connection.py", line 73, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/usr/lib64/python3.8/socket.py", line 918, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

+ many more exceptions

In other words, there is something emphemeral about the ModelCache object (I suspect); marqo has “forgotten” that the model is cached, attempts to load it from the internet, and fails due to the missing connection.

rcking · August 26, 2024, 8:11am

Addendum: Interestingly, if I restore the internet connection, and re-run marqo (having removed the container previously), it starts well and apparently loads the models from the cache! But it seems there is a check of huggingface before this happens…?

Not good behaviour for an offline deployment…

rcking · August 26, 2024, 11:07am

Found a workaround. First, I downloaded e5-base-v2 from huggingface, and put in in a local cache dirctory. Then I turned off my network interface.

Then:

docker run -d --name marqo -it -p 8882:8882 -e MARQO_MODELS_TO_PRELOAD='[]' -v /vespa:/opt/vespa/var -v /marqo/.cache/huggingface/hub:/hfcache marqoai/marqo:2.11

curl -X POST -H 'Content-type: application/json' http://localhost:8882/indexes/test -d '{"model": "local/e5-base-v2", "modelProperties": {"dimensions": 768, "tokens": 512,"type": "hf","model_size": 0.438,"text_query_prefix": "query: ","text_chunk_prefix": "passage: ","notes": "","localpath": "/hfcache/models--intfloat--e5-base-v2/snapshots/1c644c92ad3ba1efdad3f1451a637716616a20e8"}}'

Then I can add documents to the test index, and query them.

But most importantly, I can also shutdown and remove the marqo container, then restart it with the command above, and everything works - offline.

papa99do · August 27, 2024, 7:23am

@rcking , thanks for sharing your findings with us. Glad to hear that you’ve got Marqo working offline. We will do some more investigation and see why you will need to put the model in a local folder to start Marqo with. We will keep you updated about our investigation result. Thanks again.

rcking · August 28, 2024, 9:52am

I think the problem might be in from_hf.py:

def download_model_from_hf(
        location: HfModelLocation,
        auth: Optional[HfAuth] = None,
        download_dir: Optional[str] = None):
    """Downloads a pretrained model from HF, if it doesn't exist locally. The basename of the
    location's filename is used as the local filename.

    hf_hub_download downloads the model if it does not yet exist in the cache.

    Args:
        location: repo_id and filename to be downloaded.
        auth: contains HF API token for model access
        download_dir: The location where the model
            should be stored

    Returns:
        Path to the downloaded model
    """
    download_kwargs = location.dict(exclude_unset=True) # Ignore unset values to avoid adding None to params
    if auth is not None:
        download_kwargs = {**download_kwargs, **auth.dict()}
    try:

        return hf_hub_download(**download_kwargs, cache_dir=download_dir)
    except RepositoryNotFoundError:
        # TODO: add link to HF model auth/loc
        raise ModelDownloadError(
            "Could not find the specified Hugging Face model repository. Please ensure that the request's model_auth's "
            "`hf` credentials and the index's model_location are correct. "
            "If the index's model_location is not correct, please create a new index with the corrected model_location"
        )

It appears that hf_hub_download is used to get the filepath of the cached model; but that function raises an exception if huggingface cannot be reached, even if nothing is downloaded.

Topic		Replies	Views
I need some help with marqo Support	3	353	September 19, 2024
Running marqo offline General	1	45	November 11, 2024
Docker volume data storage Support	2	341	August 26, 2024
How to configure the persistent storage path? Support	5	412	August 11, 2024
Log management in docker container Support	2	28	August 13, 2024

Where are embedding models stored?

Related topics