Marqo does not find my GPU

Hello,
I try to setup marqo with GPU, but even if GPU redirection is setup correctly marqo seems not to find it.

version: "3.7"
services:
  marqo:
    image: marqoai/marqo:latest
    restart: unless-stopped

    ports:
      - "8882:8882"

    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

I tried with docker-compose and docker command line copied/pasted from documentation. Same issue.

My setup is a P2000, it working fine on host aswell inside another container like nvidia/cuda

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05   Driver Version: 525.147.05   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro P2000        On   | 00000000:01:00.0 Off |                  N/A |
| 48%   37C    P8     4W /  75W |      1MiB /  5120MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Marqo startup log :

External vector store not configured. Using local vector store
Waiting for vector store to start
Waiting... 1 second
Waiting... 2 seconds
Waiting... 3 seconds
Waiting... 4 seconds
Waiting... 5 seconds
Marqo did not find an existing vector store. Setting up vector store...
  Vector store is available. Vector store setup complete
Starting Marqo throttling
Called Marqo throttling start command
Marqo throttling is now running
/usr/local/lib/python3.8/site-packages/transformers/generation_utils.py:24: FutureWarning: Importing `GenerationMixin` from `src/transformers/generation_utils.py` is deprecated and will be removed in Transformers v5. Import as `from transformers import GenerationMixin` instead.
  warnings.warn(
INFO:ModelsForStartup:pre-loading ['hf/e5-base-v2', 'open_clip/ViT-B-32/laion2b_s34b_b79k'] onto devices=['cpu']
INFO:marqo.tensor_search.index_meta_cache:Starting index cache refresh thread
###########################################################
###########################################################
###### STARTING DOWNLOAD OF MARQO ARTEFACTS################
###########################################################
###########################################################
INFO:CUDA device summary:Found devices [{'id': -1, 'name': ['cpu']}]
INFO:SetBestAvailableDevice:Best available device set to: cpu
INFO:marqo.tensor_search.index_meta_cache:Last index cache refresh at 1713271831.8404377
INFO:marqo.s2_inference.s2_inference:loaded hf/e5-base-v2 on device cpu with normalization=True at time=2024-04-16 12:50:31.717759.
INFO:ModelsForStartup:hf/e5-base-v2 cpu run succesfully!
INFO:marqo.s2_inference.s2_inference:loaded open_clip/ViT-B-32/laion2b_s34b_b79k on device cpu with normalization=True at time=2024-04-16 12:50:41.097007.
INFO:ModelsForStartup:open_clip/ViT-B-32/laion2b_s34b_b79k cpu run succesfully!
INFO:ModelsForStartup:0.027492618560791014 for hf/e5-base-v2 and cpu
INFO:ModelsForStartup:0.06336462497711182 for open_clip/ViT-B-32/laion2b_s34b_b79k and cpu
INFO:ModelsForStartup:completed loading models
INFO:marqo.connections:Took 1.361ms to connect to redis and load scripts.
loading for: model_name=hf/e5-base-v2 and properties={'name': 'intfloat/e5-base-v2', 'dimensions': 768, 'tokens': 512, 'type': 'hf', 'model_size': 0.438, 'notes': ''}
loading for: model_name=open_clip/ViT-B-32/laion2b_s34b_b79k and properties={'name': 'open_clip/ViT-B-32/laion2b_s34b_b79k', 'dimensions': 512, 'note': 'open_clip models', 'type': 'open_clip', 'pretrained': 'laion2b_s34b_b79k'}
###########################################################
###########################################################
###### !!COMPLETED SUCCESSFULLY!!!         ################
###########################################################
###########################################################
Version: 2.4.1

Could you help me ? What I’m missing to get marqo working with my nvidia GPU ?

Thank you

Hi @LeBleu, I think you just need to add a command to the Marqo service to enable the GPU.

Running Marqo with a GPU requires the --gpus all flag to be added (docs).

version: "3.7"
services:
  marqo:
    image: marqoai/marqo:latest
    restart: unless-stopped
    command: --gpus all

    ports:
      - "8882:8882"

    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

This seems to work on my machine. Let me know if you run into any issues with it.

Hi @owen-elliott
Thank you for answer.
I tough the command line with replaced by deploy: instruction in compose.

Now marqo container seems to have enabled GPU support but it still does not use GPU.

Strange behavoir is that “nvidia-smi” command run inside marqo container return a missing lib :

[root@62e78f022f6e app]# nvidia-smi 
NVIDIA-SMI couldn't find libnvidia-ml.so library in your system. Please make sure that the NVIDIA Display Driver is properly installed and present in your system.
Please also try adding directory that contains libnvidia-ml.so to your system PATH.

Full log at startup :

External vector store not configured. Using local vector store
Waiting for vector store to start
Waiting... 1 second
Waiting... 2 seconds
Waiting... 3 seconds
Waiting... 4 seconds
Waiting... 5 seconds
Marqo did not find an existing vector store. Setting up vector store...
  Vector store is available. Vector store setup complete
Starting Marqo throttling
Called Marqo throttling start command
Marqo throttling is now running
/usr/local/lib/python3.8/site-packages/transformers/generation_utils.py:24: FutureWarning: Importing `GenerationMixin` from `src/transformers/generation_utils.py` is deprecated and will be removed in Transformers v5. Import as `from transformers import GenerationMixin` instead.
  warnings.warn(
INFO:ModelsForStartup:pre-loading ['hf/e5-base-v2', 'open_clip/ViT-B-32/laion2b_s34b_b79k'] onto devices=['cpu']
INFO:marqo.tensor_search.index_meta_cache:Starting index cache refresh thread
###########################################################
###########################################################
###### STARTING DOWNLOAD OF MARQO ARTEFACTS################
###########################################################
###########################################################
INFO:CUDA device summary:Found devices [{'id': -1, 'name': ['cpu']}]
INFO:SetBestAvailableDevice:Best available device set to: cpu
INFO:marqo.tensor_search.index_meta_cache:Last index cache refresh at 1713339198.597228
INFO:marqo.s2_inference.s2_inference:loaded hf/e5-base-v2 on device cpu with normalization=True at time=2024-04-17 07:33:18.484142.
INFO:ModelsForStartup:hf/e5-base-v2 cpu run succesfully!
INFO:marqo.s2_inference.s2_inference:loaded open_clip/ViT-B-32/laion2b_s34b_b79k on device cpu with normalization=True at time=2024-04-17 07:33:28.488329.
INFO:ModelsForStartup:open_clip/ViT-B-32/laion2b_s34b_b79k cpu run succesfully!
INFO:ModelsForStartup:0.024351859092712404 for hf/e5-base-v2 and cpu
INFO:ModelsForStartup:0.08258800506591797 for open_clip/ViT-B-32/laion2b_s34b_b79k and cpu
INFO:ModelsForStartup:completed loading models
INFO:marqo.connections:Took 0.991ms to connect to redis and load scripts.
loading for: model_name=hf/e5-base-v2 and properties={'name': 'intfloat/e5-base-v2', 'dimensions': 768, 'tokens': 512, 'type': 'hf', 'model_size': 0.438, 'notes': ''}
loading for: model_name=open_clip/ViT-B-32/laion2b_s34b_b79k and properties={'name': 'open_clip/ViT-B-32/laion2b_s34b_b79k', 'dimensions': 512, 'note': 'open_clip models', 'type': 'open_clip', 'pretrained': 'laion2b_s34b_b79k'}
###########################################################
###########################################################
###### !!COMPLETED SUCCESSFULLY!!!         ################
###########################################################
###########################################################
Version: 2.4.1
   
     __    __    ___  _        __   ___   ___ ___    ___      ______   ___       ___ ___   ____  ____   ___    ___   __ 
    |  |__|  |  /  _]| |      /  ] /   \ |   |   |  /  _]    |      | /   \     |   |   | /    ||    \ /   \  /   \ |  |
    |  |  |  | /  [_ | |     /  / |     || _   _ | /  [_     |      ||     |    | _   _ ||  o  ||  D  )     ||     ||  |
    |  |  |  ||    _]| |___ /  /  |  O  ||  \_/  ||    _]    |_|  |_||  O  |    |  \_/  ||     ||    /|  Q  ||  O  ||__|
    |  `  '  ||   [_ |     /   \_ |     ||   |   ||   [_       |  |  |     |    |   |   ||  _  ||    \|     ||     | __ 
     \      / |     ||     \     ||     ||   |   ||     |      |  |  |     |    |   |   ||  |  ||  .  \     ||     ||  |
      \_/\_/  |_____||_____|\____| \___/ |___|___||_____|      |__|   \___/     |___|___||__|__||__|\_|\__,_| \___/ |__|
                                                                                                                        
        
     _____                                                   _        __              _                                     
    |_   _|__ _ __  ___  ___  _ __   ___  ___  __ _ _ __ ___| |__    / _| ___  _ __  | |__  _   _ _ __ ___   __ _ _ __  ___ 
      | |/ _ \ '_ \/ __|/ _ \| '__| / __|/ _ \/ _` | '__/ __| '_ \  | |_ / _ \| '__| | '_ \| | | | '_ ` _ \ / _` | '_ \/ __|
      | |  __/ | | \__ \ (_) | |    \__ \  __/ (_| | | | (__| | | | |  _| (_) | |    | | | | |_| | | | | | | (_| | | | \__ \
      |_|\___|_| |_|___/\___/|_|    |___/\___|\__,_|_|  \___|_| |_| |_|  \___/|_|    |_| |_|\__,_|_| |_| |_|\__,_|_| |_|___/
                                                                                                                                                                                                                                                     
        
INFO:     Started server process [1549]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8882 (Press CTRL+C to quit)

Interesting, this is not an issue that I have come across before. What OS are you using? You said it worked in other containers, is that also using --gpus all?

I did some searching and this might be an issue outside of Marqo (maybe tied to a driver version), perhaps something in this issue might help? Let me know how it goes.

My host is Debian 12.
Plex container for example doesn’t use ""gpus all and I tried docker run --rm --gpus=all nvidia/cuda:11.4.3-base-ubuntu20.04 nvidia-smi . In both case GPU is working fine.

The fix mentionned in you issue is already implemented on my system ldconfig = "/sbin/ldconfig"

I find a workaround with marqo, at very startup of the container I connect to it and I run ldconfig inside container. After that nvidia-smi inside container works and marqo succeed to use my GPU.

I’m not sur if marqo container should run it automatically or if it should be manage by nvidia-container.