I tried with docker-compose and docker command line copied/pasted from documentation. Same issue.
My setup is a P2000, it working fine on host aswell inside another container like nvidia/cuda
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05 Driver Version: 525.147.05 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro P2000 On | 00000000:01:00.0 Off | N/A |
| 48% 37C P8 4W / 75W | 1MiB / 5120MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Marqo startup log :
External vector store not configured. Using local vector store
Waiting for vector store to start
Waiting... 1 second
Waiting... 2 seconds
Waiting... 3 seconds
Waiting... 4 seconds
Waiting... 5 seconds
Marqo did not find an existing vector store. Setting up vector store...
Vector store is available. Vector store setup complete
Starting Marqo throttling
Called Marqo throttling start command
Marqo throttling is now running
/usr/local/lib/python3.8/site-packages/transformers/generation_utils.py:24: FutureWarning: Importing `GenerationMixin` from `src/transformers/generation_utils.py` is deprecated and will be removed in Transformers v5. Import as `from transformers import GenerationMixin` instead.
warnings.warn(
INFO:ModelsForStartup:pre-loading ['hf/e5-base-v2', 'open_clip/ViT-B-32/laion2b_s34b_b79k'] onto devices=['cpu']
INFO:marqo.tensor_search.index_meta_cache:Starting index cache refresh thread
###########################################################
###########################################################
###### STARTING DOWNLOAD OF MARQO ARTEFACTS################
###########################################################
###########################################################
INFO:CUDA device summary:Found devices [{'id': -1, 'name': ['cpu']}]
INFO:SetBestAvailableDevice:Best available device set to: cpu
INFO:marqo.tensor_search.index_meta_cache:Last index cache refresh at 1713271831.8404377
INFO:marqo.s2_inference.s2_inference:loaded hf/e5-base-v2 on device cpu with normalization=True at time=2024-04-16 12:50:31.717759.
INFO:ModelsForStartup:hf/e5-base-v2 cpu run succesfully!
INFO:marqo.s2_inference.s2_inference:loaded open_clip/ViT-B-32/laion2b_s34b_b79k on device cpu with normalization=True at time=2024-04-16 12:50:41.097007.
INFO:ModelsForStartup:open_clip/ViT-B-32/laion2b_s34b_b79k cpu run succesfully!
INFO:ModelsForStartup:0.027492618560791014 for hf/e5-base-v2 and cpu
INFO:ModelsForStartup:0.06336462497711182 for open_clip/ViT-B-32/laion2b_s34b_b79k and cpu
INFO:ModelsForStartup:completed loading models
INFO:marqo.connections:Took 1.361ms to connect to redis and load scripts.
loading for: model_name=hf/e5-base-v2 and properties={'name': 'intfloat/e5-base-v2', 'dimensions': 768, 'tokens': 512, 'type': 'hf', 'model_size': 0.438, 'notes': ''}
loading for: model_name=open_clip/ViT-B-32/laion2b_s34b_b79k and properties={'name': 'open_clip/ViT-B-32/laion2b_s34b_b79k', 'dimensions': 512, 'note': 'open_clip models', 'type': 'open_clip', 'pretrained': 'laion2b_s34b_b79k'}
###########################################################
###########################################################
###### !!COMPLETED SUCCESSFULLY!!! ################
###########################################################
###########################################################
Version: 2.4.1
Could you help me ? What I’m missing to get marqo working with my nvidia GPU ?
Hi @owen-elliott
Thank you for answer.
I tough the command line with replaced by deploy: instruction in compose.
Now marqo container seems to have enabled GPU support but it still does not use GPU.
Strange behavoir is that “nvidia-smi” command run inside marqo container return a missing lib :
[root@62e78f022f6e app]# nvidia-smi
NVIDIA-SMI couldn't find libnvidia-ml.so library in your system. Please make sure that the NVIDIA Display Driver is properly installed and present in your system.
Please also try adding directory that contains libnvidia-ml.so to your system PATH.
Full log at startup :
External vector store not configured. Using local vector store
Waiting for vector store to start
Waiting... 1 second
Waiting... 2 seconds
Waiting... 3 seconds
Waiting... 4 seconds
Waiting... 5 seconds
Marqo did not find an existing vector store. Setting up vector store...
Vector store is available. Vector store setup complete
Starting Marqo throttling
Called Marqo throttling start command
Marqo throttling is now running
/usr/local/lib/python3.8/site-packages/transformers/generation_utils.py:24: FutureWarning: Importing `GenerationMixin` from `src/transformers/generation_utils.py` is deprecated and will be removed in Transformers v5. Import as `from transformers import GenerationMixin` instead.
warnings.warn(
INFO:ModelsForStartup:pre-loading ['hf/e5-base-v2', 'open_clip/ViT-B-32/laion2b_s34b_b79k'] onto devices=['cpu']
INFO:marqo.tensor_search.index_meta_cache:Starting index cache refresh thread
###########################################################
###########################################################
###### STARTING DOWNLOAD OF MARQO ARTEFACTS################
###########################################################
###########################################################
INFO:CUDA device summary:Found devices [{'id': -1, 'name': ['cpu']}]
INFO:SetBestAvailableDevice:Best available device set to: cpu
INFO:marqo.tensor_search.index_meta_cache:Last index cache refresh at 1713339198.597228
INFO:marqo.s2_inference.s2_inference:loaded hf/e5-base-v2 on device cpu with normalization=True at time=2024-04-17 07:33:18.484142.
INFO:ModelsForStartup:hf/e5-base-v2 cpu run succesfully!
INFO:marqo.s2_inference.s2_inference:loaded open_clip/ViT-B-32/laion2b_s34b_b79k on device cpu with normalization=True at time=2024-04-17 07:33:28.488329.
INFO:ModelsForStartup:open_clip/ViT-B-32/laion2b_s34b_b79k cpu run succesfully!
INFO:ModelsForStartup:0.024351859092712404 for hf/e5-base-v2 and cpu
INFO:ModelsForStartup:0.08258800506591797 for open_clip/ViT-B-32/laion2b_s34b_b79k and cpu
INFO:ModelsForStartup:completed loading models
INFO:marqo.connections:Took 0.991ms to connect to redis and load scripts.
loading for: model_name=hf/e5-base-v2 and properties={'name': 'intfloat/e5-base-v2', 'dimensions': 768, 'tokens': 512, 'type': 'hf', 'model_size': 0.438, 'notes': ''}
loading for: model_name=open_clip/ViT-B-32/laion2b_s34b_b79k and properties={'name': 'open_clip/ViT-B-32/laion2b_s34b_b79k', 'dimensions': 512, 'note': 'open_clip models', 'type': 'open_clip', 'pretrained': 'laion2b_s34b_b79k'}
###########################################################
###########################################################
###### !!COMPLETED SUCCESSFULLY!!! ################
###########################################################
###########################################################
Version: 2.4.1
__ __ ___ _ __ ___ ___ ___ ___ ______ ___ ___ ___ ____ ____ ___ ___ __
| |__| | / _]| | / ] / \ | | | / _] | | / \ | | | / || \ / \ / \ | |
| | | | / [_ | | / / | || _ _ | / [_ | || | | _ _ || o || D ) || || |
| | | || _]| |___ / / | O || \_/ || _] |_| |_|| O | | \_/ || || /| Q || O ||__|
| ` ' || [_ | / \_ | || | || [_ | | | | | | || _ || \| || | __
\ / | || \ || || | || | | | | | | | || | || . \ || || |
\_/\_/ |_____||_____|\____| \___/ |___|___||_____| |__| \___/ |___|___||__|__||__|\_|\__,_| \___/ |__|
_____ _ __ _
|_ _|__ _ __ ___ ___ _ __ ___ ___ __ _ _ __ ___| |__ / _| ___ _ __ | |__ _ _ _ __ ___ __ _ _ __ ___
| |/ _ \ '_ \/ __|/ _ \| '__| / __|/ _ \/ _` | '__/ __| '_ \ | |_ / _ \| '__| | '_ \| | | | '_ ` _ \ / _` | '_ \/ __|
| | __/ | | \__ \ (_) | | \__ \ __/ (_| | | | (__| | | | | _| (_) | | | | | | |_| | | | | | | (_| | | | \__ \
|_|\___|_| |_|___/\___/|_| |___/\___|\__,_|_| \___|_| |_| |_| \___/|_| |_| |_|\__,_|_| |_| |_|\__,_|_| |_|___/
INFO: Started server process [1549]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8882 (Press CTRL+C to quit)
Interesting, this is not an issue that I have come across before. What OS are you using? You said it worked in other containers, is that also using --gpus all?
I did some searching and this might be an issue outside of Marqo (maybe tied to a driver version), perhaps something in this issue might help? Let me know how it goes.
My host is Debian 12.
Plex container for example doesn’t use ""gpus all and I tried docker run --rm --gpus=all nvidia/cuda:11.4.3-base-ubuntu20.04 nvidia-smi . In both case GPU is working fine.
The fix mentionned in you issue is already implemented on my system ldconfig = "/sbin/ldconfig"
I find a workaround with marqo, at very startup of the container I connect to it and I run ldconfig inside container. After that nvidia-smi inside container works and marqo succeed to use my GPU.
I’m not sur if marqo container should run it automatically or if it should be manage by nvidia-container.