You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(codellama) amardeep.yadav@fintricity.com@codellamamachine:~$ pip install "openllm[vllm]"
Requirement already satisfied: openllm[vllm] in ./miniconda3/envs/codellama/lib/python3.12/site-packages (0.4.44)
Requirement already satisfied: accelerate in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (0.29.3)
Requirement already satisfied: bentoml<1.2,>=1.1.11 in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from bentoml[io]<1.2,>=1.1.11->openllm[vllm]) (1.1.11)
Requirement already satisfied: bitsandbytes<0.42 in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (0.41.3.post2)
Requirement already satisfied: build<1 in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from build[virtualenv]<1->openllm[vllm]) (0.10.0)
Requirement already satisfied: click>=8.1.3 in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (8.1.7)
Requirement already satisfied: cuda-python in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (12.4.0)
Requirement already satisfied: einops in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (0.7.0)
Requirement already satisfied: ghapi in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (1.0.5)
Requirement already satisfied: openllm-client>=0.4.44 in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (0.4.44)
Requirement already satisfied: openllm-core>=0.4.44 in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (0.4.44)
Requirement already satisfied: optimum>=1.12.0 in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (1.19.1)
Requirement already satisfied: safetensors in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (0.4.3)
Requirement already satisfied: scipy in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (1.13.0)
Requirement already satisfied: sentencepiece in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (0.2.0)
Requirement already satisfied: transformers>=4.36.0 in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from transformers[tokenizers,torch]>=4.36.0->openllm[vllm]) (4.40.1)
INFO: pip is looking at multiple versions of openllm[vllm] to determine which version is compatible with other requirements. This could take a while.
Collecting openllm[vllm]
Using cached openllm-0.4.43-py3-none-any.whl.metadata (62 kB)
Using cached openllm-0.4.42-py3-none-any.whl.metadata (62 kB)
Using cached openllm-0.4.41-py3-none-any.whl.metadata (62 kB)
Using cached openllm-0.4.40-py3-none-any.whl.metadata (62 kB)
Using cached openllm-0.4.39-py3-none-any.whl.metadata (62 kB)
Collecting megablocks (from openllm[vllm])
Using cached megablocks-0.5.1.tar.gz (49 kB)
Preparing metadata (setup.py) ... done
Collecting openllm[vllm]
Using cached openllm-0.4.38-py3-none-any.whl.metadata (62 kB)
Using cached openllm-0.4.37-py3-none-any.whl.metadata (62 kB)
INFO: pip is still looking at multiple versions of openllm[vllm] to determine which version is compatible with other requirements. This could take a while.
Using cached openllm-0.4.36-py3-none-any.whl.metadata (60 kB)
Using cached openllm-0.4.35-py3-none-any.whl.metadata (60 kB)
Collecting vllm>=0.2.2 (from openllm[vllm])
Using cached vllm-0.3.3.tar.gz (315 kB)
Installing build dependencies ... error
error: subprocess-exited-with-error
× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> [8 lines of output]
Collecting ninja
Using cached ninja-1.11.1.1-py2.py3-none-manylinux1_x86_64.manylinux_2_5_x86_64.whl.metadata (5.3 kB)
Collecting packaging
Using cached packaging-24.0-py3-none-any.whl.metadata (3.2 kB)
Collecting setuptools>=49.4.0
Using cached setuptools-69.5.1-py3-none-any.whl.metadata (6.2 kB)
ERROR: Could not find a version that satisfies the requirement torch==2.1.2 (from versions: 2.2.0, 2.2.1, 2.2.2, 2.3.0)
ERROR: No matching distribution found for torch==2.1.2
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
To reproduce
Step:1 create a normal setup for openllm with conda env
Step:2 RUN TRUST_REMOTE_CODE=True openllm start codellama/CodeLlama-34b-Instruct-hf --backend vllm
The following error might me visible to you:
(codellama) amardeep.yadav@fintricity.com@codellamamachine:~$ TRUST_REMOTE_CODE=True openllm start codellama/CodeLlama-34b-Instruct-hf --backend vllm
config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 588/588 [00:00<00:00, 7.36MB/s]
tokenizer_config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.59k/1.59k [00:00<00:00, 20.3MB/s]
tokenizer.model: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 500k/500k [00:00<00:00, 91.4MB/s]
tokenizer.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.84M/1.84M [00:00<00:00, 60.7MB/s]
special_tokens_map.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 411/411 [00:00<00:00, 5.58MB/s]
generation_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 116/116 [00:00<00:00, 1.54MB/s]
model.safetensors.index.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 37.6k/37.6k [00:00<00:00, 116MB/s]
pytorch_model.bin.index.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 35.8k/35.8k [00:00<00:00, 207MB/s]
model-00007-of-00007.safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 9.19G/9.19G [00:50<00:00, 180MB/s]
model-00001-of-00007.safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 9.85G/9.85G [00:52<00:00, 188MB/s]
model-00002-of-00007.safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 9.69G/9.69G [00:52<00:00, 183MB/s]
model-00003-of-00007.safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 9.69G/9.69G [00:52<00:00, 183MB/s]
model-00006-of-00007.safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 9.69G/9.69G [00:53<00:00, 180MB/s]
model-00005-of-00007.safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 9.69G/9.69G [00:54<00:00, 179MB/s]
model-00004-of-00007.safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 9.69G/9.69G [00:54<00:00, 179MB/s]
Fetching 15 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:54<00:00, 3.63s/it]
^[[B^[[B^[[B^[[B^[[B^[[B^[[A^[[A^[[A🚀Tip: run 'openllm build codellama/CodeLlama-34b-Instruct-hf --backend vllm --serialization safetensors' to create a BentoLLM for 'codellama/CodeLlama-34b-Instruct-hf's: 100%|████████████████████████████████████████████████████████████████████████████████████████████████▋| 9.66G/9.69G [00:54<00:00, 327MB/s]
2024-04-25T18:34:00+0000 [INFO] [cli] Prometheus metrics for HTTP BentoServer from "_service:svc" can be accessed at http://localhost:3000/metrics.9G [00:53<00:00, 285MB/s]
2024-04-25T18:34:01+0000 [INFO] [cli] Starting production HTTP BentoServer from "_service:svc" listening on http://0.0.0.0:3000 (Press CTRL+C to quit)[00:53<00:00, 286MB/s]
2024-04-25T18:34:04+0000 [ERROR] [runner:llm-llama-runner:1] An exception occurred while instantiating runner 'llm-llama-runner', see details below:
2024-04-25T18:34:04+0000 [ERROR] [runner:llm-llama-runner:1] Traceback (most recent call last):
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/bentoml/_internal/runner/runner.py", line 307, in init_local
self._set_handle(LocalRunnerRef)
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/bentoml/_internal/runner/runner.py", line 150, in _set_handle
runner_handle = handle_class(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/bentoml/_internal/runner/runner_handle/local.py", line 27, in init
self._runnable = runner.runnable_class(**runner.runnable_init_params) # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/openllm/_runners.py", line 121, in init
raise openllm.exceptions.OpenLLMException('vLLM is not installed. Do pip install "openllm[vllm]".')
openllm_core.exceptions.OpenLLMException: vLLM is not installed. Do pip install "openllm[vllm]".
2024-04-25T18:34:04+0000 [ERROR] [runner:llm-llama-runner:1] Traceback (most recent call last):
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/starlette/routing.py", line 732, in lifespan
async with self.lifespan_context(app) as maybe_state:
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/contextlib.py", line 210, in aenter
return await anext(self.gen)
^^^^^^^^^^^^^^^^^^^^^
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/bentoml/_internal/server/base_app.py", line 75, in lifespan
on_startup()
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/bentoml/_internal/runner/runner.py", line 317, in init_local
raise e
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/bentoml/_internal/runner/runner.py", line 307, in init_local
self._set_handle(LocalRunnerRef)
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/bentoml/_internal/runner/runner.py", line 150, in _set_handle
runner_handle = handle_class(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/bentoml/_internal/runner/runner_handle/local.py", line 27, in init
self._runnable = runner.runnable_class(**runner.runnable_init_params) # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/openllm/_runners.py", line 121, in init
raise openllm.exceptions.OpenLLMException('vLLM is not installed. Do pip install "openllm[vllm]".')
openllm_core.exceptions.OpenLLMException: vLLM is not installed. Do pip install "openllm[vllm]".
Describe the bug
(codellama) amardeep.yadav@fintricity.com@codellamamachine:~$ pip install "openllm[vllm]"
Requirement already satisfied: openllm[vllm] in ./miniconda3/envs/codellama/lib/python3.12/site-packages (0.4.44)
Requirement already satisfied: accelerate in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (0.29.3)
Requirement already satisfied: bentoml<1.2,>=1.1.11 in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from bentoml[io]<1.2,>=1.1.11->openllm[vllm]) (1.1.11)
Requirement already satisfied: bitsandbytes<0.42 in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (0.41.3.post2)
Requirement already satisfied: build<1 in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from build[virtualenv]<1->openllm[vllm]) (0.10.0)
Requirement already satisfied: click>=8.1.3 in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (8.1.7)
Requirement already satisfied: cuda-python in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (12.4.0)
Requirement already satisfied: einops in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (0.7.0)
Requirement already satisfied: ghapi in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (1.0.5)
Requirement already satisfied: openllm-client>=0.4.44 in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (0.4.44)
Requirement already satisfied: openllm-core>=0.4.44 in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (0.4.44)
Requirement already satisfied: optimum>=1.12.0 in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (1.19.1)
Requirement already satisfied: safetensors in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (0.4.3)
Requirement already satisfied: scipy in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (1.13.0)
Requirement already satisfied: sentencepiece in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from openllm[vllm]) (0.2.0)
Requirement already satisfied: transformers>=4.36.0 in ./miniconda3/envs/codellama/lib/python3.12/site-packages (from transformers[tokenizers,torch]>=4.36.0->openllm[vllm]) (4.40.1)
INFO: pip is looking at multiple versions of openllm[vllm] to determine which version is compatible with other requirements. This could take a while.
Collecting openllm[vllm]
Using cached openllm-0.4.43-py3-none-any.whl.metadata (62 kB)
Using cached openllm-0.4.42-py3-none-any.whl.metadata (62 kB)
Using cached openllm-0.4.41-py3-none-any.whl.metadata (62 kB)
Using cached openllm-0.4.40-py3-none-any.whl.metadata (62 kB)
Using cached openllm-0.4.39-py3-none-any.whl.metadata (62 kB)
Collecting megablocks (from openllm[vllm])
Using cached megablocks-0.5.1.tar.gz (49 kB)
Preparing metadata (setup.py) ... done
Collecting openllm[vllm]
Using cached openllm-0.4.38-py3-none-any.whl.metadata (62 kB)
Using cached openllm-0.4.37-py3-none-any.whl.metadata (62 kB)
INFO: pip is still looking at multiple versions of openllm[vllm] to determine which version is compatible with other requirements. This could take a while.
Using cached openllm-0.4.36-py3-none-any.whl.metadata (60 kB)
Using cached openllm-0.4.35-py3-none-any.whl.metadata (60 kB)
Collecting vllm>=0.2.2 (from openllm[vllm])
Using cached vllm-0.3.3.tar.gz (315 kB)
Installing build dependencies ... error
error: subprocess-exited-with-error
× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> [8 lines of output]
Collecting ninja
Using cached ninja-1.11.1.1-py2.py3-none-manylinux1_x86_64.manylinux_2_5_x86_64.whl.metadata (5.3 kB)
Collecting packaging
Using cached packaging-24.0-py3-none-any.whl.metadata (3.2 kB)
Collecting setuptools>=49.4.0
Using cached setuptools-69.5.1-py3-none-any.whl.metadata (6.2 kB)
ERROR: Could not find a version that satisfies the requirement torch==2.1.2 (from versions: 2.2.0, 2.2.1, 2.2.2, 2.3.0)
ERROR: No matching distribution found for torch==2.1.2
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
To reproduce
Step:1 create a normal setup for openllm with conda env
Step:2 RUN TRUST_REMOTE_CODE=True openllm start codellama/CodeLlama-34b-Instruct-hf --backend vllm
The following error might me visible to you:
(codellama) amardeep.yadav@fintricity.com@codellamamachine:~$ TRUST_REMOTE_CODE=True openllm start codellama/CodeLlama-34b-Instruct-hf --backend vllm
config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 588/588 [00:00<00:00, 7.36MB/s]
tokenizer_config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.59k/1.59k [00:00<00:00, 20.3MB/s]
tokenizer.model: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 500k/500k [00:00<00:00, 91.4MB/s]
tokenizer.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.84M/1.84M [00:00<00:00, 60.7MB/s]
special_tokens_map.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 411/411 [00:00<00:00, 5.58MB/s]
generation_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 116/116 [00:00<00:00, 1.54MB/s]
model.safetensors.index.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 37.6k/37.6k [00:00<00:00, 116MB/s]
pytorch_model.bin.index.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 35.8k/35.8k [00:00<00:00, 207MB/s]
model-00007-of-00007.safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 9.19G/9.19G [00:50<00:00, 180MB/s]
model-00001-of-00007.safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 9.85G/9.85G [00:52<00:00, 188MB/s]
model-00002-of-00007.safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 9.69G/9.69G [00:52<00:00, 183MB/s]
model-00003-of-00007.safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 9.69G/9.69G [00:52<00:00, 183MB/s]
model-00006-of-00007.safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 9.69G/9.69G [00:53<00:00, 180MB/s]
model-00005-of-00007.safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 9.69G/9.69G [00:54<00:00, 179MB/s]
model-00004-of-00007.safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 9.69G/9.69G [00:54<00:00, 179MB/s]
Fetching 15 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:54<00:00, 3.63s/it]
^[[B^[[B^[[B^[[B^[[B^[[B^[[A^[[A^[[A🚀Tip: run 'openllm build codellama/CodeLlama-34b-Instruct-hf --backend vllm --serialization safetensors' to create a BentoLLM for 'codellama/CodeLlama-34b-Instruct-hf's: 100%|████████████████████████████████████████████████████████████████████████████████████████████████▋| 9.66G/9.69G [00:54<00:00, 327MB/s]
2024-04-25T18:34:00+0000 [INFO] [cli] Prometheus metrics for HTTP BentoServer from "_service:svc" can be accessed at http://localhost:3000/metrics.9G [00:53<00:00, 285MB/s]
2024-04-25T18:34:01+0000 [INFO] [cli] Starting production HTTP BentoServer from "_service:svc" listening on http://0.0.0.0:3000 (Press CTRL+C to quit)[00:53<00:00, 286MB/s]
2024-04-25T18:34:04+0000 [ERROR] [runner:llm-llama-runner:1] An exception occurred while instantiating runner 'llm-llama-runner', see details below:
2024-04-25T18:34:04+0000 [ERROR] [runner:llm-llama-runner:1] Traceback (most recent call last):
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/bentoml/_internal/runner/runner.py", line 307, in init_local
self._set_handle(LocalRunnerRef)
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/bentoml/_internal/runner/runner.py", line 150, in _set_handle
runner_handle = handle_class(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/bentoml/_internal/runner/runner_handle/local.py", line 27, in init
self._runnable = runner.runnable_class(**runner.runnable_init_params) # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/openllm/_runners.py", line 121, in init
raise openllm.exceptions.OpenLLMException('vLLM is not installed. Do
pip install "openllm[vllm]"
.')openllm_core.exceptions.OpenLLMException: vLLM is not installed. Do
pip install "openllm[vllm]"
.2024-04-25T18:34:04+0000 [ERROR] [runner:llm-llama-runner:1] Traceback (most recent call last):
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/starlette/routing.py", line 732, in lifespan
async with self.lifespan_context(app) as maybe_state:
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/contextlib.py", line 210, in aenter
return await anext(self.gen)
^^^^^^^^^^^^^^^^^^^^^
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/bentoml/_internal/server/base_app.py", line 75, in lifespan
on_startup()
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/bentoml/_internal/runner/runner.py", line 317, in init_local
raise e
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/bentoml/_internal/runner/runner.py", line 307, in init_local
self._set_handle(LocalRunnerRef)
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/bentoml/_internal/runner/runner.py", line 150, in _set_handle
runner_handle = handle_class(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/bentoml/_internal/runner/runner_handle/local.py", line 27, in init
self._runnable = runner.runnable_class(**runner.runnable_init_params) # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/amardeep.yadav/miniconda3/envs/codellama/lib/python3.12/site-packages/openllm/_runners.py", line 121, in init
raise openllm.exceptions.OpenLLMException('vLLM is not installed. Do
pip install "openllm[vllm]"
.')openllm_core.exceptions.OpenLLMException: vLLM is not installed. Do
pip install "openllm[vllm]"
.2024-04-25T18:34:04+0000 [ERROR] [runner:llm-llama-runner:1] Application startup failed. Exiting.
Logs
Environment
Environment variable
System information
bentoml
: 1.1.11python
: 3.12.2platform
: Linux-5.15.0-1050-azure-x86_64-with-glibc2.31uid_gid
: 14830125:14830125conda
: 24.3.0in_conda_env
: Trueconda_packages
pip_packages
System information (Optional)
No response
The text was updated successfully, but these errors were encountered: