Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are embeddings not supported with the mistral-7b-instruct-v0.2 model? #414

Closed
norteo opened this issue May 12, 2024 · 4 comments
Closed
Labels

Comments

@norteo
Copy link

norteo commented May 12, 2024

I run llamafile with the mistral model as:

./mistral-7b-instruct-v0.2.Q5_K_M.llamafile -ngl 9999 --port 8080 --host 0.0.0.0 --embedding --threads 16

I don't have a GPU.

If I run

curl http://localhost:8080/embedding \
        -H "Authorization: Bearer no-key" \
        -H "Content-Type: application/json" \
        -d '{ "content": "The food was delicious and the waiter..." }'

llamafile "crashes" with the message:

{"function":"launch_slot_with_data","level":"INFO","line":875,"msg":"slot is processing task","slot_id":0,"task_id":0,"tid":"9434528","timestamp":1715505285}
{"function":"update_slots","level":"INFO","line":1890,"msg":"kv cache rm [p0, end)","p0":0,"slot_id":0,"task_id":0,"tid":"9434528","timestamp":1715505285}
llama_get_embeddings_ith: invalid embeddings id 0, reason: batch.logits[0] != true
GGML_ASSERT: llama.cpp/llama.cpp:16631: false

@mofosyne mofosyne added the bug label May 21, 2024
@lovenemesis
Copy link

If you don't have an GPU, you probably don't need to parse -ngl 9999 .

@k8si
Copy link
Collaborator

k8si commented May 28, 2024

This should only be an issue in older versions of llamafile. What version of llamafile is this llamafile associated with? To find out, you can run

./mistral-7b-instruct-v0.2.Q5_K_M.llamafile --version

@norteo
Copy link
Author

norteo commented May 28, 2024

Thank you for the reply.

user@fe9e8ccdc306:~$ ./mistral-7b-instruct-v0.2.Q5_K_M.llamafile --version
llamafile v0.8.0

It seems I was not using the latest version.
I redownloaded the file and I rerun the curl command and it seems to work fine.
The version I am using now is 0.8.5 .
I just downloaded it from https://huggingface.co/Mozilla/Mistral-7B-Instruct-v0.2-llamafile/resolve/main/mistral-7b-instruct-v0.2.Q5_K_M.llamafile?download=true
The hugging face repository does not seem to have the latest version and it seems it is not possible to know what llamafile version you are downloading.
Maybe something should be done about that?

@k8si
Copy link
Collaborator

k8si commented May 28, 2024

The hugging face repository does not seem to have the latest version and it seems it is not possible to know what llamafile version you are downloading.
Maybe something should be done about that?

Would you be willing to post this as a separate issue?

@k8si k8si closed this as completed May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants