Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using llama-cpp-python #485

Open
knc6 opened this issue May 17, 2024 · 7 comments
Open

Using llama-cpp-python #485

knc6 opened this issue May 17, 2024 · 7 comments

Comments

@knc6
Copy link

knc6 commented May 17, 2024

Hi,

Thanks for creating this wonderful package!
The save_to_gguf currently fails because llama.ccp installation seems to be broken.
Could something like llama-cpp-python be used instead?

@danielhanchen
Copy link
Contributor

Is this via Colab or Kaggle or local machines?

@knc6
Copy link
Author

knc6 commented May 18, 2024

I am using a local machine. Actually, I was able to save the model using save_pretrained_gguf . It produced a .Q8_0.gguf file. Could you tell me how to load it for inference on CPU/GPU?

@erwe324
Copy link

erwe324 commented May 25, 2024

@knc6 Can you please please elaborate on the steps you took to save the model after getting a broken llama.cpp error?
I am constantly facing this error on my local machine but never on Colab.

As for the gguf file you can use a lot of different software available. Simplest one would be to use llama.cpp directly. Most of he other CUI/GUI also use it in the backend.

@danielhanchen
Copy link
Contributor

@erwe324 Oh no :( Still GGUF issues? :(

@erwe324
Copy link

erwe324 commented May 26, 2024

@danielhanchen
Please Note that in my case this behaviour is from the last 2 months. However,
There is some mystery.
No matter what I do on my local machine I AM unable to run llama.cpp to create gguf.

However, the same code runs without any issues or modification on Google Collab. I am pre occupied but soon I will try to identify the root cause for this. Most probably this is due to some dependency for llama cpp

@knc6
Copy link
Author

knc6 commented May 28, 2024

@erwe324 Same issue, the script works on colab but not on local machine. Perhaps a conda-package/environment, nix, docker container for unsloth would be useful.

@danielhanchen
Copy link
Contributor

Yes a Docker env would be useful. How about if you do a manual one?

git clone --recursive https://github.com/ggerganov/llama.cpp
make clean -C llama.cpp
make all -j -C llama.cpp
pip install gguf protobuf

python llama.cpp/convert-hf-to-gguf.py FOLDER --outfile OUTPUT --outtype f16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants