Using llama-cpp-python #485

knc6 · 2024-05-17T14:35:41Z

Hi,

Thanks for creating this wonderful package!
The save_to_gguf currently fails because llama.ccp installation seems to be broken.
Could something like llama-cpp-python be used instead?

danielhanchen · 2024-05-17T18:07:34Z

Is this via Colab or Kaggle or local machines?

knc6 · 2024-05-18T01:47:27Z

I am using a local machine. Actually, I was able to save the model using save_pretrained_gguf . It produced a .Q8_0.gguf file. Could you tell me how to load it for inference on CPU/GPU?

erwe324 · 2024-05-25T14:44:08Z

@knc6 Can you please please elaborate on the steps you took to save the model after getting a broken llama.cpp error?
I am constantly facing this error on my local machine but never on Colab.

As for the gguf file you can use a lot of different software available. Simplest one would be to use llama.cpp directly. Most of he other CUI/GUI also use it in the backend.

danielhanchen · 2024-05-26T15:51:59Z

@erwe324 Oh no :( Still GGUF issues? :(

erwe324 · 2024-05-26T16:12:14Z

@danielhanchen
Please Note that in my case this behaviour is from the last 2 months. However,
There is some mystery.
No matter what I do on my local machine I AM unable to run llama.cpp to create gguf.

However, the same code runs without any issues or modification on Google Collab. I am pre occupied but soon I will try to identify the root cause for this. Most probably this is due to some dependency for llama cpp

knc6 · 2024-05-28T13:34:13Z

@erwe324 Same issue, the script works on colab but not on local machine. Perhaps a conda-package/environment, nix, docker container for unsloth would be useful.

danielhanchen · 2024-05-28T14:38:45Z

Yes a Docker env would be useful. How about if you do a manual one?

git clone --recursive https://github.com/ggerganov/llama.cpp
make clean -C llama.cpp
make all -j -C llama.cpp
pip install gguf protobuf

python llama.cpp/convert-hf-to-gguf.py FOLDER --outfile OUTPUT --outtype f16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using llama-cpp-python #485

Using llama-cpp-python #485

knc6 commented May 17, 2024

danielhanchen commented May 17, 2024

knc6 commented May 18, 2024

erwe324 commented May 25, 2024 •

edited

danielhanchen commented May 26, 2024

erwe324 commented May 26, 2024

knc6 commented May 28, 2024

danielhanchen commented May 28, 2024

Using llama-cpp-python #485

Using llama-cpp-python #485

Comments

knc6 commented May 17, 2024

danielhanchen commented May 17, 2024

knc6 commented May 18, 2024

erwe324 commented May 25, 2024 • edited

danielhanchen commented May 26, 2024

erwe324 commented May 26, 2024

knc6 commented May 28, 2024

danielhanchen commented May 28, 2024

erwe324 commented May 25, 2024 •

edited