ValueError: Unknown quantization method: bitsandbytes. Must be one of ['awq', 'gptq', 'squeezellm', 'marlin']. #482

manliu1225 · 2024-05-17T02:18:36Z

When I want to inference the finetuned model with vLLM, I got this error.
I have saved unsloth finetuned model to HF model already.
vLLM==0.4.0+cu118
unsloth==2024.5
transformers==4.40.2

danielhanchen · 2024-05-17T18:06:43Z

Oh you cannot use 4bit models - you must use model.save_pretrained_merged to 16bit then use vLLM

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: Unknown quantization method: bitsandbytes. Must be one of ['awq', 'gptq', 'squeezellm', 'marlin']. #482

ValueError: Unknown quantization method: bitsandbytes. Must be one of ['awq', 'gptq', 'squeezellm', 'marlin']. #482

manliu1225 commented May 17, 2024

danielhanchen commented May 17, 2024

ValueError: Unknown quantization method: bitsandbytes. Must be one of ['awq', 'gptq', 'squeezellm', 'marlin']. #482

ValueError: Unknown quantization method: bitsandbytes. Must be one of ['awq', 'gptq', 'squeezellm', 'marlin']. #482

Comments

manliu1225 commented May 17, 2024

danielhanchen commented May 17, 2024