-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
多卡微调报错 #695
Comments
应该是使用了 zero3? 如果使用了 zero3,请将 transformers 和 bitsandbytes 版本更新到最新 xtuner==0.1.19 如果不想更新版本,可以使用 zero1 或 zero2 |
感谢回复!采用zero1 或 zero2可以了,但是8张V100会报OOM,同样的微调配置在1张4090上可以正常运行,这是什么原因呀? |
可能是因为 v100 上用不了 flash attention,序列越长,和 4090 的显存差距就会越明显 可以尝试用 zero3 + qlora 来降低显存,否则 llm 部分是没有被切片的,每个显卡上都会有 4bit llm 的显存占用 |
zero3成功跑起来了,非常感谢! |
qlora与zero3现在兼容了吗,我是用lora+zero3跑起来的 |
单卡qlora微调可正常启动;多卡启动报错:
The text was updated successfully, but these errors were encountered: