Convert directly from llama3 #4268

pdevine · 2024-05-08T23:14:20Z

This change allows you to convert directly from a llama3 derived safetensors model into Ollama.

It is currently missing:

pytorch almost works however the embeddings layer size is off by the eos/bos tokens

This will work with some llama3 derivatives if they are using safetensors including dolphin-2.9-llama3.

mxyng · 2024-05-18T07:15:32Z

Updated the safetensors and pytorch conversion interfaces to take F32, F16, and BF16 inputs. This allows this change to convert llama3 derivatives such as nvidia's ChatQA and NousResearch's Hermes 2 Pro

jmorganca mentioned this pull request May 10, 2024

Import from a HF model directly? #3748

Closed

mxyng force-pushed the pdevine/llama3 branch 8 times, most recently from 9b83ecb to 27588a7 Compare May 16, 2024 23:53

mxyng changed the base branch from main to mxyng/cache-intermediate-layers May 17, 2024 18:38

mxyng force-pushed the mxyng/cache-intermediate-layers branch from 39efb30 to 8d807d7 Compare May 17, 2024 18:38

mxyng force-pushed the pdevine/llama3 branch from 27588a7 to d56887a Compare May 17, 2024 18:39

mxyng force-pushed the mxyng/cache-intermediate-layers branch from 8d807d7 to 0aba2d5 Compare May 17, 2024 18:40

mxyng force-pushed the pdevine/llama3 branch from d56887a to 5e3e177 Compare May 17, 2024 18:47

mxyng changed the base branch from mxyng/cache-intermediate-layers to mxyng/fix-quantize May 17, 2024 18:48

mxyng force-pushed the pdevine/llama3 branch from 448014e to 8698064 Compare May 18, 2024 07:07

pdevine and others added 8 commits May 18, 2024 00:13

some changes for llama3

ba83054

add safetensors version

a3b2f37

llama3 conversion

260a61c

add fixes for llama

c96d9ef

add missing file

350278b

bpe pretokenizer

3b0b627

cleanup

4d7bfc5

fix conversion for f16 or f32 inputs

328d0ea

mxyng force-pushed the mxyng/fix-quantize branch from 882041e to 6a6d762 Compare May 18, 2024 07:13

mxyng force-pushed the pdevine/llama3 branch from 8698064 to 328d0ea Compare May 18, 2024 07:13

mxyng marked this pull request as ready for review May 18, 2024 07:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert directly from llama3 #4268

Convert directly from llama3 #4268

pdevine commented May 8, 2024 •

edited by mxyng

mxyng commented May 18, 2024

Convert directly from llama3 #4268

Are you sure you want to change the base?

Convert directly from llama3 #4268

Conversation

pdevine commented May 8, 2024 • edited by mxyng

mxyng commented May 18, 2024

pdevine commented May 8, 2024 •

edited by mxyng