Possible bug in the 'deepseek-coder' chat template's system message #7385

jukofyork · 2024-05-19T13:33:47Z

I noticed this when adding the 'alpaca' chat template (which is very similar to the 'deepseek-coder' chat template):

----- deepseek-ai/deepseek-coder-33b-instruct -----
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
<｜begin▁of▁sentence｜>You are a helpful assistant### Instruction:
Hello
### Response:
Hi there
<|EOT|>
### Instruction:
Who are you
### Response:
   I am an assistant   
<|EOT|>
### Instruction:
Another question
### Response:

Doesn't look correct to me for deepseek-coder as the default system message ends with a newline like so:

----- deepseek-ai/deepseek-coder-33b-instruct -----
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
<｜begin▁of▁sentence｜>You are a helpful assistant
### Instruction:
Hello
### Response:
Hi there
<|EOT|>
### Instruction:
Who are you
### Response:
   I am an assistant   
<|EOT|>
### Instruction:
Another question
### Response:

Perhaps the Python script on this page should detect this and move the newline from the default system message into the template itself?

The text was updated successfully, but these errors were encountered:

jukofyork · 2024-05-20T20:18:32Z

There may be another bug in this:

output = AutoTokenizer.from_pretrained(VARIANTS_TO_TEST[0]).apply_chat_template(history, tokenize=False, add_generation_prompt=True, chat_template=GEMMA_TMLP)

Should this be VARIANTS_TO_TEST[0] and thus getting the 'teknium/OpenHermes-2.5-Mistral-7B' data or has the vector been rearranged at some time and 'google/gemma-7b-it' was originally at the start of the vector?

jukofyork added the bug-unconfirmed label May 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible bug in the 'deepseek-coder' chat template's system message #7385

Possible bug in the 'deepseek-coder' chat template's system message #7385

jukofyork commented May 19, 2024

jukofyork commented May 20, 2024

Possible bug in the 'deepseek-coder' chat template's system message #7385

Possible bug in the 'deepseek-coder' chat template's system message #7385

Comments

jukofyork commented May 19, 2024

jukofyork commented May 20, 2024