Llama 2 Prompt Template

tikkun · on July 19, 2023

The prompt template on the quantized versions of Llama 2 appears to be incorrect relative to the official Meta one (https://github.com/facebookresearch/llama/blob/main/llama/ge...).

I took Meta's generation.py and modified the code to output the raw prompt text before it’s fed to the tokenizer, to get an updated prompt template.

Result is:

  [INST] <<SYS>>
  {your_system_message}
  <</SYS>>
  
  {user_message_1} [/INST] {model_reply_1}
  [INST] {user_message_2} [/INST]

tikkun · on July 20, 2023

Slight update in the linked post, adding <s> at the beginning