I’m trying to learn more about LLMs, but I haven’t found any explanation for what determines which prompt template format a model requires.

For example meta-llama’s llama-2 requires this format:

…INST and <<SYS>> tags, BOS and EOS tokens…

But if I instead download’s TheBloke’s version of llama-2 the prompt template should instead be:

SYSTEM: …

USER: {prompt}

ASSISTANT:

I thought this would have been determined how the original training data was formatted, but afaik TheBloke only converted the llama-2 models from one format to another. Looking at the documentation for the GGML format I don’t see anything related to the prompt being embedded in the model file.

Anyone who understands this stuff who could point me in the right direction?