Yeah, that's basically it. Many models these days are specifically trained for t...

Yeah, that's basically it. Many models these days are specifically trained for tool calling though so the system prompt doesn't need to spend much effort reminding them how to do it.

You can see the prompts that make this work for gpt-oss in the chat template in their Hugging Face repo: https://huggingface.co/openai/gpt-oss-120b/blob/main/chat_te... - including this bit:

    {%- macro render_tool_namespace(namespace_name, tools) -%}
        {{- "## " + namespace_name + "\n\n" }}
        {{- "namespace " + namespace_name + " {\n\n" }}
        {%- for tool in tools %}
            {%- set tool = tool.function %}
            {{- "// " + tool.description + "\n" }}
            {{- "type "+ tool.name + " = " }}
            {%- if tool.parameters and tool.parameters.properties %}
                {{- "(_: {\n" }}
                {%- for param_name, param_spec in tool.parameters.properties.items() %}
                    {%- if param_spec.description %}
                        {{- "// " + param_spec.description + "\n" }}
                    {%- endif %}
                    {{- param_name }}
    ...

As for how LLMs know when to stop... they have special tokens for that. "eos_token_id" stands for End of Sequence - here's the gpt-oss config for that: https://huggingface.co/openai/gpt-oss-120b/blob/main/generat...

    {
      "bos_token_id": 199998,
      "do_sample": true,
      "eos_token_id": [
        200002,
        199999,
        200012
      ],
      "pad_token_id": 199999,
      "transformers_version": "4.55.0.dev0"
    }

The model is trained to output one of those three tokens when it's "done".

https://cookbook.openai.com/articles/openai-harmony#special-... defines some of those tokens:

200002 = <|return|> - you should stop inference

200012 = <|call|> - "Indicates the model wants to call a tool."

I think that 199999 is a legacy EOS token ID that's included for backwards compatibility? Not sure.