Output after I exit the llama-server command:
llama_memory_breakdown_print: | memory breakdown [MiB] | total free self model context compute unaccounted | llama_memory_breakdown_print: | - MTL0 (Apple M3 Pro) | 28753 = 14607 + (14145 = 6262 + 4553 + 3329) + 0 | llama_memory_breakdown_print: | - Host | 2779 = 666 + 0 + 2112 |
Output after I exit the llama-server command: