https://github.com/PrismML-Eng/llama.cpp
After fails with Ollama and main llama.cpp the fork worked on my M5 MBA.
Edit: Typos
I think most of them are available via nimble.
https://github.com/PrismML-Eng/llama.cpp
After fails with Ollama and main llama.cpp the fork worked on my M5 MBA.
Edit: Typos