Each keypress is appended to an 80 line prompt (key name along with timestamp of keypress and current text shown on the screen) and fed to a frontier LLM. Some of the office staff banged on the keypad for a few hours to generate training data to fine-tune the LLM on the task of denouncing key presses.
Thanks to some optimizations with Triton and running multi-GPU instances, latency is down to just a few seconds per digit entered.
You see, we needed to hit our genAI onboarding KPIs this quarter…
Each keypress is appended to an 80 line prompt (key name along with timestamp of keypress and current text shown on the screen) and fed to a frontier LLM. Some of the office staff banged on the keypad for a few hours to generate training data to fine-tune the LLM on the task of denouncing key presses.
Thanks to some optimizations with Triton and running multi-GPU instances, latency is down to just a few seconds per digit entered.
You see, we needed to hit our genAI onboarding KPIs this quarter…