GPT-OSS-120B runs like hell on my DGX Spark | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		vkaufmann 3 months ago \| parent \| context \| favorite \| on: Show HN: I taught GPT-OSS-120B to see using Google... GPT-OSS-120B runs like hell on my DGX Spark

embedding-shape 3 months ago [–]

The MXFP4 variant I suppose? My setup (RTX Pro 6000) does around ~140 tok/s with llama.cpp, around 160 tok/s with vLLM.

vkaufmann 3 months ago | [–]

yep MXFP4 really fast :D

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact