Yes. How do they do it? Literally they must have PagerDuty set up to alert the t...

beernet · 2026-04-16T14:43:44 1776350624

They obviously collaborate with some of the labs prior to the official release date.

sigbottle · 2026-04-16T14:46:35 1776350795

That... is a more plausible explanation I didn't think of.

danielhanchen · 2026-04-16T15:07:24 1776352044

Yes we collab with them!

qskousen · 2026-04-16T18:13:04 1776363184

Sorry this is a bit of a tangent, but I noticed you also released UD quants of ERNIE-Image the same day it released, which as I understand requires generating a bunch of images. I've been working to do something similar with my CLI program ggufy, and was curious of you had any info you could share on the kind of compute you put into that, and if you generate full images or look at latents?

danielhanchen · 2026-04-17T08:34:41 1776414881

Yes we have started doing diffusion GGUFs but it's in it's infancy :) But yes we do generate images to test quants out!

sigbottle · 2026-04-16T14:46:03 1776350763

Is quantization a mostly solved pipeline at this point? I thought that architectures were varied and weird enough where you can't just click a button, say "go optimize these weights", and go. I mean new models have new code that they want to operate on, right, so you'd have to analyze the code and insert the quantization at the right places, automatically, then make sure that doesn't degrade perf?

Maybe I just don't understand how quantization works, but I thought quantization was a very nasty problem involving a lot of plumbing

Readerium · 2026-04-17T01:26:16 1776389176

that is true. gguf does not support any Architecture.

for the most recent example, as of April 16, 2026 (today)

Turboquant isnt still added to GGUF