More

zihotki · 2026-04-29T17:50:44 1777485044

I wonder if this benchmark brings any value. Models are already quite capable and reach high scores in it.

khurdula · 2026-04-29T18:16:14 1777486574

Check out the "The JSON-pass vs Value-Accuracy gap" section in the blog. That was an eye opener.

While most models were great at producing JSON schema, they were pretty bad at producing accurate values.

In the graph you'll is almost a 20%-30% drop between the JSON schema pass vs the value accuracy.

zihotki · 2026-04-27T20:23:23 1777321403

Looks like Microsoft has run out of compute and can't scale it fast enough to serve copilot users and Azure AI Foundry needs, given that the customer base is growing there as well.

zihotki · 2026-04-25T12:30:18 1777120218

No worries, chats soon will catch up with the ads!

zihotki · 2026-04-20T09:13:44 1776676424

No numbers/measurements/benchmarks and you dare call it "a working" one? Any real proofs that this 'works'?

zihotki · 2026-04-20T07:21:43 1776669703

Productivity per dollar doesn't increase because for maturity levels 1&2 the costs for inference and extra team load (PRs quantity and size) eat up all gains. Only on level 3 one can see actual productivity impact. Most companies are between levels 1 and 2, that's where only costs are rising.

Levels: 0 - no AI, 1 - AI enabled (copilots), 2 - AI assisted (autonomous agent pipelines not on your PC) , 3 - AI measured.

zihotki · 2026-04-16T14:17:55 1776349075

Did they take into account aging and depreciation of the vehicle battery, which is crazy expensive? It makes negative sense to use v2h with current limited cycles batteries of cars. These batteries are optimized for charging speed and power density.

There are much more cheaper and better suited batteries for houses built using other chemistries, they are bigger and heavier and that's fine for a house as long as they live 10y+.

dyauspitr · 2026-04-16T14:24:36 1776349476

Most of these batteries are on full warranty for 8-10 years. You should definitely make full use of it during that period.

zihotki · 2026-04-16T15:01:41 1776351701

Read the fine print - there is usually a limitation on charging cycles. So battery can be out of warranty even if it's 3 years old but reached limit on charging cycles.

dyauspitr · 2026-04-16T23:21:58 1776381718

It’s not. For my ford it is 8 years or 100,000 miles whichever comes first. It’s not about cycles.

lostmsu · 2026-04-16T16:13:37 1776356017

How are cycles counted if the battery is not drained fully?

ragebol · 2026-04-16T14:37:24 1776350244

Does that warranty still apply if the battery is used for other applications besides it's core function of powering the car?

zihotki · 2026-04-16T15:00:25 1776351625

The car battery warranty is often for X years or Y cycles, whatever comes first.

ragebol · 2026-04-16T15:05:11 1776351911

Yeah, I guessed so. Using it as a home battery with incur a lot more cycles I suppose. Although if the battery is large enough so that a day of powering a home only drains the battery eg. 10%, how does that factor into the cycle count? Is that somewhere in the small print maybe?

dyauspitr · 2026-04-16T23:22:51 1776381771

I would look at your warranty, mine is 8 years or 100,000 miles. It doesn’t have a cycles stipulation.

zihotki · 2026-04-13T08:09:57 1776067797

For coding it makes no sense to use any quantization worse than Q6_K, from my experience. More quantized models make more mistakes and if for text processing it still can be fine, for coding it's not.

segmondy · 2026-04-13T13:28:03 1776086883

I don't think most people realize that. Quality of tokens beats quantity of token. I always tell folks to go as high a quant as you can only go lower if you just don't have the memory capacity.

hmokiguess · 2026-04-13T13:46:37 1776087997

what do you mean with that, I’m not sure I understood what you said

m348e912 · 2026-04-13T14:28:39 1776090519

AI models like gemma4 are available in different quant "sizes", think about it as an image available in various compression levels.

The best image is the largest, takes up the most memory when loading, and while it is large and looks the best, it uses up much of your system resources.

On the other end of the spectrum there is a smaller much more compressed version of that same image. It loads quickly, uses less resources, but is lacking detail and clarity of the original image.

AI models are similar in that fashion, and the parent poster is suggesting you use the largest version of the AI model your system can support, even if it runs a little slower than you like.

hmokiguess · 2026-04-13T15:36:16 1776094576

Thank you!

stavros · 2026-04-13T14:01:21 1776088881

Better go for a less-quantized model even if it's slower than go for a faster, quantized one.

hmokiguess · 2026-04-13T15:36:21 1776094581

Thank you!

shaz0x · 2026-04-15T22:28:41 1776292121

On mobile the Q4 vs Q6 tradeoff flips. Gemma 4 E2B at Q4_K_M barely fits in RAM on a 6GB Android, so Q6 isn't on the table. In practice the Q4 hit shows up in tool-call reliability more than general reasoning, which is usually fine for a constrained skill surface.

zihotki · 2026-04-13T07:49:07 1776066547

For those who's looking into a good homelab servers - better look at the refurbrished/used mini-pc based on 5th gen of Intel, like i5 11500T (HP ProDesk 400 G5 Mini for example), or ryzen. You'll get better thermals, better CPU, more expansion slots for cheaper than you can get out of NUC.

On top of that, resellers also often have upgrades for RAM and NVME available. WD-Red OEM 1Tb for less than 100 dollars sounds like bargain.

theandrewbailey · 2026-04-13T11:02:44 1776078164

> 5th gen of Intel, like i5 11500T

That's an 11th gen Intel Core CPU, not 5th.

zihotki · 2026-03-31T08:12:07 1774944727

Misconception? That's a playbook, not a misconception.

zihotki · 2026-03-22T23:37:04 1774222624

This is false, you can make many plastics without fossil sources (pla, bio-pet, bio-abs, etc). The only challenge is cost and scale - it's cheaper and easier to use existing processes.

cpursley · 2026-03-28T11:10:54 1774696254

And how exactly do you think all of those agricultural products are produced? They require an insane amount of diesel fuel and nitrogen fertilizers…

paganel · 2026-03-23T07:13:03 1774249983

> challenge is cos

So you can’t.