More

jameshart · 2026-04-18T13:56:59 1776520619

Considerably more in many cases. The point of floating point is to have as many distinct values in the range 2-4 as are in the range 1-2 as are between 1/2 and 1, 1/4 and 1/2, 1/8 and 1/4, etc. the smallest representable difference between consecutive floating point numbers down around the size of 1/64 is on the order of epsilon/64

Multiplying epsilon by the largest number you are dealing with is a strategy that makes using epsilons at least somewhat logical.

jameshart · 2026-04-17T02:52:07 1776394327

I’m not sure ‘patched’ is the right word here. Are you suggesting they edited the LLM weights to fix cabbage transportation and car wash question answering?

gf000 · 2026-04-17T05:45:46 1776404746

Absolutely not my area of expertise but giving it a few examples of what should be the expected answer in a fine-tuning step seems like a reasonable thing and I would expect it would "fix" it as in less likely to fall into the trap.

At the same time, I wouldn't be surprised if some of these would be "patched" via simply prompt rewrite, e.g. for the strawberry one they might just recognize the question and add some clarifying sentence to your prompt (or the system prompt) before letting it go to the inference step?

But I'm just thinking out loud, don't take it too seriously.

TheLNL · 2026-04-17T05:13:29 1776402809

They might have further trained the model with these edgecases in the dataset

jameshart · 2026-04-17T02:50:34 1776394234

I was given to understand that attention is all you need.

layer8 · 2026-04-17T11:12:30 1776424350

That’s why we’re testing for it.

jameshart · 2026-04-15T01:16:59 1776215819

The puns are all top class.

Gith abuser content?

jameshart · 2026-04-09T22:41:26 1775774486

I used to do it. Have stopped because AI made it seem uncouth.

jameshart · 2026-04-09T00:09:05 1775693345

Innovationwashing, maybe?

jameshart · 2026-04-08T13:19:59 1775654399

Noise cancelling headphones while riding a motorbike is… a choice. Do you also wear a blindfold?

jameshart · 2026-04-04T15:50:11 1775317811

Give coding agents access to intellisense and syntax highlighting.

Making coding agents spit out syntactically correct code token by token is like asking a human to code on a whiteboard.

vova_hn2 · 2026-04-04T16:11:07 1775319067

Yeah, I was also thinking about it A LOT.

We kinda have a little bit of it with some coding harnesses giving model access to LSP, but I think that we can insert this knowledge on a lower level if we find a clever way to somehow utilize it during sampling.

I think that there is a lot of low hanging fruit in this area.

And in general, I think that people try to use LLMs too much to solve problems that can be easily solved by cheaper (computationally), and, more importantly deterministic tools.

For example, back in the day when LLM-assisted coding just became a thing people very often complained about models generating syntactically incorrect code and inventing non-existent library methods.

Well, I, an experienced human programmer, probably would also be making syntax mistakes and inventing non-existent methods if you stripped me of my tools and made me write code in a bare text editor without syntax highlighting.

Thankfully, my IDE would autocomplete real syntax and actually existing library methods for me and immediately give me feedback if I make a mistake anyway. And all of it is achieved using reliable deterministic code without the inherent issues of statistical models.

I think that it is really inefficient to reach for an expensive and unreliable tool when a cheap and reliable tool will do.

jwolfe · 2026-04-04T15:59:21 1775318361

In general these agents support LSPs, which is often as much information as your IDE will give you. They are also not required to output syntactically correct code token by token when running agentically, because the loop is:

1. code

2. syntax check / build / format / lint (details language dependent)

3. test

and they can hop between 1 and 2 however many times they want.

tadfisher · 2026-04-04T17:16:31 1775322991

Doing a tool call for autocomplete is not going to make coding agents faster.

I do think there is some merit in a tool that dumps all namespaces and reachable symbols so the agent can do its own autocomplete without a round-trip.

jameshart · 2026-04-04T20:59:36 1775336376

Doesn’t need to be a tool call.

As a human coder you don’t summon intellisense. It’s just popped up into your visual field as extra input - contextual cues.

You could force intellisense state into the context vector the LLM receives.

foota · 2026-04-05T00:51:58 1775350318

Not really, because the LLM loop doesn't have the ability to get updates from the agent live. It would have to somehow be integrated all the way down the stack.

jameshart · 2026-04-05T01:21:46 1775352106

LLMs can have whatever abilities we build for them. The fact we currently start their context out with a static prompt which we keep feeding in on every iteration of the token prediction loop is a choice. We don’t have to keep doing that if there are other options available.

foota · 2026-04-10T20:01:16 1775851276

Late to reply, but.. yes, hence my reply that it would need to be integrated all the way down the stack.

But also, LLMs (or their current implementation) rely heavily on print caching for efficiency, without this costs are much higher. You can do neat tricks with it, but generally you're limited to playing with the end of the context to avoid breaking things.

I think some agents do add small context snippets to the end of the conversation that get used by the agent. You can do things like: conversation messages + context snippets + new message and then once the agent replies make the next turn conversation + new message + reply + ... This breaks the cache only for the latest message (not too bad) and let's you give the model current up to date information. This is how stuff like the "mode" or "what time is it now" are handled I believe.

orbital-decay · 2026-04-05T07:42:53 1775374973

You're describing structured outputs.

sgbeal · 2026-04-04T16:02:00 1775318520

> Give coding agents access to intellisense and syntax highlighting.

i once asked an LLM if it could ingest code from an interactive session more easily if it were in appropriately-typed markdown fences and it said absolutely yes, and that the syntax highlighting fed to it that way helps it immensely. i was downright shocked that syntax highlighting was anything more than noise for them.

Tarq0n · 2026-04-06T06:43:11 1775457791

You can't trust what a model says about itself. It has no ability to introspect.

devmor · 2026-04-04T19:56:28 1775332588

Why would this be surprising? That’s exactly how much of the code they were trained on is presented in PRs, Forums, etc.

astrange · 2026-04-04T21:38:15 1775338695

Is that true? That depends on how their web scraping works, like whether it runs client-side highlighting, strips out HTML tags, etc.

devmor · 2026-04-05T02:05:01 1775354701

The highlighting isn't what matters, its the pretext. E.g. An LLM seeing "```python" before a code block is going to better recall python codeblocks by people that prefixed them that way.

jameshart · 2026-04-04T14:55:58 1775314558

Quite the opposite.

jameshart · 2026-04-04T14:55:36 1775314536

It’s a BBC journalistic standards thing; the BBC doesn’t want to express an opinion about the image, they are relaying that as a quote from someone about the image. The word “spectacular” is attributed to NASA in the article.