Hacker Newsnew | past | comments | ask | show | jobs | submit | nextaccountic's commentslogin

> It’s absolutely possible to be that good.

Sure, but not if this is your first painting. Humans can't one-shot art like this


The title is sensationalised. They mean the earliest painting of his that we have. It's also a copy of an existing engraving.

Then that's very believable (well, depending on what age he started).

As a comparison, Mozart's compositions when he was 15 years old was unbelievable, unless you put in context he was already composing music at 5 years old


America, Afro-Eurasia, Australia, Antarctica

I can count 4


Based on what?

Based on being large landmasses

Potentially any island is a continent if your cutoff is low enough


Maybe the base model is just a compression of the training data?

There is also a RLHF training step on top of that


yep the base model is the compression, but RLHF (and other types of post training) doesn't really change this picture, it's still working within that same compressed knowledge.

nathan lambert (who wrote the RLHF book @ https://rlhfbook.com/ ) describes this as the "elicitation theory of post training", the idea is that RLHF is extracting and reshaping what's already latent in the base model, not adding new knowledge. as he puts it: when you use preferences to change model behavior "it doesn't mean that the model believes these things. it's just trained to prioritize these things."

so like when you RLHF a model to not give virus production info, you're not necessarily erasing those weights, the theory is that you're just making it harder for that information to surface. the knowledge is still in the compression, RLHF just changes what gets prioritized during decompression.


Then you apply LLMs in domains where things can be checked

Indeed I expect to see a huge push into formally verified software just because sound mathematical proofs provide an excellent verifier to put into a LLM hardness. Just see how Aristotle has been successful at math, and it could be applied to coding too

Maybe Lean will become the new Python

https://harmonic.fun/news#blog-post-verina-bench-sota


  "LLMs reliably fail at abstraction."
  "This limitation will go away soon."
  "Hallucinations haven't."
  "I found a workaround for that."
  "That doesn't work for most things."
  "Then don't use LLMs for most things."

Um, yes? Except ‘most things’ are not much at all by volume.

There must be some interface for LLMs tocdeal directly with the tree structure of programming and computer languages. Like, something similar to Emacs paredit interface, but for arbitrary languages

What about TPUs? They are more efficient than nvidia GPUs, a huge amount of inference is done with them, and while they are not literally being sold to the public, the whole technology should be influencing the next steps of Nvidia just like AMD influenced Intel

TPUs can be more efficient, but are quite difficult to program for efficiently (difficult to saturate). That is why Google tends to sell TPU-services, rather than raw access to TPUs, so they can control the stack and get good utilization. GPUs are easier to work with.

I think the software side of the story is underestimated. Nvidia has a big moat there and huge community support.


My understanding is all of Google's AI is trained and run on quite old but well designed TPUs. For a while the issue was that developing these AI models still needed flexibility and customised hardware like TPUs couldn't accomodate that.

Now that the model architecture has settled into something a bit more predictable, I wouldn't be surprised if we saw a little more specialisation in the hardware.


How does this compare with container-use?

https://container-use.com/introduction


This is exactly what I want, but don't really want to run Docker all the time. Nicer git worktrees and isolation of code so I can run multiple agents. It even has the setup command stuff so "npm install" runs automatically.

I'll check this out for sure! I just wish it used bubblewrap or the macos equivalent instead of reaching for containers.

I have also been enjoying having an IDE open so I can interact with the agents as they're working, and not just "fire and forget" and check back in a while. I've only been experimenting with this for a couple of days though, so maybe I'm just not trusting enough of it yet.


> Non-verbal cues are invisible to text: Transcription-based models discard sighs, throat-clearing, hesitation sounds, and other non-verbal vocalizations that carry critical conversational-flow information. Sparrow-1 hears what ASR ignores.

Could Sparrow instead be used to produce high quality transcription that incorporate non-verbal cues?

Or even, use Sparrow AND another existing transcription/ASR thing to augment the transcription with non-verbal cues


This is a very good idea. We currently have a model in our perception system (Raven-1) that performs this partially. It uses audio to understand tone and augment the transcription we send to the conversational LLM. That seems to have an impact on the conversational style of the replicas output, in a good way. We’re still evaluating that model and will post updates when we have better insights.

I would like to just have a storage engine that can be very aggressive at deduplicating stuff. If some data is redundant, why am I storing it twice?

That's already pretty common, but the goal isn't storing less data for its own sake.

> the goal isn't storing less data for its own sake.

Isn't it? I was under impression that the problem is the cost storing all this stuff


Nope, you can't just look at cost of storage and try to minimize it. There are a lot of other things that matter.

What I am asking is, what are the other concerns other than literally the cost? I have interest in this area and I am seeing everyone saying that observability companies are overcharging their consumers.

We're currently discussing the cost of _storage_, and you can bet the providers already are deduplicating it. You just don't get those savings - they get increased margins.

I'm not going to quote the article or other threads here to you about why reducing storage just for the sake of cost isn't the answer.


Well, that's a weirdly confrontational reply. But thanks

But if you don't do anomaly detection, how can you possibly know which data is useful for anomaly detection? And thus, which data is valuable to keep

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: