Then that's very believable (well, depending on what age he started).
As a comparison, Mozart's compositions when he was 15 years old was unbelievable, unless you put in context he was already composing music at 5 years old
yep the base model is the compression, but RLHF (and other types of post training) doesn't really change this picture, it's still working within that same compressed knowledge.
nathan lambert (who wrote the RLHF book @ https://rlhfbook.com/ ) describes this as the "elicitation theory of post training", the idea is that RLHF is extracting and reshaping what's already latent in the base model, not adding new knowledge. as he puts it: when you use preferences to change model behavior "it doesn't mean that the model believes these things. it's just trained to prioritize these things."
so like when you RLHF a model to not give virus production info, you're not necessarily erasing those weights, the theory is that you're just making it harder for that information to surface. the knowledge is still in the compression, RLHF just changes what gets prioritized during decompression.
Then you apply LLMs in domains where things can be checked
Indeed I expect to see a huge push into formally verified software just because sound mathematical proofs provide an excellent verifier to put into a LLM hardness. Just see how Aristotle has been successful at math, and it could be applied to coding too
"LLMs reliably fail at abstraction."
"This limitation will go away soon."
"Hallucinations haven't."
"I found a workaround for that."
"That doesn't work for most things."
"Then don't use LLMs for most things."
There must be some interface for LLMs tocdeal directly with the tree structure of programming and computer languages. Like, something similar to Emacs paredit interface, but for arbitrary languages
What about TPUs? They are more efficient than nvidia GPUs, a huge amount of inference is done with them, and while they are not literally being sold to the public, the whole technology should be influencing the next steps of Nvidia just like AMD influenced Intel
TPUs can be more efficient, but are quite difficult to program for efficiently (difficult to saturate). That is why Google tends to sell TPU-services, rather than raw access to TPUs, so they can control the stack and get good utilization. GPUs are easier to work with.
I think the software side of the story is underestimated. Nvidia has a big moat there and huge community support.
My understanding is all of Google's AI is trained and run on quite old but well designed TPUs. For a while the issue was that developing these AI models still needed flexibility and customised hardware like TPUs couldn't accomodate that.
Now that the model architecture has settled into something a bit more predictable, I wouldn't be surprised if we saw a little more specialisation in the hardware.
This is exactly what I want, but don't really want to run Docker all the time. Nicer git worktrees and isolation of code so I can run multiple agents. It even has the setup command stuff so "npm install" runs automatically.
I'll check this out for sure! I just wish it used bubblewrap or the macos equivalent instead of reaching for containers.
I have also been enjoying having an IDE open so I can interact with the agents as they're working, and not just "fire and forget" and check back in a while. I've only been experimenting with this for a couple of days though, so maybe I'm just not trusting enough of it yet.
> Non-verbal cues are invisible to text: Transcription-based models discard sighs, throat-clearing, hesitation sounds, and other non-verbal vocalizations that carry critical conversational-flow information. Sparrow-1 hears what ASR ignores.
Could Sparrow instead be used to produce high quality transcription that incorporate non-verbal cues?
Or even, use Sparrow AND another existing transcription/ASR thing to augment the transcription with non-verbal cues
This is a very good idea. We currently have a model in our perception system (Raven-1) that performs this partially. It uses audio to understand tone and augment the transcription we send to the conversational LLM. That seems to have an impact on the conversational style of the replicas output, in a good way. We’re still evaluating that model and will post updates when we have better insights.
What I am asking is, what are the other concerns other than literally the cost? I have interest in this area and I am seeing everyone saying that observability companies are overcharging their consumers.
We're currently discussing the cost of _storage_, and you can bet the providers already are deduplicating it. You just don't get those savings - they get increased margins.
I'm not going to quote the article or other threads here to you about why reducing storage just for the sake of cost isn't the answer.
Sure, but not if this is your first painting. Humans can't one-shot art like this
reply