Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> it's not a matter of having a 100x more powerful LLM,

I think we all can agree that even the best LLM currently is not AGI. That's not what being disputed here I think.

However a 100x more powerful LLM is not just 100x better at recall. A 100x more powerful LLM is not just 100x better at being stupid hallucinatory parrot. A model that is just 100x bigger is not necessarily 100x more powerful if you define power is the ability to achieve goals.

However pure language models will always lack something else: the ability to ground things in reality.

I recently had a dream where I solved some problems and when I woke up I realized that those solutions were bullshit, but I also realized the whole approach of my dreaming self was very similar to what a LLM would have done.



> I think we all can agree that even the best LLM currently is not AGI.

Disagree, for the record. If I’d described the capabilities of contemporary AI to 100 AI scientists 5 years ago, I bet more than half would agree to call that AGI. Further, more than 90% would assume that these capabilities were decades and decades away.


“Oh those goalposts? We moved them over there because they were getting uncomfortably close.”


Cool, so we're at AGI, we just need ASI, maybe something a bit smarter, and we can shut the fuck up about it and get back to blwing billions on real living creatures problems? ;)


>If I’d described the capabilities of contemporary AI to 100 AI scientists 5 years ago

This is hard to believe, the all you need is attention paper was 6 years ago, GPT1 is 5 years old and GPT3 is 3 years old. The current crop of LLMs wasn't something that happened overnight.


No one thought GPT1 would have these scaling effects. Really.


Define grounding things in reality.

We only have our 5 senses to go off of. Meta has already put out one multimodal model incorporating multiple data types, openai is undoubtedly working on it too.


Grounding in reality can be something as simple as what openai is experimenting with plugins or something much more integrated.

It's not a matter of which senses you have, but about being able to "continuously" use them.

The current LLMs are basically unfiltered raw thoughts that must be continuously refined. A similar thing happens in our brains and only a little bit of that is accessible to our consciousness


> The current LLMs are basically unfiltered raw thoughts that must be continuously refined. A similar thing happens in our brains and only a little bit of that is accessible to our consciousness

Exactly. But, AFAIK, it's also the part that does the bulk of actual thinking and decision-making for us. In that sense, LLMs may be closer to AGI than people expect, because they seem to be capturing the actual core of intelligence and reasoning - and the missing bits (like long-term memory and higher-level thought stream filter/censor) may be much easier to bolt on to them.


This is why we typically see better performance out of GPT when plugins are bolted in an chain|tree of though with reflection.

The output of LLMs is kind of like our stream of consciousness, there's a lot of things I think, then discount after internally reflecting on the thought which the often leads to a more correct solution. Having an LLM 'think' like this natively would massively increase the necessary the amount of compute needed, hence the expense, so at least in any public products it's not being done at this time.


Yup. I totally expect you'll be able to eke out significant performance boost if you chain up the LLMs, so that e.g. you feed the initial query to a first-stage GPT-4 several times (likely in parallel), feed those to some kind of filter models that pass or reject the output, looping until you have, say, 3 passing outputs, then feed that to a summarizer, etc. Maybe play with generating system prompts so that you have multiple entirely different takes on the same query, or stage it. Or, you know, have GPT-4 look at the query and propose a graph of subsequent invocations for you.

I wish I had time to play with it some more right now. The pace of progress in the field is giving me a serious case of FOMO.


We have a lot more than 5.

Balance, proprioception, hunger, …




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: