Yes. Let’s define AGI as ability for a single model to pass most human professional tests (no cheating) and to provide genuine human-level flexible cognitive benefit to specialized professionals in diverse fields. Reasonable?
No, because most tests designed for humans test memory and pattern recognition, which computers already can do better than humans so it's not a useful comparison. I'd rather define it to be superhuman AGI when it not only performs better on tests with humans who can use computers during the test but also can perform everyday tasks which are not 'hard' for us humans. That is because we have the hardware in our brains to do these things, doesn't mean that it's easy in the least for a computer.
Have some examples? What would be tests that, if passed, you’d say “oh yeah, that’s AGI.”
For instance, if it could make a peanut butter and jelly sandwich? Most challenging things that are easy for us are in the motor domain. While important, I think “intellectual AGI” is a meaningful milestone and closest to what most people think of when they think AGI.
>Have some examples? What would be tests that, if passed, you’d say “oh yeah, that’s AGI.”
The "G" in AGI is general. A computer program or system that could both, lets say write code and learn drive a car would be something closer to an AGI. Written tests are remarkably brittle in showing how intelligent someone is - like we already know the limitations of tests in the real world! Einstein famously flunked his entrance exam, but then invented general relativity at 26; but other posters would have you believe an LLM is more intelligent than Einstein because it could pass a test. When an LLM defends a dissertation in arguably any field that would be way more impressive than an LLM passing a test that humans already designed and know the answers to.
One problem I have with saying a 100x more powerful LLM could become AGI is that there is nothing that leads me to believe that LLMs, as they exist currently, are capable of synthesizing new knowledge and I'm not sure what breakthroughs you would need to get there. Once you start to think about that, you start to run up on the limitations of the LLM. If I were to invent a completely new programming language, I could probably teach a junior engineer how to use it in a week, but the jury is out if I would need to first generate 50,000 sample program and spend $1,000,000 in gpu compute to get an LLM to output the same thing. It's hard to consider such a system AGI. Further still, sure you have Google spending millions of FSD, but it's hard to consider the system they have as general. Could I take Waymo and have it pilot a forklift? Or a submarine? How much would that cost? A """below average intelligence""" human could learn to use a forklift in an afternoon.
All in all, there's more to intelligence than written tests.
The problem with defining AGI isn't only in defining intelligence, but also defining general. Also, why do we treat AGI as a yes/no question, when it probably makes sense to think partially... i don't have a definition of either
And that’s why I argue that AGI is already here. It is a spectrum. And we are well along the way to further acceleration. And if you want to say “no, we are not at AGI yet”, we need to define a clear test of what would be beyond that point.
This definition fails badly because it doesn't test anything outside of language. At a bear minimum have the tests involved have pictures and descriptions in them and require the AI to use the same model to synthesize information from both.
This is a very problematic word. For example if you were a civil engineer and went "throw me any old design for a bridge, I have a river I need to cross", you'd have your license removed.
Intelligence is too massively loaded, and too much of a gradient even across humans to try to some up human or AI abilities. It is a multitude of different capabilities that don't necessarily have to be bundled together for something to be 'smart', 'useful', 'capable', and/or 'dangerous'.