Google’s Sergey Brin sees a path to AGI, but not beyond


In a recent AGI House interview, Sergey Brin described Gemini as a system whose capabilities not only evolve, but integrate the world’s knowledge across languages ​​and modalities. He said the software the AI ​​runs on has also evolved beyond what it was originally designed for, and while Brin could imagine Gemini getting to AGI, he also couldn’t see what came next.

AGI: Artificial General Intelligence

AGI is a level of AI capable of learning, understanding, and applying knowledge across tasks in a manner similar to humans. Today’s AI can produce useful answers, write code, analyze images, and solve many specific problems, but it does not yet understand the world or independently apply knowledge across domains like a human would.

OpenAI, Google DeepMind, and Anthropic are all developing AGI, but they emphasize different reasons for what they want to do with it. OpenAI focuses on economic benefits, Google DeepMind emphasizes scientific discovery, and Anthropic prioritizes human progress.

The next big thing: AI capabilities are converging

Brin said Google’s previous advances in AI relied on specialized models designed for specific tasks. But he added that Gemini is increasingly achieving peak performance in several areas such as mathematics and scientific reasoning. What Google is seeing is that capabilities that previously relied on models trained to do specific things are now giving way to families of models that can do everything: convergence.

He also said that convergence was something that had happened; This wasn’t something he expected when Google started developing AI.

The context of his answer was a question about the next big thing, his answer being convergence.

Brin replied:

“I think what’s exciting is that all of these things are converging into the same general patterns.

In the past, specialized models were required. And in the case of protein folding, this is obviously always the case.

But increasingly, our main Gemini LLMs can be state-of-the-art for mathematics, for example, and for other types of scientific questions. So this convergence, I don’t know, I guess it’s not something that I really would have predicted initially. But it was pretty incredible to see.

And I guess that involves this concept of transfer, just the idea that when you practice for a certain class of problems, let’s say you practice coding, that can actually help your mathematical reasoning and vice versa.

And that’s really exciting to see… the multimodal capability is also an example of that. For example, can you actually go from being able to process images to being able to think about geometric text problems as well.

Transfer learning is one of the reasons why convergence occurs. Transfer learning involves training a model in one thing and it turns out to have advantages in completing tasks in another seemingly unrelated thing. What’s happening now is that Google is discovering that the combination of things like visual training, math, and reasoning helps improve several abilities.

Transformers are ‘strangely flexible’

Brin was asked if transformers would play a role in AGI. Transformers are the software that AI runs on and the breakthrough that enabled things like ChatGPT. Brin’s answer mentions MOE, which stands for Mixture Of Experts. MOE is a technique for routing specific tasks to specialized internal “experts” to increase efficiency.

Asked if AGI will work on transformers, Brin responded:

“Transformers are strangely flexible. We use them for image and video in addition to text, so they’ve outgrown their original capacity.”

Now, to be honest, along the way they also changed. I mean, we have any type of MOE, transformers. I mean, there’s a lot of little details that changed along the way, so it’s not exactly the same as transformer paper.

If I could guess, could something close be AGI? I would say yes.

That’s just my guess, just because they were able to evolve so much.

But like I said, they change. It’s not exactly the same as the original transformer paper.

Global Patterns Converge with Gemini

Brin was asked if global models would help AI achieve AGI, if that was part of achieving that goal. A world model is an internal AI simulation of reality that helps it anticipate what might happen next. By foreseeing the consequences of different actions, she can make better decisions and plan ahead.

He cited Google’s Gemini Omni as an example of this AI direction. Gemini Omni was presented in mid-May during Google I/O. Google describes it as its new family of “all-input-to-output” multimodal AI models. It combines Gemini’s reasoning abilities with generative media capabilities, starting with video creation and editing. Google describes it as a template that can potentially “create anything from any input.”

The question asked was:

“What is your view on how global models can help achieve AGI? »

Brin replied:

“Yeah, I mean, global models are like video, basically, models. And I guess there are a few – people talk about AGI quite broadly.

I think of it as, I think of AGI as the idea that AI can actually get better.

But other people, and I think these people are probably more right, sort of think that AGI means, well, AI should be able to do everything that a person can do.

And they are two different things.

So to do anything a person can do, you absolutely must be able to understand and interact with the physical world.

So for that, being able to know, to dream, to imagine what will happen in the world if you do something and to understand it is obviously important.

So, I think global models, yeah, if you want to do everything and that, you know, extends to robotics and things like that, global models are essential.

And yes, you’ve probably had more time to play with our Gem Omni model than I have, because I’m deeply passionate about the self-improvement game.

But yes, we have been working on this for a long time, Omni is the latest version.

Omni is also pretty cool because it’s the exact same thing, you know, Gemini, as we also trained with all the text and all the other stuff, trains the exact same way.

The fact that these converge is quite astonishing. But yes, you need that ability to be able to interact physically.

The takeaway is that Gemini is moving in a new direction with the convergence of global patterns. This is the next stage of growth.

What comes after AGI?

Someone asked Brin what would happen after the AGI, which was a very good question. What was interesting about Brin’s response was that he didn’t have one. Brin’s response was that he couldn’t really see beyond it. He compared AI to previous waves of technology like the web and mobile computing, but he didn’t identify a paradigm for what comes next.

The implication is that determining what comes after the AGI would itself be a major opportunity.

He said:

“Wow, that’s a great question.

What’s the next step after achieving AGI?

I mean, I think everyone right now is focused on accelerating the growth of AI. What comes next?

We obviously started with web and internet research. We kind of went through the mobile generation, which was another pretty big explosion.

I guess now people are – AI is now a major new industry trend. And what comes next?

Boy… I mean, I think if you can answer that question, you’ll have a fantastic business on your hands.

What it all means

  • Brin sees AI moving toward AGI through convergence.
  • Capabilities formerly managed by separate models are merging into larger model families.
  • Transfer learning helps one type of expertise improve the performance of another.
  • Transformers continue to evolve.
  • Global patterns could be Gemini’s next stage of growth.
  • No one may know what will happen after the AGI until they get there.

OpenAI, Google DeepMind, and Anthropic are all working to create AGI, prioritizing different goals.

Brin’s description of Gemini offers insight into how Google believes AGI can be achieved. He described a process of convergence in which capabilities that once required separate systems increasingly appear within the same model family. One reason this happens is transfer learning, in which training a model in one area improves its abilities in another.

This same convergence now extends to global models. Rather than treating understanding the physical world as a separate discipline, Google is integrating these capabilities into Gemini itself. Brin cited Gemini Omni as an example of how reasoning, multimodal understanding and world model capabilities are increasingly part of the same system.

What comes after the AGI remains an open question. Brin said he could imagine current AI architectures continuing to evolve toward AGI, but when asked what comes next, he didn’t have an answer. If AGI is the next frontier, whatever comes next could be the foundation of a whole new generation of businesses and technologies.

And that’s where we’re heading with AI.

Watch the interview here:

Featured Image/Screenshot



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *