What are the current limitations hindering progress in AI today? And how close are we to building AI agents with human-like memory or reasoning?
We’re starting to build AI agents that can do some things, but they’re still in the early stages of research compared to more established non-agent models. There's tremendous potential, especially in making agents useful in virtual environments or even physical settings like robotics.
One major limitation is memory. For a model to feel truly personalised, it needs to remember past interactions and retrieve them when needed—like when you ask it to find hiking pictures with a friend from last year to make a birthday card. That kind of context isn’t in the training data but needs to live in memory the model can call upon. We have some ideas on how to build such systems, but it’s still an open area of research.
Today’s AI agents can handle relatively simple tasks, but as their capabilities improve, they’ll manage increasingly complex workflows. In a task with 50 steps, for instance, an agent might handle a portion on its own while still requiring human intervention for other parts. The question then becomes: how much autonomy are you comfortable giving it? There’s a trade-off between convenience and control.
See also: OpenAI to expand in India with first office and hiring drive
Even as agents get smarter, we’ll still want safeguards. You might let an agent buy a movie ticket with your credit card, but also set limits so it doesn’t overspend. So while we’re progressing toward more intelligent and useful agents, human-in-the-loop systems remain essential today—and likely for some time.
What’s the role of hardware innovation like Tensor Processing Units (TPUs) in pushing AI advancements and adoption?
It's very, very important. We've focused on building custom TPU (which are specialised chips designed to accelerate AI and machine learning tasks) hardware for almost 10 years now, and the reason is a fewfold.
See also: AI agents expected to drive revenue by Apac CFOs, adoption rising in Southeast Asia
We saw a big need for being able to deploy AI and machine learning models to lots of users. Many of our AI-powered products have a large number of users, and the inference demands in those products were quite significant, even when we wanted to roll out much smaller models than the ones we're training today. That was the origin of the first version of TPU.
Since then, we’ve focused on making those systems more capable, such as being able to handle larger scales, having more compute, and being more energy efficient. Part of the reason we do that is to accelerate our research process.
When you have a machine learning idea, you want to try and implement it and get results as fast as you can. So if you can run that on 2,000 chips instead of 20 chips, you'll get your answer maybe 80 times faster, which is very good because the more ideas you can try, the more likely you are to land on the ones that make big improvements in the underlying algorithms, or you can experiment with different kinds of data to improve the quality of the models.
Once you've landed on what you think is going to be a successful training recipe for a very large-scale model, it’s time to scale up. You’d want to use as much compute as you can in a set period to train the most capable model while being energy-efficient. We can support that as our current Ironwood TPU pods are 30 times more energy-efficient and 3,600 times faster than our TPU version 2. This means that a task that previously took an hour can now be completed in one second.
Do you think that future breakthroughs in AI will come more from better algorithms or data, or because of hardware improvements?
Maybe a bit more on the algorithm side, but improvements in all those areas will be important.
We’re seeing this multiplicative effect of combining better hardware, data and algorithms in our Gemini model. The Gemini 2.0 Flash model, introduced just months after the 1.5 Pro, surpassed its quality. That’s our aspirational goal for every generation of Gemini – for the powerful Pro model to be better than the previous, while the fast and efficient Flash model to achieve the quality of the prior Pro model.
To stay ahead of the latest tech trends, click here for DigitalEdge Section
There are a lot of concerns about ethics, so how do you ensure the AI models and innovations Google develops are inclusive?
We have a lot of consideration for inclusivity in many different kinds of models that we develop. For instance, we work very hard to make our Gemini model multilingual because we think it's really important that all people have access to high-quality AI systems regardless of what language it is. That's one form of inclusivity.
Another example is if you're training an image model and you have a bunch of photos that have been labelled as "weddings" but they only show the more traditional North American white gown photo, which doesn't represent the full diversity of weddings globally. So you want to make sure that you've collected data representative of weddings around the world. We pay quite a lot of attention to that, and we do safety testing of various kinds for our models before we release them.
Beyond that, we’re enabling inclusivity through the release of Gemma, our open-source AI model that is smaller than Gemini. Gemma comes in several sizes and can run on a single GPU or TPU. This is important as it makes AI more accessible to many more people who don't have access to lots of specialised hardware.
Google DeepMind has consistently explored biologically inspired models and brain-like architectures as part of its mission to build artificial general intelligence (AGI). How much of today’s machine learning/ AI is truly biologically inspired?
It's important to realise that even the machine learning models we use today are loosely biologically inspired. They’re artificial neural networks. Each neural network is made up of a bunch of artificial neurons, and each artificial neuron is designed to behave kind of like how we think real neurons behave.
They take in a bunch of inputs, apply weights to those inputs, and then decide how strong an output to produce based on the collective signals they receive. That’s basically how deep learning models work—they have layers of these neurons, and the lower layers tend to learn very primitive features.
In the context of visual models, at the lowest level, we might have a neuron that gets excited when it sees a little blotch of red. Another neuron might get excited by a line in a specific orientation. Then, in the next layer up, the model starts learning combinations of these features—maybe red next to a line, or a certain texture.
As you go higher up the layers, more complex patterns emerge—like a wheel detector, a car detector, or even a nose-and-ears detector. This whole layered approach is biologically inspired.
As neuroscientists learn more about how real brains work, we definitely want to adapt some of those ideas and see if they help us build better artificial models.
Now, neuroscience itself is still a field where there’s a lot more we don’t know than we do. For example, within Google Research, we have a project in connectomics—the study of brain connectivity. The goal is to take a piece of brain tissue and map out how every neuron is connected, whether it’s in a small cube of tissue or an entire brain (in small organisms).
We’ve been working on this for maybe eight to ten years. We started with a fly brain, then moved to a piece of a mouse brain, then a whole mouse brain, and most recently, we’ve done a significant chunk of a human brain—about a cubic centimetre.
We’re basically using AI to reconstruct the brain. We take a tiny piece of tissue and slice it into ultra-thin layers using a very fine knife. Then we image each layer with an electron microscope. The challenge is that when you see a neuron in one slice, how do you figure out where that same neuron appears in the next slice?
We’ve turned this into a game called Eyewire, where people could trace neurons through slices and label them—like, “This is neuron 7,” and then find the next piece of neuron 7 in the layer below. That helped us build training data. With enough of that data, we could train models to predict neuron connectivity automatically across slices. You can search “Google Connectomics Visualizer" to learn more about it.
(This interview has been edited for clarity and length.)