The Millennium Prize Problems Are My Benchmark for AGI
Artificial General Intelligence (AGI) is the hypothetical ability of an intelligent agent to understand or learn any intellectual task that a human being can. It is true human-level (if not necessarily human-like) intelligence, and I admit that I am skeptical that current AI technology like LLMs, however fascinating and useful they are, can ever achieve it.
A lot of people today think that some advanced LLMs like Anthropic’s Claude 3 are already AGI because they have indeed gotten much better at whatever the hell it is they do. I think they are getting better at providing answers based on synthesizing and summarizing their training data. But I can see the argument that this is possibly a distinction without a difference. If you train a system to be really, really, really good at just predicting the next token, maybe you end up with AGI, since the best way to predict the next token is indeed some form of true “understanding.” Maybe really advanced “lossy compression” is just…knowledge. I don’t think so but what do I know?
To avoid being accused of moving the goalposts in the future, I believe that the most compelling test for AGI would be an AI system solving one or more of the the Millennium Prize Problems, a set of seven six unsolved mathematical problems that have been identified by the Clay Mathematics Institute as some of the most challenging and important in the field.
The Millennium Prize Problems are:
- The Riemann Hypothesis, which relates to the distribution of prime numbers
- The Poincaré Conjecture (solved in 2003 by Grigori Perelman), which deals with the characterization of the three-dimensional sphere
- The P vs NP Problem, which asks whether every problem whose solution can be quickly verified can also be quickly solved
- The Hodge Conjecture, which deals with the characterization of geometric structures called algebraic cycles
- The Birch and Swinnerton-Dyer Conjecture, which relates to the properties of elliptic curves
- The Navier-Stokes Existence and Smoothness, which deals with the behavior of fluid flows
- The Yang-Mills Existence and Mass Gap, which relates to quantum field theory
These problems provide a compelling benchmark for AGI because they require not just computational power, but also true creativity and intelligence. Mathematics (as opposed to computation, or arithmetic) is not just about crunching numbers; it requires the ability to see patterns, make connections, and develop new ideas. An AI system that could solve one of these problems would be demonstrating a level of insight and originality that goes beyond simply processing large amounts of data.
Also: the Millennium Prize Problems are a good benchmark for AGI in that that they are well-defined and can be independently verified. An AGI may also be able to write a truly original Great American Novel or complete Schubert’s Unfinished Symphony but there would be an inherent subjectivity of judgment in these cases. But while a mathematical proof might be “beautiful” or “ugly,” it’s either valid or not.