A suggested solution to AGI safety is to ensure intelligent systems only work on abstract mathematical problems  that have, as far as we know, little to no connection with the “real world”. I’ll sidestep some obvious problems (limited usefulness, temptation to formulate real world problems in mathematical terms) to explore some implications of this idea, and then tie it back to modelling physical reality efficiently.

Ideally, these powerful “math oracles” would be handed datasets related to purely theoretical problems, model the data, and send back some form of domain-relevant answer. These AIs would live solely in a universe of esoteric mathematical objects and their relations. They would neither know nor care about humans and all their machinations, the births and deaths of stars, the decay of bromine atoms, the combining of simple cells into wondrously complex structures, or the rise and fall of galactic empires. Indeed, they wouldn’t even know or care about their own existence. There’s no concept of “math oracle running on a planet named Earth” that can be derived from the data. The whole idea of physical reality gets thrown out here.

Or such is the hope.

Let’s imagine one of these superintelligent math oracles that’s tasked with modelling a single (carefully selected) integer sequence. An oracle that learns an efficient, predictive model of the primes (specifically, their distribution among the natural numbers) can be said to fully “understand” that sequence. But is that all it understands? Due to the fundamental nature of primes in mathematics, an AI that finds their underlying pattern would arguably “know” number theory (and related mathematical fields) on a far deeper level than any human. A task that seemed constrained in its implications (“Predict the next n primes”) with a simple, associated dataset (e.g. the first million primes) gave the AI a far wider insight into hidden mathematical reality than was perhaps imagined at the outset.

Note that this doesn’t necessarily have to be viewed negatively.

Added to that, this prime-predicting oracle might know more than the secret workings of numbers. The primes seem to bleed into our physical reality. For instance, patterns in primes are mirrored in certain forms of matter interactions. The hibernation cycle of cicadas follows a prime pattern. Whether these are coincidences or hint at some deeper connection between the primes and the real world, it raises the possibility that modelling abstract objects can still give an entity insights into the physical world. Insights, needless to say, humans don’t currently possess.

Of course, the fact that the alien abstractions the AI has learned over the data overlap with some other structures we humans call “the real world” will probably be of zero interest to it. And as I previously mentioned, this potential phenomenon isn’t necessarily a negative. In fact, I think it could be advantageous to humans. It hints, I think, at the possibility of radical data efficiency given the task of modelling a world. That is, the real world.

What kind of data would we feed a superintelligent oracle whose task was modelling known reality, and not just abstract mathematical objects? Well, there’s Wikipedia. Marcus Hutter uses it as the basis of his contest for lossless compression in the hopes that this will lead to AGI. But Wikipedia is huge. Modelling the entire thing is grossly inefficient. Perhaps we could just use Vital Level V, a collection of the ~50,000 most important Wiki articles? That’s still a huge amount of data though. Point a webcam at a busy town square? Way too much data.

What would be ideal is a small and simple (but probably very hard to model) dataset that gives an AI profound insights into our physical world, similar to how the prime-predictor uses its own simple dataset (a list of primes) to glean insights into mathematical reality.

Perhaps such an example is the VIX, a market volatility index that tracks the expected strength of S&P 500 price changes over the near term. It could be said to capture people’s sense of foreboding about the near future at any given time. How much information is encoded in its price changes? It would seem to me that humans are in there, with all their fears, and hopes and ambitions. The current state of the world? The hidden workings of the world? Could reality as we know it be derived from a sequence of VIX price changes?

That’s just one idea for an ultra-basic world-modelling dataset. I’d be happy to hear more suggestions.