The last post introduced the first part of the syllabus, all the very basic categories introducing different elements of the environment. This week Murray Shanahan introduces our first complex category: Internal Models/Lights Out.
“What do you see when you turn out the light?” John Lennon asks Ringo Starr on Sgt. Pepper’s Lonely Hearts Club Band. “I can't tell you but I know it's mine” is Ringo’s reply. If they’d chosen to pursue this line of philosophical enquiry, Lennon and McCartney might have gone on to ask “What do you know when you turn out the light?”. Quite a lot, would have to be the answer. Even when my eyes are closed (and the rest of my senses are dulled), I know there is still an enduring world out there, external to me, a spatially extended world containing numerous spatially extended persistent objects, some of which are other people and animals who perceive the very same world I do (when our eyes are open). I know that everyday objects typically don’t wink in and out of existence. I know they typically don’t vanish from one place and instantly appear elsewhere.
Of course, things change. Things move and grow and decay. I know that too. But I know that if I spot an object in location X today and then spot the same object in location Y tomorrow then the object must have passed through a contiguous series of locations between X and Y in the mean time. And I know that one solid object cannot pass through another. So any barriers between X and Y must have been gone round or over by the object. All this knowledge lies at the foundation of our common sense understanding of the everyday world. It’s part of what developmental psychologists call “core knowledge”, and we rely on it to make sense of what we see, hear, and touch, and thereby to act intelligently.
It’s probably just as well none of this came up in the lyrics of Sgt. Pepper. But if AI researchers want to endow computers with common sense, they need to think about these issues. For the most part, reinforcement learning agents today cannot be said to know any of these things, even when they are trained to carry out tasks in 3D virtual environments. They don’t know that there is an enduring (virtual) world external to them that contains persistent three-dimensional (virtual) objects. I suspect this fundamentally limits their ability to generalise and transfer expertise to unfamiliar tasks. This is why we’ve incorporated a “lights-out” condition in the Animal-AI Olympics. Our hope is that this will encourage researchers to design agents that can adapt to such a contingency, because they “know” that the objects they previously encountered probably still exist somewhere.
This category tests the agent's ability to store internal models of the environment. In these tests, the lights may turn off after a while and the agent must remember the layout of the environment to navigate it in the dark. Many animals are capable of this behaviour, but have access to more sensory input than our agents. Hence, the tests here are fairly simple in nature, designed for agents that must rely on visual input alone.
Suggested Basic Training: