MaCro Philosophy

Strange Loops Part 3: How to Build One

In the last two posts I explored strange loops and argued that humans are perfect examples of them. In this post I will discuss how to build them. Again, I will go through the components one by one.

Being a Loop

As discussed previously, the loop we are part of is an action-perception loop. This type of loop is also ubiquitous in embedded AI systems, where it is often referred to as an agent-environment loop. For example, OpenAI gym is a huge collection of testing environments for AI. The environments are built around the loop shown on the left (figure taken from their documentation).

Note that this is the kind of scale of loop that is needed. Simple feedback loops in neural networks (such as in recurrent networks) are occasionally suggested as providing the necessary looping. They will likely be involved in the right kind of systems, but they are not enough by themselves. They do not represent the looping of a system that contains all the tools for the hierarchical shifts, component reference, and self (as a whole) reference that is required.

If we want to build strange loops, then we must build agents that interact with an environment.1

Moving Up In A Hierarchy

Consider the following problem: you have a network as pictured below, where the inputs are low level information represented by different colours. The colours travel upwards through the network, and each line they travel up has an associated weight, which changes the colours as they combine, separate, strengthen and weaken. The aim is to set the weights of the network so that whatever is input, the output is identical to it.

An easy way to do this is to set the weights so that they do not make any changes to the colours. The low level information moves up the network unchanged at each step and the problem is solved.

However, we can do something much more interesting. Suppose that you follow standard machine learning techniques. You start with random weights and the colours change unpredictably as they go through the system. At first the output looks nothing like the inputs. But, you can see which direction to tweak the final weights to get the output closer to what you want. Once you've tweaked those, you can slowly backtrack through the network making little changes so that it's more likely to give the answer you want. Repeat this process enough times and you may be able to get the right outputs. Because you started with random weights, your solution probably won't be as neat as the one above. The colours in the middle will probably look nothing like the inputs. Yet it will still work.

The information in the inputs gets shuffled around and manipulated, but always in such a way that it can be undone. This solution also works, and is slightly more interesting than the direct copying version.

Now for the exciting part! We can start to involve a 'shifting up in hierarchy' by squeezing the middle of the network, as shown below.

Somehow the information that can be used to recreate the original input needs to squeeze through the bottleneck that has just been created in the middle. Previously, the information for 16 colours was encoded using a different 16 colours; now it needs to be encoded in just 6. Whereas before, it was possible to just shuffle the information around, now it needs to be encoded more efficiently. This will not always be possible, but if there are the right kind of patterns across the set of possible inputs, it can be done.

Comparing to Human Loops

It may seem that the methods mentioned here are far removed from the human case, but this is not true. I mentioned in the last part that the brain was trying to understand the low level information by actively guessing at what it might contain. This can be imagined like a folded over (and highly interconnected) version of the method above. The low-level is encoded into higher level concepts that are simultaneously used to actively guess the incoming data working back down. Of course, there is no distinct top in the human case. Shown below is just one idealised subcomponent of a larger, messier, figure.

Efficient Encoding Example

To make this more concrete, consider the example shown below (from the DeepMind dSprites dataset). An input is a single image from the gif consisting of 4096 individual colour values. The goal in this case would be to output the exact same colours after having squeezed that information through the bottleneck in the centre.

We can see that this is possible if the middle values come to encode information such as type of shape, size of shape, orientation of shape, and position of shape. If I told you that the shape was a square, along with its size, location, and rotation, then it should be possible to recreate the image. The bottom half of the network learns to encode the high level information. The top half of the network has the job of reconstructing the image. While in practice there are many challenges with getting such an encoding, it can theoretically be done. The low level information of many individual colour values has been converted into high level information such as type of shape and location.

Not only is it possible to build a system with a moving up in hierarchy, but it is an active and fruitful research area in current machine learning.

Considering Components of the System

This time, I'll switch the order and discuss the ability to reference components of the system first. Remember that again, the loop in question is between an agent, its actions, and its observations from the environment. For it to count as a strange loop, the system must contain encodings that are about the possibilities for action which are then used in determining the course of action. This way the high-level understanding can reference itself and also feeds back into the low level system (via interaction with the environment), closing the loop.

Encoding Actions

There are many proponents of the view that our experience of the world (and under my account, therefore also our underlying model) is built up from the possibilities we have for interaction, as opposed to inherent properties of the world itself.2 I won't repeat their arguments here. Instead, I present an interactive experiment that may help demonstrate this view, and also argue that just as it applies to us, it applies to AI systems learning to interact with their environment.

I am currently working on an AI system that learns to predict its environment by interacting with it. One early result is that it is much more efficient to learn the results of interactions than the properties of the environment itself. The way the system learns about the world is deeply rooted in the ways it can interact with it. It is easier to build a model of the interactions (and how the change the environment) than to build one of how the environment changes by itself.

To following experiment will (hopefully) demonstrate this [now updated version with extra features added in my follow-up post]. You should see a grid of randomly changing colours. However, there is some element of the grid that is not purely random. At first, it should be tricky to see anything but random noise. You may think you notice patterns in the randomness, but (hopefully) it will hard to be sure if any are real. By using the 'wasd' keys ('w' = UP, 'a' = LEFT, 's' = DOWN, 'd' = RIGHT) you can control the non-random part of the pattern. Hopefully, by interacting in this way, it will be much easier to decode the non-random component of what is mostly just randomly changing colours.

So, look for a short while to see if you can spot the non-random component. Then start using the controls ('wasd') and hopefully it will become clearer. After you've played around with this for a bit continue to the next paragraph.

Interact with 'wasd' on the keyboard
Grid Size: 10
Update rate (per second): 10

[Do not read on until you have tried the demo.] The non-random component is a single black pixel. You can control its movement with the 'wasd' keys, and it will move in straight lines with occasionally turns if you do not control it. If everything went as planned, it should be easy to see it when it's being controlled, but harder otherwise. Perhaps this seems obvious, but then the point is made.

Perhaps you noticed another element of the pattern? There is also a white pixel that moves identically to the black one. You can control it with 'ijkl'. If this experiment went exactly as planned then you didn't notice it until now because your interactions did not have any effect on it.

Obviously, this is a very unscientific demonstration that does not control for potentially influencing variables. Its explanation also relied on extensive use of the word 'hopefully'. However, hopefully it helped get the point across. It's much easier to notice patterns in data that are based on interaction. My theory is that when we build systems based on interaction with their environments, then any high level representations they learn will naturally be the action-based type needed for strange loops.

Considering the Entire System

It is on this point that current methods (or at least my knowledge of them) fall short. Remember that the kind of self-reference necessary is not just the ability of the agent to refer to its own state (such as its location). To have the properties seen in Gödel's proof, it must create a single entity out of all its possible actions, and then reason over the kind of things this allows the system to do. I'm not exactly sure yet what this would involve. Hopefully I will be able to return to this in the future.

For what it's worth, my current feeling is that the answer may lie in self-updating the reward function. Recall the loop from the beginning of this article. It contained an agent, an environment, actions, observations, and a reward. The reward is a signal sent to the agent which describes how well it is doing. It rewards behaviour we want to encourage and penalises behaviour we do not. The agent learns to unquestioningly maximise this reward, as that is exactly what we want it to do. Agents cannot typically modify their rewards because then, first, they won't be learning what we want them to learn, and second, they would run into problems such as the experience machine. They would just set everything to give maximum reward and never need to learn anything again.

Uncovering Properties at the Low-Level

The final component is that properties of the low-level system are uncovered by the high level self-referential components. Like in the human case, I discussed this along the way. If the system is working, and is learning based on its interactions with an environment, then the loop will naturally be activated. A system that has all the proceeding properties and actually works to get better at interacting with its environment will be a strange loop.


We can build systems that are close to strange loop systems, but are (probably) not quite there yet. It won't take much to put them all together and work out the final few steps. Perhaps the strangeness of these systems will endow them with new abilities and allow them to outperform other approaches in AI. Perhaps building them will give us better insights into consciousness. I don't know. But I'm excited to find out.

Blog Homepage


1. Technically it may turn out that you can define subsystems that have the strange loop properties based on recurrent connections inside networks. However, this would just be another case of an action-perception loop, where the agent is a subsystem with actions that affect the rest of the network (which under this view plays the role of the environment). It's an interesting question whether the strange loop theory of consciousness implies that multiple consciousnesses exist whenever a system contains sub-strange-loops.

2. Phenomenology: An Introduction, by Anthony Chemero and Stephen Kaufer is an excellent overview of past philosophical literature that leans in this direction. Kevin O'Regan has an excellent book available to download from his website that is filled with examples of how our experience of the world is built on possible interactions.