Saturday, July 12, 2025

Suffering from Success: Best Case Crisis of AI

Artificial Intelligence (AI) seems to be making its way from science fiction into the real world. We see companies like OpenAI and DeepMind make great progress in training and deploying Large Language Models (LLMs) and other AI (agents), which can answer so many questions we may have from writing letters to code. Having tried them myself, I find them quite helpful. Adding to this, AI is getting increasingly better month by month. In fact, some predict that AI will reach AGI (intelligence of human level breadth and superhuman level depth) by 2027.

Of course, there are doom-oriented thinkers for such a revolutionary technology. Even for the Industrial revolution where we made tools which were nowhere near replacing humanity we had luddites who wanted out. Even many decades after the industrial revolution we are aware of radicals like Ted Kaczynski, who were deeply suspicious of technological progress. Now, with the advent of AI we have a new wave of nay-sayers like Eliezer Yudkowsky, who have been ringing the alarm bells of the apocalyptic agency of AI for years. The problems that AIs pose are slowly dawning on even its godfathers. They are relatively recently seriously considering or perhaps even fearing these risks. There is a famous trend of giving the probability of doom from AI with the phrase P(doom). The public P(doom) of some thinkers in this field are given here. The problem is basically that AIs will eventually have wills of their own and that these wills will not be aligned with what is good or even endurable for humanity. 

In this article, I am going to be optimistic and just look at the risks we will face when the best case scenario of AI materializes. That is, I will assume we will solve the famous AI alignment problem and see what type of society we will have with benevolent and powerful AI at hand. There are a few things that come to mind as dangers.

One imminent threat of AI is on the job market. Disembodied AI is currently replacing many white collar jobs (reference). This will only get worse when we manage to get robotic AI in various work places. There is of course the immediate threat of income inequality that will be exacerbated and the class consciousness it will foment. Again, I am going to be an optimist and assume there will be some solution to this via some forms of welfare states. Moving beyond people's financial needs we move to the need for meaning.  We know get the opposite of Victor Frankl’s problem in his famous Man's Search for Meaning: no suffering and no meaning instead of facing suffering with meaning. Not struggling to make a living as ordained in Genesis will have negative mental repercussions. Pulling from various articles and papers, Google's AI Gemini puts it this way:

Joblessness is significantly linked to mental health problems. Individuals experiencing unemployment are more likely to report lower self-perceived mental health, increased rates of depression, and higher levels of anxiety and stress. Specifically, research indicates that unemployed individuals are twice as likely to experience depression symptoms and major depressive disorder compared to those with employment.

Put simply, people will likely have a crisis of meaning if they are not needed as participants in the world. Sure people can try to convince themselves they can mean something to each other in families and friendships but will such relationships be healthy? There is a deeper interpersonal issue at play here that deserves separate attention.

But, it's not all doom and gloom. Just because AI can automate many or perhaps all of the things we need to have functioning infrastructures for civilization, the notion of human uniqueness still remains. We are so far the only verifiable entities in the universe that are sentient. This, in my opinion, is at the heart of our peril and defense against benign but overwhelming AI. 

Some think that human uniqueness is safe for the foreseeable future because AI is bounded in its capacity to think. People who hold this kind of view include famous mathematician Roger Penrose and computer scientist Yann LeCun. Dr. LeCun believes that we are not close to AGI and superhuman level synthetic entities. For him this is in large part because of his lack of esteem for LLMs when it comes to intelligence. Dr. Penrose asserts that consciousness simply can not rise out of computation. He argues this in his book The Emperor's New Mind: Concerning Computers, Minds and The Laws of Physics. Although I have only skimmed the book, here is a gist of its message. Dr. Penrose speaks of microtubules (the skeleton, transport lanes, and scaffolding of cells) giving rise to consciousness through quantum effects. There is a lot to be read, understood, and said here but for this essay we will just point out that Dr. Penrose argues that this effect is beyond computational. Being grounded in materialism Dr. Penrose believes there might be an natural explanation for this but that we have not reached there yet. Whether this bottleneck for consciousness is surmountable by mere humans or not, the conclusion is that, if Penrose is right, we do not have an immediate human obsoleteness problem. 

Dr. Penrose's arguments are, according to him, speculative and not yet complete, and in general this question of human uniqueness/specialness in the face of AGI and Artificial Super Intelligence (ASI) remains open. For now it seems to me that we simply have to have faith in ourselves. For Christians, this means continuing to believe that we are fearfully and wonderfully made in the image of God. For the non-religious this means holding on to the notion of the Ubermensch, to the precarious belief of being the only sentient beings in the universe (quite a leap of faith granted materialist assumptions), or just forgetting this problem in some epicurean sense. 



Thursday, June 19, 2025

Pursuit Curves and Reinforcement Learning

The news these days is filled with the tit-for-tat conflict between Israel and Iran. Both countries are volleying missiles at each other while the defending nation is using some means to intercept and destroy said missiles. Israel in particular is known for its Iron Dome, a mobile air defense system. During a 2014 war the Iron Dome was said to be nearly 90% effective. The idea of such an air defense system is currently being contemplated by the US and Canada.

Obviously, any nation would love to simply zap all missiles coming their way. But how does one build an Iron Dome? Well, there is a lot that goes into building it (primary among which might be money) but let’s focus on one aspect: how does the interceptor catch up with the missile? To answer this question, we delve into the math of pursuit curves.

Pursuit curves describe the path taken by a pursuer/chaser that is always moving directly toward a moving target. Let the pursuer’s position be given by \( \vec{P}\) and the target’s by \( \vec{T}\). We make the idealistic assumption that the pursuer always moves towards the target. Hence,

\[ \frac{dP}{dt}=v\frac{\vec{T}-\vec{P}}{|\vec{T}-\vec{P}|} \] where \(v\) is is the speed of the pursuer which we assume to be constant. Visually, we see that the vector \(\vec{T}-\vec{P}\) points in the direction of \(\vec{T}\) from \(\vec{P}\).

Now, if the target moves in a very simple known path, we often can solve the trajectory of the pursuer using analytical methods for Ordinary Differential Equations. However, this ability quickly gets bogged down with nuance. Even for a simple circular path by the target with a speed equal to the pursuer’s, we get quite intricate pursuit curves. This is demonstrated in the GIF below.

Due to multiple enviromental interactions (such as wind) or due to internal imperfections, the target might move somewhat randomly. Its equations of motion would then be given by the Stochastic Differential Equations:

\[ dT=\vec{\mu}(t)dt+\vec{\sigma}(t)d\vec{W}(t) \] where \(\vec{\mu}\) is the deterministic(intended) velocity and the term "\(\vec{\sigma}(t)d\vec{W}(t)\)" captures the noisy aspect. \(\vec{\mu}\) can describe circular motion like before or some other chosen shape. Going forward, we will set \(\vec{\mu}\) to zero to emphasize the challenge of the random target. We will set "\(\vec{\sigma}\) as a single constant and see that, even then, we face interesting challenges when we add in some realism.

We will now see a noisy chase. For better visibility we will make the pursuer’s effective speed slower than the effective speed of the target. By effective speed we mean the root mean squared displacement over time step \(\Delta t\) (\(\frac{\sqrt{\langle \Delta x^2\rangle}}{\Delta t }\)). (On a side note, the GIF reminds me of a Key and Peele skit).

We will now introduce a challenge that brings us closer to the real world. So far, we have been giving the pursuer as many observations of the target as there are time steps in the simulation. But this is unrealistic. The interceptor can only afford to watch the target a few times during the chase. Hence, weith a fixed budget of observations, the target becomes a more distant goal.

There are many ways to schedule the observation times with a fixed budget of observations (M). One is to set an absolute final time T for the chase and evenly split it by M. We will call this a uniform budget. A second and third approach are to evenly spread the observations in the first and fourth quarters between \( t=0\) and \(t=T\). We will call these the front-loaded and back-loaded budgets respectively. A fourth approach is to randomly sample the observation times with a uniform distribution across the full time interval. We will call this a random budget.

With M=10, T=20 sec (I am skipping non-dimensionalization for clarity), and with pure Brownian noise like before we get the following demonstrations.

In these runs of the simulation only the front-loaded approach seems to be doing very bad. But we only had one run for each approach and with a pretty simple target behavior. Plus, the other three methods do not seem to be doing too good either. With a modestly small catch radius all of them would still be in pursuit (the catch radius is the radius of the circle centered at the target being within which signifies successful interception).

One can not help but wonder if the observation times can be optimized to maximize the likelihood of being within the catch radius before the final time T or at least get close to this radius when the clock stops. One clever approach is to have a geometric sequence of observation times which waits longer and longer later when sampling observations. This is clever for Brownian Motion (BM) because BM's uncertainity grows over time and therefore later observations would be increasingly misleading and it would be best to hunker down and wait longer intervals later on. But this is predicated on a Brownian target. What if we have other motion patterns like a Levy Walk or a sand walk from Dune? What is a good foundational method to optimize our observation times?

The answer is AI or, more modestly, Reinforcement Learning (RL). RL is a type of machine learning where an agent (AI) develops a policy for its actions based on positive or negative reinforcement. It looks at the state of things it is allowed to see, makes decisions, and faces consequences for its decisions. It does this many many times living and learning until the training stops. For our RL agent we have the following setup:

  • RL agent decision/action: waiting time until next observation
  • RL agent observable: current pursuer and last known target positions, current time, and remaining observation budget
  • RL agent reward: +1 if target is intercepted (within catch radius) otherwise negative of final distance between pursuer and target

Using a standard RL algorithm (a policy gradient method) and training over a modest number of episodes we get an average total reward of -0.69 when we test the RL agent 100 times (simulate 100 chases with the agent picking the observation times). Under the same set of circumstances, the other methods get the following average total reward:

  • Uniform budget: -1.65
  • Front-loaded budget: -1.81
  • Geometric budget: -1.07

Clearly, the RL agent wins.

This is just scratching the surface of the power of RL. Even with this scenario there is a lot more we can do to make the interception more likely and more efficient by giving the RL agent more authority. God willing, this will be a topic for another day.

To end on an eerie note, we will show a GIF of a pursuit with the RL agent picking the observation times and with circumstances like that of the previous GIF (so as to compare). Note that we are only plotting positions when the pursuer makes an observation/decision.