My Blog

What Happens When AI Systems Forget? — Finally Understanding Machine Unlearning

Machine Unlearning Blog Image

More Details on the Topic

AI systems continue to grow in size and sophistication on a daily basis; we are now asking ourselves an interesting question— can AI systems forget like humans do? As with our ability to fade memory over the years, and through the accumulation of experiences, many researchers are now looking at how AI systems use vast amounts of data (in the case of machine learning) and from large amounts of examples.

Despite many years of research into machine learning, it is now clear that machine learning systems (such as deep learning, or reinforcement learning models) don't just need the ability to “forget” in order to be accountable for privacy and equity in AI technology, they actually need that ability as part of their training approach and ongoing maintenance strategy. Thus, there is a growing field of research on "machine unlearning," which has the potential to transform the way we design, develop, and maintain intelligent AI systems.

We will discuss how machine unlearning operates, why it is necessary, and what the underlying scientific principles behind it are.

Why Forgetting Is Necessary For AI?

The large amounts of data that AI models learn from, such as image or language models, typically contain a lot of information about people. This information can include sensitive information that a person has shared, as well as copyrighted material, private messages, medical records, and even mistakes that were made. If the AI model learns something harmful or sensitive from this data, it will retain the information in its memory and may behave accordingly.

For these reasons, forgetting is important.

1. Privacy Legislation Requires It

The right of an individual to request that their data be removed from a database is the basis for laws such as the Right To Be Forgotten in GDPR. However, simply removing data from the database is insufficient to erase all traces of the data; once a model learns from that data, the model retains certain aspects of the data and can act on them. Machine unlearning allows a model to completely forget what it has learned from the data.

2. Errors In Training Data Can Be Fixed

In many cases, the training data may contain incorrectly labeled instances, biased examples, or inappropriate examples. If the training data contains mistakes that have been learned by the model, it is likely that the model will produce harmful predictions based on those mistakes. Unlearn helps to erase any harmful effects of the mistakes that have been made.

3. Remove Unwanted Biases

When the training data includes information about race, gender, or other sensitive characteristics, it is possible for the model to produce unfair outcomes because of the introduction of hidden bias. It is possible to remove these unwanted biases through targeted unlearning, rather than having to retrain the entire model.

How can an AI forget something?

Human beings have memories, emotional attachments and can forget certain events, but artificial intelligence has no ability to think like humans, and therefore forgets differently.

An AI learns through "weight" (numbers) and the way they are adjusted or changed by the AI is a representation of how "forgotten" it is and "forgotten" means that the previously learned data will not impact or influence any future predictions.

To visualize the comparison, imagine cooking with garlic and after you cook, you realise your dish has too much garlic in it. Once it’s been blended it’s hard to physically remove any visible pieces of garlic but you can adjust the recipe to create a balance of garlic that is more acceptable. Therefore, "machine unlearning" works in a very similar fashion.

Two Types of Forgetting:

"Exact Unlearning"
With this method of unlearning an AI Model would act as if it had never been trained with that dataset. "Exact Unlearning" is the ideal method of unlearning a dataset, however this method normally is very difficult to accomplish.

"Approximate Unlearning"
This method of unlearning adjusts the model to remove as much influence from any unwanted dataset as possible. The great thing about "approximate Unlearning" is that it can be done quickly, cheaply and is normally adequate for every day real-world applications.

What are the Methods for Machine Unlearning?

Although the machine unlearning field is still developing, we already have a number of options available for getting started. Below is a simplified list of the types of machine unlearning:

1. Retraining From Scratch

The most straightforward way is to delete the records we don’t want and then train the model from the beginning.

Advantages
This means complete unlearning.

Disadvantages
The time, cost and resources needed to retrain a complex machine learning model are enormous. This is equivalent to a person having to demolish their entire home just because they want to remove one brick.

2. Selective Retraining

With this method, only those sections of the model that were impacted by the data are retrained.

Think of it as repainting one room instead of repainting an entire house.

Methods used
Influence Functions
Gradient Tracking
Layer-Specific Fine-Tuning

Objective
To continue to adjust those sections of the model that were impacted by the data being deleted.

3. Gradient Reversal

As a model is being trained, it learns the direction to reduce the error based on the network of gradients associated with the model’s errors. To unlearn, you must reverse or subtract the gradient associated with the model’s lost data.

It’s like walking backwards to correct a step you took earlier.

Pros
This is a very efficient and potentially very effective method for unlearning.

Cons
It is often challenging to accurately isolate which gradients are associated with which records.

SISA Training organizes the training data into many independent "shards." This way, if a user requests that their data be deleted from the model, only the shard that contains that user's data will need to be retrained.

Knowledge Editing is a method for removing specific fact(s) from an AI model without retraining the entire model.

How to Evaluate If an AI Model Really Forgot Information

1. Membership Inference Testing
Checks whether the AI can still determine if a data point was part of its training set.

2. Measurement of Influence
Checks how much influence removed data still has on predictions.

3. Attack Simulation
Attempts to extract “forgotten” info; if no leaks remain, the model passed.

Challenges Associated with Machine Unlearning

1. Blending of AI models during Training
Once information is mixed deeply during training, separating it is extremely difficult.

2. Excessive Unlearning
Sometimes a model may forget too much, harming accuracy.

3. Cost and Time
Unlearning in large models can take hours even for small data removals.

The Future of Machine Unlearning

Users' Rights to Privacy
People will always be able to request deletion of their influence from AI.

Increased Safety of AI Models
Removing biases and harmful data makes models safer.

Ethics of AI Development Will Strengthen
Unlearned models adapt better and avoid carrying mistakes forever.

Improved Time Interval for Repairs
Unlearning reduces full retraining needs, lowering cost and energy usage.

An Evolutionary Perspective
Humans learn and forget naturally; AI will develop similar “forgetting ability.”

Machine unlearning is still new but will soon become essential for building trustworthy AI. The future of AI will not only depend on what machines learn — but also on what they deliberately forget.

Also read: What Happens When AI Models Forget? The Science of Machine Unlearning

Related: AI as a Digital Twin Assistant