news-13072024-041324

In the world of artificial intelligence, particularly in the ancient Chinese game of Go, top-level AI algorithms have been able to defeat human players for years. However, recent research has revealed vulnerabilities in these algorithms that allow humans to exploit weaknesses and defeat the AI using basic strategies.

Researchers at MIT and FAR AI conducted a study to enhance the resilience of the KataGo algorithm against adversarial attacks. Despite their efforts, the results showed that creating truly robust and unexploitable AIs is challenging, even in a controlled environment like board games.

The researchers tested three methods to strengthen the KataGo algorithm’s defenses. Initially, they fine-tuned the algorithm using examples of cyclic strategies that had previously defeated it. While this approach initially showed promise, the attacker managed to adapt with minimal computing power and reduce KataGo’s win rate significantly.

In a second attempt, an iterative “arms race” was implemented where new adversarial models discovered exploits and defensive models tried to counter them. Despite multiple rounds of training, the defending algorithm still struggled against new variations of attacks.

Finally, the researchers employed vision transformers in a new training method to avoid biases present in the initial convolutional neural networks. However, this strategy also failed to provide robust defense against attacks.

The study highlights the importance of evaluating worst-case scenarios in AI systems, even if their average performance is exceptional. Vulnerabilities can be exploited by adversaries, exposing weaknesses in seemingly strong AI algorithms.

While the research did not entirely eliminate new attacks in Go, it successfully addressed fixed exploits that were previously identified. This suggests that training against a diverse range of attacks could potentially make AI systems more resilient in the future.

Overall, the study demonstrates the challenges of making AI systems impervious to worst-case scenarios, emphasizing the need for robust defense mechanisms. Despite the difficulties encountered, researchers remain optimistic about the potential for future advancements in AI security.