AI Algorithm Trains on Children’s Photos Despite Strict Privacy Measures

July 2, 2024

Human Rights Watch (HRW) has raised concerns about the use of children’s photos posted online to train AI models, despite strict privacy measures in place. The organization found that photos of children, including indigenous children, from various regions were linked in an AI dataset without the knowledge or consent of the children or their families.

The dataset, known as LAION-5B, contains images that AI image generators can use to create realistic deepfakes of children. This poses serious privacy and safety risks, as some photos included identifying information such as names and locations. Even photos from unlisted YouTube videos, meant to be private, were found in the dataset, highlighting the challenges posed by unauthorized scraping of online content.

Despite efforts by platforms like YouTube to prevent AI scraping, the images have already been used to train AI models, making it difficult to undo the damage. HRW researcher Hye Jung Han emphasized the need for regulators to intervene and prevent such training from happening in the future to protect children’s privacy.

Han’s findings come at a critical time, as Australia is set to release a reformed Privacy Act, including a draft of the Children’s Online Privacy Code. However, the extent of protections for children in the new legislation remains uncertain. Han stressed the importance of implementing strong legal protections to prevent the misuse of children’s photos online.

The AI scraping of children’s images is a widespread issue, with potentially far-reaching consequences. Han’s research, which examined only a small fraction of the dataset, uncovered a significant number of photos of Australian children, indicating a larger problem. While efforts are being made to remove the links to these images, the process is slow and does not address the broader issue of AI models that have already trained on the data.

In addition to privacy risks, the use of children’s images in AI training poses other dangers, such as the creation of harmful deepfakes. Han cited a case where girls from Melbourne had their photos manipulated using AI to create explicit deepfakes, highlighting the real-world impact of these practices.

For First Nations children, the inclusion of photos in AI datasets raises unique concerns, as it may violate cultural restrictions on the reproduction of images of deceased individuals. The training of AI models on these images could perpetuate harms by making it harder to control the reproduction of sensitive photos.

To address these risks, Han recommended legal protections for children’s photos to prevent their abuse. While removing images from online platforms is one way to mitigate these risks, Han argued that the burden should not fall on children and parents to protect their privacy. Instead, she called for robust legal safeguards to ensure that children’s photos are not misused in AI training.

Overall, Han’s research underscores the urgent need for regulatory action to protect children’s privacy in the age of AI technology. By implementing strong legal protections and enforcing regulations on AI training, policymakers can help safeguard children from the risks posed by the unauthorized use of their images online.