Meta pauses plans to train AI using European users’ data, bowing to regulatory pressure
Meta has confirmed that it will pause plans to start training its AI systems using data from its users in the European Union and U.K.
The move follows pushback from the Irish Data Protection Commission (DPC), Metaâs lead regulator in the EU, which is acting on behalf of several data protection authorities across the bloc. The U.K.âs Information Commissionerâs Office (ICO) also requested that Meta pause its plans until it could satisfy concerns it had raised.
âThe DPC welcomes the decision by Meta to pause its plans to train its large language model using public content shared by adults on Facebook and Instagram across the EU/EEA,â the DPC said in a statement Friday. âThis decision followed intensive engagement between the DPC and Meta. The DPC, in cooperation with its fellow EU data protection authorities, will continue to engage with Meta on this issue.â
While Meta is already tapping user-generated content to train its AI in markets such as the U.S, Europeâs stringent GDPR regulations has created obstacles for Meta â and other companies â looking to improve their AI systems including large language models with user-generated training material.
However, Meta began notifying users of an upcoming change to its privacy policy last month, one that it said will give it the right to use public content on Facebook and Instagram to train its AI, including content from comments, interactions with companies, status updates, photos and their associated captions. The company argued that it needed to do this to reflect âthe diverse languages, geography and cultural references of the people in Europe.â
These changes were due to come into effect on June 26, 2024 â 12 days from now. But the plans spurred not-for-profit privacy activist organization NOYB (ânone of your businessâ) to file 11 complaints with constituent EU countries, arguing that Meta is contravening various facets of GDPR. One of those relates to the issue of opt-in versus opt-out, vis Ă vis where personal data processing does take place, users should be asked their permission first rather than requiring action to refuse.
Meta, for its part, was relying on a GDRP provision called âlegitimate interestsâ to contend that its actions were compliant with the regulations. This isnât the first time Meta has used this legal basis in defence, having previously done so to justify processing European usersâ for targeted advertising.
It always seemed likely that regulators would at least put a stay of execution on Metaâs planned changes, particularly given how difficult the company had made it for users to âopt outâ of having their data used. The company said that it sent out more than 2 billion notifications informing users of the upcoming changes, but unlike other important public messaging that are plastered to the top of usersâ feeds, such as prompts to go out and vote, these notifications appeared alongside usersâ standard notifications â friendsâ birthdays, photo tag alerts, group announcements, and more. So if someone doesnât regularly check their notifications, it was all too easy to miss this.
And those who did see the notification wouldnât automatically know that there was a way to object or opt-out, as it simply invited users to click through to find out how Meta will use their information. There was nothing to suggest that there was a choice here.
Moreover, users technically werenât able to âopt outâ of having their data used. Instead, they had to complete an objection form where they put forward their arguments for why they didnât want their data to be processed â it was entirely at Metaâs discretion as to whether this request was honored, though the company said it would honor each request.
Although the objection form was linked from the notification itself, anyone proactively looking for the objection form in their account settings had their work cut out.
On Facebookâs website, they had to first click their profile photo at the top-right; hit settings & privacy; tap privacy center; scroll down and click on the Generative AI at Meta section; scroll down again past a bunch of links to a section titled more resources. The first link under this section is called âHow Meta uses information for Generative AI models,â and they needed to read through some 1,100 words before getting to a discrete link to the companyâs âright to objectâ form. It was a similar story in the Facebook mobile app too.
Earlier this week, when asked why this process required the user to file an objection rather than opt-in, Metaâs policy communications manager Matt Pollard pointed TechCrunch to its existing blog post, which says: âWe believe this legal basis [âlegitimate interestsâ] is the most appropriate balance for processing public data at the scale necessary to train AI models, while respecting peopleâs rights.â
To translate this, making this opt-in likely wouldnât generate enough âscaleâ in terms of people willing to offer their data. So the best way around this was to issue a solitary notification in amongst usersâ other notifications; hide the objection form behind half-a-dozen clicks for those seeking the âopt-outâ independently; and then make them justify their objection, rather than give them a straight opt-out.
In an updated blog post today, Metaâs global engagement director for privacy policy Stefano Fratta said that it was âdisappointedâ by the request it has received from the DPC.
âThis is a step backwards for European innovation, competition in AI development and further delays bringing the benefits of AI to people in Europe,â Fratta wrote. âWe remain highly confident that our approach complies with European laws and regulations. AI training is not unique to our services, and weâre more transparent than many of our industry counterparts.â
AI arms race
None of this new of course, and Meta is in an AI arms race that has shone a giant spotlight on the vast arsenal of data Big Tech holds on all of us.
Earlier this year, Reddit revealed that itâs contracted to make north of $200 million in the coming years for licensing its data to companies such as ChatGPT-maker OpenAI and Google. And the latter of those companies is already facing huge fines for leaning on copyrighted news content to train its generative AI models.
But these efforts also highlight the lengths to which companies will go to to ensure that they can leverage this data within the constrains of existing legislation â âopting inâ is rarely on the agenda, and the process of opting out is often needlessly arduous. Just last month, someone spotted some dubious wording in an existing Slack privacy policy that suggested it would be able to leverage user data for training its AI systems, with users able to opt out only by emailing the company.
And last year, Google finally gave online publishers a way to opt their websites out of training its models by enabling them to inject a piece of code into their sites. OpenAI, for its part, is building a dedicated tool to allow content creators to opt out of training its generative AI smarts â this should be ready by 2025.
While Metaâs attempts to train its AI on usersâ public content in Europe is on ice for now, it likely will rear its head again in another form after consultation with the DPC and ICO â hopefully with a different user-permission process in tow.
âIn order to get the most out of generative AI and the opportunities it brings, it is crucial that the public can trust that their privacy rights will be respected from the outset,â Stephen Almond, the ICOâs executive director for regulatory risk, said in a statement today. âWe will continue to monitor major developers of generative AI, including Meta, to review the safeguards they have put in place and ensure the information rights of UK users are protected.â