#Tech news

Google admits its AI Overviews need work, but we’re all helping it beta test


Google is embarrassed about its AI Overviews, too. After a deluge of dunks and memes over the past week, which cracked on the poor quality and outright misinformation that arose from the tech giantā€™s underbaked new AI-powered search feature, the company on Thursday issued a mea culpa of sorts. Google ā€” a company whose name is synonymous with searching the web ā€” whose brand focuses on ā€œorganizing the worldā€™s informationā€ and putting it at userā€™s fingertips ā€” actually wrote in a blog post that ā€œsome odd, inaccurate or unhelpful AI Overviews certainly did show up.ā€

Thatā€™s putting it mildly.

The admission of failure, penned by Google VP and Head of Search Liz Reid, seems a testimony as to how the drive to mash AI technology into everything has now somehow made Google Search worse.

In the post titled ā€œAbout last week,ā€ (this got past PR?), Reid spells out the many ways its AI Overviews make mistakes. While they donā€™t ā€œhallucinateā€ or make things up the way that other large language models (LLMs) may, she says, they can get things wrong for ā€œother reasons,ā€ like ā€œmisinterpreting queries, misinterpreting a nuance of language on the web, or not having a lot of great information available.ā€

Reid also noted that some of the screenshots shared on social media over the past week were faked, while others were for nonsensical queries, like ā€œHow many rocks should I eat?ā€ ā€” something no one ever really searched for before. Since thereā€™s little factual information on this topic, Googleā€™s AI guided a user to satirical content. (In the case of the rocks, the satirical content had been publishedĀ on a geological software providerā€™s website.)

Itā€™s worth pointing out that if you had Googled ā€œHow many rocks should I eat?ā€ and were presented with a set of unhelpful links, or even a jokey article, you wouldnā€™t be surprised. What people are reacting to is the confidence with which the AI spouted back that ā€œgeologists recommend eating at least one small rock per dayā€ as if itā€™s a factual answer. It may not be a ā€œhallucination,ā€ in technical terms, but the end user doesnā€™t care. Itā€™s insane.

Whatā€™s unsettling, too, is that Reid claims Google ā€œtested the feature extensively before launch,ā€ including with ā€œrobust red-teaming efforts.ā€

Does no one at Google have a sense of humor then? No one thought of prompts that would generate poor results?

In addition, Google downplayed the AI featureā€™s reliance on Reddit user data as a source of knowledge and truth. Although people have regularly appended ā€œRedditā€ to their searches for so long that Google finally made it a built-in search filter, Reddit is not a body of factual knowledge. And yet the AI would point to Reddit forum posts to answer questions, without an understanding of when first-hand Reddit knowledge is helpful and when it is not ā€” or worse, when it is a troll.

Reddit today is making bank by offering its data to companies like Google, OpenAI and others to train their models, but that doesnā€™t mean users want Googleā€™s AI deciding when to search Reddit for an answer, or suggesting that someoneā€™s opinion is a fact. Thereā€™s nuance to learning when to search Reddit and Googleā€™s AI doesnā€™t understand that yet.

As Reid admits, ā€œforums are often a great source of authentic, first-hand information, but in some cases can lead to less-than-helpful advice, like using glue to get cheese to stick to pizza,ā€ she said, referencing one of the AI featureā€™s more spectacular failures over the past week.

Google AI overview suggests adding glue to get cheese to stick to pizza, and it turns out the source is an 11 year old Reddit comment from user F*cksmith šŸ˜‚ pic.twitter.com/uDPAbsAKeO

ā€” Peter Yang (@petergyang) May 23, 2024

If last week was a disaster, though, at least Google is iterating quickly as a result ā€” or so it says.

The company says itā€™s looked at examples from AI Overviews and identified patterns where it could do better, including building better detection mechanisms for nonsensical queries, limiting the user of user-generated content for responses that could offer misleading advice, adding triggering restrictions for queries where AI Overviews were not helpful, not showing AI Overviews for hard news topics, ā€œwhere freshness and factuality are important,ā€ and adding additional triggering refinements to its protections for health searches.

With AI companies building ever-improving chatbots every day, the question is not on whether they will ever outperform Google Search for helping us understand the worldā€™s information, but whether Google Search will ever be able to get up to speed on AI to challenge them in return.

As ridiculous as Googleā€™s mistakes may be, itā€™s too soon to count it out of the race yet ā€” especially given the massive scale of Googleā€™s beta-testing crew, which is essentially anybody who uses search.

ā€œThereā€™s nothing quite like having millions of people using the feature with many novel searches,ā€ says Reid.





Source link

Leave a comment

Your email address will not be published. Required fields are marked *