MoneyWatch

Google strikes $60 million deal with Reddit, allowing search giant to train AI models on human posts

February 23, 2024 / 12:35 PM EST / CBS/AP

Google rolls out Gemini AI chatbot and assistant

Google has struck a deal with Reddit that allows the search giant to use posts from the online discussion site for training its artificial intelligence models and to improve services such as Google Search.

The arrangement, announced Thursday and valued at roughly $60 million, will also give Reddit access to Google AI models for improving its internal site search and other features. Reddit declined to comment or answer questions beyond its written statement about the deal.

Separately, the San Francisco-based company announced plans for its initial public offering Wednesday. In documents filed with the Securities and Exchange Commission, Reddit said it reported net income of $18.5 million — its first profit in two years — in the October-December quarter on revenue of $249.8 million. The company said it aims to list its shares on the New York Stock Exchange under the ticker symbol RDDT.

The Google deal is a big step for Reddit, which relies on volunteer moderators to run its sprawling array of freewheeling topic-based discussions. Those moderators have publicly protested earlier Reddit decisions, most recently blacking out much of the site for days when Reddit announced plans to start charging many third-party apps for access to its content.

Google unveils never-before-seen text-to-video AI tool in this week's "60 Minutes"

The data-sharing arrangement is also highly significant for Google which is hungry for access to human-written material it can use to train its AI models to improve their "understanding" of the world and thus their ability to provide relevant answers to questions in a conversational format.

It also sets a precedent for how tech companies go about obtaining data for their growing AI-models to feast on, even when it comes to content that is publicly available. A barrage of high-profile lawsuits were filed in a New York federal court in January testing the future of large language models like ChatGPT and other artificial intelligence products that ingest huge troves of copyrighted human works available online.

The arrangement with Google doesn't presage any sort of data-driven changes to how Reddit functions, according to an individual familiar with the matter. This person requested anonymity in order to speak freely during the SEC-enforced "quiet period" that precedes an IPO. Unlike social media sites such as TikTok, Facebook and YouTube, Reddit does not use algorithmic processes that try to guess what users will be most interested in seeing next. Instead, users simply search for the discussion forums they're interested in and can then dive into ongoing conversations or start new ones.

The individual also noted that the agreement requires Google to comply with Reddit's user terms and privacy policy, which also differ in some ways from other social media. For instance, when Reddit users delete their posts or other content, the site deletes it everywhere, with no ghostly remnants lingering in unexpected locations. Reddit partners such as Google are required to do likewise in order "to respect the choices that users make on Reddit," the individual said.

Google praised Reddit in a news release, calling it a repository for "an incredible breadth of authentic, human conversations and experiences" and stressing that the search giant primarily aims "to make it even easier for people to benefit from that useful information."

Google played down its interest in using Reddit data to train its AI systems, instead emphasizing how it will make it "even easier" for users to access Reddit information, such as product recommendations and travel advice, by funneling it through Google products.

It described this process as "more content-forward displays of Reddit information" that aim to benefit both Google's tools and to make it easier for people to participate on Reddit.

Halt on Gemini AI-image tool

Among those tools is Google's artificial intelligence chatbot, Gemini.

Google on Thursday said it is suspending its Gemini chatbot from generating images of people a day after apologizing for "inaccuracies" in historical depictions it was creating.

Gemini users this week posted screenshots on social media of historically White-dominated scenes with racially diverse characters they say the chatbot generated, leading critics to raise questions about whether the company is overcorrecting for the risk of racial bias in its AI model.

"We're already working to address recent issues with Gemini's image generation feature," Google said in a post on the social media platform X. "While we do this, we're going to pause the image generation of people and will re-release an improved version soon."