Bard reportedly used responses from ChatGPT for training, Google denies

Ben Schoon | Mar 30 2023 - 7:21 am PT

The AI wars are very much ongoing, with Google a bit late to the party. In a new report, Google’s Bard is accused of using ChatGPT responses shared online as training data, but Google denies the claim.

Google Bard is a LLM (large language model) which can generate content based on prompts. This can include explaining topics, answering questions, or creating paragraphs of text based on a simple user request. The system works a whole lot like ChatGPT, the generative AI that took the world by storm last year and whose technology is also powering the “new Bing” chatbot.

In its first week in the public eye, Bard has worked quite similarly to ChatGPT, meaning it’s quite rough around the edges in a lot of ways. Bard often gets factual details wrong, occasionally “hallucinates” and creates nonsense, and doesn’t cite its sources in any capacity.

Advertisement - scroll for more content

But, a bigger problem behind the scenes might be how Bard was trained. According to a report from The Information, a now former AI engineer at Google, Jacob Devlin, took issue with Google using data from ChatGPT to train Bard.

Devlin reportedly believed that the Bard team was “relying heavily” on responses from ChatGPT that were made public on the ShareGPT website, where users often share responses they’ve received from OpenAI’s chatbot. Devlin also felt that such training could result in Bard’s responses being more similar to those from ChatGPT.

After sharing his concerns with Sundar Pichai, Devlin resigned from his position and now works at OpenAI. Google also stopped using such data to train Bard, the report goes on to say.

Devlin quit after sharing concerns with Pichai, Dean and other senior managers that the Bard team, which received assistance from Brain employees, was training its machine-learning model using data from OpenAI’s ChatGPT. Specifically, Devlin believed the Bard team appeared to be relying heavily on information from ShareGPT, a website where people publish conversations they’ve had with ChatGPT.
The Information

Other Googlers aware of the situation apparently felt that such usage violated OpenAI’s terms of service, which barred the use of ChatGPT’s output “to develop models that compete with OpenAI.”

Since that report has surfaced, Google has come out with a brief statement to The Verge saying that Bard is not trained with data originating from ChatGPT.

Bard is not trained on any data from ShareGPT or ChatGPT.
Google Spokesperson

Google’s statement doesn’t seem to firmly rule out whether or not data from ChatGPT was ever used for Bard training, but it does seem that is at least no longer the case.

The Information report goes on to explain that Google’s Brain AI group and DeepMind, a company owned by Google’s parent company Alphabet, are joining forces to better compete with OpenAI. The project is known as “Gemini” and is an attempt to “try to match the capabilities of OpenAI’s GPT-4,” the report says. This would include reaching 1 trillion parameters – how the calculations in a machine-learning model are measured – matching that of GPT-4.