Home United States USA — software OpenAI and Google reportedly used YouTube transcripts to train their AI models

OpenAI and Google reportedly used YouTube transcripts to train their AI models

135
0
SHARE

OpenAI and Google have reportedly taken to transcripts of YouTube videos to train their LLMs.
Training artificial intelligence models requires a lot of data to help them better understand the context of queries and ultimately provide better responses. In the constant search for more data, both OpenAI and Google have turned to using YouTube videos, created by others, to train their large language models (LLMs), The New York Times reported over the weekend, citing people who claim to have knowledge of the companies’ activities.
In 2023, OpenAI developed Whisper, a speech recognition tool that would help the company scrape YouTube, take audio from more than 1 million YouTube videos, and use that to inform GPT-4, according to the Times’ sources.
Google, meanwhile, also transcribed YouTube videos, according to the report. What’s more, the search giant changed its terms of service in 2023 to make it easier to sweep up public Google Docs, Google Maps restaurant reviews, and other publicly available content for use in its AI models, according to the Times.
It’s no secret that AI models require significant troves of data to operate efficiently.

Continue reading...