Start United States USA — software OpenAI: Give Us Your Content or Get Left Behind

OpenAI: Give Us Your Content or Get Left Behind

75
0
TEILEN

Publishers are handing their content over to OpenAI in exchange for a prominent spot in ChatGPT’s future.
The Financial Times announced a deal with OpenAI on Monday to license its world-class journalism for training and informing ChatGPT’s models. It joins Axel Springer and the Associated Press who struck similar deals, where OpenAI reportedly offers millions for the right to use content. However, ChatGPT was trained on lots of other web-scraped content that OpenAI did not pay for. So why is OpenAI paying for some datasets and not others?
OpenAI’s licensing deals seem to send a clear message: we’re going to use your content anyway, so sign a deal with us or get left behind. The main perk of a licensing deal seems to be a prominent spot in ChatGPT’s answers. Some publishers may also want to solidify a relationship with the next big information distribution channel before it takes over. However, it seems OpenAI is using a lot of publishers’ content anyways.
OpenAI already trains its AI models in part on “publicly available data” according to CTO Mira Murati, which seems purposefully vague. What is publicly available data anyway? The phrase assumes anything free to read on the internet is also free to build into ChatGPT. For instance, Gizmodo is part of OpenAI’s “publicly available data.” Our website was cached over 34,000 times on GPT-2’s WebText dataset, the last dataset OpenAI disclosed using to train an AI model.
Gizmodo is free for readers largely due to the ads on this webpage.

Continue reading...