OpenAI and journalism

Our discussions with The New York Times appeared to be progressing constructively through our last communication on December 19th. Negotiations focused on a high-value partnership around real-time display with attribution in ChatGPT, in which The New York Times would gain a new way to connect with its existing and new readers, and our users would gain access to their reports. We explained to The New York Times that, like any individual source, their content did not contribute meaningfully to the training of our existing models, and will also not be sufficiently influential for future training. Their lawsuit on December 27—which we learned about by reading The New York Times—came as a surprise and a disappointment to us.

They mentioned along the way that they noticed their content being vomited, but repeatedly refused to share any examples, despite our commitment to investigate and resolve any issues. We have shown how seriously we take this as a priority, as in July when we removed the ChatGPT feature right after we found out that it can play content in real time in an unintended way.

Interestingly, the regurgitations that the New York Times has induced seem to come from years-old articles that have spread to multiple the third–Fun Web pages. They appear to have deliberately manipulated the queries, often including long excerpts of articles, to force our model to throw up. Even when they use such queries, our models typically don't behave as The New York Times insinuates, suggesting that they either told the model to return or they chose their examples from many trials.

Despite their claims, this abuse is not typical or permitted user activity and is not a substitute for The New York Times. Regardless, we are constantly making our systems more resistant to adversarial attacks to recover training data and have already made significant progress in our recent models.

Source link

Leave a Reply Cancel reply

Podcasts