(Bloomberg) -- Microsoft Corp. reached a deal with News Corp.’s HarperCollins that will allow the software company to use nonfiction titles from the book publisher to train its artificial intelligence models, according to a person familiar with the matter.
Microsoft wants the HarperCollins books for a model that it hasn’t yet announced, according to the person, who asked not to be identified discussing plans that aren’t public. The company isn’t planning to use the content to generate new books without human authors, the person said. Microsoft declined to comment.
In a statement to Bloomberg News, HarperCollins confirmed it reached an agreement with an unidentified AI technology company that would “allow limited use of select nonfiction backlist titles for training AI models to improve model quality and performance.”
HarperCollins authors will have the option to participate or not, the company said.
“Part of our role is to present authors with opportunities for their consideration while simultaneously protecting the underlying value of their works and our shared revenue and royalty streams,” HarperCollins said. “This agreement, with its limited scope and clear guardrails around model output that respects author’s rights, does that.”
Technology companies use an array of data, from social-media sites to news articles, to train AI models, and companies like Microsoft are hunting for additional sources of high-quality text that they can license to make their programs more accurate, better able to answer questions or provide expertise on specific subjects.
News Corp. signed an agreement in May with OpenAI to let the company use content from more than a dozen of its publications, including the Wall Street Journal, Barron’s and MarketWatch. OpenAI has also signed licensing deals with publishers including Axel Springer SE, the Atlantic, Vox Media Inc., Dotdash Meredith Inc., Hearst Communications Inc. and Time magazine. Microsoft has worked on AI initiatives with Reuters, Hearst and Axel Springer, which publishes Business Insider and Politico.
Some publishers have taken issue with AI companies pulling in content without permission. The New York Times is suing OpenAI and Microsoft, alleging copyright infringement. Perplexity AI Inc., another AI startup, has faced similar lawsuits.
©2024 Bloomberg L.P.