Business

Meta employees discussed purchasing Simon & Schuster to train AI models, the report shows

Published

3 months ago

April 10, 2024

Recent revelations from internal meetings at Meta, the parent company of Facebook and Instagram, have shed light on discussions among managers, lawyers, and engineers regarding the potential acquisition of Simon & Schuster to procure books for training the company's artificial intelligence (AI) tools.

Recent revelations from internal meetings at Meta, the parent company of Facebook and Instagram, have shed light on discussions among executives, lawyers and engineers about the possible acquisition of Simon & Schuster to purchase books for training artificial intelligence (AI) company tools.

The recordings, shared with the New York Times by a Meta contributor, provide insight into the deliberations about leveraging the renowned publisher’s extensive catalog to improve AI training, raising ethical and legal considerations.

According to the recordings from March to April 2023, Meta staff met almost daily to explore opportunities to collect additional data to train AI models. Discussion included the possibility of purchasing Simon & Schuster, with some participants considering paying $10 per book for licensing rights to new titles.

Simon & Schuster, a prominent player in the English-language publishing landscape and part of the esteemed ‘Big Five’ alongside Penguin Random House, HarperCollins, Hachette and Macmillan, boasts a selection of leading authors such as Stephen King, Colleen Hoover and Bob Woodward.

The prospect of Meta acquiring Simon & Schuster arose following Paramount Global’s March 2020 announcement of its intention to divest the publishing business. Despite a aborted merger attempt Together with Penguin Random House, Simon & Schuster was eventually sold to private equity firm KKR in August 2023.

Ahmad Al-Dahle, Meta’s vice president of generative AI, reportedly informed executives that the company had exhausted almost all available English-language literary content on the Internet for AI training purposes, prompting the search for new data sources.

Employees recognized the use of textual resources without permission and considered expanding these practices despite potential legal consequences. An attorney’s concerns about the ethical implications of using copyrighted intellectual property were met with silence.

In addition, discussions revealed that Meta used contractors in Africa to aggregate summaries of copyrighted fiction and non-fiction texts, raising further ethical and legal questions regarding its data collection practices.

Maria A Pallante, president of the Association of American Publishers, expressed skepticism about Simon & Schuster’s willingness to stage such a sale, questioning Meta’s intentions and its potential impact on authors and contractual agreements.

In a related development, California federal judge Vince Chhabria has dismissed part of a copyright lawsuit brought by comedian Sarah Silverman and other authors against Meta for using copyrighted books in training its AI system LLaMA. Chhabria cast doubt on claims that the results of the AI models were significantly similar to the authors’ works, underscoring the ongoing debates over AI and intellectual property rights.