{"id":14741,"date":"2025-01-12T19:03:15","date_gmt":"2025-01-12T18:03:15","guid":{"rendered":"http:\/\/plus.maciejpiasecki.info\/index.php\/2025\/01\/12\/meta-sued-for-allegedly-training-ai-with-content-from-pirated-books\/"},"modified":"2025-01-12T21:18:45","modified_gmt":"2025-01-12T20:18:45","slug":"meta-sued-for-allegedly-training-ai-with-content-from-pirated-books","status":"publish","type":"post","link":"https:\/\/plus.maciejpiasecki.info\/index.php\/2025\/01\/12\/meta-sued-for-allegedly-training-ai-with-content-from-pirated-books\/","title":{"rendered":"Meta sued for allegedly training AI with content from pirated books"},"content":{"rendered":"<p>Meta is one of the companies that has decided to bet heavily on artificial intelligence to stay among the top companies in the tech industry. The firm has its own series of AI models, Llama. Like other companies, Meta trained Llama using datasets with large amounts of information available on the internet. However, a group of authors is suing Meta for allegedly using pirated books to train their AI models.<br \/>\nAuthors like Ta-Nehisi Coates and comedian Sarah Silverman (among others) are part of the group that says Meta used a dataset with content from stolen books. Not only that, the company\u2019s CEO, Mark Zuckerberg would have been aware that the dataset contained pirated books before giving his approval for its use in the Llama training.<br \/>\nMeta deliberately used pirated books to train AI, lawsuit claims<br \/>\nDocuments related to the lawsuit were made public in the middle of this week. The case, filed in a California federal court, stems from another filed in 2023 and dismissed last year by U.S. District Judge Vince Chhabria. At the time, the authors claimed that Meta AI was able to generate text that infringed their copyrights. The original suit also alleged that Meta AI removed the copyright management information (CMI) from the content of their books.<br \/>\nThe plaintiff group wants the case reopened<br \/>\nHowever, the plaintiff group claims that new findings warrant reopening the case. They say that they had access to internal Meta communications where Zuckerberg \u201capproved Meta\u2019s use of the LibGen dataset notwithstanding concerns within Meta\u2019s AI executive team (and others at Meta) that LibGen is \u2018a dataset we know to be pirated.&#8217;\u201d LibGen is a dataset for AI training that was available on the internet for a time. It contained around 32 TBs of content focused on books of all kinds\u2014including scientific content.<br \/>\nThe plaintiffs told Judge Chhabria that the new findings not only bolster their previous claims. They even think they may also include a new computer fraud claim. The judge will allow the plaintiffs to present their new evidence in an amended complaint. However, he also expressed skepticism that the lawsuit could be successful for the authors.<br \/>\nThe post Meta sued for allegedly training AI with content from pirated books appeared first on Android Headlines.&#013;<br \/>\n<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/plus.maciejpiasecki.info\/wp-content\/uploads\/2025\/01\/Meta-AI-AH.jpg\" width=\"1920\" height=\"1080\">&#013;<br \/>\nSource: ndroidheadlines.com&#013;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Meta is one of the companies that has decided to bet heavily on artificial intelligence to stay among the top [&hellip;]<\/p>\n","protected":false},"author":67,"featured_media":14742,"comment_status":"false","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-14741","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-bez-kategorii"],"_links":{"self":[{"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/posts\/14741","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/users\/67"}],"replies":[{"embeddable":true,"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/comments?post=14741"}],"version-history":[{"count":1,"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/posts\/14741\/revisions"}],"predecessor-version":[{"id":14743,"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/posts\/14741\/revisions\/14743"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/media\/14742"}],"wp:attachment":[{"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/media?parent=14741"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/categories?post=14741"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/tags?post=14741"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}