Apple has been accused of unlawfully using copyrighted books to train its artificial intelligence systems, after two authors filed a lawsuit in the United States. The complaint alleges that the company relied on pirated works without consent or compensation, raising fresh concerns about how tech firms are sourcing material for AI development.
The authors claim the works were taken without consent
The lawsuit was filed by authors Grady Hendrix and Jennifer Roberson, who claim their books were among those copied to train Apple’s AI. According to the complaint, the company’s web crawler, known as Applebot, can access so-called “shadow libraries”. These online repositories contain vast numbers of pirated, copyrighted books that have not been licensed for use.
The authors argue that their creative works were used without permission as part of Apple’s efforts to strengthen its artificial intelligence system, branded Apple Intelligence. In their filing, they wrote that Apple had “copied the copyrighted works” to develop AI models that produce outputs which directly compete with and weaken the value of the original books.
They added: “This conduct has deprived Plaintiffs and the Class of control over their work, undermined the economic value of their labour, and positioned Apple to achieve massive commercial success through unlawful means.”
The lawsuit seeks class action status, pointing to the large number of books and authors whose works are believed to be stored in shadow libraries. Hendrix and Roberson emphasised that Apple, one of the most profitable companies in the world, did not offer to pay for their intellectual property despite using it for what they described as “a potentially lucrative venture”.
Growing legal challenges over AI training
This latest case adds to a growing list of lawsuits targeting companies involved in generative AI. Several major firms, including OpenAI, have been accused of building their systems on copyrighted material without proper authorisation.
OpenAI, the company behind ChatGPT, is already facing legal action from The New York Times and other media organisations. The lawsuits centre on claims that their journalism has been used to train AI models, which can then generate outputs resembling original reporting.
Meanwhile, Anthropic, the developer of the Claude chatbot, recently agreed to settle a copyright lawsuit brought by authors. The company will reportedly pay US$1.5 billion to resolve claims that it also relied on pirated books sourced from online libraries. The agreement is expected to cover around 500,000 writers, each receiving approximately US$3,000 per work.
Spotlight on AI development practices
The case against Apple reflects a wider debate over the methods used to train artificial intelligence systems and whether they comply with existing copyright law. Many authors and publishers argue that the unauthorised use of their work not only undermines their earnings but also threatens the integrity of creative industries.
On the other hand, technology companies often claim that large-scale data collection is necessary to improve AI performance. They argue that the models depend on vast datasets to generate accurate and useful outputs. However, as lawsuits continue to mount, firms may be forced to rethink how they source training material and consider agreements with rights holders.
The outcome of the Apple case remains uncertain, but it could prove significant in shaping the future of AI regulation and copyright enforcement. If the plaintiffs succeed in achieving class action status, it may open the door for many more authors to pursue claims against the company.