By Tom West
As scrutiny around the use of unlicensed, copyright-protected content as AI training data grows, collective licensing is emerging as a practical way to balance innovation, transparency and fair remuneration.
Generative AI has quickly become a strategic priority for businesses across Europe, with much of the conversation focused on productivity, efficiency and new creative possibilities. But as AI adoption moves from experimentation into large-scale commercial deployment, attention is beginning to shift towards a more fundamental question: how these systems access and use the content they depend on.
For businesses deploying AI technologies, this is no longer simply a technical issue. Questions around transparency, provenance and lawful access to content are becoming increasingly important, particularly in regulated and reputation-sensitive environments where trust matters as much as capability.
Trust and transparency
At the centre of the debate is copyright-protected content. Generative AI systems rely heavily on journalism, books, research, images and other trusted, professionally created material to generate reliable and useful outputs. For publishers, authors and creators, the issue is becoming increasingly straightforward: if creative work contributes value to AI systems, its use should be authorised, transparent and properly recognised.
For a period, parts of the AI sector operated on the assumption that publicly accessible content could simply be absorbed into training models without permission or accountability. That position ignores the basic principles of copyright and is becoming more difficult to sustain as legal scrutiny increases and governments begin developing clearer regulatory frameworks around AI and intellectual property. The European Union’s AI Act, alongside wider debates around copyright reform in the UK and elsewhere, reflects a broader shift towards transparency and accountability in AI development.
Businesses themselves are also becoming more cautious. Organisations investing in AI technologies increasingly want clarity around where training data comes from, whether permission has been granted for its use and whether those systems could withstand scrutiny from regulators, customers or partners. This is not simply about compliance. It is about trust.
The quality of AI outputs is directly shaped by the quality of the material used to train them. Systems built on trusted and professionally created content are more likely to produce accurate, reliable and commercially useful outputs. Equally, uncertainty around data sourcing creates legal and reputational risks that many businesses will be reluctant to absorb over the long term.
This is why provenance is becoming a much more central issue within AI adoption. Rather than being treated as a barrier to innovation, lawful access to high-quality content is increasingly being recognised as part of the infrastructure required for AI to scale responsibly.
A scalable solution
The challenge, however, is scale. Generative AI largel language models require access to enormous volumes of material, often covering millions of individual works across multiple sectors and territories. Negotiating permissions on a one-to-one basis is unrealistic for both rights holders and AI developers, particularly as demand for high-quality training data continues to grow.
This is where collective licensing is becoming an increasingly important part of the conversation.
Collective licensing allows rights from multiple rightsholders to be aggregated into a single framework, creating a more practical route to lawful access at scale. It is a model with a long history within publishing, media and music licensing, where it has enabled organisations to access large repertoires of content efficiently while ensuring creators are compensated for the use of their work.
That same approach is now being adapted for generative AI.
In the UK, Publishers’ Licensing Services (PLS), working alongside the Copyright Licensing Agency (CLA) and the Authors’ Licensing and Collecting Society (ALCS), has developed one of the first collective licensing models specifically designed for generative AI. The framework allows publishers to opt in and make content available under clear and transparent licensing terms, creating a more practical route for AI developers seeking lawful access to high-quality material at scale.
It also reflects a broader recognition across both the publishing and AI sectors that long-term adoption will depend on clearer frameworks around permission, transparency and remuneration. As AI technologies become more commercially embedded, the expectation that content use should be authorised and accountable is likely to become increasingly difficult to avoid.
Shared interests
The significance of collective licensing is therefore not simply legal, it is also practical and economic. AI companies need scalable access and permission to use trusted content, while publishers and creators need mechanisms that allow them to participate fairly in the value AI generates. Collective licensing creates infrastructure capable of supporting both objectives simultaneously.
This is sometimes framed as a tension between innovation and rights protection, but that increasingly feels like a false divide. The long-term success of generative AI depends on access to reliable, professionally created and diverse content. At the same time, the sustainability of the creative economy depends on ensuring that creators and publishers are not excluded from emerging AI markets built, in part, on their work.
That balance matters beyond the largest technology companies and media organisations. Without scalable licensing frameworks, participation in AI-related opportunities risks becoming concentrated among a relatively small number of businesses capable of negotiating direct agreements at scale. Smaller publishers, independent creators and specialist media organisations could easily find themselves excluded despite producing content that is valuable precisely because of its quality, expertise and diversity.
That would not only create economic imbalance within the creative industries; it could also weaken AI systems themselves. Diversity of content is an important factor in producing reliable and representative AI outputs. Narrower datasets inevitably produce narrower perspectives and weaker results.
Collective licensing offers a way to avoid that outcome by creating a broader and more inclusive market for participation. It allows organisations of different sizes to contribute to and benefit from the AI ecosystem without requiring every negotiation to happen individually. For AI developers, that means access to richer and more diverse datasets. For rights holders, it creates a clearer route to recognition and remuneration in a rapidly changing technological environment.
A more sustainable model
More broadly, what is beginning to emerge is a shift away from extraction and towards collaboration. Early Generative AI development was often characterised by limited transparency around how content was sourced and used. That approach is becoming increasingly difficult to defend legally, commercially and reputationally as expectations around accountability continue to evolve.
A more sustainable model is beginning to take shape, one in which AI developers, AI users, publishers and policymakers recognise that their interests are not necessarily in conflict. AI systems depend on trusted, high-quality content. Creative industries need workable ways to license and monetise that content. Policymakers want frameworks that support innovation while also protecting long-term economic and cultural sustainability.
Collective licensing will not resolve every issue surrounding AI and copyright, nor is it the only model that will emerge as the market develops. But it does offer one of the clearest and most practical frameworks currently available for balancing innovation with transparency and fair remuneration.
As generative AI continues to evolve, the organisations most likely to succeed over the long term will be those able to demonstrate not only technical capability, but also trustworthiness. Increasingly, that means showing that AI systems are built on foundations that are lawful, transparent and sustainable.
Conclusion
The debate around generative AI is moving beyond what the technology can do and towards how it should operate responsibly at scale. As scrutiny around training data, copyright and transparency increases, lawful access to high-quality content is becoming an increasingly important part of sustainable AI development.
Collective licensing offers a practical way to support that shift, creating a framework that enables innovation while recognising the value of the creative work AI systems rely on.
About the Author
Tom West is Chief Executive of Publishers’ Licensing Services (PLS), where he leads the development of collective licensing solutions for the publishing and creative sectors. With over two decades at PLS, he has deep expertise in rights management, licensing innovation and content access, playing a key role in expanding services and advancing sustainable models for the use of content in emerging technologies.








