Navigating Copyright, Data Governance, and AI Training: Multijurisdictional Policy Assessment
- DOI
- 10.2991/978-2-38476-547-8_6How to use a DOI?
- Keywords
- Artificial Intelligence; Machine Learning; Generative AI; Copyright; Data Governance
- Abstract
The fast-paced development of generative “Artificial Intelligence” (AI) has put copyright in a critical position. The training of these AI models needs magnanimous and systematic copying of creative works. However, it raises uncertainty for the creators, authors, and artists about how their works are being used. AI models make this process quite opaque. The lack of a global framework about whether these AI models infringe copyright, combined with the lack of transparency, leads to legal and ethical ambiguity. On one hand, the creators risk losing control, recognition, and financial benefits, while on the other hand, companies keep on benefiting from the access to unrestricted data, thus intensifying the power imbalance. In India, the gaps in policy are intensified, for instance, the scope of Section 52 of the Copyright Act, the absence of text and mining data exceptions, the lack of disclosure requirements, and the absence of AI-centric regulation and guidelines.
The training of generative AI requires large-scale copyrighted works. The Copyright Act, 1957, does not have any AI-related provision, in turn creating uncertainty around text and data mining. In certain cases, AI models use copyrighted content without necessary permissions, and raising transparency concerns cannot alone resolve this issue. Developers depend on “Fair Use,” which was not intended to cover large-scale automated copying. It is difficult to strike a balance between copyright infringement and fair use. The U.S. district court on February 11, 2025, in “Thomson Reuters v. Ross Intelligence”, showcased a limited “Fair use” in training AI models.
The objective of this study is to critically analyze the legal and ethical concerns arising from AI platforms using copyrighted materials in training their models. To achieve this, the research adopts a doctrinal and comparative study to analyze copyright, data governance, and AI training platforms across select global jurisdictions. It examines the statutory provisions, regulatory compliance, and policies to understand how the current copyright framework addresses or fails to address the challenges posed by AI models. It focuses on transparency requirements, fair use, and unauthorized use of copyrighted materials while training AI datasets. This helps us understand the gaps in the current legal framework, assess and recommend legal reforms that can balance technological advancement, innovation, and creators’ rights.
- Copyright
- © 2026 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Maitreyee Bapat AU - Isha Kul PY - 2026 DA - 2026/03/05 TI - Navigating Copyright, Data Governance, and AI Training: Multijurisdictional Policy Assessment BT - Proceedings of the International Conference on Socio Legal Intricacies of Artificial Intelligence (ICSLIAI 2026) PB - Atlantis Press SP - 41 EP - 46 SN - 2352-5398 UR - https://doi.org/10.2991/978-2-38476-547-8_6 DO - 10.2991/978-2-38476-547-8_6 ID - Bapat2026 ER -