{"id":18003,"date":"2025-09-10T15:39:54","date_gmt":"2025-09-10T13:39:54","guid":{"rendered":"https:\/\/plus.maciejpiasecki.info\/index.php\/2025\/09\/10\/report-from-oss-eu-2025-and-ai_dev-whats-next-for-osaid\/"},"modified":"2025-09-10T22:49:15","modified_gmt":"2025-09-10T20:49:15","slug":"report-from-oss-eu-2025-and-ai_dev-whats-next-for-osaid","status":"publish","type":"post","link":"https:\/\/plus.maciejpiasecki.info\/index.php\/2025\/09\/10\/report-from-oss-eu-2025-and-ai_dev-whats-next-for-osaid\/","title":{"rendered":"Report from OSS EU 2025 and AI_dev: What\u2019s next for OSAID"},"content":{"rendered":"<p>At this year\u2019s Open Source Summit EU (OSS EU 2025) and AI_dev EU 2025, Stefano Maffulli, Executive Director of OSI, and Jordan Maris, OSI\u2019s EU Policy Analyst, presented a summary of 10 months since the release of the Open Source AI Definition v1.0. The two talks at the events hosted by the Linux Foundation in Amsterdam raised questions from the audience that reflected the maturity of the debate.<\/p>\n<p>The AI dilemma<\/p>\n<p>The presentations quickly summarized why AI had to be treated differently from software. The emergence of AI systems capable of generating code, images, and text presented a fundamental challenge to traditional Open Source concepts. While it was relatively straightforward to determine that Open Source AI should provide the same four freedoms as Open Source software (use, study, modify, and share), defining the equivalent of \u201csource code\u201d for AI systems proved complex.<\/p>\n<p>Unlike traditional software where humans write code, AI systems\u2014particularly modern machine learning models\u2014operate as black boxes with behavior that emerges from training processes rather than explicit programming. This raised the critical question: what constitutes the \u201cpreferred form\u201d for users and developers to study and modify an AI system?<\/p>\n<p>Enter the Open Source AI Definition<\/p>\n<p>Over the past years, OSI convened over 100 participants from 27 countries\u2014many from the global south\u2014with the support of the Sloan Foundation and other partners. Our goal was to define what \u201cOpen Source AI\u201d should mean in practice. The process led to the Open Source AI Definition (OSAID), approved by the OSI Board in October 2024. According to this definition, an Open Source AI must provide unrestricted access to:<\/p>\n<p>Model weights and parameters<\/p>\n<p>Code used to build and train the system\u00a0<\/p>\n<p>Code for dataset creation<\/p>\n<p>The complete training data (or detailed information enabling its reproduction, when distribution isn\u2019t possible)<\/p>\n<p>This ensures that the four essential freedoms of Open Source\u2014use, study, modify, and share\u2014apply meaningfully to AI.<\/p>\n<p>Maffulli emphasized that while some people describe openness as a \u201cspectrum,\u201d OSI sees Open Source as a binary gate threshold. Just as Linux and BSD represent different licensing models but both qualify as Open Source, AI must meet a minimum set of criteria to be considered genuinely Open Source.<\/p>\n<p>Why policymakers should care<\/p>\n<p>Maris highlighted why this clarity is crucial for lawmakers worldwide. Governments are drafting AI legislation in the EU, Canada, the U.S., and China, often with special provisions for Open Source. Without a clear definition, however, \u201cOpen Source\u201d risks being used as a marketing buzzword, undermining trust, competition, and safety.<\/p>\n<p>The definition\u2019s development was significantly influenced by regulatory needs, particularly in the European Union. The EU\u2019s AI Act includes exemptions for Open Source AI, but lawmakers struggled with how to define such systems. This challenge became apparent during the Act\u2019s negotiation process, where well-intentioned attempts to consider Open Source created complexity without clear definitional boundaries.<\/p>\n<p>The Open Source AI Definition matters because it ensures:<\/p>\n<p>1. True freedom to use<\/p>\n<p>The OSAID ensures genuine freedom to use AI systems without hidden restrictions. Many models claiming to be \u201cOpen Source\u201d actually impose usage limitations based on user numbers, commercial applications, or other criteria. Such restrictions contradict both the spirit of Open Source and the policy rationale for regulatory exemptions, which assume that Open Source AI can contribute to \u201cresearch and innovation in the market\u201d and \u201cprovide significant growth opportunities.\u201d<\/p>\n<p>2. Legal flexibility through data information<\/p>\n<p>The definition introduces the concept of \u201cdata information\u201d as an alternative to requiring complete dataset publication. This approach addresses several critical legal challenges:<\/p>\n<p>Data protection compliance: Particularly relevant for medical AI, where patients may consent to their data being used for beneficial AI development but not for public distribution<\/p>\n<p>Copyright law variations: International differences in copyright law (exemplified by Italy\u2019s state copyright on the statue of David) make universal dataset sharing legally problematic<\/p>\n<p>Text and data mining exceptions: EU copyright law allows AI training on copyrighted material but doesn\u2019t extend to redistribution rights<\/p>\n<p>Accidental copyright inclusion: Large datasets inevitably contain some copyrighted material; discovering this shouldn\u2019t invalidate a model\u2019s Open Source status<\/p>\n<p>3. Downstream risk analysis and compliance<\/p>\n<p>The definition enables critical downstream compliance verification. Developers building derivative AI systems need sufficient information to ensure their creations comply with legal requirements including:<\/p>\n<p>Data accuracy and quality standards (required under the EU AI Act)<\/p>\n<p>Data protection law compliance<\/p>\n<p>Bias and safety risk assessment<\/p>\n<p>Copyright clearance verification<\/p>\n<p>Security validation against data poisoning attacks<\/p>\n<p>Without transparency, derivative AI systems risk legal liability and potential harm to users\u2014slowing innovation and adoption.<\/p>\n<p>What we\u2019ve seen in 2025<\/p>\n<p>Several encouraging developments suggest movement toward true Open Source AI:<\/p>\n<p>Increasing openness: Models like Granite and DeepSeek R1 are progressively releasing more training code and infrastructure<\/p>\n<p>Open dataset growth: More truly open training datasets are becoming available, such as EleutherAI\u2019s Common Pile<\/p>\n<p>Reduced barriers: Government and institutional initiatives are providing compute resources for AI development<\/p>\n<p>State funding: Various governments are funding the development of Open Source AI systems<\/p>\n<p>Open Source AI models: Organizations like AI2 with their OLMo project are developing models trained entirely on open datasets<\/p>\n<p>But challenges remain. Many still misuse the term \u201cOpen Source AI\u201d to mean \u201copen weights only,\u201d which OSI continues to push back against. Copyright and data provenance remain complex issues, especially given global legal variation.<\/p>\n<p>The next frontier: data governance<\/p>\n<p>Looking ahead, OSI is doubling down on data governance and interoperability. In October 2025, we will host Deep Dive: Data Governance (October 1\u20133), a free online event bringing together experts to explore data standards, legal frameworks, and best practices for building trustworthy AI systems.<\/p>\n<p>Get involved<\/p>\n<p>OSI is committed to monitoring the field and evolving the definition as technology and practices mature. We continue to seek broader community engagement, particularly from developers working with non-generative AI systems in fields like biotechnology, medical applications, and computer vision. These diverse use cases help inform the definition\u2019s evolution and ensure broad applicability. Together, we can ensure that AI development remains open, fair, and empowering for everyone\u2014developers, researchers, and society at large.<\/p>\n<p>OSI is calling on the community to:<\/p>\n<p>Share your use cases \u2013 Whether in LLMs, robotics, biotech, or beyond.<\/p>\n<p>Attend the Deep Dive: Data Governance conference \u2013 Registration is free, and it\u2019s your chance to shape the conversation about public, Open Source AI.<\/p>\n<p>Resources<\/p>\n<p>Watch the full recordings here:<\/p>\n<p>OSS EU 2025 video recording<\/p>\n<p>AI_dev EU 2025 video recording<br \/>\n&#013;<br \/>\n&#013;<br \/>\nSource: opensource.org&#013;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>At this year\u2019s Open Source Summit EU (OSS EU 2025) and AI_dev EU 2025, Stefano Maffulli, Executive Director of OSI, [&hellip;]<\/p>\n","protected":false},"author":72,"featured_media":0,"comment_status":"false","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[],"class_list":["post-18003","post","type-post","status-publish","format-standard","hentry","category-mp"],"_links":{"self":[{"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/posts\/18003","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/users\/72"}],"replies":[{"embeddable":true,"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/comments?post=18003"}],"version-history":[{"count":1,"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/posts\/18003\/revisions"}],"predecessor-version":[{"id":18004,"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/posts\/18003\/revisions\/18004"}],"wp:attachment":[{"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/media?parent=18003"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/categories?post=18003"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/tags?post=18003"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}