{"id":13227,"date":"2023-07-13T15:00:00","date_gmt":"2023-07-13T13:00:00","guid":{"rendered":"http:\/\/plus.maciejpiasecki.info\/index.php\/2023\/07\/13\/towards-a-definition-of-open-artificial-intelligence-first-meeting-recap\/"},"modified":"2023-07-13T22:24:23","modified_gmt":"2023-07-13T20:24:23","slug":"towards-a-definition-of-open-artificial-intelligence-first-meeting-recap","status":"publish","type":"post","link":"https:\/\/plus.maciejpiasecki.info\/index.php\/2023\/07\/13\/towards-a-definition-of-open-artificial-intelligence-first-meeting-recap\/","title":{"rendered":"Towards a definition of \u201cOpen Artificial Intelligence\u201d: First meeting recap"},"content":{"rendered":"<p>The Open Source Initiative recently kicked off a multi-stakeholder process to define machine learning systems that can be characterized as \u201cOpen Source.\u201d A long list of non-profit organizations, corporations and research groups have joined our call to find a common understanding of \u201copen\u201d principles applied to artificial intelligence (AI).\u00a0<\/p>\n<p>A group of people who work at Mozilla Foundation, Creative Commons, Wikimedia Foundation, Internet Archive, Linux Foundation Europe, OSS Capital and OSI board members met recently in San Francisco to start framing the conversation.\u00a0<\/p>\n<p>Participants, who were not representing their employers, included: Lila Bailey, Adam Bouhenguel, Gabriele Columbro, Heather Meeker, Daniel Nazer, Jacob Rogers, Derek Slater and Luis Villa. The OSI\u2019s Executive Director Stefano Maffulli and board members Pam Chestek, Aeva Black, and Justin Colannino also weighed in during the four-hour afternoon meeting at Mozilla\u2019s San Francisco headquarters.<\/p>\n<p>As the legislators accelerate and the doomsayers chant, one thing is clear: It\u2019s time to define what \u201copen\u201d means in this context before it\u2019s defined for us. AI is a controversial term and, for right now, the conversation about what to call this \u201copen\u201d definition is ongoing.<\/p>\n<p>We want you to get involved: Send a proposal to speak at the online webinar series before August 4, 2023 and check out the timeline for upcoming in-person workshops. Up next is the first community review in Portland at FOSSY.\u00a0<\/p>\n<p>Why we\u2019re in this together<\/p>\n<p>This first small gathering aimed to set ground rules and create the first working document of a \u201cDefinition of AI systems\u201d that reflect the Open Source values.\u00a0<\/p>\n<p>The group brainstormed over 20 reasons for dedicating time on this milestone project. These included reducing confusion for policymakers, helping developers understand data sharing and transparency, reducing confusion for re-users and modifiers, creating a permission structure and fighting open washing.A few in detail:<\/p>\n<p>Good for business, good for the world\u00a0<\/p>\n<p>Participants agreed there\u2019s value in understanding which startups and technologies to invest in, based on their \u201copen practices\u201d and contributions to the community.<\/p>\n<p>One participant commented, \u201cThe point is not that we need a definition [of open AI] for business. The point is we need a definition to identify people who are doing technology in a way that shares it with the world, and that is what is important. Even if companies fail, they\u2019ve still given something to the world.\u201d<\/p>\n<p>Cracking the black box<\/p>\n<p>The group was soundly divided on the tensions and tradeoffs around transparency in ML training data. There\u2019s a huge question when it comes to the sausage making that is today\u2019s AI systems \u2013 what goes in and what comes out? Who gets to see the ingredients? What data should be transparent \u2013 zip codes, for example \u2013 and what information should not be \u2013 single patient tumor scans?\u00a0<\/p>\n<p>\u201cWhen a private company creates private machine learning models, we have no idea what is forming or shaping those models, to the detriment of society as a whole,\u201d one person commented. Another person added, \u201cI\u2019m very concerned about people blocking access to [their own personal financial or health care] data [that could be] used to train models because we\u2019re going to get inherently biased\u2026I hope that those designing the models are thinking long and hard about what data is important and valuable, especially if there are people saying \u2018you shouldn\u2019t use my medical data to train your model.\u2019 That\u2019s a very harmful road to go down.\u201d<\/p>\n<p>The value of openness<\/p>\n<p>Open Source is about delivering users self-sovereignty in their software. Presumably an \u201cOpen AI\u201d would be aimed at delivering self-sovereignty when it comes to use of and input into AI systems. Self-sovereignty is the reason field-of-use restrictions are forbidden in Open Source: Those imply requiring permission from a gatekeeper to proceed.<\/p>\n<p>\u201cPart of this work involves reflecting on the past 20-to-30 years of learning about what has gone well and what hasn\u2019t in terms of the open community and the progress it has made,\u201d one participant said, adding that \u201cIt\u2019s important to understand that openness does not automatically mean ethical, right or just.\u201d Other factors such as privacy concerns and safety when developing open systems come into play \u2013 there\u2019s an ongoing tension between something being open and being safe, or potentially harmful.\u00a0<\/p>\n<p>\u201cIt is crucial to establish a document that not only offers a definition of openness but that also provides the necessary context to support it.\u201d<\/p>\n<p>Key debates<\/p>\n<p>Participants generally agreed that the Definition of Open Source, drafted 25 years ago and maintained by the OSI, does not cover this new era. \u201cThis is not a software-only issue. It\u2019s not something that can be solved by using the same exact terms as before,\u201d noted one participant.<\/p>\n<p>\u201cTensions\u201d may have been the word to pop up most frequently in the course of the afternoon. The push-and-pull between best practices and formal requirements, what\u2019s desirable in a definition versus what\u2019s legally possible, the value of private data (e.g. healthcare) vs. reproducibility and transparency were just a few.\u00a0<\/p>\n<p>Field-of-Use restrictions<\/p>\n<p>Most participants felt that the new definition should not limit the scope of the user\u2019s right to adopt the technology for a specific purpose. There have been a number of AI creators leaving projects over ethical concerns and a push for \u201cresponsible\u201d licenses that restrict usage.\u00a0<\/p>\n<p>\u201cPeople are shortsighted in all the ways that matter,\u201d one participant said, citing the example of Stable Diffusion\u2019s ban on using the deep learning, text-to-image model for medical applications. \u201cThere are researchers who have figured out how to read the minds of people with locked-in syndrome, people who have figured out how to see mental imagery. And yet they can\u2019t help these people and make their lives better because, technically, it\u2019d be violating a license.\u201d These researchers, for context, do not have the millions of dollars necessary to create a Stable Diffusion-type model from scratch, so the innovation is stalled.\u00a0<\/p>\n<p>\u201cWith field-of-use restrictions, we\u2019re depriving creators of these tools a way to affect positive outcomes in society,\u201d another participant noted.\u00a0<\/p>\n<p>While several participants noted their support for the intent behind ethical constraints, the consensus was that licenses are the wrong vehicle for enforcement.<\/p>\n<p>Attribution requirements\u00a0<\/p>\n<p>There was much talk about a \u201clandscape of tradeoffs\u201d around attribution requirements, too. In a discussion about data used to train models, participants said that requiring attribution may not be meaningful because there\u2019s not a single author. Even though communities like Wikipedia care about acknowledging who wrote what, it doesn\u2019t hold up in this context and the creators of automated AI tools already have ways of being recognized. The length and breadth of these supporting documents are also a factor in skipping these requirements. One group member pointed out that \u201cattribution\u201d for a dataset might result in a 300-million page PDF. \u201cCompletely useless. It would compress well, because most of it would be redundant.\u201d\u00a0<\/p>\n<p>This conversation dovetails with the tension between transparency and observability with requirements imposed by other regulations, like privacy and safety.<\/p>\n<p>Get involved<\/p>\n<p>This half-day discussion is only the beginning. Participants were well aware that the community will need more conversations and more collective thinking before finding a common ground. Send a proposal to speak at the online webinar series before August 4, 2023 and check out the timeline for upcoming in-person workshops. OSI members can also book time to chat with Executive Director Stefano Maffulli during office hours.<br \/>\nThe post &lt;span class=&#8217;p-name&#8217;&gt;Towards a definition of \u201cOpen Artificial Intelligence\u201d: First meeting recap&lt;\/span&gt; appeared first on Voices of Open Source.&#013;<br \/>\n<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/plus.maciejpiasecki.info\/wp-content\/uploads\/2023\/07\/meeting-kickoff-mozilla.png\" width=\"925\" height=\"670\">&#013;<br \/>\nSource: opensource.org&#013;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The Open Source Initiative recently kicked off a multi-stakeholder process to define machine learning systems that can be characterized as [&hellip;]<\/p>\n","protected":false},"author":53,"featured_media":13228,"comment_status":"false","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[],"class_list":["post-13227","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-mp"],"_links":{"self":[{"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/posts\/13227","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/users\/53"}],"replies":[{"embeddable":true,"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/comments?post=13227"}],"version-history":[{"count":1,"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/posts\/13227\/revisions"}],"predecessor-version":[{"id":13229,"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/posts\/13227\/revisions\/13229"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/media\/13228"}],"wp:attachment":[{"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/media?parent=13227"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/categories?post=13227"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/plus.maciejpiasecki.info\/index.php\/wp-json\/wp\/v2\/tags?post=13227"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}