Back in the 2000s, we talked about open source a lot—perhaps too much. We fought about whether code freedom (GPL) or developer freedom (Apache/BSD) mattered more. We wondered when the year of the Linux desktop might finally arrive. (TL;DR never. Or maybe it already happened. Or…whatever.) We chastised companies for “open washing” (anticipating the years of cloud- and AI-washing to come). We debated “open core” business models.
By the 2010s, open source faded into the background as it became essential infrastructure for every developer and company on the planet, whether they knew it or not. Sure, we had sporadic eruptions of fist-shaking at cloud giants for strip-mining open source, and people made earnest pleas for sustainable open source (even as it showed no signs of ever running out), but mostly we pushed open source to the back of our minds, even as it became critical to most everything we do.
Until now. Open source is again top of mind, given its seeming centrality to ensuring AI isn’t commandeered by a few companies, not to mention Redis’ recent decision to change its licensing. The problem is that open source hasn’t kept up with technology trends. There is no such thing as “open source AI,” for example, no matter how much some pretend otherwise. And there’s still no good open source licensing for the cloud. We need to use this open source moment to ensure it’s fit for purpose going forward, but how can we do so with fairness?
Falling behind
I’ve written quite a bit recently about these issues, prompted initially by the difficulty of applying licenses that meet the Open Source Definition to artificial intelligence. As Mike Linksvayer, head of developer policy at GitHub, says: “There is no settled definition of what open source AI is.” Every time you hear someone confidently proclaim a large language model is or is not open source, it’s worth wondering how they can be so certain when even the executive director of the Open Source Initiative (OSI), Stefano Maffulli, acknowledges that open source for AI is by no means settled: “We definitely have to rethink licenses in a way that addresses the real limitations of copyright and permissions in AI models while keeping many of the tenets of the open source community.”
The OSI hopes to have guidance by October, but until then, anyone pretending to an absolute certainty about what is or isn’t open source in AI is doing just that: pretending.
To be clear, I don’t think the OSI will radically change the OSD for AI (or cloud). We won’t suddenly see a green light given to discrimination against fields of endeavor, for example. I expect the essential character of open source software to remain, even as we gain clarity on how to apply the OSD to things like floating point numbers, training data, and weights.
I hope we’ll also see the OSI revisit cloud, since I believe its failure to apply the OSD to cloud distribution of software is the primary reason we’ve seen so many companies turn to source-available licenses. Let’s look at why this is the case.
The strange irony of copyleft
When Richard Stallman created the GNU General Public License (GPL), he did so to protect the freedom of code and ensure that code remained free for software users. You could make changes to the code, but if you did, you had to make them available. You couldn’t lock up the code behind proprietary licenses. Later, to make free software more palatable to corporations, a group coined “open source” and a new breed of license was born that said, essentially, “Do whatever you want with this code.”
But there was a strange irony in all this. Back in 2004, I wrote: “We are sitting on the most exciting IT business model capitalism has ever seen, all thanks to the GPL.” A few years later I doubled down on that sentiment, writing, “Most of the successful open source companies … use the GPL.” As I concluded, “The GPL, contrary to popular belief, facilitates a commercial software business.”
Did Stallman intend this? Nope. But that doesn’t matter. What matters is the text of the license, and its power to protect code and user freedom—oh, and to generate cash.
Interestingly, the very license that worked hardest to protect code and user freedom also happened to be the license that most enabled companies to build successful businesses, from Red Hat to MySQL. Once the cloud arrived, however, the GPL lost all potency and the Affero GPL hack did little to sustain it. It was a poor compromise that failed to protect code/user freedom and failed to give corporations confidence that they could use it. (It did, however, allow cloud companies to make billions by monetizing a steady supply of free and open source software to which they contribute little or nothing. What a bargain!)
At this point, some readers are screaming, “But those so-called open source companies don’t care about code freedom! They only care about money!!”
More free code, more good
Does that matter? Does it matter even a tiny bit? It does not. If the code is free, the downstream user can take it, use it, modify it, and distribute it, so long as they keep the code free and open. As the license says, “You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License.” Why would anyone care about the motivations behind using the license, so long as the end result—code freedom—remains?
I’m convinced that if a company today wrote the GPL exactly as Stallman penned it decades ago—word for word identically—it would be rejected by the OSI. Why? Because people would claim (probably correctly) that the intent behind the words was different. But should that matter? Shouldn’t the text of the license (which persists long after intentions may have changed) be the standard? Shouldn’t we be grateful for free software, no matter why it was licensed as such?
In 2007, Charlie Babcock wrote, “One of the great ironies of the GPL, 17 years after its creation, is that it has become an unabashed creator of companies that compete effectively.” We can wring our hands over whether companies like MySQL (in its day) used the GPL for freedom-loving purposes or whether it served a corporate end, but in 2024 let’s just be grateful that 30 years after MySQL development began, we still get to use it as free software, thanks to the GPL.
We need to make copyleft real again, updating or creating a new OSI-approved license for the cloud. This will serve the needs of those in Stallman’s camp who want code freedom, as well as the corporate types who want a strong business (which, in turn, fuels the development of more code). Oh, it also serves the needs of those people who like code freedom and also like paying their rent. Either way, we all win because we’ll have more free and open source software.
Copyright © 2024 IDG Communications, Inc.