The Tragedy of Open Source
There’s no denying that open source has become the de facto development methodology of software development, especially for new and emerging fields. Playing on Marc Andreessen’s quote, Dr. Ibrahim Haddad of Samsung has said ‘open source software is eating the world.
Open source has been great about lowering the barriers to participate in software development; and that software is everywhere, from cameras to refrigerators, to the applications that every business builds and depends on. However, from a project perspective, there is nothing that requires that you continue to pay attention to a project you previously dedicated time to. During my time as a Director of the Apache Software Foundation (ASF) my peers and I spent a considerable amount of time each month looking at the hundreds of projects that call the ASF home. We were evaluating them for a number of things, but one of the primary concerns was that enough people still cared for the project. This would be manifested in regular releases, rapid resolution of security issues. We weren’t as concerned about development velocity, but rather that someone was ensuring the code that was already released to the public was getting adequate attention, and that users of the software, weren’t being left in the lurch.
Often time we get lost in looking at how many users a given piece of software has to determine success; but it is only a piece of the equation. Far too often we have software with millions of users, but to which, developers are no longer paying attention, or are focusing on new features alone. Linus’ law states that with enough eyeballs all bugs are shallow. But all too often, as Stephen Bellovin, a Comp Sci professor at Columbia University wrote they can be ‘eyeballs more consumed with new features than quality’
The idea that all open source is zero cost to consume and that we can leverage it for free is the very thing that I worry sets us up for a tragedy of the commons in software
Over the past few years, we have seen a number of widely used and adopted projects that weren’t receiving enough ongoing maintenance attention. In many cases they were considered the reference implementation. Occasionally they even had folks who were actively paid to develop new functionality and features. For better or worse, the software that essentially provides the underlying foundation of everything that we do often falls prey to the fact that they are assumed to be there and working, with no thought as to whether anyone is paying attention or not.
As a result of a number of discussions, Daniel Gruno at the ASF developed a formula, whose result was in jest was called ‘Pony Factor’. This was supposed to be the smallest number of folks who had committed 50 percent of codebase in the past two years. Of course, we ran this against all of the repos at the ASF, and looked at the results. Some were surprising, like the Pony Factor for Subversion being 7; and httpd being 9. E.g 9 people were responsible for approximately 50 percent of commits into the Apache web server of late.
Of course this isn’t perfect. Pony Factor gives the best indication for whether software is well maintained when it is evaluating relatively mature codebases. It can’t distinguish between a flurry of activity that might be new functionality, but rather it gives us a sense of how diversified the project’s development is. Are people reliant on one or two people, or many?
Why did we go to these lengths? Because there is very much a risk, that in economic terms would be called the Tragedy of the Commons. This theory states that when there is a shared resource (in this case open source software), individual users act in their own short term best interest contrary to the common long term good of all users.
Of course, we aren’t supposed to be applying 19th century economic theories born out of agrarian society to today’s post-scarcity world of software. The cost to consume open source software today is miniscule. However, much like communal grazing lands, rail and road infrastructure, office buildings, or even the environment; unfettered consumption without any thought to the long term well being of the resource being consumed is a recipe for disaster.
The reality is that, today we rely on scores of ‘critical’ pieces of software, Software that generally doesn’t have VC-funded startups, or massive software corporations looking after it. I tend to think of this as the plumbing of our current technology platforms. We likely don’t even consider the software; it’s just part of the environment we’re using; not necessarily a conscientious decision on our part to use it; software like OpenSSL, bash, and ntpd. When we do use, we’re incurring an odd kind of technical debt. Software that was essentially free for us to download and install, but causes us to take on some technical debt, in that the software must be maintained by someone.
This reminds me of Robert Heinlein’s TANSTAAFL in The Moon is a Harsh Mistress which expands to mean ‘There Ain’t No Such Thing As A Free Lunch’. The idea that all open source is zero cost to consume and that we can leverage it for free is the very thing that I worry sets us up for a tragedy of the commons in software. When a piece of software can have hundreds of millions of users, but can’t garner half-a-dozen developers, there is a problem, or soon will be one. I don’t say that to denigrate free and open source software; on the contrary, I think free and open source software gives us tremendous leverage by allowing us to stand on the shoulders of giants. What it doesn’t do is give us a pass to ignore the maintenance and security needs of the software we are using; and at a minimum we need to recognize the technical debt we are assuming when we use open source software. A better plan would be to do our own risk evaluations for the pieces of the software stack that we are making use of, and make and informed decision about how much risk we are willing to accept, and where we need to mitigate that risk by making strategic investments.