Around two months ago, thousands of applications ran into errors. All due to the actions of one man.
Over the course of this piece, I will be aiming to do three things:
- Help you understand how one single coder could break so much havoc.
- Help you understand how copyright works in relation to open source software.
- And finally, help you understand how software development actually works. I have dealt with this in the final section because it is a bit technical, but I strongly urge you to go through it as that will help you understand some of the finer points that I make in the earlier parts of this piece.
How it all started…
Npm, one particular software, does two things:
- It provides you with a library of pre-made code.
- It provides you with a platform to add new code making use the pre-made library.
- It is pertinent to note that the library is not static and users continually add modules (pieces of code) to the repository.
One fine day, one disgruntled npm user, instead of adding, deleted his module (called “leftpad”) from the library. Now, this was such a basic piece of code that even software with minimal complexity levels used it. Therefore, every software of higher complexities, which directly or indirectly depended on “left-pad” ran into errors.
Here is an illustrated version of the problem:
In our case Code A is “leftpad”
It wasn’t until npm “un-unpublished” the code (re-established the left-pad module) did the chaos subside. Now the issue is that the disgruntled user’s consent was not taken before re-establishing the module and we need to consider the legality of such an act.
The Legality
Before we look into npm’s policies and licensing agreements, let me briefly explain how copyright in code works. The general rule is that the author of the code (the coder) owns the copyright. He can chose how he wants others to use it through licensing. In other words, the extent and manner in which you can use somebody else’s code depends on the kind of license agreement that you have with him.
In the current case, it is slightly more complicated because the npm library, that we are concerned with, is an open source library. In the case of an open source platform, the moment you feed your code to the library, you have effectively agreed to license anybody according to an open source license adopted by the platform. In this particular case, the adopted license was the Artistic License 2.0 (a standard license) with a few additions.
Let us have a quick look at the relevant part of AL 2.0:
“(2) You may Distribute verbatim copies of the Source form of the Standard Version of this Package in any medium without restriction, either gratis or for a Distributor Fee, provided that you duplicate all of the original copyright notices and associated disclaimers.”
What one needs to note is that in this particular case the code licensing came into existence the moment the user fed the code and continued to exist even after the code was deleted. So, according to a joint reading of the license and the npm policy, npm’s act of “un-unpublishing” code, once it has been fed into the open source library, is perfectly legal, provided that it subscribes to the other conditions (such as duplication of notices) of AL 2.0.
Change in Policy
Furthermore, npm has now changed its policy so that users can no longer break havoc, even temporarily, by deleting their modules. After 24 hours of publishing a module, the author will be restricted from deleting at will. He would then have to contact the npm support staff and request for deletion. It is only after the support staff decides that consequences are minimal, will they take the code down.
Despite taking a lot of control away from an author, I believe that npm’s change in policy is quite sound. There are strong reasons for maintaining a “modular” environment. And in order to sustain a system wherein thousands of coders depend on interdependent code, it makes sense to restrict an author’s ability to unpublish and break havoc.
In order to understand the significance of a “modular” environment, I would urge the enthusiastic reader to go through the primer below (last section). It is slightly complicated, but I have tried to do my best to keep it simple.
A New Factor
This entire drama brings us to a rather interesting juncture. Please note that in this section I will be making a more general point about Intellectual Property, which this particular fact scenario has uncovered.
In reductionist speak: I.P. law aims to promote innovation by incentivizing creators, while keeping the fundamentals of creation outside the scope of protection. But the issue under discussion made me consider the possibility of factoring in another consideration.
I wondered:
Normatively, shouldn’t the manner in which creation takes place in a particular field also determine the manner in which rights should be distributed?
I have not done enough research to talk about the extent of significance that needs to be given to the “field specific creation process”, but it definitely needs to be a factor. I.P. law should not only look at what incentivizes a creator and what is available to a creator, but must also give importance to the manner in which creation takes place.
For instance, in the current scenario, consider a hypothetical wherein I.P. law overrules licenses and gives coders the right to delete their modules of code. While this would incentivize creators (due to greater control over their code), and effectively not reduce the access of content (republishing), there is the fact that mere deletion of a module breaks havoc, at least until it is republished. Therefore, while deciding distribution of rights through I.P. law, legislators should take this into account as well.
Software Development Primer
Software development is nothing but production of instructions that computers can understand. Now, imagine a robot at the green dot. You need to give him instructions to chauffeur your mother to the red dot. “Chauffer mother to red dot” might seem like a simple instruction, but robots and computers need excruciatingly detailed instructions. A friend tried his hand at it and his instructions were:
- Get Keys.
- Get Mom.
- Get in Red Car.
- Get to Red Dot.
Needless to say, he didn’t exactly share the same enthusiasm as I did. Anyway, what we need to understand over here is that each of the above mentioned instructions have a great deal of complexity within them. A plethora of intricacies arise, such as how would the robot locate the keys, how would it start the car, how would it get to the red dot, et cetera. Explicit instructions need to be provided for every movement. Since it is incredibly pain staking to write instructions for every small step, developers work in what people call a “modular” environment.
Developers of the past, would have already written instructions for each of those small steps. Today’s developers would only have put all those “modules” or “packages” of instructions together to suit their specific need and context. You don’t produce each piece, but rather just put them together to create whatever finds your fancy. Of course, you might have to write your own instructions as well, but a lot of it can borrowed.
Also, there is another advantage of working in a modular environment. Consider the fourth step of the robot example (Get To the Red Dot). Note that there are multiple ways of doing so. One could take number of longer paths to get to the same spot, but there are only a limited number of efficient ways of doing so. More often than not, coders (software developers) do not get the most efficient code (instructions) the first time around. Coders continually strive for greater efficiency and as a result the code for the same functions are constantly updated. If you use modules, your complex software also gets improved on the whole as the bits and pieces get better.
Before moving any further, let me clarify a small concept. There are two ways earlier code (instructions) can be used. Either you can copy the earlier code and paste it into your new software, or you could directly insert a module into the new software. Copy-Pasting won’t reap the benefits of the modular environment to the fullest extent, because the copy-pasted code will stay as it is and not modify itself with updates. By inserting a module (a different technique), you become eternally dependent on other coders (unlike in the case of copy-pasting), but you reap the benefits of continual improvements upon it.
So in the robot example, the robot would improve its efficiency as other software developers (independent of your software) work on improving the efficiency of every minute step, while you only bother about the larger task at hand. An analogy could be drawn to constructing a building. An architect bothers only about the design of the building, while there are engineers concerning themselves about improving the strength of the iron bars. The architect need not be too concerned about the engineer, but no one would argue that the engineer is useless to the architect.
Therefore, sustaining a “modular” environment is in the best interests of the coding community and npm has taken a step in the right direction by changing its policy to better sustain such a system of software development.
Chief References:
Software and Intellectual Property, Suitability of Modular Systems, Who owns the Code