Fragility of modern source code or why semver is evil

TL;DR Dynamic “semver” versioning is evil, because it makes it too easy to break things on a global scale. With semver versioning the builds are not-repeatable: you may get one version of a module now and another version 5 seconds later. Combined with deep dependency hierarchies, this creates too many single points of failure, and periodically leads to disaster.TL;DR

Back in October 2016 I found that our software won’t compile, since a developer from Virginia just published a faulty version of a module I never heard of. But not to worry: he promised to fix it once he gets back from the gym.

fix-after-gym

Now, think about it: some guy we don’t know publishes a new version of a module we never heard of, and a couple of hours ago it breaks our compilation. If this is not scary, I don’t know what is.

There are three reasons for this debacle:

  1. The dependencies are in an external repository like npm.
  2. Versioning of the dependencies is dynamic (see the semver specification).
  3. The dependency tree is deep.

It is convenient to store dependencies in a remote repository: it greatly simplifies management of the code. However, this means that to build your code you must be connected to the remote repository, and the modules you are looking for must not be deleted. This alone has serious consequences: in 2016 npm prevented a developer from deleting his own code,  because it would cause too much trouble.

Dynamic versioning makes the problem worse: it makes the builds not repeatable. Even if you don’t touch anything in your code, what you get from the external repository may change from one moment to the next.

Deep dependency tree means that you indirectly depend on hundreds of modules. As a developer, you can tightly control your immediate dependencies by specifying their exact versions, but you don’t have a say on whether their dependencies are exact.

To get back to our story, the faulty module that was to be fixed after gym is named gulp-sourcemaps. It was a 8th order dependency, so we did not even know it existed. The chain of dependencies went something like this:

{our code} => grunt-contrib-imagemin => imagemin => imagemin-optipng => optipng-bin => bin-build => decompress => vinyl-fs => gulp-sourcemaps.

Its immediate user, vinyl-fs version 2.4.3 had the following line in their dependency list:

gulp-sourcemaps: ^1.5.2

This will pull in the latest version that is greater or equal to 1.5.2, but less than 2.0.0. On October 13, 2016 a bunch of faulty versions of gulp-sourcemaps were published, which broke vinyl-fs and everything that depended on it. The author of vinyl-fs reacted quickly (even before Mr. Nmccready got back from the gym), and made this commit: Fix: Pin gulp-sourcemaps because they are breaking things like crazy. The dependency was now pinned to version gulp-sourcemaps v1.6.0, no variation allowed.

In theory dynamic versioning is good for quick bug fixes: it is possible to fix a problem, publish a “compatible” version, and all upstream packages will automatically pick up the fix. Unfortunately, if your version is faulty, all upstream packages will also automatically pick up the fault.

In the Microsoft world, NuGet dependencies were always exact, but since Microsoft recently copy-cats everything JavaScript, NuGet dependencies now also support semver. One can argue that this is optional, and indeed it is, but if the author of your dependency chooses to use semver for his depndencies, there is nothing you can do against it: the moment it happened you lost the repeatable build.

All you can do now is hope that the guy from Virginia comes back from the gym in time before you next release.

3 Comments


  1. Fun times indeed!
    As for NuGet semver dependencies – as long as you pin versions of your dependencies, you are going to be fine, even if somebody down the line chooses not to pin his dependencies

    Reply

    1. Oh, that’s good to know. At least they learnt something from those disasters.

      Reply

      1. That’s not how I meant it, but it is all the same.
        NuGet does not silently pull dependency of dependencies – it tells you what it takes and you are in control of what is taken and what version. If I choose to use package A of v1.1 and it depends on package B of v2.1, it will always be the case on the build server and NuGet packages are immutable, so if in the future package A of v1.2 depends on spoilt package B v2.2, my project will still be taking A v1.1 and B v2.1.
        i.e. packages.config file describes all the nuget packages and their versions and nothing else is installed. I even seen situations where I could remove unneeded (for me) second-level dependencies and everything worked.

        Reply

Leave a Reply

Your email address will not be published. Required fields are marked *