Interim: ASML manual builds and continuous integration

❝Manual builds and continuous integration evolving.❞
Contents

In the first months of arriving at ASML, possibly even in the first month, I was asked to assist with building the application and preparing the application for release. This was a manual process at the time, i.e. assisted by Apache Maven for the builds.

There was surprisingly little in attacks surrounding specifically these topics, though some connection with Proguard is discussed in another topic. This post is essentially an interim that further describes my contributions. There are references to circumstances containing harassment described in detail in future posts. Actually, there is one matter concerning a different department “facilitating” the release process that caused significant friction. It’s better discussed in a separate topic.

Early releases

Every few weeks, we would compile a release build with some teams available for testing. This would also include occasionally supporting teams with (last-minute) integration of work in the main branch. We all know how these things go, especially before things have been fully integrated and smoothed out. I assisted several people and teams with merge problems using git. In the process, I instructed many on the command-line tools such that they would be able to gain insight into a confusing conflicts. The use of merge tooling would more than once cause confusion because the visual (UI-based) tooling would not be able to properly represent the state of the merge and file-conflicts. I helped several people this way, and we essentially established and documented the basic steps then. One of the people I helped, would go on to become the scrum master in my third team later, and proceed to backstab me on several occasions.

During the early releases, we would make sure the work was integrated, build several tools all part of one huge build process, at the time ranging over 1500+ modules. When I got there, the state of the build was such that a build took over an hour. We couldn’t complete a full build without building the tests, even if they wouldn’t be run. They were intertwined in a way that the compiler stopped on a build failures if tests weren’t included. The releases at the time didn’t even include an installer, at first. Regardless, it started out as a bundle of separate applications.

As we smoothed out the established process, I stopped being involved in releases themselves. In part, because I was better read into the functional domain and was able to take on a larger tasks due to being there for more than a few weeks. At that point there were still some manual steps involved, and the build was pretty much that: building the packages.

Trouble running a release-build

After being less involved in the releases and fully engaged in development work with the team, we ran into a week-long stumbling block that was “dependency injection”. This was also around the time that the number of applications started to grow, the applications got integrated into a single framework called “Main Application” among devs, and desktop tooling started to form a non-negligible role. The exact timing of this particular incident isn’t really of any concern, because the root cause is quite isolated and self-explanatory. I think the chronology is reasonably accurate.

I think it was on a friday, the usual day where we’d integrate the last remaining work and build the release. The release had been fully built and very late it was found out that the application couldn’t start. That is, it would start and immediately exit. Now, for a bit of context: the server component of the application had existed for a while longer. Many of the modules in use, were shared between server and desktop-application, sometimes a basic client and other times a stand-alone tool with its own functionality.

The server software was fine (to my knowledge). The problem manifested itself specifically in the way the desktop-application would be started. The desktop application didn’t start at all. Logging indicated a very early exit due to the CDI framework not being able to resolve the dependencies or being able to resolve them but not in a sound, coherent configuration. CDI is present, because – as inherited from the server component – we needed an ability to wire up the modules. And this initial wiring failed. On a code-base consisting of 1500+ modules, and a rich number of dependencies to the desktop application, this became far from a transparent issue. Given that everyone would integrate their work late, everything would come together on that day and only afterwards it was discovered that resolution failed at some point.

Several devs made various attempts to get matters resolved. I taught a few people how to use git-bisect in an attempt to narrow down the commit that introduced the issues faster. However valuable it may have been, the 1+ hour build times were a real stumbling block. To get the whole mess resolved, took several working-days. Most developers would be reading in on or preparing project information, trying out dev experiments, or do some much-needed refactoring.

The first time, it took 5 or so days to get matters resolved. The second time, I remember less prominently, probably took less and was less impactful. I asked a colleague and they said that such incidents would happen a few times each year.

Now it’s interesting to consider having 50+ developers (early on), all blocked or constrained from their development work, because dependencies aren’t in control, for 5 or so days, 3 or so times a year. These numbers add up quite significantly.

The topic of (Context) Dependency Injection is in itself worth discussing, and will be left to another story.

Automated build infrastructure

If I remember correctly, there was always build infrastructure available. However, we started to make a lot more extensive use of it when the number of teams grew. Builds became more comprehensive to cover for additional checks and verification, with CI build jobs performing builds therefore saving on developer time under certain circumstances.

As builds got more advanced and there was more focus on getting team builds more reliable to smooth the integration into the shared development branch, my scrum master had played around with an installer-system. We had the basics set up easily enough. I integrated the installer later as a final build step into the release builds, allowing us to (eventually) have a completely automated release build into the final release artifact: a built, verified, obfuscated and installer-packaged integrated software-application.

Build verification, guards

Me and another colleague would later investigate several possibilities into improving the build in a way that a larger part would be automated, carefully select which operations should be part of team builds to provide some additional guard checks to qualify integration, and set up and fine-tune static analysis of various kinds for more elaborate verification of the code-base. These included dependency analyses and various static analysis tooling for common errors, dubious patterns and style-checking. (Note that I wasn’t involved in all of these, plug-ins for checkstyle and pmd among others were present. We did, however, evaluate and fine-tune the configuration.) For a lot of these matters, many people make small contributions which add to significant advances on the whole.

Given the nature of CDI, I tested and applied Proguard in the automated build to perform the obfuscation. Due to Proguard’s nature, it performs additional checks to see if it is possible to syntactically fit the various modules, classes and interfaces, such as that APIs match. This had the added benefit, that it would be possible to detect a larger number of mismatches in dependencies such as the ones possible with CDI. So, I set up Proguard on team builds, to benefit from it as a verification step before deliveries to the shared development-branch. This allowed us to catch more complications early. This would detect a number of issues that previously have caused the long down-times.

In a cooperative effort, we also checked the possibilities for eradicating certain types of compiler warnings, converting them to “error” after we manually cleaned up the code-base. This made it possible for me to set a hard lower bar to the (syntactic) quality of the code-base: certain compiler warnings after being eradicated from the code-base, would be changed to error allowing us to enforce certain types of wholely unnecessary errors be corrected. These are sometimes trivial matters but may also catch, for example, bad use of generics and type-erasure.

All-in-all, these checks add up and make it possible to detect many inaccuracies, mistakes, dubious patterns, misleading notations, and so forth, early. Ideally, before integrating into the shared development branch, preventing blockages such as we had the years before.

Conclusions

Note that for this particular subject, there haven’t been much in the matter of attacks. I’m including this because such contributions to the build process help to prevent the 5-day down-time we had before and smoothed the integration process for the increasing number of teams.

Changelog

This article will receive updates, if necessary.


This post is part of the Coordinated harassment series series.
Other posts in this series: