Interim: ASML, Proguard
Sat, Feb 22, 2025 ❝Obfuscation done wrong in 7 little disconnected bits.❞Contents
This story is itself not so much about harassment, but is likely a lead-in to other attacks. Even if just for context, I’d like to share this story. At the very least, it illustrates my own attitude towards solving real problems, improving circumstances and ASML concerns.
Context and chronology
This story also happened quite early in my first year/first team. I’d guess a couple of months in. I started hearing something about Marketing department being frantic about Java applications being fairly easy to reverse-engineer, and probably some of the business. It didn’t seem like anyone else much cared.
There were some rumors that there were likely several customers, that had a department to reverse-engineer the code as soon as a new version is released. I don’t know anything for sure, but that was the story I heard from various people with various roles. There were a couple of sprints with presentations, it took them at least 2, probably more, sprints in order to get this obfuscation to a “deliverable” quality.
Delivered but wrong
Now, when the Proguard configuration got delivered, it was included as a late-stage build module for Maven. Maven is used as the build tooling for the software, when I was there.
So, my team wants to deliver their work, which is essentially completely independent of this Proguard thing. Except, this Proguard thing is now in, and we have a new module to add, therefore must be processed by Proguard and it requires us to specify whether or not to obfuscate. We cannot complete our delivery, because we cannot fully integrate our work with the main line, shared among development teams.
We need to figure out how this thing works. I start looking into it, but it soon turns out there is this rather arbitrary split between the various modules. There are 7 configurations, I believe even just numbered 1-7, that are all available and somehow are all relevant. Now it is interesting to note I had never even heard of Proguard before, but if you understand the concept and what it is trying to achieve, it isn’t that hard to understand.
Note: maybe it was 6 or 8 files. I’m not sure. 7 gives the right impression, even if I’m possibly one off.
Fixing it to make this sustainable
So, I start looking into it and it is immediately clear that 7 configuration files is not how this is supposed to work. I start completely rearranging the contents of the 7 files into one to make it do what it is supposed to, and run into a rather interesting problem: the command to execute proguard
for obfuscation fails to run. This is simply due to ASML’s elaborate naming scheme for modules and the 1800+ modules present at the time, so it vastly surpasses the maximum length of the command-line for Windows. (Possibly for Linux, but I never bothered to look it up.)
This isn’t actually a problem. I am amazed that there were, likely 3+ sprints spent on this, because the Maven Proguard-plugin has a setting that says: “read the modules from a text-file instead of taking them from the command-line arguments” in case the command-line becomes too large. You can set it to true
and point to a file that lists all the modules by name.
So I’ve merged all the configuration files, essentially with the same configuration already present minus the insane amount of duplication and package-boundaries-juggling that was necessary to manually make these things “connect” or “match up” in the obfuscated hell-scape. Essentially, the originally delivered configuration was extremely conservative because they couldn’t really make it work. I produced a text-file listing all packages and divided them into the “obfuscated set of package” for primarily (almost) everything from ASML, and the “untouched set of packages” mostly public open-source dependencies.
At this point, I’ve got essentially the minimally necessary changes to make obfuscation work in the proper way, with 1 configuration file and a partitioning of packages to ensure ASML packages get obfuscated. A lot of ASML packages that were previously not obfuscated because they served as artificial boundaries between the 7 configuration files, could now be obfuscated. Because I fixed the configuration, it meant that Proguard could finally do what it was designed for. And my team, deliver this as the actual working configuration together with our teams delivery.
Note that this will then require some explaining, because at this point any team would need to understand that it needs to update the list of modules. There is a benefit though: Proguard warns in case of unknown modules. A team would be informed of “missing modules”, i.e. modules not yet present in the list. Of course, the explanation for how Proguard works, is now a lot easier. Most developers can assume the configuration is acceptable for them and simply list any new modules, and that’s it.
The original, broken delivery would have been completely unsustainable.
Enhancing protection/obfuscation
It is important to consider that this configuration as delivered was still the same extremely conservative set-up because I was not present while it was defined, and I knew they had unnecessary obstacles to tackle, but I couldn’t just crank up all the protection because that would mean that I might end up accidentally breaking or blocking applications. So, I knew this configuration needed to be thoroughly checked and fine-tuned. At this point, I do not have any priority from anyone, so I’m tackling this in the spare time I have available, most often during full builds of the applications.
I investigate what the various Proguard options mean, the defaults, the things I can tweak that have less impact on Java’s dynamic behavior, i.e. reflection and such. Essentially, exploring which opportunities I have available and which opportunities are low- or high-risk as to avoid breaking things from other teams. The problem is that Proguard runs over all modules, therefore also all ASML modules, so it is possible to break seemingly arbitrary parts of the software if I crank up the protection carelessly.
Over several weeks, I experiment with a few options. In particular the options that immediately break things when cranked up too high are convenient, because of the hard errors. As I never got any priority on this, it slowly fizzled out as the number of obvious opportunities decreased. I’d like to think that I managed to produce quite a reasonable protective configuration in the end, even if it wasn’t optimal. This was all of my own accord, because business didn’t really care and most likely wasn’t aware. I have no doubt, that as far as they knew, the “delivered obfuscation project protected the application so the risk is averted.” And I know Proguard wouldn’t have been considered “insignificant bullshit”, so I know this time wasn’t wasted.
Reducing chances for down-time
In Manual builds and continuous integration, I described how many development teams had a not insignificant amount of down-time due to the application building but not running. As I was finishing up the first proper configuration for a correctly working Proguard set-up, I realized that this could also be instrumental in reducing the chances for unexpected down-times due to run-time dependency conflicts. (These could actually occur as a result of deliveries from multiple teams that would combine into a conflicting situation.) So, instead of making Proguard run only on release-builds executed only at final release, I now configured it to run for all team-builds, making Proguard a guard-condition before allowing delivery into common development main-line. This would protect us from a number of risks that could cause such dynamic dependency resolution issues.
“Aftercare”
Afterwards I’ve sat down with several teams to explain in a bit more detail how Proguard should work, what it tries to accomplish, i.e. that it builds up this internal model of “obfuscated” and “untouched” modules and that dependencies should always only go either: obfuscated -> obfuscated
, obfuscated -> untouched
, untouched -> untouched
. (So, untouched -> obfuscated
is illegal.) That way, Proguard is free to obfuscate the code in whatever way it deems suitable, while always able to ensure a sound, consistent set of internal relations. The concept is trivial if you boil it down to the bare minimum of what it does. I’d also explain that dynamic behavior, i.e. reflection, is a bit more complicated. However, in many cases the use of reflection was in rather harmless locations, such that it wasn’t a big problem. For example, there was some use of reflection in the ADEL-file handling, but then those files are readable to the public eye, so it wasn’t a big risk to reduce the level of protection on those packages.
The configuration file (usually untouched by teams) and the file-list (necessarily updated by teams as per the failing builds when new modules are introduced) were easy enough to grasp when explained and made for an almost trivial obstacle for other development teams.
Epilogue
The Romanian guy that did the original bad delivery with 7 configuration files, later moved to the team I tell about in group lead team composition fuck-up and some effort was made to “rub it in my face” that he was going to do something fun while the rest of the team was being a PITA.
Changelog
This article will receive updates, if necessary.
- 2025-02-22 Initial version.