Backdoors in computing: how to prevent attacks on open source software

In the world of computer security, a recent event highlighted hidden vulnerabilities in the software we use every day. Andres Freund, a developer at Microsoft, discovered an insidious backdoor in the code of liblzma, a key element of the xz package, raising urgent questions about the security of open-source projects. This discovery, which originated from anomalies in system performance, uncovered aophisticated compromise attempt that is likely to have large-scale repercussions, how the backdoor in computing poses a constant threat.

But let’s start at the beginning, to get a good understanding of how the events unfolded and what this story taught us.

The discovery of the backdoor

On March 29, 2024 Andres Freund (a Microsoft developer) created a thread thread in oss-security Openwall (a well-known open-source security mailing list) that starts like this:

Hi, After observing some strange symptoms related to liblzma (part of the xz package) on Debian sid installations in the last weeks, (CPU-intensive ssh accesses, valgrind errors) I found the answer:
The upstream xz repository and xz tarballs have been “backdoored”.
At first I thought it was a debian package compromise, but it turned out to be upstream.
Basically, Andres noticed that the ssh login process was taking about 500 ms longer than usual. He then began to analyze the root cause of the problem and discovered that xz (a popular lossless data compression library widely used in unix-like systems) had been compromised by an attacker who had introduced a backdoor into it.

 

Backdoors in computing: how the attack developed:/h3>
What is really interesting (and scary at the same time) is how the attacker perpetrated the attack.

The (now disabled) xz code was hosted on GitHub and is completely open source (so anyone can contribute and the maintainer can accept or reject changes). However, the attacker (whose name is Jia Tan) has been committing to this repository since February 2022 (you can see his activity on GitHub, as the account is still there). He slowly gained the trust and credibility of the project maintainer, until March 9, 2024,, when he used a clever technique to hide a backdoor using two test files and several payload obfuscation steps. In addition, part of the backdoor (which is executed in multiple stages) was present only in the tarballs of the affected versions (5.6.0 and 5.6.1) and was not included in the source control version (so it was harder to find).
The following image summarizes what happened from 2021 (GitHub account creation) until the final packaging of the backdoor inside xz/libzma (credits go to Thomas Roccia for this fantastic infographic):

Impact

The attacker can essentially execute any arbitrary command (RCE – Remote Code Execution) on affected systems that have an sshd daemon running, and thus exposed. Only the attacker can execute the commands because the payload must be signed with an Ed448 key known only to the attacker. More details here.
Fortunately, only a few Linux distributions have been affected by default (especially those with unstable/testing/development branches or rolling-release distributions) including Fedora (40,41), Debian (testing, unstable, experimental) and Arch Linux.
We currently know that the payload works under some specific conditions:

  • must be running an sshd daemon
  • sshd must be patched to support systemd-notify (which actually uses xz/liblzma as a dependency)

Other attack scenarios are possible, and reverse engineers are still working to provide more information.
If you want to check whether you are using the backdoored xz library, run the following command:

strings `which xz` | grep 5\.6\.[01]

A vulnerable system would show something like:

xz (XZ Utils) 5.6.1

and you should immediately downgrade to version 5.4.6.
If no results are obtained, you are safe.

More information qui.


Final reflections

This security incident gives us an opportunity to reflect on some important things.

  1. Attacks are becoming increasingly sophisticated.. The way the malicious actor managed to gain reputation and credibility was crazy. He acted patiently and exploited social engineering techniques to push the maintainer to give him full control of the repository. He also created fake accounts to push the infected xz-utils package to be included in debian ( here). In addition, he used advanced and rather clever techniques to hide and obfuscate the payload. I personally believe that although the software creation mechanisms are essential, their complexity may inadvertently create opportunities for exploitation of previously unknown or less viable attack vectors..
  2. What does it mean to maintain an open-source project? The original maintainer of the project (Lasse Collin, who also created this page to provide updates on the incident) was completely overwhelmed by requests from people using the xz library and complaining about the slow release of new features. In this thread Lasse also admitted that he was suffering“from “long-term mental health issues” and that he was thinking about giving Jia Tan (the guy who introduced the backdoor) a bigger role in the xz project. We probably  all need to understand what kind of effort is required to maintain an open-source project, especially considering that most of them receive no financial compensation despite being used daily by thousands of multimillion-dollar companies..
  3. Can we blindly trust open-source? We all know that software has bugs and that bugs can create security problems, introduced by developers’ mistakes. This is normal, and for the most part we know how to deal with the problem (e.g., by adopting software composition analysis tools and preventing the import of vulnerable third-party dependencies into our application ecosystem as soon as a vulnerability is discovered). However, this time is different: how can we trust contributors to an open source project? While many contributors sincerely aim to improve the software and benefit the community, others may have ulterior motives, such as exploiting vulnerabilities for personal (?) gain. Considering that this security incident was identified by accident, how many others might exist that are still there, not yet identified?