Reviewed-by: Peter Müller > - Malicious filenames can make xzgrep to write to arbitrary files > or (with a GNU sed extension) lead to arbitrary code execution. > - xzgrep from XZ Utils versions up to and including 5.2.5 are > affected. 5.3.1alpha and 5.3.2alpha are affected as well. > - This bug was inherited from gzip's zgrep. gzip 1.12 includes > a fix for zgrep. > - CU167 has gzip-1.12 with the fix already merged. > > Signed-off-by: Adolf Belka > --- > lfs/xz | 1 + > src/patches/xzgrep-ZDI-CAN-16587.patch | 94 ++++++++++++++++++++++++++ > 2 files changed, 95 insertions(+) > create mode 100644 src/patches/xzgrep-ZDI-CAN-16587.patch > > diff --git a/lfs/xz b/lfs/xz > index 586fbc90f..9345df954 100644 > --- a/lfs/xz > +++ b/lfs/xz > @@ -75,6 +75,7 @@ $(subst %,%_BLAKE2,$(objects)) : > $(TARGET) : $(patsubst %,$(DIR_DL)/%,$(objects)) > @$(PREBUILD) > @rm -rf $(DIR_APP) && cd $(DIR_SRC) && tar axf $(DIR_DL)/$(DL_FILE) > + cd $(DIR_APP) && patch -p1 -i $(DIR_SRC)/src/patches/xzgrep-ZDI-CAN-16587.patch > cd $(DIR_APP) && ./configure --prefix=$(PREFIX) > cd $(DIR_APP) && make $(MAKETUNING) > cd $(DIR_APP) && make install > diff --git a/src/patches/xzgrep-ZDI-CAN-16587.patch b/src/patches/xzgrep-ZDI-CAN-16587.patch > new file mode 100644 > index 000000000..406ded590 > --- /dev/null > +++ b/src/patches/xzgrep-ZDI-CAN-16587.patch > @@ -0,0 +1,94 @@ > +From 69d1b3fc29677af8ade8dc15dba83f0589cb63d6 Mon Sep 17 00:00:00 2001 > +From: Lasse Collin > +Date: Tue, 29 Mar 2022 19:19:12 +0300 > +Subject: [PATCH] xzgrep: Fix escaping of malicious filenames (ZDI-CAN-16587). > + > +Malicious filenames can make xzgrep to write to arbitrary files > +or (with a GNU sed extension) lead to arbitrary code execution. > + > +xzgrep from XZ Utils versions up to and including 5.2.5 are > +affected. 5.3.1alpha and 5.3.2alpha are affected as well. > +This patch works for all of them. > + > +This bug was inherited from gzip's zgrep. gzip 1.12 includes > +a fix for zgrep. > + > +The issue with the old sed script is that with multiple newlines, > +the N-command will read the second line of input, then the > +s-commands will be skipped because it's not the end of the > +file yet, then a new sed cycle starts and the pattern space > +is printed and emptied. So only the last line or two get escaped. > + > +One way to fix this would be to read all lines into the pattern > +space first. However, the included fix is even simpler: All lines > +except the last line get a backslash appended at the end. To ensure > +that shell command substitution doesn't eat a possible trailing > +newline, a colon is appended to the filename before escaping. > +The colon is later used to separate the filename from the grep > +output so it is fine to add it here instead of a few lines later. > + > +The old code also wasn't POSIX compliant as it used \n in the > +replacement section of the s-command. Using \ is the > +POSIX compatible method. > + > +LC_ALL=C was added to the two critical sed commands. POSIX sed > +manual recommends it when using sed to manipulate pathnames > +because in other locales invalid multibyte sequences might > +cause issues with some sed implementations. In case of GNU sed, > +these particular sed scripts wouldn't have such problems but some > +other scripts could have, see: > + > + info '(sed)Locale Considerations' > + > +This vulnerability was discovered by: > +cleemy desu wayo working with Trend Micro Zero Day Initiative > + > +Thanks to Jim Meyering and Paul Eggert discussing the different > +ways to fix this and for coordinating the patch release schedule > +with gzip. > +--- > + src/scripts/xzgrep.in | 20 ++++++++++++-------- > + 1 file changed, 12 insertions(+), 8 deletions(-) > + > +diff --git a/src/scripts/xzgrep.in b/src/scripts/xzgrep.in > +index b180936..e5186ba 100644 > +--- a/src/scripts/xzgrep.in > ++++ b/src/scripts/xzgrep.in > +@@ -180,22 +180,26 @@ for i; do > + { test $# -eq 1 || test $no_filename -eq 1; }; then > + eval "$grep" > + else > ++ # Append a colon so that the last character will never be a newline > ++ # which would otherwise get lost in shell command substitution. > ++ i="$i:" > ++ > ++ # Escape & \ | and newlines only if such characters are present > ++ # (speed optimization). > + case $i in > + (*' > + '* | *'&'* | *'\'* | *'|'*) > +- i=$(printf '%s\n' "$i" | > +- sed ' > +- $!N > +- $s/[&\|]/\\&/g > +- $s/\n/\\n/g > +- ');; > ++ i=$(printf '%s\n' "$i" | LC_ALL=C sed 's/[&\|]/\\&/g; $!s/$/\\/');; > + esac > +- sed_script="s|^|$i:|" > ++ > ++ # $i already ends with a colon so don't add it here. > ++ sed_script="s|^|$i|" > + > + # Fail if grep or sed fails. > + r=$( > + exec 4>&1 > +- (eval "$grep" 4>&-; echo $? >&4) 3>&- | sed "$sed_script" >&3 4>&- > ++ (eval "$grep" 4>&-; echo $? >&4) 3>&- | > ++ LC_ALL=C sed "$sed_script" >&3 4>&- > + ) || r=2 > + exit $r > + fi >&3 5>&- > +-- > +2.35.1 > +