From: Justin Luth <jluth@mail.com>
To: development@lists.ipfire.org
Subject: [PATCH] Fix bug 12252: updxlrator - handle \ in filename
Date: Thu, 05 Dec 2019 16:07:06 +0300 [thread overview]
Message-ID: <9fdfa42e-cb04-be00-3605-1284d294f5c5@mail.com> (raw)
In-Reply-To: <a345f1bf-67d7-9ec6-bba2-ac84ea02a272@mail.com>
[-- Attachment #1: Type: text/plain, Size: 4654 bytes --]
Recently some Microsoft patches are alternating between using
\ and / as directory separators, leaving files behind in
/var/updatecache/download that can only be removed in a terminal.
What was happening was that the lock file was created using the
URL name containing %5c while wget was actually writing to a filename
with a real backslash. The zero-sized lockfile was moved into the uuid
folder, and the actual .cab file was orphaned in the download folder,
indicated in the GUI as a stranded file that couldn't be deleted.
An alternate way to fix the problem could be to force wget to
use --restrict-file-names=windows - in which case \ | : ? " * < >
would all be saved as escaped codes. That would seem to better
match the URL mangling that apparently is happening.
Cons:
-removing sourceurl mangling means re-downloading existing caches.
-less human-readable URLs / filesnames
-conceptual alternative: I didn't prove/test this theory since it
means re-downloading all oddly-named files.
P.S. I didn't change perl's utime command to use updatefile_shellEscaped
since the filepath is in quotes - the space didn't actually need to be
escaped.
---
config/updxlrator/download | 20 ++++++++++++++++----
config/updxlrator/updxlrator | 2 ++
2 files changed, 18 insertions(+), 4 deletions(-)
diff --git a/config/updxlrator/download b/config/updxlrator/download
index afa6e6cb9..df4a58725 100644
--- a/config/updxlrator/download
+++ b/config/updxlrator/download
@@ -26,6 +26,7 @@ my @http_header=();
my $remote_size=0;
my $remote_mtime=0;
my $updatefile='';
+my $updatefile_shellEscaped=''; #double escaped for re-processing by
shell/system calls
my $unique=0;
my $mirror=1;
@@ -38,11 +39,24 @@ my $restartdl = defined($ARGV[3]) ? $ARGV[3] : 0;
umask(0002);
+# Hopefully correct documentation: based on debugging a "\" problem...
+# WGET doesn't use the filename from the URL, but gets the filename
from the server info.
+# -mangles that name according to --restrict-file-names.
+# -for unix, only mangles "/", so +, ~, \ etc. are not escaped/mangled.
+# -(which suggests the entry for %2f is incorrect BTW)
+# The URL passed to this module is mangled (correct? - perhaps by
http://search.cpan.org/dist/URI/lib/URI/Escape.pm)
+# -but $updatefile needs to match wget's download name, regardless of
the URL's encoding.
+# -achievable either by mangling the URL and/or the filename.
+# -*make sure "updxlrator" has the same adjustments*
$sourceurl =~ s@\%2b(a)+@ig;
$sourceurl =~ s@\%2f@/@ig;
$sourceurl =~ s@\%7e@~@ig;
$updatefile = substr($sourceurl,rindex($sourceurl,"/")+1);
+$updatefile_shellEscaped = $updatefile;
+$updatefile_shellEscaped =~ s@\%20@\\ @ig;
+$updatefile_shellEscaped =~ s@\%5c@\\\\@ig;
$updatefile =~ s@\%20@ @ig;
+$updatefile =~ s@\%5c@\\@ig;
$vendorid =~ tr/A-Z/a-z/;
unless (-d "$repository/download/$vendorid")
@@ -58,7 +72,7 @@ if($restartdl == 0)
# hinder multiple downloads from starting simultaneously. Create
empty "lock" file.
# TODO: Another thread may sneak in between these two commands -
so not fool-proof, but good enough?
- system("touch $repository/download/$vendorid/$updatefile");
+ system("touch
$repository/download/$vendorid/$updatefile_shellEscaped");
}
else
@@ -169,11 +183,9 @@ if ($_ == 0)
}
&writelog("Moving file to the cache directory: $vendorid/$uuid");
- $updatefile =~ s@ @\\ @ig;
- system("mv $repository/download/$vendorid/$updatefile
$repository/$vendorid/$uuid");
+ system("mv $repository/download/$vendorid/$updatefile_shellEscaped
$repository/$vendorid/$uuid");
# Workaround for IPCop's mv bug:
utime time,$remote_mtime,"$repository/$vendorid/$uuid/$updatefile";
- $updatefile =~ s@\\ @ @ig;
&UPDXLT::setcachestatus("$repository/$vendorid/$uuid/source.url",$sourceurl);
&UPDXLT::setcachestatus("$repository/$vendorid/$uuid/status",$UPDXLT::sfOk);
diff --git a/config/updxlrator/updxlrator b/config/updxlrator/updxlrator
index cdc7eeb50..1febae51e 100644
--- a/config/updxlrator/updxlrator
+++ b/config/updxlrator/updxlrator
@@ -338,11 +338,13 @@ sub check_cache
my $sourceurl=$_[0];
my $cfmirror=$_[4];
+ # Any changes made here must be mirrored in "download"
$sourceurl =~ s@\%2b(a)+@ig;
$sourceurl =~ s@\%2f@/@ig;
$sourceurl =~ s@\%7e@~@ig;
$updfile = substr($sourceurl,rindex($sourceurl,"/")+1);
$updfile =~ s@\%20@ @ig;
+ $updfile =~ s@\%5c@\\@ig;
if ($cfmirror)
{
--
2.17.1
next parent reply other threads:[~2019-12-05 13:07 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <a345f1bf-67d7-9ec6-bba2-ac84ea02a272@mail.com>
2019-12-05 13:07 ` Justin Luth [this message]
2019-12-06 6:55 ` Michael Tremer
2019-12-09 17:57 ` Justin Luth
2019-12-13 22:45 ` Michael Tremer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9fdfa42e-cb04-be00-3605-1284d294f5c5@mail.com \
--to=jluth@mail.com \
--cc=development@lists.ipfire.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox