Hi,
Thank you very much for looking into this.
This seems to be a tricky one.
On 5 Dec 2019, at 13:07, Justin Luth jluth@mail.com wrote:
Recently some Microsoft patches are alternating between using \ and / as directory separators, leaving files behind in /var/updatecache/download that can only be removed in a terminal.
What was happening was that the lock file was created using the URL name containing %5c while wget was actually writing to a filename with a real backslash. The zero-sized lockfile was moved into the uuid folder, and the actual .cab file was orphaned in the download folder, indicated in the GUI as a stranded file that couldn't be deleted.
An alternate way to fix the problem could be to force wget to use --restrict-file-names=windows - in which case \ | : ? " * < > would all be saved as escaped codes. That would seem to better match the URL mangling that apparently is happening. Cons: -removing sourceurl mangling means re-downloading existing caches. -less human-readable URLs / filesnames -conceptual alternative: I didn't prove/test this theory since it means re-downloading all oddly-named files.
P.S. I didn't change perl's utime command to use updatefile_shellEscaped since the filepath is in quotes - the space didn't actually need to be escaped.
config/updxlrator/download | 20 ++++++++++++++++---- config/updxlrator/updxlrator | 2 ++ 2 files changed, 18 insertions(+), 4 deletions(-)
diff --git a/config/updxlrator/download b/config/updxlrator/download index afa6e6cb9..df4a58725 100644 --- a/config/updxlrator/download +++ b/config/updxlrator/download @@ -26,6 +26,7 @@ my @http_header=(); my $remote_size=0; my $remote_mtime=0; my $updatefile=''; +my $updatefile_shellEscaped=''; #double escaped for re-processing by shell/system calls my $unique=0; my $mirror=1;
@@ -38,11 +39,24 @@ my $restartdl = defined($ARGV[3]) ? $ARGV[3] : 0;
umask(0002);
+# Hopefully correct documentation: based on debugging a "" problem... +# WGET doesn't use the filename from the URL, but gets the filename from the server info. +# -mangles that name according to --restrict-file-names. +# -for unix, only mangles "/", so +, ~, \ etc. are not escaped/mangled. +# -(which suggests the entry for %2f is incorrect BTW) +# The URL passed to this module is mangled (correct? - perhaps by http://search.cpan.org/dist/URI/lib/URI/Escape.pm) +# -but $updatefile needs to match wget's download name, regardless of the URL's encoding. +# -achievable either by mangling the URL and/or the filename. +# -*make sure "updxlrator" has the same adjustments* $sourceurl =~ s@%2b@+@ig; $sourceurl =~ s@%2f@/@ig; $sourceurl =~ s@%7e@~@ig; $updatefile = substr($sourceurl,rindex($sourceurl,"/")+1); +$updatefile_shellEscaped = $updatefile; +$updatefile_shellEscaped =~ s@%20@\ @ig; +$updatefile_shellEscaped =~ s@%5c@\\@ig; $updatefile =~ s@%20@ @ig; +$updatefile =~ s@%5c@\@ig; $vendorid =~ tr/A-Z/a-z/;
This is really messy code. Not yours. What has been here before.
It is quite hard to figure out what was actually supposed to happen here. The replacements look more or less like fixes instead of thinking ahead and decoding any special characters.
unless (-d "$repository/download/$vendorid") @@ -58,7 +72,7 @@ if($restartdl == 0)
# hinder multiple downloads from starting simultaneously. Create
empty "lock" file. # TODO: Another thread may sneak in between these two commands - so not fool-proof, but good enough?
- system("touch $repository/download/$vendorid/$updatefile");
- system("touch
$repository/download/$vendorid/$updatefile_shellEscaped");
However, this is all needed because we are calling shell commands here to move a file.
I suppose that that is quite dangerous because any forged filenames could be used to run any shell commands.
If we would use perl’s move() function, then we could avoid escaping the characters here. They should all be allowed to be used in the file system:
https://perldoc.perl.org/File/Copy.html#move
Would you like to try that option?
} else @@ -169,11 +183,9 @@ if ($_ == 0) }
&writelog("Moving file to the cache directory: $vendorid/$uuid");
- $updatefile =~ s@ @\ @ig;
- system("mv $repository/download/$vendorid/$updatefile
$repository/$vendorid/$uuid");
- system("mv $repository/download/$vendorid/$updatefile_shellEscaped
$repository/$vendorid/$uuid");
Same here.
# Workaround for IPCop's mv bug: utime time,$remote_mtime,"$repository/$vendorid/$uuid/$updatefile";
- $updatefile =~ s@\ @ @ig;
&UPDXLT::setcachestatus("$repository/$vendorid/$uuid/source.url",$sourceurl); &UPDXLT::setcachestatus("$repository/$vendorid/$uuid/status",$UPDXLT::sfOk); diff --git a/config/updxlrator/updxlrator b/config/updxlrator/updxlrator index cdc7eeb50..1febae51e 100644 --- a/config/updxlrator/updxlrator +++ b/config/updxlrator/updxlrator @@ -338,11 +338,13 @@ sub check_cache my $sourceurl=$_[0]; my $cfmirror=$_[4];
- # Any changes made here must be mirrored in "download" $sourceurl =~ s@%2b@+@ig; $sourceurl =~ s@%2f@/@ig; $sourceurl =~ s@%7e@~@ig; $updfile = substr($sourceurl,rindex($sourceurl,"/")+1); $updfile =~ s@%20@ @ig;
- $updfile =~ s@%5c@\@ig;
I think we will still need this one then, don’t we?
if ($cfmirror) {
— 2.17.1
-Michael