With Microsoft's new style of downloading updates, where portions of a patch are requested multiple times per second, it has become extremely common for downloads to reach > 100%. Due to an early unlinking of the "lock" file, there is a big window of opportunity (between the unlink and wget actually saving some data) for multiple download/wget threads to start, adding to the same file. So not only is bandwidth wasted by duplicate downloads running simultaneously, but the resulting file is corrupt anyway.
The problem is noticed more often by low bandwidth users (who need the benefits of updxlrator the most) because then wget's latency is even longer, creating a very wide window of opportunity.
Ultimately, this needs something like "flock", where the file is set and tested in one operation. But for now, settle with the current test / create lock solution, and just stop unnecessarily releasing the lock.
Since the file already exists as a lock when wget starts, wget now must ALWAYS run with --continue, which works fine on a zero-sized file.
Signed-off-by: Justin Luth jluth@mail.com --- config/updxlrator/download | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/config/updxlrator/download b/config/updxlrator/download index dbc722c23..afa6e6cb9 100644 --- a/config/updxlrator/download +++ b/config/updxlrator/download @@ -30,7 +30,6 @@ my $unique=0; my $mirror=1;
my %dlinfo=(); -my $wgetContinueFlag="";
my $vendorid = $ARGV[0]; if (!defined($vendorid) || $vendorid eq '') { exit; } my $sourceurl = $ARGV[1]; if (!defined($sourceurl) || $sourceurl eq '') { exit; } @@ -57,16 +56,15 @@ if($restartdl == 0) # this is a new download exit if (-e "$repository/download/$vendorid/$updatefile");
- # dotzball: Why is this necessary? + # hinder multiple downloads from starting simultaneously. Create empty "lock" file. + # TODO: Another thread may sneak in between these two commands - so not fool-proof, but good enough? system("touch $repository/download/$vendorid/$updatefile"); - $wgetContinueFlag = "-nc";
} else { # this is a restart of a previous (unfinished) download # -> continue download - $wgetContinueFlag = "-c"; &writelog("Continue download: $updatefile"); }
@@ -133,7 +131,9 @@ unless($restartdl) { # this is a new download # -> download from scratch - unlink "$repository/download/$vendorid/$updatefile"; + + #already exited earlier if the file existed, and afterwards created this empty "lock", so if not empty now, another thread is already downloading it. + exit if ( -s "$repository/download/$vendorid/$updatefile" ); unlink "$repository/download/$vendorid/$updatefile.info"; }
@@ -147,7 +147,7 @@ $dlinfo{'REMOTESIZE'} = $remote_size; $dlinfo{'STATUS'} = "1"; &UPDXLT::writehash("$repository/download/$vendorid/$updatefile.info", %dlinfo);
-my $cmd = "$UPDXLT::wget $login $dlrate --user-agent="$UPDXLT::useragent" -q -P $repository/download/$vendorid $wgetContinueFlag $sourceurl"; +my $cmd = "$UPDXLT::wget $login $dlrate --user-agent="$UPDXLT::useragent" -q -P $repository/download/$vendorid --continue $sourceurl";
$_ = system("$cmd"); $ENV{'http_proxy'} = '';
Hello,
good patch and okay with me.
(Has the same line wrapping issue like the previous one)
Best, -Michael
On Sat, 2017-12-30 at 10:44 +0300, Justin Luth wrote:
With Microsoft's new style of downloading updates, where portions of a patch are requested multiple times per second, it has become extremely common for downloads to reach > 100%. Due to an early unlinking of the "lock" file, there is a big window of opportunity (between the unlink and wget actually saving some data) for multiple download/wget threads to start, adding to the same file. So not only is bandwidth wasted by duplicate downloads running simultaneously, but the resulting file is corrupt anyway.
The problem is noticed more often by low bandwidth users (who need the benefits of updxlrator the most) because then wget's latency is even longer, creating a very wide window of opportunity.
Ultimately, this needs something like "flock", where the file is set and tested in one operation. But for now, settle with the current test / create lock solution, and just stop unnecessarily releasing the lock.
Since the file already exists as a lock when wget starts, wget now must ALWAYS run with --continue, which works fine on a zero-sized file.
Signed-off-by: Justin Luth jluth@mail.com
config/updxlrator/download | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/config/updxlrator/download b/config/updxlrator/download index dbc722c23..afa6e6cb9 100644 --- a/config/updxlrator/download +++ b/config/updxlrator/download @@ -30,7 +30,6 @@ my $unique=0; my $mirror=1;
my %dlinfo=(); -my $wgetContinueFlag="";
my $vendorid = $ARGV[0]; if (!defined($vendorid) || $vendorid eq '') { exit; } my $sourceurl = $ARGV[1]; if (!defined($sourceurl) || $sourceurl eq '') { exit; } @@ -57,16 +56,15 @@ if($restartdl == 0) # this is a new download exit if (-e "$repository/download/$vendorid/$updatefile");
- # dotzball: Why is this necessary?
- # hinder multiple downloads from starting simultaneously. Create
empty "lock" file.
- # TODO: Another thread may sneak in between these two commands - so
not fool-proof, but good enough? system("touch $repository/download/$vendorid/$updatefile");
$wgetContinueFlag = "-nc";
} else { # this is a restart of a previous (unfinished) download # -> continue download
$wgetContinueFlag = "-c"; &writelog("Continue download: $updatefile"); }
@@ -133,7 +131,9 @@ unless($restartdl) { # this is a new download # -> download from scratch
- unlink "$repository/download/$vendorid/$updatefile";
- #already exited earlier if the file existed, and afterwards created
this empty "lock", so if not empty now, another thread is already downloading it.
- exit if ( -s "$repository/download/$vendorid/$updatefile" ); unlink "$repository/download/$vendorid/$updatefile.info"; }
@@ -147,7 +147,7 @@ $dlinfo{'REMOTESIZE'} = $remote_size; $dlinfo{'STATUS'} = "1"; &UPDXLT::writehash("$repository/download/$vendorid/$updatefile.info", %dlinfo);
-my $cmd = "$UPDXLT::wget $login $dlrate --user-agent="$UPDXLT::useragent" -q -P $repository/download/$vendorid $wgetContinueFlag $sourceurl"; +my $cmd = "$UPDXLT::wget $login $dlrate --user-agent="$UPDXLT::useragent" -q -P $repository/download/$vendorid --continue $sourceurl";
$_ = system("$cmd"); $ENV{'http_proxy'} = '';