From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Tremer To: development@lists.ipfire.org Subject: Re: [PATCH 4/5] ids-functions.pl: Use If-Modified-Since header to reduce file downloads. Date: Wed, 23 Mar 2022 09:34:26 +0000 Message-ID: <2EB13745-23E5-4F5C-9D07-86AA84BC866E@ipfire.org> In-Reply-To: <20220323040452.2609-4-stefan.schantl@ipfire.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============7848921190903645079==" List-Id: --===============7848921190903645079== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hello, I like this mechanism. It is quite powerful since it will save a lot of unnec= essary downloads and CPU cycles. However, I am not sure if all providers will= support this because some use crappy CDNs which do not give anything about c= onforming to any RFC. > On 23 Mar 2022, at 04:04, Stefan Schantl wrot= e: >=20 > When using the "If-Modified-Since" header, the server can be requested > if a modified version of the file can be served. >=20 > In case that is true, the file will be sent and stored by the downloader > function. If the file has not been touched since the last time, the > server will respond with the code "304" (Not modified). >=20 > This tells us, that the current stored file is the latest one (still up-to-= date) > and we safely can skip the download attempt for this provider. >=20 > Signed-off-by: Stefan Schantl > --- > config/cfgroot/ids-functions.pl | 38 ++++++++++++++++++++++++++++++--- > 1 file changed, 35 insertions(+), 3 deletions(-) >=20 > diff --git a/config/cfgroot/ids-functions.pl b/config/cfgroot/ids-functions= .pl > index dfbeb1a7d..d7df41dd1 100644 > --- a/config/cfgroot/ids-functions.pl > +++ b/config/cfgroot/ids-functions.pl > @@ -365,9 +365,25 @@ sub downloadruleset ($) { > my $tmp =3D File::Temp->new( SUFFIX =3D> ".tmp", DIR =3D> "/var/tmp/", UN= LINK =3D> 0 ); > my $tmpfile =3D $tmp->filename(); >=20 > + # Genarate and assign file name and path to store the downloaded rules f= ile. > + my $dl_rulesfile =3D &_get_dl_rulesfile($provider); > + > + # Load perl module to deal with file atributes. > + use File::stat; I don=E2=80=99t like modules in the middle of a script. This makes testing a = new perl update more difficult. If I can load the script, I expect it to have= all dependencies it needs. This does not work when modules are being loaded = later on. I would say that we need to cover those things by a module test the= n. I would prefer reliability over performance (I assume that this your argument= why to load it only when you need it). > + > + # Get the mtime of the rulesfile if it exists. > + my $mtime =3D (stat($dl_rulesfile)->mtime) if (-f $dl_rulesfile); I suggest making this a little bit simpler: Create the request first, then stat the file (you don=E2=80=99t need to check= if it exists first because stat will just tell you). Then add the header if you have a timestamp; if not, then don=E2=80=99t add a= useless header. This is a lot more straight forward :) > + > + # Convert the mtime into gmtime format. > + my $gmtime =3D gmtime($mtime || 0); > + > # Pass the requested url to the downloader. > my $request =3D HTTP::Request->new(GET =3D> $url); >=20 > + # Add the If-Modified-Since header to the request, containing the omited= and converted > + # mtime of the downloaded rules file, if one is present. > + $request->header( 'If-Modified-Since' =3D> "$gmtime" ); Will this survive any redirects? I know that some redirect to AWS S3 and you = will need to make sure that the header is being sent when you follow the redi= rect - assuming S3 supports this which I think they would. > + > my $dl_attempt =3D 1; > my $response; >=20 > @@ -381,6 +397,14 @@ sub downloadruleset ($) { > # Break loop. > last; >=20 > + # Check if the server responds with 304 (Not Modified). > + } elsif ($response->code =3D=3D 304) { > + # Log to syslog. > + &_log_to_syslog("Ruleset is up-to-date, no update required."); > + > + # Nothing to be done, the ruleset is up-to-date. > + return; > + > # Check if we ran out of download re-tries. > } elsif ($dl_attempt eq $max_dl_attempts) { > # Obtain error. > @@ -406,6 +430,10 @@ sub downloadruleset ($) { > # Get the remote size of the downloaded file. > my $remote_filesize =3D $headers->content_length; >=20 > + # Get the timestamp from header, when the file has been modified the > + # last time. > + my $last_modified =3D $headers->last_modified; > + > # Load perl stat module. > use File::stat; You are loading the module again. >=20 > @@ -428,9 +456,6 @@ sub downloadruleset ($) { > return 1; > } >=20 > - # Genarate and assign file name and path to store the downloaded rules f= ile. > - my $dl_rulesfile =3D &_get_dl_rulesfile($provider); > - > # Check if a file name could be obtained. > unless ($dl_rulesfile) { > # Log error message. > @@ -449,6 +474,13 @@ sub downloadruleset ($) { > # Overwrite the may existing rulefile or tarball with the downloaded one. > move("$tmpfile", "$dl_rulesfile"); >=20 > + # Check if the server respond contained a last_modified value. > + if ($last_modified) { > + # Assign the last modified timestamp from server as mtime to the > + # rules file. > + utime(time(), "$last_modified", "$dl_rulesfile"); > + } > + > # Delete temporary file. > unlink("$tmpfile"); >=20 > --=20 > 2.30.2 >=20 -Michael --===============7848921190903645079==--