From: Michael Tremer <michael.tremer@ipfire.org>
To: development@lists.ipfire.org
Subject: Re: [PATCH 4/5] ids-functions.pl: Use If-Modified-Since header to reduce file downloads.
Date: Wed, 23 Mar 2022 09:34:26 +0000 [thread overview]
Message-ID: <2EB13745-23E5-4F5C-9D07-86AA84BC866E@ipfire.org> (raw)
In-Reply-To: <20220323040452.2609-4-stefan.schantl@ipfire.org>
[-- Attachment #1: Type: text/plain, Size: 4941 bytes --]
Hello,
I like this mechanism. It is quite powerful since it will save a lot of unnecessary downloads and CPU cycles. However, I am not sure if all providers will support this because some use crappy CDNs which do not give anything about conforming to any RFC.
> On 23 Mar 2022, at 04:04, Stefan Schantl <stefan.schantl(a)ipfire.org> wrote:
>
> When using the "If-Modified-Since" header, the server can be requested
> if a modified version of the file can be served.
>
> In case that is true, the file will be sent and stored by the downloader
> function. If the file has not been touched since the last time, the
> server will respond with the code "304" (Not modified).
>
> This tells us, that the current stored file is the latest one (still up-to-date)
> and we safely can skip the download attempt for this provider.
>
> Signed-off-by: Stefan Schantl <stefan.schantl(a)ipfire.org>
> ---
> config/cfgroot/ids-functions.pl | 38 ++++++++++++++++++++++++++++++---
> 1 file changed, 35 insertions(+), 3 deletions(-)
>
> diff --git a/config/cfgroot/ids-functions.pl b/config/cfgroot/ids-functions.pl
> index dfbeb1a7d..d7df41dd1 100644
> --- a/config/cfgroot/ids-functions.pl
> +++ b/config/cfgroot/ids-functions.pl
> @@ -365,9 +365,25 @@ sub downloadruleset ($) {
> my $tmp = File::Temp->new( SUFFIX => ".tmp", DIR => "/var/tmp/", UNLINK => 0 );
> my $tmpfile = $tmp->filename();
>
> + # Genarate and assign file name and path to store the downloaded rules file.
> + my $dl_rulesfile = &_get_dl_rulesfile($provider);
> +
> + # Load perl module to deal with file atributes.
> + use File::stat;
I don’t like modules in the middle of a script. This makes testing a new perl update more difficult. If I can load the script, I expect it to have all dependencies it needs. This does not work when modules are being loaded later on. I would say that we need to cover those things by a module test then.
I would prefer reliability over performance (I assume that this your argument why to load it only when you need it).
> +
> + # Get the mtime of the rulesfile if it exists.
> + my $mtime = (stat($dl_rulesfile)->mtime) if (-f $dl_rulesfile);
I suggest making this a little bit simpler:
Create the request first, then stat the file (you don’t need to check if it exists first because stat will just tell you).
Then add the header if you have a timestamp; if not, then don’t add a useless header. This is a lot more straight forward :)
> +
> + # Convert the mtime into gmtime format.
> + my $gmtime = gmtime($mtime || 0);
> +
> # Pass the requested url to the downloader.
> my $request = HTTP::Request->new(GET => $url);
>
> + # Add the If-Modified-Since header to the request, containing the omited and converted
> + # mtime of the downloaded rules file, if one is present.
> + $request->header( 'If-Modified-Since' => "$gmtime" );
Will this survive any redirects? I know that some redirect to AWS S3 and you will need to make sure that the header is being sent when you follow the redirect - assuming S3 supports this which I think they would.
> +
> my $dl_attempt = 1;
> my $response;
>
> @@ -381,6 +397,14 @@ sub downloadruleset ($) {
> # Break loop.
> last;
>
> + # Check if the server responds with 304 (Not Modified).
> + } elsif ($response->code == 304) {
> + # Log to syslog.
> + &_log_to_syslog("Ruleset is up-to-date, no update required.");
> +
> + # Nothing to be done, the ruleset is up-to-date.
> + return;
> +
> # Check if we ran out of download re-tries.
> } elsif ($dl_attempt eq $max_dl_attempts) {
> # Obtain error.
> @@ -406,6 +430,10 @@ sub downloadruleset ($) {
> # Get the remote size of the downloaded file.
> my $remote_filesize = $headers->content_length;
>
> + # Get the timestamp from header, when the file has been modified the
> + # last time.
> + my $last_modified = $headers->last_modified;
> +
> # Load perl stat module.
> use File::stat;
You are loading the module again.
>
> @@ -428,9 +456,6 @@ sub downloadruleset ($) {
> return 1;
> }
>
> - # Genarate and assign file name and path to store the downloaded rules file.
> - my $dl_rulesfile = &_get_dl_rulesfile($provider);
> -
> # Check if a file name could be obtained.
> unless ($dl_rulesfile) {
> # Log error message.
> @@ -449,6 +474,13 @@ sub downloadruleset ($) {
> # Overwrite the may existing rulefile or tarball with the downloaded one.
> move("$tmpfile", "$dl_rulesfile");
>
> + # Check if the server respond contained a last_modified value.
> + if ($last_modified) {
> + # Assign the last modified timestamp from server as mtime to the
> + # rules file.
> + utime(time(), "$last_modified", "$dl_rulesfile");
> + }
> +
> # Delete temporary file.
> unlink("$tmpfile");
>
> --
> 2.30.2
>
-Michael
next prev parent reply other threads:[~2022-03-23 9:34 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-23 4:04 [PATCH 1/5] ids-functions.pl: Drop downloader code for sourcefire based ruleset Stefan Schantl
2022-03-23 4:04 ` [PATCH 2/5] ids-functions.pl: Allow "5" download attempts for each provider before fail Stefan Schantl
2022-03-23 9:28 ` Michael Tremer
2022-03-24 18:23 ` Stefan Schantl
2022-03-28 15:16 ` Michael Tremer
2022-03-23 4:04 ` [PATCH 3/5] ids-functions.pl: Remove temporary file, if the download failed Stefan Schantl
2022-03-23 4:04 ` [PATCH 4/5] ids-functions.pl: Use If-Modified-Since header to reduce file downloads Stefan Schantl
2022-03-23 9:34 ` Michael Tremer [this message]
2022-03-24 18:50 ` Stefan Schantl
2022-03-28 15:15 ` Michael Tremer
2022-03-23 4:04 ` [PATCH 5/5] ids-functions.pl: Do not longer call any log message as "ERROR" Stefan Schantl
2022-03-23 9:37 ` [PATCH 1/5] ids-functions.pl: Drop downloader code for sourcefire based ruleset Michael Tremer
-- strict thread matches above, loose matches on Subject: below --
2022-03-22 19:40 Stefan Schantl
2022-03-22 19:40 ` [PATCH 4/5] ids-functions.pl: Use If-Modified-Since header to reduce file downloads Stefan Schantl
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2EB13745-23E5-4F5C-9D07-86AA84BC866E@ipfire.org \
--to=michael.tremer@ipfire.org \
--cc=development@lists.ipfire.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox