public inbox for development@lists.ipfire.org
 help / color / mirror / Atom feed
From: Michael Tremer <michael.tremer@ipfire.org>
To: development@lists.ipfire.org
Subject: Re: [PATCH 4/5] ids-functions.pl: Use If-Modified-Since header to reduce file downloads.
Date: Wed, 23 Mar 2022 09:34:26 +0000	[thread overview]
Message-ID: <2EB13745-23E5-4F5C-9D07-86AA84BC866E@ipfire.org> (raw)
In-Reply-To: <20220323040452.2609-4-stefan.schantl@ipfire.org>

[-- Attachment #1: Type: text/plain, Size: 4941 bytes --]

Hello,

I like this mechanism. It is quite powerful since it will save a lot of unnecessary downloads and CPU cycles. However, I am not sure if all providers will support this because some use crappy CDNs which do not give anything about conforming to any RFC.

> On 23 Mar 2022, at 04:04, Stefan Schantl <stefan.schantl(a)ipfire.org> wrote:
> 
> When using the "If-Modified-Since" header, the server can be requested
> if a modified version of the file can be served.
> 
> In case that is true, the file will be sent and stored by the downloader
> function. If the file has not been touched since the last time, the
> server will respond with the code "304" (Not modified).
> 
> This tells us, that the current stored file is the latest one (still up-to-date)
> and we safely can skip the download attempt for this provider.
> 
> Signed-off-by: Stefan Schantl <stefan.schantl(a)ipfire.org>
> ---
> config/cfgroot/ids-functions.pl | 38 ++++++++++++++++++++++++++++++---
> 1 file changed, 35 insertions(+), 3 deletions(-)
> 
> diff --git a/config/cfgroot/ids-functions.pl b/config/cfgroot/ids-functions.pl
> index dfbeb1a7d..d7df41dd1 100644
> --- a/config/cfgroot/ids-functions.pl
> +++ b/config/cfgroot/ids-functions.pl
> @@ -365,9 +365,25 @@ sub downloadruleset ($) {
> 		my $tmp = File::Temp->new( SUFFIX => ".tmp", DIR => "/var/tmp/", UNLINK => 0 );
> 		my $tmpfile = $tmp->filename();
> 
> +		# Genarate and assign file name and path to store the downloaded rules file.
> +		my $dl_rulesfile = &_get_dl_rulesfile($provider);
> +
> +		# Load perl module to deal with file atributes.
> +		use File::stat;

I don’t like modules in the middle of a script. This makes testing a new perl update more difficult. If I can load the script, I expect it to have all dependencies it needs. This does not work when modules are being loaded later on. I would say that we need to cover those things by a module test then.

I would prefer reliability over performance (I assume that this your argument why to load it only when you need it).

> +
> +		# Get the mtime of the rulesfile if it exists.
> +		my $mtime = (stat($dl_rulesfile)->mtime) if (-f $dl_rulesfile);

I suggest making this a little bit simpler:

Create the request first, then stat the file (you don’t need to check if it exists first because stat will just tell you).

Then add the header if you have a timestamp; if not, then don’t add a useless header. This is a lot more straight forward :)

> +
> +		# Convert the mtime into gmtime format.
> +		my $gmtime = gmtime($mtime || 0);
> +
> 		# Pass the requested url to the downloader.
> 		my $request = HTTP::Request->new(GET => $url);
> 
> +		# Add the If-Modified-Since header to the request, containing the omited and converted
> +		# mtime of the downloaded rules file, if one is present.
> +		$request->header( 'If-Modified-Since' => "$gmtime" );

Will this survive any redirects? I know that some redirect to AWS S3 and you will need to make sure that the header is being sent when you follow the redirect - assuming S3 supports this which I think they would.

> +
> 		my $dl_attempt = 1;
> 		my $response;
> 
> @@ -381,6 +397,14 @@ sub downloadruleset ($) {
> 				# Break loop.
> 				last;
> 
> +			# Check if the server responds with 304 (Not Modified).
> +			} elsif ($response->code == 304) {
> +				# Log to syslog.
> +				&_log_to_syslog("Ruleset is up-to-date, no update required.");
> +
> +				# Nothing to be done, the ruleset is up-to-date.
> +				return;
> +
> 			# Check if we ran out of download re-tries.
> 			} elsif ($dl_attempt eq $max_dl_attempts) {
> 				# Obtain error.
> @@ -406,6 +430,10 @@ sub downloadruleset ($) {
> 		# Get the remote size of the downloaded file.
> 		my $remote_filesize = $headers->content_length;
> 
> +		# Get the timestamp from header, when the file has been modified the
> +		# last time.
> +		my $last_modified = $headers->last_modified;
> +
> 		# Load perl stat module.
> 		use File::stat;

You are loading the module again.

> 
> @@ -428,9 +456,6 @@ sub downloadruleset ($) {
> 			return 1;
> 		}
> 
> -		# Genarate and assign file name and path to store the downloaded rules file.
> -		my $dl_rulesfile = &_get_dl_rulesfile($provider);
> -
> 		# Check if a file name could be obtained.
> 		unless ($dl_rulesfile) {
> 			# Log error message.
> @@ -449,6 +474,13 @@ sub downloadruleset ($) {
> 		# Overwrite the may existing rulefile or tarball with the downloaded one.
> 		move("$tmpfile", "$dl_rulesfile");
> 
> +		# Check if the server respond contained a last_modified value.
> +		if ($last_modified) {
> +			# Assign the last modified timestamp from server as mtime to the
> +			# rules file.
> +			utime(time(), "$last_modified", "$dl_rulesfile");
> +		}
> +
> 		# Delete temporary file.
> 		unlink("$tmpfile");
> 
> -- 
> 2.30.2
> 

-Michael

  reply	other threads:[~2022-03-23  9:34 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-23  4:04 [PATCH 1/5] ids-functions.pl: Drop downloader code for sourcefire based ruleset Stefan Schantl
2022-03-23  4:04 ` [PATCH 2/5] ids-functions.pl: Allow "5" download attempts for each provider before fail Stefan Schantl
2022-03-23  9:28   ` Michael Tremer
2022-03-24 18:23     ` Stefan Schantl
2022-03-28 15:16       ` Michael Tremer
2022-03-23  4:04 ` [PATCH 3/5] ids-functions.pl: Remove temporary file, if the download failed Stefan Schantl
2022-03-23  4:04 ` [PATCH 4/5] ids-functions.pl: Use If-Modified-Since header to reduce file downloads Stefan Schantl
2022-03-23  9:34   ` Michael Tremer [this message]
2022-03-24 18:50     ` Stefan Schantl
2022-03-28 15:15       ` Michael Tremer
2022-03-23  4:04 ` [PATCH 5/5] ids-functions.pl: Do not longer call any log message as "ERROR" Stefan Schantl
2022-03-23  9:37 ` [PATCH 1/5] ids-functions.pl: Drop downloader code for sourcefire based ruleset Michael Tremer
  -- strict thread matches above, loose matches on Subject: below --
2022-03-22 19:40 Stefan Schantl
2022-03-22 19:40 ` [PATCH 4/5] ids-functions.pl: Use If-Modified-Since header to reduce file downloads Stefan Schantl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2EB13745-23E5-4F5C-9D07-86AA84BC866E@ipfire.org \
    --to=michael.tremer@ipfire.org \
    --cc=development@lists.ipfire.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox