From: Michael Tremer <michael.tremer@ipfire.org>
To: location@lists.ipfire.org
Subject: Re: [PATCH v2] location-importer.in: Conduct sanity checks per DROP list
Date: Tue, 27 Sep 2022 10:17:11 +0100 [thread overview]
Message-ID: <71AEB78A-D75A-4306-BCB1-9B8E4F56B063@ipfire.org> (raw)
In-Reply-To: <a4378944-5729-231c-2428-529855ea2479@ipfire.org>
[-- Attachment #1: Type: text/plain, Size: 5378 bytes --]
Hello,
This looks a lot more Pythonic and okay to me.
I will merge this shortly.
-Michael
> On 26 Sep 2022, at 19:26, Peter Müller <peter.mueller(a)ipfire.org> wrote:
>
> Previously, the lack of distinction between different DROP lists caused
> only the last one to be persisted. The second version of this patch
> incorporates suggestions from Michael on the first version.
>
> Tested-by: Peter Müller <peter.mueller(a)ipfire.org>
> Signed-off-by: Peter Müller <peter.mueller(a)ipfire.org>
> ---
> src/scripts/location-importer.in | 74 +++++++++++++++++++-------------
> 1 file changed, 44 insertions(+), 30 deletions(-)
>
> diff --git a/src/scripts/location-importer.in b/src/scripts/location-importer.in
> index 8d47497..d405eb2 100644
> --- a/src/scripts/location-importer.in
> +++ b/src/scripts/location-importer.in
> @@ -1427,37 +1427,37 @@ class CLI(object):
> def _update_overrides_for_spamhaus_drop(self):
> downloader = location.importer.Downloader()
>
> - ip_urls = [
> - "https://www.spamhaus.org/drop/drop.txt",
> - "https://www.spamhaus.org/drop/edrop.txt",
> - "https://www.spamhaus.org/drop/dropv6.txt"
> + ip_lists = [
> + ("SPAMHAUS-DROP", "https://www.spamhaus.org/drop/drop.txt"),
> + ("SPAMHAUS-EDROP", "https://www.spamhaus.org/drop/edrop.txt"),
> + ("SPAMHAUS-DROPV6", "https://www.spamhaus.org/drop/dropv6.txt")
> ]
>
> - asn_urls = [
> - "https://www.spamhaus.org/drop/asndrop.txt"
> + asn_lists = [
> + ("SPAMHAUS-ASNDROP", "https://www.spamhaus.org/drop/asndrop.txt")
> ]
>
> - for url in ip_urls:
> - # Fetch IP list
> + for name, url in ip_lists:
> + # Fetch IP list from given URL
> f = downloader.retrieve(url)
>
> # Split into lines
> fcontent = f.readlines()
>
> - # Conduct a very basic sanity check to rule out CDN issues causing bogus DROP
> - # downloads.
> - if len(fcontent) > 10:
> - self.db.execute("""
> - DELETE FROM autnum_overrides WHERE source = 'Spamhaus ASN-DROP list';
> - DELETE FROM network_overrides WHERE source = 'Spamhaus DROP lists';
> - """)
> - else:
> - log.error("Spamhaus DROP URL %s returned likely bogus file, ignored" % url)
> - continue
> -
> - # Iterate through every line, filter comments and add remaining networks to
> - # the override table in case they are valid...
> with self.db.transaction():
> + # Conduct a very basic sanity check to rule out CDN issues causing bogus DROP
> + # downloads.
> + if len(fcontent) > 10:
> + self.db.execute("""
> + DELETE FROM network_overrides WHERE source = '%s';
> + """ % name,
> + )
> + else:
> + log.error("%s (%s) returned likely bogus file, ignored" % (name, url))
> + continue
> +
> + # Iterate through every line, filter comments and add remaining networks to
> + # the override table in case they are valid...
> for sline in fcontent:
> # The response is assumed to be encoded in UTF-8...
> sline = sline.decode("utf-8")
> @@ -1475,8 +1475,8 @@ class CLI(object):
>
> # Sanitize parsed networks...
> if not self._check_parsed_network(network):
> - log.warning("Skipping bogus network found in Spamhaus DROP URL %s: %s" % \
> - (url, network))
> + log.warning("Skipping bogus network found in %s (%s): %s" % \
> + (name, url, network))
> continue
>
> # Conduct SQL statement...
> @@ -1488,17 +1488,31 @@ class CLI(object):
> ) VALUES (%s, %s, %s)
> ON CONFLICT (network) DO UPDATE SET is_drop = True""",
> "%s" % network,
> - "Spamhaus DROP lists",
> + name,
> True
> )
>
> - for url in asn_urls:
> + for name, url in asn_lists:
> # Fetch URL
> f = downloader.retrieve(url)
>
> - # Iterate through every line, filter comments and add remaining ASNs to
> - # the override table in case they are valid...
> + # Split into lines
> + fcontent = f.readlines()
> +
> with self.db.transaction():
> + # Conduct a very basic sanity check to rule out CDN issues causing bogus DROP
> + # downloads.
> + if len(fcontent) > 10:
> + self.db.execute("""
> + DELETE FROM autnum_overrides WHERE source = '%s';
> + """ % name,
> + )
> + else:
> + log.error("%s (%s) returned likely bogus file, ignored" % (name, url))
> + continue
> +
> + # Iterate through every line, filter comments and add remaining ASNs to
> + # the override table in case they are valid...
> for sline in f.readlines():
> # The response is assumed to be encoded in UTF-8...
> sline = sline.decode("utf-8")
> @@ -1518,8 +1532,8 @@ class CLI(object):
>
> # Filter invalid ASNs...
> if not self._check_parsed_asn(asn):
> - log.warning("Skipping bogus ASN found in Spamhaus DROP URL %s: %s" % \
> - (url, asn))
> + log.warning("Skipping bogus ASN found in %s (%s): %s" % \
> + (name, url, asn))
> continue
>
> # Conduct SQL statement...
> @@ -1531,7 +1545,7 @@ class CLI(object):
> ) VALUES (%s, %s, %s)
> ON CONFLICT (number) DO UPDATE SET is_drop = True""",
> "%s" % asn,
> - "Spamhaus ASN-DROP list",
> + name,
> True
> )
>
> --
> 2.35.3
next prev parent reply other threads:[~2022-09-27 9:17 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-18 21:42 [PATCH] " Peter Müller
2022-08-19 8:21 ` Matthias Fischer
2022-08-19 8:25 ` Michael Tremer
2022-08-19 8:31 ` Michael Tremer
2022-09-26 18:26 ` Peter Müller
2022-09-26 18:26 ` [PATCH v2] " Peter Müller
2022-09-27 9:17 ` Michael Tremer [this message]
2022-09-27 9:22 ` [PATCH] " Michael Tremer
2022-10-02 11:04 ` Peter Müller
2022-10-04 8:39 ` Michael Tremer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=71AEB78A-D75A-4306-BCB1-9B8E4F56B063@ipfire.org \
--to=michael.tremer@ipfire.org \
--cc=location@lists.ipfire.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox