public inbox for location@lists.ipfire.org
 help / color / mirror / Atom feed
From: Michael Tremer <michael.tremer@ipfire.org>
To: location@lists.ipfire.org
Subject: Re: [PATCH v2] location-importer.in: Conduct sanity checks per DROP list
Date: Tue, 27 Sep 2022 10:17:11 +0100	[thread overview]
Message-ID: <71AEB78A-D75A-4306-BCB1-9B8E4F56B063@ipfire.org> (raw)
In-Reply-To: <a4378944-5729-231c-2428-529855ea2479@ipfire.org>

[-- Attachment #1: Type: text/plain, Size: 5378 bytes --]

Hello,

This looks a lot more Pythonic and okay to me.

I will merge this shortly.

-Michael

> On 26 Sep 2022, at 19:26, Peter Müller <peter.mueller(a)ipfire.org> wrote:
> 
> Previously, the lack of distinction between different DROP lists caused
> only the last one to be persisted. The second version of this patch
> incorporates suggestions from Michael on the first version.
> 
> Tested-by: Peter Müller <peter.mueller(a)ipfire.org>
> Signed-off-by: Peter Müller <peter.mueller(a)ipfire.org>
> ---
> src/scripts/location-importer.in | 74 +++++++++++++++++++-------------
> 1 file changed, 44 insertions(+), 30 deletions(-)
> 
> diff --git a/src/scripts/location-importer.in b/src/scripts/location-importer.in
> index 8d47497..d405eb2 100644
> --- a/src/scripts/location-importer.in
> +++ b/src/scripts/location-importer.in
> @@ -1427,37 +1427,37 @@ class CLI(object):
> 	def _update_overrides_for_spamhaus_drop(self):
> 		downloader = location.importer.Downloader()
> 
> -		ip_urls = [
> -					"https://www.spamhaus.org/drop/drop.txt",
> -					"https://www.spamhaus.org/drop/edrop.txt",
> -					"https://www.spamhaus.org/drop/dropv6.txt"
> +		ip_lists = [
> +					("SPAMHAUS-DROP", "https://www.spamhaus.org/drop/drop.txt"),
> +					("SPAMHAUS-EDROP", "https://www.spamhaus.org/drop/edrop.txt"),
> +					("SPAMHAUS-DROPV6", "https://www.spamhaus.org/drop/dropv6.txt")
> 				]
> 
> -		asn_urls = [
> -					"https://www.spamhaus.org/drop/asndrop.txt"
> +		asn_lists = [
> +					("SPAMHAUS-ASNDROP", "https://www.spamhaus.org/drop/asndrop.txt")
> 				]
> 
> -		for url in ip_urls:
> -			# Fetch IP list
> +		for name, url in ip_lists:
> +			# Fetch IP list from given URL
> 			f = downloader.retrieve(url)
> 
> 			# Split into lines
> 			fcontent = f.readlines()
> 
> -			# Conduct a very basic sanity check to rule out CDN issues causing bogus DROP
> -			# downloads.
> -			if len(fcontent) > 10:
> -				self.db.execute("""
> -					DELETE FROM autnum_overrides WHERE source = 'Spamhaus ASN-DROP list';
> -					DELETE FROM network_overrides WHERE source = 'Spamhaus DROP lists';
> -				""")
> -			else:
> -				log.error("Spamhaus DROP URL %s returned likely bogus file, ignored" % url)
> -				continue
> -
> -			# Iterate through every line, filter comments and add remaining networks to
> -			# the override table in case they are valid...
> 			with self.db.transaction():
> +				# Conduct a very basic sanity check to rule out CDN issues causing bogus DROP
> +				# downloads.
> +				if len(fcontent) > 10:
> +					self.db.execute("""
> +						DELETE FROM network_overrides WHERE source = '%s';
> +					""" % name,
> +					)
> +				else:
> +					log.error("%s (%s) returned likely bogus file, ignored" % (name, url))
> +					continue
> +
> +				# Iterate through every line, filter comments and add remaining networks to
> +				# the override table in case they are valid...
> 				for sline in fcontent:
> 					# The response is assumed to be encoded in UTF-8...
> 					sline = sline.decode("utf-8")
> @@ -1475,8 +1475,8 @@ class CLI(object):
> 
> 					# Sanitize parsed networks...
> 					if not self._check_parsed_network(network):
> -						log.warning("Skipping bogus network found in Spamhaus DROP URL %s: %s" % \
> -							(url, network))
> +						log.warning("Skipping bogus network found in %s (%s): %s" % \
> +							(name, url, network))
> 						continue
> 
> 					# Conduct SQL statement...
> @@ -1488,17 +1488,31 @@ class CLI(object):
> 						) VALUES (%s, %s, %s)
> 						ON CONFLICT (network) DO UPDATE SET is_drop = True""",
> 						"%s" % network,
> -						"Spamhaus DROP lists",
> +						name,
> 						True
> 					)
> 
> -		for url in asn_urls:
> +		for name, url in asn_lists:
> 			# Fetch URL
> 			f = downloader.retrieve(url)
> 
> -			# Iterate through every line, filter comments and add remaining ASNs to
> -			# the override table in case they are valid...
> +			# Split into lines
> +			fcontent = f.readlines()
> +
> 			with self.db.transaction():
> +				# Conduct a very basic sanity check to rule out CDN issues causing bogus DROP
> +				# downloads.
> +				if len(fcontent) > 10:
> +					self.db.execute("""
> +						DELETE FROM autnum_overrides WHERE source = '%s';
> +					""" % name,
> +					)
> +				else:
> +					log.error("%s (%s) returned likely bogus file, ignored" % (name, url))
> +					continue
> +
> +				# Iterate through every line, filter comments and add remaining ASNs to
> +				# the override table in case they are valid...
> 				for sline in f.readlines():
> 					# The response is assumed to be encoded in UTF-8...
> 					sline = sline.decode("utf-8")
> @@ -1518,8 +1532,8 @@ class CLI(object):
> 
> 					# Filter invalid ASNs...
> 					if not self._check_parsed_asn(asn):
> -						log.warning("Skipping bogus ASN found in Spamhaus DROP URL %s: %s" % \
> -							(url, asn))
> +						log.warning("Skipping bogus ASN found in %s (%s): %s" % \
> +							(name, url, asn))
> 						continue
> 
> 					# Conduct SQL statement...
> @@ -1531,7 +1545,7 @@ class CLI(object):
> 						) VALUES (%s, %s, %s)
> 						ON CONFLICT (number) DO UPDATE SET is_drop = True""",
> 						"%s" % asn,
> -						"Spamhaus ASN-DROP list",
> +						name,
> 						True
> 					)
> 
> -- 
> 2.35.3


  reply	other threads:[~2022-09-27  9:17 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-18 21:42 [PATCH] " Peter Müller
2022-08-19  8:21 ` Matthias Fischer
2022-08-19  8:25   ` Michael Tremer
2022-08-19  8:31 ` Michael Tremer
2022-09-26 18:26   ` Peter Müller
2022-09-26 18:26     ` [PATCH v2] " Peter Müller
2022-09-27  9:17       ` Michael Tremer [this message]
2022-09-27  9:22     ` [PATCH] " Michael Tremer
2022-10-02 11:04       ` Peter Müller
2022-10-04  8:39         ` Michael Tremer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=71AEB78A-D75A-4306-BCB1-9B8E4F56B063@ipfire.org \
    --to=michael.tremer@ipfire.org \
    --cc=location@lists.ipfire.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox