public inbox for location@lists.ipfire.org
 help / color / mirror / Atom feed
From: "Peter Müller" <peter.mueller@ipfire.org>
To: location@lists.ipfire.org
Subject: Re: [PATCH 2/2] location-importer.in: Add Spamhaus DROP lists
Date: Wed, 13 Oct 2021 18:37:08 +0200	[thread overview]
Message-ID: <a9034d43-3973-3a9b-cfab-e0dc5ab50336@ipfire.org> (raw)
In-Reply-To: <D30B265E-898F-470E-980E-1DAF296549D5@ipfire.org>

[-- Attachment #1: Type: text/plain, Size: 7065 bytes --]

Hello Michael,

thanks for your reply.

> Hello,
> 
>> On 10 Oct 2021, at 17:16, Peter Müller <peter.mueller(a)ipfire.org> wrote:
>>
>> A while ago, it was discussed whether or not libloc should become an
>> "opinionated database", i. e. including any information on a network's
>> reputation.
>>
>> In general, this idea was dismissed as libloc is neither intended nor
>> suitable for such tasks, and we do not want to make (political?)
>> decisions like these for various reasons. All we do is to provide a
>> useful location database in a neutral way, and leave it up to our users
>> on how to react on certain results.
>>
>> However, there is a problematic area. Take AS55303 as an example: We
>> _know_ this is to be a dirty network, tampering with RIR data and
>> hijacking IP space, and strongly recommend against processing any
>> connection originating from or directed to it.
>>
>> Since it appears to be loaded with proxies used by miscreants for
>> abusive purposes, all we can do at the time of writing is to flag it
>> as "anonymous proxy", but we lack possibility of telling our users
>> something like "this is not a safe area". The very same goes for known
>> bulletproof ISPs, IP hijackers, and so forth.
>>
>> This patch therefore suggests to populate the "is_drop" flag introduced
>> in libloc 0.9.8 (albeit currently unused in production) with the
>> contents of Spamhaus' DROP lists (https://www.spamhaus.org/drop/), to
>> have at least the baddest of the bad covered. The very same lists are,
>> in fact, included in popular IPS rulesets as well - a decent amount of
>> IPFire users is therefore likely to have them already enabled, but in a
>> very costly way.
>>
>> It is not planned to go further, partly because there is no other feed
>> publicly available, which would come with the same intention,
>> volatility, and FP rate.
>>
>> The second version of this patch makes use of an auxiliary function to
>> sanitise ASNs, hence avoiding boilerplate code, and treats any line
>> starting with a semicolon as a comment, which should be sufficient.
>>
>> Signed-off-by: Peter Müller <peter.mueller(a)ipfire.org>
>> ---
>> src/python/location-importer.in | 104 ++++++++++++++++++++++++++++++++
>> 1 file changed, 104 insertions(+)
>>
>> diff --git a/src/python/location-importer.in b/src/python/location-importer.in
>> index c2b3e41..1d84c15 100644
>> --- a/src/python/location-importer.in
>> +++ b/src/python/location-importer.in
>> @@ -1075,6 +1075,9 @@ class CLI(object):
>> 			# network allocation lists in a machine-readable format...
>> 			self._update_overrides_for_aws()
>>
>> +			# Update overrides for Spamhaus DROP feeds...
>> +			self._update_overrides_for_spamhaus_drop()
>> +
>> 			for file in ns.files:
>> 				log.info("Reading %s..." % file)
>>
>> @@ -1259,6 +1262,107 @@ class CLI(object):
>> 				)
>>
>>
>> +	def _update_overrides_for_spamhaus_drop(self):
>> +		downloader = location.importer.Downloader()
>> +
>> +		ip_urls = [
>> +					"https://www.spamhaus.org/drop/drop.txt",
>> +					"https://www.spamhaus.org/drop/edrop.txt",
>> +					"https://www.spamhaus.org/drop/dropv6.txt"
>> +				]
>> +
>> +		asn_urls = [
>> +					"https://www.spamhaus.org/drop/asndrop.txt"
>> +				]
>> +
>> +		for url in ip_urls:
>> +			try:
>> +				with downloader.request(url, return_blocks=False) as f:
>> +					fcontent = f.body.readlines()
>> +			except Exception as e:
>> +				log.error("Unable to download Spamhaus DROP URL %s: %s" % (url, e))
>> +				return
>> +
>> +			# Iterate through every line, filter comments and add remaining networks to
>> +			# the override table in case they are valid...
>> +			with self.db.transaction():
>> +				for sline in fcontent:
>> +
>> +					# The response is assumed to be encoded in UTF-8...
>> +					sline = sline.decode("utf-8")
>> +
>> +					# Comments start with a semicolon...
>> +					if sline.startswith(";"):
>> +						continue
>> +
>> +					# Extract network and ignore anything afterwards...
>> +					try:
>> +						network = ipaddress.ip_network(sline.split()[0], strict=False)
>> +					except ValueError:
>> +						log.error("Unable to parse line: %s" % sline)
>> +						continue
>> +
>> +					# Sanitize parsed networks...
>> +					if not self._check_parsed_network(network):
>> +						log.warning("Skipping bogus network found in Spamhaus DROP URL %s: %s" % \
>> +							(url, network))
>> +						continue
>> +
>> +					# Conduct SQL statement...
>> +					self.db.execute("""
>> +						INSERT INTO network_overrides(
>> +							network,
>> +							source,
>> +							is_drop
>> +						) VALUES (%s, %s, %s)
>> +						ON CONFLICT (network) DO NOTHING""",
>> +						"%s" % network,
>> +						"Spamhaus DROP lists",
>> +						True
>> +					)
>> +
>> +		for url in asn_urls:
>> +			try:
>> +				with downloader.request(url, return_blocks=False) as f:
>> +					fcontent = f.body.readlines()
>> +			except Exception as e:
>> +				log.error("Unable to download Spamhaus DROP URL %s: %s" % (url, e))
>> +				return
>> +
>> +			# Iterate through every line, filter comments and add remaining ASNs to
>> +			# the override table in case they are valid...
>> +			with self.db.transaction():
>> +				for sline in fcontent:
>> +
>> +					# The response is assumed to be encoded in UTF-8...
>> +					sline = sline.decode("utf-8")
>> +
>> +					# Comments start with a semicolon...
>> +					if sline.startswith(";"):
>> +						continue
>> +
>> +					# Extract ASN and ignore anything afterwards...
>> +					asn = int(sline.split()[0].strip("AS"))
> 
> I kind of like stuff like this being split over multiple lines. Performance is the same, but it is easier to understand and debugging lines can be added easier.

I see.

> 
>> +
>> +					# Filter invalid ASNs...
>> +					if not self._check_parsed_asn(asn):
>> +						log.warning("Skipping bogus ASN found in Spamhaus DROP URL %s: %s" % \
>> +							(url, asn))
>> +						continue
> 
> This answers my question from my other email whether it is possible that a non-integer is being passed to the check function. It can’t happen  here. I would prefer to drop the isinstance() check for performance reasons.

True, it can't happen, but I just noticed I would not catch a ValueError in case something
else than an integer would have left after all these split() and strip(). In the
_import_as_names_from_arin() function, this is covered, hence I will add this here as well.

Thanks, and best regards,
Peter Müller

> 
>> +
>> +					# Conduct SQL statement...
>> +					self.db.execute("""
>> +						INSERT INTO autnum_overrides(
>> +							number,
>> +							source,
>> +							is_drop
>> +						) VALUES (%s, %s, %s)
>> +						ON CONFLICT (number) DO NOTHING""",
>> +						"%s" % asn,
>> +						"Spamhaus ASN-DROP list",
>> +						True
>> +					)
>> +
>> 	@staticmethod
>> 	def _parse_bool(block, key):
>> 		val = block.get(key)
>> -- 
>> 2.26.2
> 

  reply	other threads:[~2021-10-13 16:37 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-10 16:16 [PATCH 1/2] location-importer: Introduce auxiliary function to sanitise ASNs Peter Müller
2021-10-10 16:16 ` [PATCH 2/2] location-importer.in: Add Spamhaus DROP lists Peter Müller
2021-10-12 11:26   ` Michael Tremer
2021-10-13 16:37     ` Peter Müller [this message]
2021-10-14 18:12       ` Michael Tremer
2021-10-12 11:23 ` [PATCH 1/2] location-importer: Introduce auxiliary function to sanitise ASNs Michael Tremer
2021-10-13 16:33   ` Peter Müller
2021-10-14 18:19     ` Michael Tremer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a9034d43-3973-3a9b-cfab-e0dc5ab50336@ipfire.org \
    --to=peter.mueller@ipfire.org \
    --cc=location@lists.ipfire.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox