From: "Peter Müller" <peter.mueller@ipfire.org>
To: location@lists.ipfire.org
Subject: Re: [PATCH 2/2] location-importer.in: Add Spamhaus DROP lists
Date: Wed, 13 Oct 2021 18:37:08 +0200 [thread overview]
Message-ID: <a9034d43-3973-3a9b-cfab-e0dc5ab50336@ipfire.org> (raw)
In-Reply-To: <D30B265E-898F-470E-980E-1DAF296549D5@ipfire.org>
[-- Attachment #1: Type: text/plain, Size: 7065 bytes --]
Hello Michael,
thanks for your reply.
> Hello,
>
>> On 10 Oct 2021, at 17:16, Peter Müller <peter.mueller(a)ipfire.org> wrote:
>>
>> A while ago, it was discussed whether or not libloc should become an
>> "opinionated database", i. e. including any information on a network's
>> reputation.
>>
>> In general, this idea was dismissed as libloc is neither intended nor
>> suitable for such tasks, and we do not want to make (political?)
>> decisions like these for various reasons. All we do is to provide a
>> useful location database in a neutral way, and leave it up to our users
>> on how to react on certain results.
>>
>> However, there is a problematic area. Take AS55303 as an example: We
>> _know_ this is to be a dirty network, tampering with RIR data and
>> hijacking IP space, and strongly recommend against processing any
>> connection originating from or directed to it.
>>
>> Since it appears to be loaded with proxies used by miscreants for
>> abusive purposes, all we can do at the time of writing is to flag it
>> as "anonymous proxy", but we lack possibility of telling our users
>> something like "this is not a safe area". The very same goes for known
>> bulletproof ISPs, IP hijackers, and so forth.
>>
>> This patch therefore suggests to populate the "is_drop" flag introduced
>> in libloc 0.9.8 (albeit currently unused in production) with the
>> contents of Spamhaus' DROP lists (https://www.spamhaus.org/drop/), to
>> have at least the baddest of the bad covered. The very same lists are,
>> in fact, included in popular IPS rulesets as well - a decent amount of
>> IPFire users is therefore likely to have them already enabled, but in a
>> very costly way.
>>
>> It is not planned to go further, partly because there is no other feed
>> publicly available, which would come with the same intention,
>> volatility, and FP rate.
>>
>> The second version of this patch makes use of an auxiliary function to
>> sanitise ASNs, hence avoiding boilerplate code, and treats any line
>> starting with a semicolon as a comment, which should be sufficient.
>>
>> Signed-off-by: Peter Müller <peter.mueller(a)ipfire.org>
>> ---
>> src/python/location-importer.in | 104 ++++++++++++++++++++++++++++++++
>> 1 file changed, 104 insertions(+)
>>
>> diff --git a/src/python/location-importer.in b/src/python/location-importer.in
>> index c2b3e41..1d84c15 100644
>> --- a/src/python/location-importer.in
>> +++ b/src/python/location-importer.in
>> @@ -1075,6 +1075,9 @@ class CLI(object):
>> # network allocation lists in a machine-readable format...
>> self._update_overrides_for_aws()
>>
>> + # Update overrides for Spamhaus DROP feeds...
>> + self._update_overrides_for_spamhaus_drop()
>> +
>> for file in ns.files:
>> log.info("Reading %s..." % file)
>>
>> @@ -1259,6 +1262,107 @@ class CLI(object):
>> )
>>
>>
>> + def _update_overrides_for_spamhaus_drop(self):
>> + downloader = location.importer.Downloader()
>> +
>> + ip_urls = [
>> + "https://www.spamhaus.org/drop/drop.txt",
>> + "https://www.spamhaus.org/drop/edrop.txt",
>> + "https://www.spamhaus.org/drop/dropv6.txt"
>> + ]
>> +
>> + asn_urls = [
>> + "https://www.spamhaus.org/drop/asndrop.txt"
>> + ]
>> +
>> + for url in ip_urls:
>> + try:
>> + with downloader.request(url, return_blocks=False) as f:
>> + fcontent = f.body.readlines()
>> + except Exception as e:
>> + log.error("Unable to download Spamhaus DROP URL %s: %s" % (url, e))
>> + return
>> +
>> + # Iterate through every line, filter comments and add remaining networks to
>> + # the override table in case they are valid...
>> + with self.db.transaction():
>> + for sline in fcontent:
>> +
>> + # The response is assumed to be encoded in UTF-8...
>> + sline = sline.decode("utf-8")
>> +
>> + # Comments start with a semicolon...
>> + if sline.startswith(";"):
>> + continue
>> +
>> + # Extract network and ignore anything afterwards...
>> + try:
>> + network = ipaddress.ip_network(sline.split()[0], strict=False)
>> + except ValueError:
>> + log.error("Unable to parse line: %s" % sline)
>> + continue
>> +
>> + # Sanitize parsed networks...
>> + if not self._check_parsed_network(network):
>> + log.warning("Skipping bogus network found in Spamhaus DROP URL %s: %s" % \
>> + (url, network))
>> + continue
>> +
>> + # Conduct SQL statement...
>> + self.db.execute("""
>> + INSERT INTO network_overrides(
>> + network,
>> + source,
>> + is_drop
>> + ) VALUES (%s, %s, %s)
>> + ON CONFLICT (network) DO NOTHING""",
>> + "%s" % network,
>> + "Spamhaus DROP lists",
>> + True
>> + )
>> +
>> + for url in asn_urls:
>> + try:
>> + with downloader.request(url, return_blocks=False) as f:
>> + fcontent = f.body.readlines()
>> + except Exception as e:
>> + log.error("Unable to download Spamhaus DROP URL %s: %s" % (url, e))
>> + return
>> +
>> + # Iterate through every line, filter comments and add remaining ASNs to
>> + # the override table in case they are valid...
>> + with self.db.transaction():
>> + for sline in fcontent:
>> +
>> + # The response is assumed to be encoded in UTF-8...
>> + sline = sline.decode("utf-8")
>> +
>> + # Comments start with a semicolon...
>> + if sline.startswith(";"):
>> + continue
>> +
>> + # Extract ASN and ignore anything afterwards...
>> + asn = int(sline.split()[0].strip("AS"))
>
> I kind of like stuff like this being split over multiple lines. Performance is the same, but it is easier to understand and debugging lines can be added easier.
I see.
>
>> +
>> + # Filter invalid ASNs...
>> + if not self._check_parsed_asn(asn):
>> + log.warning("Skipping bogus ASN found in Spamhaus DROP URL %s: %s" % \
>> + (url, asn))
>> + continue
>
> This answers my question from my other email whether it is possible that a non-integer is being passed to the check function. It can’t happen here. I would prefer to drop the isinstance() check for performance reasons.
True, it can't happen, but I just noticed I would not catch a ValueError in case something
else than an integer would have left after all these split() and strip(). In the
_import_as_names_from_arin() function, this is covered, hence I will add this here as well.
Thanks, and best regards,
Peter Müller
>
>> +
>> + # Conduct SQL statement...
>> + self.db.execute("""
>> + INSERT INTO autnum_overrides(
>> + number,
>> + source,
>> + is_drop
>> + ) VALUES (%s, %s, %s)
>> + ON CONFLICT (number) DO NOTHING""",
>> + "%s" % asn,
>> + "Spamhaus ASN-DROP list",
>> + True
>> + )
>> +
>> @staticmethod
>> def _parse_bool(block, key):
>> val = block.get(key)
>> --
>> 2.26.2
>
next prev parent reply other threads:[~2021-10-13 16:37 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-10 16:16 [PATCH 1/2] location-importer: Introduce auxiliary function to sanitise ASNs Peter Müller
2021-10-10 16:16 ` [PATCH 2/2] location-importer.in: Add Spamhaus DROP lists Peter Müller
2021-10-12 11:26 ` Michael Tremer
2021-10-13 16:37 ` Peter Müller [this message]
2021-10-14 18:12 ` Michael Tremer
2021-10-12 11:23 ` [PATCH 1/2] location-importer: Introduce auxiliary function to sanitise ASNs Michael Tremer
2021-10-13 16:33 ` Peter Müller
2021-10-14 18:19 ` Michael Tremer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a9034d43-3973-3a9b-cfab-e0dc5ab50336@ipfire.org \
--to=peter.mueller@ipfire.org \
--cc=location@lists.ipfire.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox