From: Michael Tremer <michael.tremer@ipfire.org>
To: location@lists.ipfire.org
Subject: Re: [PATCH 2/2] location-importer.in: Add Spamhaus DROP lists
Date: Thu, 14 Oct 2021 19:12:36 +0100 [thread overview]
Message-ID: <CEE4DD51-855F-495C-AF23-8683162C56CD@ipfire.org> (raw)
In-Reply-To: <a9034d43-3973-3a9b-cfab-e0dc5ab50336@ipfire.org>
[-- Attachment #1: Type: text/plain, Size: 7704 bytes --]
Hello,
> On 13 Oct 2021, at 17:37, Peter Müller <peter.mueller(a)ipfire.org> wrote:
>
> Hello Michael,
>
> thanks for your reply.
>
>> Hello,
>>
>>> On 10 Oct 2021, at 17:16, Peter Müller <peter.mueller(a)ipfire.org> wrote:
>>>
>>> A while ago, it was discussed whether or not libloc should become an
>>> "opinionated database", i. e. including any information on a network's
>>> reputation.
>>>
>>> In general, this idea was dismissed as libloc is neither intended nor
>>> suitable for such tasks, and we do not want to make (political?)
>>> decisions like these for various reasons. All we do is to provide a
>>> useful location database in a neutral way, and leave it up to our users
>>> on how to react on certain results.
>>>
>>> However, there is a problematic area. Take AS55303 as an example: We
>>> _know_ this is to be a dirty network, tampering with RIR data and
>>> hijacking IP space, and strongly recommend against processing any
>>> connection originating from or directed to it.
>>>
>>> Since it appears to be loaded with proxies used by miscreants for
>>> abusive purposes, all we can do at the time of writing is to flag it
>>> as "anonymous proxy", but we lack possibility of telling our users
>>> something like "this is not a safe area". The very same goes for known
>>> bulletproof ISPs, IP hijackers, and so forth.
>>>
>>> This patch therefore suggests to populate the "is_drop" flag introduced
>>> in libloc 0.9.8 (albeit currently unused in production) with the
>>> contents of Spamhaus' DROP lists (https://www.spamhaus.org/drop/), to
>>> have at least the baddest of the bad covered. The very same lists are,
>>> in fact, included in popular IPS rulesets as well - a decent amount of
>>> IPFire users is therefore likely to have them already enabled, but in a
>>> very costly way.
>>>
>>> It is not planned to go further, partly because there is no other feed
>>> publicly available, which would come with the same intention,
>>> volatility, and FP rate.
>>>
>>> The second version of this patch makes use of an auxiliary function to
>>> sanitise ASNs, hence avoiding boilerplate code, and treats any line
>>> starting with a semicolon as a comment, which should be sufficient.
>>>
>>> Signed-off-by: Peter Müller <peter.mueller(a)ipfire.org>
>>> ---
>>> src/python/location-importer.in | 104 ++++++++++++++++++++++++++++++++
>>> 1 file changed, 104 insertions(+)
>>>
>>> diff --git a/src/python/location-importer.in b/src/python/location-importer.in
>>> index c2b3e41..1d84c15 100644
>>> --- a/src/python/location-importer.in
>>> +++ b/src/python/location-importer.in
>>> @@ -1075,6 +1075,9 @@ class CLI(object):
>>> # network allocation lists in a machine-readable format...
>>> self._update_overrides_for_aws()
>>>
>>> + # Update overrides for Spamhaus DROP feeds...
>>> + self._update_overrides_for_spamhaus_drop()
>>> +
>>> for file in ns.files:
>>> log.info("Reading %s..." % file)
>>>
>>> @@ -1259,6 +1262,107 @@ class CLI(object):
>>> )
>>>
>>>
>>> + def _update_overrides_for_spamhaus_drop(self):
>>> + downloader = location.importer.Downloader()
>>> +
>>> + ip_urls = [
>>> + "https://www.spamhaus.org/drop/drop.txt",
>>> + "https://www.spamhaus.org/drop/edrop.txt",
>>> + "https://www.spamhaus.org/drop/dropv6.txt"
>>> + ]
>>> +
>>> + asn_urls = [
>>> + "https://www.spamhaus.org/drop/asndrop.txt"
>>> + ]
>>> +
>>> + for url in ip_urls:
>>> + try:
>>> + with downloader.request(url, return_blocks=False) as f:
>>> + fcontent = f.body.readlines()
>>> + except Exception as e:
>>> + log.error("Unable to download Spamhaus DROP URL %s: %s" % (url, e))
>>> + return
>>> +
>>> + # Iterate through every line, filter comments and add remaining networks to
>>> + # the override table in case they are valid...
>>> + with self.db.transaction():
>>> + for sline in fcontent:
>>> +
>>> + # The response is assumed to be encoded in UTF-8...
>>> + sline = sline.decode("utf-8")
>>> +
>>> + # Comments start with a semicolon...
>>> + if sline.startswith(";"):
>>> + continue
>>> +
>>> + # Extract network and ignore anything afterwards...
>>> + try:
>>> + network = ipaddress.ip_network(sline.split()[0], strict=False)
>>> + except ValueError:
>>> + log.error("Unable to parse line: %s" % sline)
>>> + continue
>>> +
>>> + # Sanitize parsed networks...
>>> + if not self._check_parsed_network(network):
>>> + log.warning("Skipping bogus network found in Spamhaus DROP URL %s: %s" % \
>>> + (url, network))
>>> + continue
>>> +
>>> + # Conduct SQL statement...
>>> + self.db.execute("""
>>> + INSERT INTO network_overrides(
>>> + network,
>>> + source,
>>> + is_drop
>>> + ) VALUES (%s, %s, %s)
>>> + ON CONFLICT (network) DO NOTHING""",
>>> + "%s" % network,
>>> + "Spamhaus DROP lists",
>>> + True
>>> + )
>>> +
>>> + for url in asn_urls:
>>> + try:
>>> + with downloader.request(url, return_blocks=False) as f:
>>> + fcontent = f.body.readlines()
>>> + except Exception as e:
>>> + log.error("Unable to download Spamhaus DROP URL %s: %s" % (url, e))
>>> + return
>>> +
>>> + # Iterate through every line, filter comments and add remaining ASNs to
>>> + # the override table in case they are valid...
>>> + with self.db.transaction():
>>> + for sline in fcontent:
>>> +
>>> + # The response is assumed to be encoded in UTF-8...
>>> + sline = sline.decode("utf-8")
>>> +
>>> + # Comments start with a semicolon...
>>> + if sline.startswith(";"):
>>> + continue
>>> +
>>> + # Extract ASN and ignore anything afterwards...
>>> + asn = int(sline.split()[0].strip("AS"))
>>
>> I kind of like stuff like this being split over multiple lines. Performance is the same, but it is easier to understand and debugging lines can be added easier.
>
> I see.
>
>>
>>> +
>>> + # Filter invalid ASNs...
>>> + if not self._check_parsed_asn(asn):
>>> + log.warning("Skipping bogus ASN found in Spamhaus DROP URL %s: %s" % \
>>> + (url, asn))
>>> + continue
>>
>> This answers my question from my other email whether it is possible that a non-integer is being passed to the check function. It can’t happen here. I would prefer to drop the isinstance() check for performance reasons.
>
> True, it can't happen, but I just noticed I would not catch a ValueError in case something
> else than an integer would have left after all these split() and strip(). In the
> _import_as_names_from_arin() function, this is covered, hence I will add this here as well.
Yeah that might be true, but you don’t want to catch all exceptions. They are called exceptions for a reason.
When you parse stuff, you want to check this, but I would normally expect that the caller of a function has sufficiently typed their arguments. Otherwise you want to use a typed language like C :)
-Michael
>
> Thanks, and best regards,
> Peter Müller
>
>>
>>> +
>>> + # Conduct SQL statement...
>>> + self.db.execute("""
>>> + INSERT INTO autnum_overrides(
>>> + number,
>>> + source,
>>> + is_drop
>>> + ) VALUES (%s, %s, %s)
>>> + ON CONFLICT (number) DO NOTHING""",
>>> + "%s" % asn,
>>> + "Spamhaus ASN-DROP list",
>>> + True
>>> + )
>>> +
>>> @staticmethod
>>> def _parse_bool(block, key):
>>> val = block.get(key)
>>> --
>>> 2.26.2
next prev parent reply other threads:[~2021-10-14 18:12 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-10 16:16 [PATCH 1/2] location-importer: Introduce auxiliary function to sanitise ASNs Peter Müller
2021-10-10 16:16 ` [PATCH 2/2] location-importer.in: Add Spamhaus DROP lists Peter Müller
2021-10-12 11:26 ` Michael Tremer
2021-10-13 16:37 ` Peter Müller
2021-10-14 18:12 ` Michael Tremer [this message]
2021-10-12 11:23 ` [PATCH 1/2] location-importer: Introduce auxiliary function to sanitise ASNs Michael Tremer
2021-10-13 16:33 ` Peter Müller
2021-10-14 18:19 ` Michael Tremer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CEE4DD51-855F-495C-AF23-8683162C56CD@ipfire.org \
--to=michael.tremer@ipfire.org \
--cc=location@lists.ipfire.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox