From: Michael Tremer <michael.tremer@ipfire.org>
To: location@lists.ipfire.org
Subject: Re: [PATCH 2/2] location-importer.in: Add Spamhaus DROP lists
Date: Tue, 12 Oct 2021 12:26:52 +0100 [thread overview]
Message-ID: <D30B265E-898F-470E-980E-1DAF296549D5@ipfire.org> (raw)
In-Reply-To: <640523c0-63c2-e671-a013-20d09c03d231@ipfire.org>
[-- Attachment #1: Type: text/plain, Size: 6521 bytes --]
Hello,
> On 10 Oct 2021, at 17:16, Peter Müller <peter.mueller(a)ipfire.org> wrote:
>
> A while ago, it was discussed whether or not libloc should become an
> "opinionated database", i. e. including any information on a network's
> reputation.
>
> In general, this idea was dismissed as libloc is neither intended nor
> suitable for such tasks, and we do not want to make (political?)
> decisions like these for various reasons. All we do is to provide a
> useful location database in a neutral way, and leave it up to our users
> on how to react on certain results.
>
> However, there is a problematic area. Take AS55303 as an example: We
> _know_ this is to be a dirty network, tampering with RIR data and
> hijacking IP space, and strongly recommend against processing any
> connection originating from or directed to it.
>
> Since it appears to be loaded with proxies used by miscreants for
> abusive purposes, all we can do at the time of writing is to flag it
> as "anonymous proxy", but we lack possibility of telling our users
> something like "this is not a safe area". The very same goes for known
> bulletproof ISPs, IP hijackers, and so forth.
>
> This patch therefore suggests to populate the "is_drop" flag introduced
> in libloc 0.9.8 (albeit currently unused in production) with the
> contents of Spamhaus' DROP lists (https://www.spamhaus.org/drop/), to
> have at least the baddest of the bad covered. The very same lists are,
> in fact, included in popular IPS rulesets as well - a decent amount of
> IPFire users is therefore likely to have them already enabled, but in a
> very costly way.
>
> It is not planned to go further, partly because there is no other feed
> publicly available, which would come with the same intention,
> volatility, and FP rate.
>
> The second version of this patch makes use of an auxiliary function to
> sanitise ASNs, hence avoiding boilerplate code, and treats any line
> starting with a semicolon as a comment, which should be sufficient.
>
> Signed-off-by: Peter Müller <peter.mueller(a)ipfire.org>
> ---
> src/python/location-importer.in | 104 ++++++++++++++++++++++++++++++++
> 1 file changed, 104 insertions(+)
>
> diff --git a/src/python/location-importer.in b/src/python/location-importer.in
> index c2b3e41..1d84c15 100644
> --- a/src/python/location-importer.in
> +++ b/src/python/location-importer.in
> @@ -1075,6 +1075,9 @@ class CLI(object):
> # network allocation lists in a machine-readable format...
> self._update_overrides_for_aws()
>
> + # Update overrides for Spamhaus DROP feeds...
> + self._update_overrides_for_spamhaus_drop()
> +
> for file in ns.files:
> log.info("Reading %s..." % file)
>
> @@ -1259,6 +1262,107 @@ class CLI(object):
> )
>
>
> + def _update_overrides_for_spamhaus_drop(self):
> + downloader = location.importer.Downloader()
> +
> + ip_urls = [
> + "https://www.spamhaus.org/drop/drop.txt",
> + "https://www.spamhaus.org/drop/edrop.txt",
> + "https://www.spamhaus.org/drop/dropv6.txt"
> + ]
> +
> + asn_urls = [
> + "https://www.spamhaus.org/drop/asndrop.txt"
> + ]
> +
> + for url in ip_urls:
> + try:
> + with downloader.request(url, return_blocks=False) as f:
> + fcontent = f.body.readlines()
> + except Exception as e:
> + log.error("Unable to download Spamhaus DROP URL %s: %s" % (url, e))
> + return
> +
> + # Iterate through every line, filter comments and add remaining networks to
> + # the override table in case they are valid...
> + with self.db.transaction():
> + for sline in fcontent:
> +
> + # The response is assumed to be encoded in UTF-8...
> + sline = sline.decode("utf-8")
> +
> + # Comments start with a semicolon...
> + if sline.startswith(";"):
> + continue
> +
> + # Extract network and ignore anything afterwards...
> + try:
> + network = ipaddress.ip_network(sline.split()[0], strict=False)
> + except ValueError:
> + log.error("Unable to parse line: %s" % sline)
> + continue
> +
> + # Sanitize parsed networks...
> + if not self._check_parsed_network(network):
> + log.warning("Skipping bogus network found in Spamhaus DROP URL %s: %s" % \
> + (url, network))
> + continue
> +
> + # Conduct SQL statement...
> + self.db.execute("""
> + INSERT INTO network_overrides(
> + network,
> + source,
> + is_drop
> + ) VALUES (%s, %s, %s)
> + ON CONFLICT (network) DO NOTHING""",
> + "%s" % network,
> + "Spamhaus DROP lists",
> + True
> + )
> +
> + for url in asn_urls:
> + try:
> + with downloader.request(url, return_blocks=False) as f:
> + fcontent = f.body.readlines()
> + except Exception as e:
> + log.error("Unable to download Spamhaus DROP URL %s: %s" % (url, e))
> + return
> +
> + # Iterate through every line, filter comments and add remaining ASNs to
> + # the override table in case they are valid...
> + with self.db.transaction():
> + for sline in fcontent:
> +
> + # The response is assumed to be encoded in UTF-8...
> + sline = sline.decode("utf-8")
> +
> + # Comments start with a semicolon...
> + if sline.startswith(";"):
> + continue
> +
> + # Extract ASN and ignore anything afterwards...
> + asn = int(sline.split()[0].strip("AS"))
I kind of like stuff like this being split over multiple lines. Performance is the same, but it is easier to understand and debugging lines can be added easier.
> +
> + # Filter invalid ASNs...
> + if not self._check_parsed_asn(asn):
> + log.warning("Skipping bogus ASN found in Spamhaus DROP URL %s: %s" % \
> + (url, asn))
> + continue
This answers my question from my other email whether it is possible that a non-integer is being passed to the check function. It can’t happen here. I would prefer to drop the isinstance() check for performance reasons.
> +
> + # Conduct SQL statement...
> + self.db.execute("""
> + INSERT INTO autnum_overrides(
> + number,
> + source,
> + is_drop
> + ) VALUES (%s, %s, %s)
> + ON CONFLICT (number) DO NOTHING""",
> + "%s" % asn,
> + "Spamhaus ASN-DROP list",
> + True
> + )
> +
> @staticmethod
> def _parse_bool(block, key):
> val = block.get(key)
> --
> 2.26.2
next prev parent reply other threads:[~2021-10-12 11:26 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-10 16:16 [PATCH 1/2] location-importer: Introduce auxiliary function to sanitise ASNs Peter Müller
2021-10-10 16:16 ` [PATCH 2/2] location-importer.in: Add Spamhaus DROP lists Peter Müller
2021-10-12 11:26 ` Michael Tremer [this message]
2021-10-13 16:37 ` Peter Müller
2021-10-14 18:12 ` Michael Tremer
2021-10-12 11:23 ` [PATCH 1/2] location-importer: Introduce auxiliary function to sanitise ASNs Michael Tremer
2021-10-13 16:33 ` Peter Müller
2021-10-14 18:19 ` Michael Tremer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=D30B265E-898F-470E-980E-1DAF296549D5@ipfire.org \
--to=michael.tremer@ipfire.org \
--cc=location@lists.ipfire.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox