From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Tremer To: location@lists.ipfire.org Subject: Re: [PATCH 2/2] location-importer.in: Add Spamhaus DROP lists Date: Tue, 12 Oct 2021 12:26:52 +0100 Message-ID: In-Reply-To: <640523c0-63c2-e671-a013-20d09c03d231@ipfire.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============3383065798077570127==" List-Id: --===============3383065798077570127== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hello, > On 10 Oct 2021, at 17:16, Peter M=C3=BCller wr= ote: >=20 > A while ago, it was discussed whether or not libloc should become an > "opinionated database", i. e. including any information on a network's > reputation. >=20 > In general, this idea was dismissed as libloc is neither intended nor > suitable for such tasks, and we do not want to make (political?) > decisions like these for various reasons. All we do is to provide a > useful location database in a neutral way, and leave it up to our users > on how to react on certain results. >=20 > However, there is a problematic area. Take AS55303 as an example: We > _know_ this is to be a dirty network, tampering with RIR data and > hijacking IP space, and strongly recommend against processing any > connection originating from or directed to it. >=20 > Since it appears to be loaded with proxies used by miscreants for > abusive purposes, all we can do at the time of writing is to flag it > as "anonymous proxy", but we lack possibility of telling our users > something like "this is not a safe area". The very same goes for known > bulletproof ISPs, IP hijackers, and so forth. >=20 > This patch therefore suggests to populate the "is_drop" flag introduced > in libloc 0.9.8 (albeit currently unused in production) with the > contents of Spamhaus' DROP lists (https://www.spamhaus.org/drop/), to > have at least the baddest of the bad covered. The very same lists are, > in fact, included in popular IPS rulesets as well - a decent amount of > IPFire users is therefore likely to have them already enabled, but in a > very costly way. >=20 > It is not planned to go further, partly because there is no other feed > publicly available, which would come with the same intention, > volatility, and FP rate. >=20 > The second version of this patch makes use of an auxiliary function to > sanitise ASNs, hence avoiding boilerplate code, and treats any line > starting with a semicolon as a comment, which should be sufficient. >=20 > Signed-off-by: Peter M=C3=BCller > --- > src/python/location-importer.in | 104 ++++++++++++++++++++++++++++++++ > 1 file changed, 104 insertions(+) >=20 > diff --git a/src/python/location-importer.in b/src/python/location-importer= .in > index c2b3e41..1d84c15 100644 > --- a/src/python/location-importer.in > +++ b/src/python/location-importer.in > @@ -1075,6 +1075,9 @@ class CLI(object): > # network allocation lists in a machine-readable format... > self._update_overrides_for_aws() >=20 > + # Update overrides for Spamhaus DROP feeds... > + self._update_overrides_for_spamhaus_drop() > + > for file in ns.files: > log.info("Reading %s..." % file) >=20 > @@ -1259,6 +1262,107 @@ class CLI(object): > ) >=20 >=20 > + def _update_overrides_for_spamhaus_drop(self): > + downloader =3D location.importer.Downloader() > + > + ip_urls =3D [ > + "https://www.spamhaus.org/drop/drop.txt", > + "https://www.spamhaus.org/drop/edrop.txt", > + "https://www.spamhaus.org/drop/dropv6.txt" > + ] > + > + asn_urls =3D [ > + "https://www.spamhaus.org/drop/asndrop.txt" > + ] > + > + for url in ip_urls: > + try: > + with downloader.request(url, return_blocks=3DFalse) as f: > + fcontent =3D f.body.readlines() > + except Exception as e: > + log.error("Unable to download Spamhaus DROP URL %s: %s" % (url, e)) > + return > + > + # Iterate through every line, filter comments and add remaining network= s to > + # the override table in case they are valid... > + with self.db.transaction(): > + for sline in fcontent: > + > + # The response is assumed to be encoded in UTF-8... > + sline =3D sline.decode("utf-8") > + > + # Comments start with a semicolon... > + if sline.startswith(";"): > + continue > + > + # Extract network and ignore anything afterwards... > + try: > + network =3D ipaddress.ip_network(sline.split()[0], strict=3DFalse) > + except ValueError: > + log.error("Unable to parse line: %s" % sline) > + continue > + > + # Sanitize parsed networks... > + if not self._check_parsed_network(network): > + log.warning("Skipping bogus network found in Spamhaus DROP URL %s: %= s" % \ > + (url, network)) > + continue > + > + # Conduct SQL statement... > + self.db.execute(""" > + INSERT INTO network_overrides( > + network, > + source, > + is_drop > + ) VALUES (%s, %s, %s) > + ON CONFLICT (network) DO NOTHING""", > + "%s" % network, > + "Spamhaus DROP lists", > + True > + ) > + > + for url in asn_urls: > + try: > + with downloader.request(url, return_blocks=3DFalse) as f: > + fcontent =3D f.body.readlines() > + except Exception as e: > + log.error("Unable to download Spamhaus DROP URL %s: %s" % (url, e)) > + return > + > + # Iterate through every line, filter comments and add remaining ASNs to > + # the override table in case they are valid... > + with self.db.transaction(): > + for sline in fcontent: > + > + # The response is assumed to be encoded in UTF-8... > + sline =3D sline.decode("utf-8") > + > + # Comments start with a semicolon... > + if sline.startswith(";"): > + continue > + > + # Extract ASN and ignore anything afterwards... > + asn =3D int(sline.split()[0].strip("AS")) I kind of like stuff like this being split over multiple lines. Performance i= s the same, but it is easier to understand and debugging lines can be added e= asier. > + > + # Filter invalid ASNs... > + if not self._check_parsed_asn(asn): > + log.warning("Skipping bogus ASN found in Spamhaus DROP URL %s: %s" %= \ > + (url, asn)) > + continue This answers my question from my other email whether it is possible that a no= n-integer is being passed to the check function. It can=E2=80=99t happen her= e. I would prefer to drop the isinstance() check for performance reasons. > + > + # Conduct SQL statement... > + self.db.execute(""" > + INSERT INTO autnum_overrides( > + number, > + source, > + is_drop > + ) VALUES (%s, %s, %s) > + ON CONFLICT (number) DO NOTHING""", > + "%s" % asn, > + "Spamhaus ASN-DROP list", > + True > + ) > + > @staticmethod > def _parse_bool(block, key): > val =3D block.get(key) > --=20 > 2.26.2 --===============3383065798077570127==--