From: "Peter Müller" <peter.mueller@ipfire.org>
To: location@lists.ipfire.org
Subject: [PATCH 4/8] location-importer.in: filter bogus IP networks for both Whois and extended sources
Date: Wed, 21 Oct 2020 14:47:39 +0000 [thread overview]
Message-ID: <20201021144743.18083-4-peter.mueller@ipfire.org> (raw)
In-Reply-To: <20201021144743.18083-1-peter.mueller@ipfire.org>
[-- Attachment #1: Type: text/plain, Size: 4772 bytes --]
Sanity checks for parsed networks have been put into a separate function
to avoid boilerplate code for extended sources. This makes the location
database less vulnerable to garbage written into RIR databases on
purpose or by chance.
Fixes: #12500
Signed-off-by: Peter Müller <peter.mueller(a)ipfire.org>
---
src/python/location-importer.in | 83 ++++++++++++++++++++++++++-------
1 file changed, 67 insertions(+), 16 deletions(-)
diff --git a/src/python/location-importer.in b/src/python/location-importer.in
index d249a35..20eb052 100644
--- a/src/python/location-importer.in
+++ b/src/python/location-importer.in
@@ -459,6 +459,69 @@ class CLI(object):
for line in f:
self._parse_line(line)
+ def _check_parsed_network(self, network):
+ """
+ Assistive function to detect and subsequently sort out parsed
+ networks from RIR data (both Whois and so-called "extended sources"),
+ which are or have...
+
+ (a) not globally routable (RFC 1918 space, et al.)
+ (b) covering a too large chunk of the IP address space (prefix length
+ is < 7 for IPv4 networks, and < 10 for IPv6)
+ (c) "0.0.0.0" or "::" as a network address
+ (d) are too small for being publicly announced (we have decided not to
+ process them at the moment, as they significantly enlarge our
+ database without providing very helpful additional information)
+
+ This unfortunately is necessary due to brain-dead clutter across
+ various RIR databases, causing mismatches and eventually disruptions.
+
+ We will return False in case a network is not suitable for adding
+ it to our database, and True otherwise.
+ """
+
+ if not network or not (isinstance(network, ipaddress.IPv4Network) or isinstance(network, ipaddress.IPv6Network)):
+ return False
+
+ if not network.is_global:
+ logging.warning("Skipping non-globally routable network: %s" % network)
+ return False
+
+ if network.version == 4:
+ if network.prefixlen < 7:
+ logging.warning("Skipping too big IP chunk: %s" % network)
+ return False
+
+ if network.prefixlen > 24:
+ logging.info("Skipping network too small to be publicly announced: %s" % network)
+ return False
+
+ if str(network.network_address) == "0.0.0.0":
+ logging.warning("Skipping network based on 0.0.0.0: %s" % network)
+ return False
+
+ elif network.version == 6:
+ if network.prefixlen < 10:
+ logging.warning("Skipping too big IP chunk: %s" % network)
+ return False
+
+ if network.prefixlen > 48:
+ logging.info("Skipping network too small to be publicly announced: %s" % network)
+ return False
+
+ if str(network.network_address) == "::":
+ logging.warning("Skipping network based on '::': %s" % network)
+ return False
+
+ else:
+ # This should not happen...
+ logging.warning("Skipping network of unknown family, this should not happen: %s" % network)
+ return False
+
+ # In case we have made it here, the network is considered to
+ # be suitable for libloc consumption...
+ return True
+
def _parse_block(self, block):
# Get first line to find out what type of block this is
line = block[0]
@@ -549,22 +612,7 @@ class CLI(object):
network = ipaddress.ip_network(inetnum.get("inet6num") or inetnum.get("inetnum"), strict=False)
- # Bail out in case we have processed a network covering the entire IP range, which
- # is necessary to work around faulty (?) IPv6 network processing
- if network.prefixlen == 0:
- logging.warning("Skipping network covering the entire IP adress range: %s" % network)
- return
-
- # Bail out in case we have processed a network whose prefix length indicates it is
- # not globally routable (we have decided not to process them at the moment, as they
- # significantly enlarge our database without providing very helpful additional information)
- if (network.prefixlen > 24 and network.version == 4) or (network.prefixlen > 48 and network.version == 6):
- logging.info("Skipping network too small to be publicly announced: %s" % network)
- return
-
- # Bail out in case we have processed a non-public IP network
- if network.is_private:
- logging.warning("Skipping non-globally routable network: %s" % network)
+ if not self._check_parsed_network(network):
return
self.db.execute("INSERT INTO _rirdata(network, country) \
@@ -648,6 +696,9 @@ class CLI(object):
log.warning("Invalid IP address: %s" % address)
return
+ if not self._check_parsed_network(network):
+ return
+
self.db.execute("INSERT INTO networks(network, country) \
VALUES(%s, %s) ON CONFLICT (network) DO \
UPDATE SET country = excluded.country",
--
2.20.1
next prev parent reply other threads:[~2020-10-21 14:47 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-21 14:47 [PATCH 1/8] Revert "Revert "Revert "Revert "importer: Import raw sources for inetnum's again"""" Peter Müller
2020-10-21 14:47 ` [PATCH 2/8] Revert "Revert "location-importer.in: only import relevant data from AFRINIC, APNIC and RIPE"" Peter Müller
2020-10-21 14:47 ` [PATCH 3/8] export.py: fix exporting IP networks for crappy xt_geoip module Peter Müller
2020-10-21 14:47 ` Peter Müller [this message]
2020-10-21 14:47 ` [PATCH 5/8] importer.py: fetch LACNIC data via HTTPS Peter Müller
2020-10-21 14:47 ` [PATCH 6/8] location-importer.in: omit historic/orphaned RIR data Peter Müller
2020-10-21 14:47 ` [PATCH 7/8] location-importer.in: Create gist index for announcement table as well Peter Müller
2020-10-21 14:47 ` [PATCH 8/8] location-importer.in: avoid log spam for too small networks Peter Müller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201021144743.18083-4-peter.mueller@ipfire.org \
--to=peter.mueller@ipfire.org \
--cc=location@lists.ipfire.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox