public inbox for location@lists.ipfire.org
 help / color / mirror / Atom feed
* [PATCH v2 1/2] location-importer.in: add source column for overrides as well
@ 2021-06-08  9:55 Peter Müller
  2021-06-08  9:55 ` [PATCH v2 2/2] location-importer.in: import additional IP information for Amazon AWS IP networks Peter Müller
  0 siblings, 1 reply; 2+ messages in thread
From: Peter Müller @ 2021-06-08  9:55 UTC (permalink / raw)
  To: location

[-- Attachment #1: Type: text/plain, Size: 2564 bytes --]

This allows us to track changes introduced by IP feeds from 3rd parties,
such as Amazon AWS, on the SQL server side.

In order not to break existing tables (which would required TRUNCATE),
there currently is no constraint set for the new column, but "NOT NULL"
is planned in the future.

Signed-off-by: Peter Müller <peter.mueller(a)ipfire.org>
---
 src/python/location-importer.in | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/python/location-importer.in b/src/python/location-importer.in
index aa3b8f7..78bfd55 100644
--- a/src/python/location-importer.in
+++ b/src/python/location-importer.in
@@ -183,6 +183,7 @@ class CLI(object):
 				);
 				CREATE UNIQUE INDEX IF NOT EXISTS autnum_overrides_number
 					ON autnum_overrides(number);
+				ALTER TABLE autnum_overrides ADD COLUMN IF NOT EXISTS source text;
 				ALTER TABLE autnum_overrides ADD COLUMN IF NOT EXISTS is_drop boolean;
 
 				CREATE TABLE IF NOT EXISTS network_overrides(
@@ -196,6 +197,7 @@ class CLI(object):
 					ON network_overrides(network);
 				CREATE INDEX IF NOT EXISTS network_overrides_search
 					ON network_overrides USING GIST(network inet_ops);
+				ALTER TABLE network_overrides ADD COLUMN IF NOT EXISTS source text;
 				ALTER TABLE network_overrides ADD COLUMN IF NOT EXISTS is_drop boolean;
 			""")
 
@@ -997,14 +999,16 @@ class CLI(object):
 								INSERT INTO network_overrides(
 									network,
 									country,
+									source,
 									is_anonymous_proxy,
 									is_satellite_provider,
 									is_anycast,
 									is_drop
-								) VALUES (%s, %s, %s, %s, %s, %s)
+								) VALUES (%s, %s, %s, %s, %s, %s, %s)
 								ON CONFLICT (network) DO NOTHING""",
 								"%s" % network,
 								block.get("country"),
+								"manual",
 								self._parse_bool(block, "is-anonymous-proxy"),
 								self._parse_bool(block, "is-satellite-provider"),
 								self._parse_bool(block, "is-anycast"),
@@ -1027,15 +1031,17 @@ class CLI(object):
 									number,
 									name,
 									country,
+									source,
 									is_anonymous_proxy,
 									is_satellite_provider,
 									is_anycast,
 									is_drop
-								) VALUES(%s, %s, %s, %s, %s, %s, %s)
+								) VALUES(%s, %s, %s, %s, %s, %s, %s, %s)
 								ON CONFLICT DO NOTHING""",
 								autnum,
 								block.get("name"),
 								block.get("country"),
+								"manual",
 								self._parse_bool(block, "is-anonymous-proxy"),
 								self._parse_bool(block, "is-satellite-provider"),
 								self._parse_bool(block, "is-anycast"),
-- 
2.20.1


^ permalink raw reply	[flat|nested] 2+ messages in thread

* [PATCH v2 2/2] location-importer.in: import additional IP information for Amazon AWS IP networks
  2021-06-08  9:55 [PATCH v2 1/2] location-importer.in: add source column for overrides as well Peter Müller
@ 2021-06-08  9:55 ` Peter Müller
  0 siblings, 0 replies; 2+ messages in thread
From: Peter Müller @ 2021-06-08  9:55 UTC (permalink / raw)
  To: location

[-- Attachment #1: Type: text/plain, Size: 5263 bytes --]

Amazon publishes information regarding some of their IP networks
primarily used for AWS cloud services in a machine-readable format. To
improve libloc lookup results for these, we have little choice other
than importing and parsing them.

Unfortunately, there seems to be no machine-readable list of the
locations of their data centers or availability zones available. If
there _is_ any, please let the author know.

The second version of this patch adds a meaningful description for the
"source" column in the overrides tables, to make introduced changes
less intransparent.

Fixes: #12594

Signed-off-by: Peter Müller <peter.mueller(a)ipfire.org>
---
 src/python/location-importer.in | 114 ++++++++++++++++++++++++++++++++
 1 file changed, 114 insertions(+)

diff --git a/src/python/location-importer.in b/src/python/location-importer.in
index 78bfd55..4acd972 100644
--- a/src/python/location-importer.in
+++ b/src/python/location-importer.in
@@ -19,6 +19,7 @@
 
 import argparse
 import ipaddress
+import json
 import logging
 import math
 import re
@@ -976,6 +977,10 @@ class CLI(object):
 				TRUNCATE TABLE network_overrides;
 			""")
 
+			# Update overrides for various cloud providers big enough to publish their own IP
+			# network allocation lists in a machine-readable format...
+			self._update_overrides_for_aws()
+
 			for file in ns.files:
 				log.info("Reading %s..." % file)
 
@@ -1051,6 +1056,115 @@ class CLI(object):
 						else:
 							log.warning("Unsupported type: %s" % type)
 
+	def _update_overrides_for_aws(self):
+		# Download Amazon AWS IP allocation file to create overrides...
+		downloader = location.importer.Downloader()
+
+		try:
+			with downloader.request("https://ip-ranges.amazonaws.com/ip-ranges.json", return_blocks=False) as f:
+				aws_ip_dump = json.load(f.body)
+		except Exception as e:
+			log.error("unable to preprocess Amazon AWS IP ranges: %s" % e)
+			return
+
+		# XXX: Set up a dictionary for mapping a region name to a country. Unfortunately,
+		# there seems to be no machine-readable version available of this other than
+		# https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html
+		# (worse, it seems to be incomplete :-/ ); https://www.cloudping.cloud/endpoints
+		# was helpful here as well.
+		aws_region_country_map = {
+				"af-south-1": "ZA",
+				"ap-east-1": "HK",
+				"ap-south-1": "IN",
+				"ap-south-2": "IN",
+				"ap-northeast-3": "JP",
+				"ap-northeast-2": "KR",
+				"ap-southeast-1": "SG",
+				"ap-southeast-2": "AU",
+				"ap-southeast-3": "MY",
+				"ap-southeast-4": "AU",
+				"ap-northeast-1": "JP",
+				"ca-central-1": "CA",
+				"eu-central-1": "DE",
+				"eu-central-2": "CH",
+				"eu-west-1": "IE",
+				"eu-west-2": "GB",
+				"eu-south-1": "IT",
+				"eu-south-2": "ES",
+				"eu-west-3": "FR",
+				"eu-north-1": "SE",
+				"me-central-1": "AE",
+				"me-south-1": "BH",
+				"sa-east-1": "BR"
+				}
+
+		# Fetch all valid country codes to check parsed networks aganist...
+		rows = self.db.query("SELECT * FROM countries ORDER BY country_code")
+		validcountries = []
+
+		for row in rows:
+			validcountries.append(row.country_code)
+
+		with self.db.transaction():
+			for snetwork in aws_ip_dump["prefixes"] + aws_ip_dump["ipv6_prefixes"]:
+				try:
+					network = ipaddress.ip_network(snetwork.get("ip_prefix") or snetwork.get("ipv6_prefix"), strict=False)
+				except ValueError:
+					log.warning("Unable to parse line: %s" % snetwork)
+					continue
+
+				# Sanitize parsed networks...
+				if not self._check_parsed_network(network):
+					continue
+
+				# Determine region of this network...
+				region = snetwork["region"]
+				cc = None
+				is_anycast = False
+
+				# Any region name starting with "us-" will get "US" country code assigned straight away...
+				if region.startswith("us-"):
+					cc = "US"
+				elif region.startswith("cn-"):
+					# ... same goes for China ...
+					cc = "CN"
+				elif region == "GLOBAL":
+					# ... funny region name for anycast-like networks ...
+					is_anycast = True
+				elif region in aws_region_country_map:
+					# ... assign looked up country code otherwise ...
+					cc = aws_region_country_map[region]
+				else:
+					# ... and bail out if we are missing something here
+					log.warning("Unable to determine country code for line: %s" % snetwork)
+					continue
+
+				# Skip networks with unknown country codes
+				if not is_anycast and validcountries and cc not in validcountries:
+					log.warning("Skipping Amazon AWS network with bogus country '%s': %s" % \
+						(cc, network))
+					return
+
+				# Conduct SQL statement...
+				self.db.execute("""
+					INSERT INTO network_overrides(
+						network,
+						country,
+						source,
+						is_anonymous_proxy,
+						is_satellite_provider,
+						is_anycast
+					) VALUES (%s, %s, %s, %s, %s, %s)
+					ON CONFLICT (network) DO NOTHING""",
+					"%s" % network,
+					cc,
+					"Amazon AWS IP feed",
+					None,
+					None,
+					is_anycast,
+				)
+
+
 	@staticmethod
 	def _parse_bool(block, key):
 		val = block.get(key)
-- 
2.20.1


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-06-08  9:55 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-08  9:55 [PATCH v2 1/2] location-importer.in: add source column for overrides as well Peter Müller
2021-06-08  9:55 ` [PATCH v2 2/2] location-importer.in: import additional IP information for Amazon AWS IP networks Peter Müller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox