Hello Peter,
I merged this patch, but I wanted to give you some guidance on how I would have written the Python code.
I am not saying my way is the better way. I just find it a lot easier to read.
On 11 Dec 2021, at 21:59, Peter Müller peter.mueller@ipfire.org wrote:
This improves country code accurarcy for suballocations within IP space managed by LACNIC, as the delegated-extended-latest file only provides country code information at the top level of an allocated network.
Sadly, lacnic.db.gz does not contain descriptions or names of Autonomous Systems within the space maintained by LACNIC.
Signed-off-by: Peter Müller peter.mueller@ipfire.org
src/python/importer.py | 4 ++- src/python/location-importer.in | 43 +++++++++++++++++++++++++++------ 2 files changed, 39 insertions(+), 8 deletions(-)
diff --git a/src/python/importer.py b/src/python/importer.py index de340b4..dee36ed 100644 --- a/src/python/importer.py +++ b/src/python/importer.py @@ -53,7 +53,9 @@ WHOIS_SOURCES = { ],
# Latin America and Caribbean Network Information Centre
- # XXX ???
"LACNIC": [
"https://ftp.lacnic.net/lacnic/dbase/lacnic.db.gz"
],
# Réseaux IP Européens "RIPE": [
diff --git a/src/python/location-importer.in b/src/python/location-importer.in index b791b4d..78bb846 100644 --- a/src/python/location-importer.in +++ b/src/python/location-importer.in @@ -681,13 +681,42 @@ class CLI(object): # Strip any excess space start_address, end_address = start_address.rstrip(), end_address.strip()
# Convert to IP address
try:
start_address = ipaddress.ip_address(start_address)
end_address = ipaddress.ip_address(end_address)
except ValueError:
log.warning("Could not parse line: %s" % line)
return
# Handle "inetnum" formatting in LACNIC DB (e.g. "24.152.8/22" instead of "24.152.8.0/22")
I would have checked in the beginning whether the string contains “/“ and then branched off to somewhere that handles this.
That would have been more intuitive to read than the following if statement:
if start_address and not (delim or end_address):
try:
start_address = ipaddress.ip_network(start_address, strict=False)
except ValueError:
start_address = start_address.split("/")
ldigits = len(start_address[0].split("."))
You can use start_address[0].count(“.”) here which is shorter and easier to read. Hopefully it is implemented faster, too.
# How many octets do we need to add?
# (LACNIC does not seem to have a /8 or greater assigned, so the following should suffice.)
if ldigits == 2:
start_address = start_address[0] + ".0.0/" + start_address[1]
elif ldigits == 3:
start_address = start_address[0] + ".0/" + start_address[1]
else:
log.warning("Could not recover IPv4 address from line in LACNIC DB format: %s" % line)
return
I would consider these inputs (e.g. 24.152.8/22) valid IP addresses. They are not very popular, confusing and very easy to get wrong, but inet_ntop should parse this correctly.
However, since we are in this mess already, maybe a regular expression or substitution would have been more simple to write.
In the end, this solution seems to work. The reason why I am going on about this is that we run through this millions of times. Even adding an extra if statement is becoming a performance penalty which I would like to avoid as much as possible. The other factor is that code should be easy to audit and that is why I would like to pass on my patterns because I believe they are good :)
-Michael
try:
start_address = ipaddress.ip_network(start_address, strict=False)
except ValueError:
log.warning("Could not parse line in LACNIC DB format: %s" % line)
return
# Enumerate first and last IP address of this network
end_address = start_address[-1]
start_address = start_address[0]
else:
# Convert to IP address
try:
start_address = ipaddress.ip_address(start_address)
end_address = ipaddress.ip_address(end_address)
except ValueError:
log.warning("Could not parse line: %s" % line)
return inetnum["inetnum"] = list(ipaddress.summarize_address_range(start_address, end_address))
-- 2.26.2