From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Tremer To: location@lists.ipfire.org Subject: Re: [PATCH] Process LACNIC geofeed as well Date: Mon, 13 Dec 2021 18:54:07 +0000 Message-ID: In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0268454290635461924==" List-Id: --===============0268454290635461924== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hello Peter, I merged this patch, but I wanted to give you some guidance on how I would ha= ve written the Python code. I am not saying my way is the better way. I just find it a lot easier to read. > On 11 Dec 2021, at 21:59, Peter M=C3=BCller wr= ote: >=20 > This improves country code accurarcy for suballocations within IP space > managed by LACNIC, as the delegated-extended-latest file only provides > country code information at the top level of an allocated network. >=20 > Sadly, lacnic.db.gz does not contain descriptions or names of Autonomous > Systems within the space maintained by LACNIC. >=20 > Signed-off-by: Peter M=C3=BCller > --- > src/python/importer.py | 4 ++- > src/python/location-importer.in | 43 +++++++++++++++++++++++++++------ > 2 files changed, 39 insertions(+), 8 deletions(-) >=20 > diff --git a/src/python/importer.py b/src/python/importer.py > index de340b4..dee36ed 100644 > --- a/src/python/importer.py > +++ b/src/python/importer.py > @@ -53,7 +53,9 @@ WHOIS_SOURCES =3D { > ], >=20 > # Latin America and Caribbean Network Information Centre > - # XXX ??? > + "LACNIC": [ > + "https://ftp.lacnic.net/lacnic/dbase/lacnic.db.gz" > + ], >=20 > # R=C3=A9seaux IP Europ=C3=A9ens > "RIPE": [ > diff --git a/src/python/location-importer.in b/src/python/location-importer= .in > index b791b4d..78bb846 100644 > --- a/src/python/location-importer.in > +++ b/src/python/location-importer.in > @@ -681,13 +681,42 @@ class CLI(object): > # Strip any excess space > start_address, end_address =3D start_address.rstrip(), end_address.stri= p() >=20 > - # Convert to IP address > - try: > - start_address =3D ipaddress.ip_address(start_address) > - end_address =3D ipaddress.ip_address(end_address) > - except ValueError: > - log.warning("Could not parse line: %s" % line) > - return > + # Handle "inetnum" formatting in LACNIC DB (e.g. "24.152.8/22" instead= of "24.152.8.0/22") I would have checked in the beginning whether the string contains =E2=80=9C/= =E2=80=9C and then branched off to somewhere that handles this. That would have been more intuitive to read than the following if statement: > + if start_address and not (delim or end_address): > + try: > + start_address =3D ipaddress.ip_network(start_address, strict=3DFalse) > + except ValueError: > + start_address =3D start_address.split("/") > + ldigits =3D len(start_address[0].split(".")) You can use start_address[0].count(=E2=80=9C.=E2=80=9D) here which is shorter= and easier to read. Hopefully it is implemented faster, too. > + # How many octets do we need to add? > + # (LACNIC does not seem to have a /8 or greater assigned, so the fol= lowing should suffice.) > + if ldigits =3D=3D 2: > + start_address =3D start_address[0] + ".0.0/" + start_address[1] > + elif ldigits =3D=3D 3: > + start_address =3D start_address[0] + ".0/" + start_address[1] > + else: > + log.warning("Could not recover IPv4 address from line in LACNIC DB = format: %s" % line) > + return I would consider these inputs (e.g. 24.152.8/22) valid IP addresses. They are= not very popular, confusing and very easy to get wrong, but inet_ntop should= parse this correctly. However, since we are in this mess already, maybe a regular expression or sub= stitution would have been more simple to write. In the end, this solution seems to work. The reason why I am going on about t= his is that we run through this millions of times. Even adding an extra if st= atement is becoming a performance penalty which I would like to avoid as much= as possible. The other factor is that code should be easy to audit and that = is why I would like to pass on my patterns because I believe they are good :) -Michael > + > + try: > + start_address =3D ipaddress.ip_network(start_address, strict=3DFals= e) > + except ValueError: > + log.warning("Could not parse line in LACNIC DB format: %s" % line) > + return > + > + # Enumerate first and last IP address of this network > + end_address =3D start_address[-1] > + start_address =3D start_address[0] > + > + else: > + # Convert to IP address > + try: > + start_address =3D ipaddress.ip_address(start_address) > + end_address =3D ipaddress.ip_address(end_address) > + except ValueError: > + log.warning("Could not parse line: %s" % line) > + return >=20 > inetnum["inetnum"] =3D list(ipaddress.summarize_address_range(start_add= ress, end_address)) >=20 > --=20 > 2.26.2 --===============0268454290635461924==--