From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter =?utf-8?q?M=C3=BCller?= To: location@lists.ipfire.org Subject: Re: [PATCH] location-importer: Fix Spamhaus ASN-DROP parsing Date: Sun, 05 Nov 2023 16:59:00 +0000 Message-ID: <44885c47-4ca1-4e04-807e-aca00b6f6fc1@ipfire.org> In-Reply-To: <2ABB0AF4-CC9C-4CEA-A389-2452CC3DE685@ipfire.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============3389787531626040150==" List-Id: --===============3389787531626040150== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hello Michael, thank you for your reply. > Hello Peter, >=20 >> On 4 Nov 2023, at 14:04, Peter M=C3=BCller wr= ote: >> >> The format of this list has changed, from a plain text file with a >> customer schema to JSON. Adjust our routines accordingly to make use of >> this list again. >> >> Signed-off-by: Peter M=C3=BCller >> Tested-by: Peter M=C3=BCller >> --- >> src/scripts/location-importer.in | 21 +++++++-------------- >> 1 file changed, 7 insertions(+), 14 deletions(-) >> >> diff --git a/src/scripts/location-importer.in b/src/scripts/location-impor= ter.in >> index 28a4f6c..8b0c676 100644 >> --- a/src/scripts/location-importer.in >> +++ b/src/scripts/location-importer.in >> @@ -3,7 +3,7 @@ >> # = # >> # libloc - A library to determine the location of someone on the Internet = # >> # = # >> -# Copyright (C) 2020-2022 IPFire Development Team = # >> +# Copyright (C) 2020-2023 IPFire Development Team = # >> # = # >> # This library is free software; you can redistribute it and/or = # >> # modify it under the terms of the GNU Lesser General Public = # >> @@ -1686,7 +1686,7 @@ class CLI(object): >> ] >> >> asn_lists =3D [ >> - ("SPAMHAUS-ASNDROP", "https://www.spamhaus.org/drop/asndrop.txt") >> + ("SPAMHAUS-ASNDROP", "https://www.spamhaus.org/drop/asndrop.json") >> ] >> >> for name, url in ip_lists: >> @@ -1759,23 +1759,16 @@ class CLI(object): >> >> # Iterate through every line, filter comments and add remaining ASNs to >> # the override table in case they are valid... >> - for sline in f.readlines(): >> + for sline in fcontent: >> # The response is assumed to be encoded in UTF-8... >> sline =3D sline.decode("utf-8") >> >> - # Comments start with a semicolon... >> - if sline.startswith(";"): >> + # Load every line as a JSON object and try to obtain an ASN from it... >> + try: >> + asn =3D json.loads(sline)["asn"] >> + except KeyError: >> continue >=20 > It would be nicer if you didn=E2=80=99t do so many things in one line, beca= use it will get difficult to catch the correct exception. >=20 > I believe that this should be split into the json.loads() operation where y= ou should catch any JSON decoding issues. This would make it clear that some = kind of download issue or similar has happened. >=20 > Fetching the =E2=80=9Casn=E2=80=9D field should be a second step, because a= n error here means that you have received a valid JSON object, but the format= has changed. Yes, that makes sense. I'll submit a second version of this patch. Having the json.loads() operation and the extraction of listed ASNs separated= carries the benefit of being able to extract the AS's name as well. I was thinking th= at we might want to update the autnums table accordingly, if it does not contain th= e AS already. This would allow us to have names for ASN-DROP'ed ASNs in the LACNIC space. H= owever, there are only two listed at the moment, so I'm not sure if this is worth the= effort. But then, having nothing is better than having nothing at all. :-) What do you think? Thanks, and best regards, Peter M=C3=BCller >=20 >> - # Throw away anything after the first space... >> - sline =3D sline.split()[0] >> - >> - # ... strip the "AS" prefix from it ... >> - sline =3D sline.strip("AS") >> - >> - # ... and convert it into an integer. Voila. >> - asn =3D int(sline) >> - >> # Filter invalid ASNs... >> if not self._check_parsed_asn(asn): >> log.warning("Skipping bogus ASN found in %s (%s): %s" % \ >> --=20 >> 2.35.3 >=20 --===============3389787531626040150==--