From mboxrd@z Thu Jan  1 00:00:00 1970
From: Peter =?utf-8?q?M=C3=BCller?= <peter.mueller@ipfire.org>
To: location@lists.ipfire.org
Subject: Re: [PATCH] location-importer: Fix Spamhaus ASN-DROP parsing
Date: Sun, 05 Nov 2023 16:59:00 +0000
Message-ID: <44885c47-4ca1-4e04-807e-aca00b6f6fc1@ipfire.org>
In-Reply-To: <2ABB0AF4-CC9C-4CEA-A389-2452CC3DE685@ipfire.org>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="===============3389787531626040150=="
List-Id: <location.lists.ipfire.org>

--===============3389787531626040150==
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable

Hello Michael,

thank you for your reply.

> Hello Peter,
>=20
>> On 4 Nov 2023, at 14:04, Peter M=C3=BCller <peter.mueller(a)ipfire.org> wr=
ote:
>>
>> The format of this list has changed, from a plain text file with a
>> customer schema to JSON. Adjust our routines accordingly to make use of
>> this list again.
>>
>> Signed-off-by: Peter M=C3=BCller <peter.mueller(a)ipfire.org>
>> Tested-by: Peter M=C3=BCller <peter.mueller(a)ipfire.org>
>> ---
>> src/scripts/location-importer.in | 21 +++++++--------------
>> 1 file changed, 7 insertions(+), 14 deletions(-)
>>
>> diff --git a/src/scripts/location-importer.in b/src/scripts/location-impor=
ter.in
>> index 28a4f6c..8b0c676 100644
>> --- a/src/scripts/location-importer.in
>> +++ b/src/scripts/location-importer.in
>> @@ -3,7 +3,7 @@
>> #                                                                         =
    #
>> # libloc - A library to determine the location of someone on the Internet =
    #
>> #                                                                         =
    #
>> -# Copyright (C) 2020-2022 IPFire Development Team <info(a)ipfire.org>    =
       #
>> +# Copyright (C) 2020-2023 IPFire Development Team <info(a)ipfire.org>    =
       #
>> #                                                                         =
    #
>> # This library is free software; you can redistribute it and/or           =
    #
>> # modify it under the terms of the GNU Lesser General Public              =
    #
>> @@ -1686,7 +1686,7 @@ class CLI(object):
>> ]
>>
>> asn_lists =3D [
>> - ("SPAMHAUS-ASNDROP", "https://www.spamhaus.org/drop/asndrop.txt")
>> + ("SPAMHAUS-ASNDROP", "https://www.spamhaus.org/drop/asndrop.json")
>> ]
>>
>> for name, url in ip_lists:
>> @@ -1759,23 +1759,16 @@ class CLI(object):
>>
>> # Iterate through every line, filter comments and add remaining ASNs to
>> # the override table in case they are valid...
>> - for sline in f.readlines():
>> + for sline in fcontent:
>> # The response is assumed to be encoded in UTF-8...
>> sline =3D sline.decode("utf-8")
>>
>> - # Comments start with a semicolon...
>> - if sline.startswith(";"):
>> + # Load every line as a JSON object and try to obtain an ASN from it...
>> + try:
>> + asn =3D json.loads(sline)["asn"]
>> + except KeyError:
>> continue
>=20
> It would be nicer if you didn=E2=80=99t do so many things in one line, beca=
use it will get difficult to catch the correct exception.
>=20
> I believe that this should be split into the json.loads() operation where y=
ou should catch any JSON decoding issues. This would make it clear that some =
kind of download issue or similar has happened.
>=20
> Fetching the =E2=80=9Casn=E2=80=9D field should be a second step, because a=
n error here means that you have received a valid JSON object, but the format=
 has changed.

Yes, that makes sense. I'll submit a second version of this patch.

Having the json.loads() operation and the extraction of listed ASNs separated=
 carries
the benefit of being able to extract the AS's name as well. I was thinking th=
at we
might want to update the autnums table accordingly, if it does not contain th=
e AS
already.

This would allow us to have names for ASN-DROP'ed ASNs in the LACNIC space. H=
owever,
there are only two listed at the moment, so I'm not sure if this is worth the=
 effort.
But then, having nothing is better than having nothing at all. :-)

What do you think?

Thanks, and best regards,
Peter M=C3=BCller

>=20
>> - # Throw away anything after the first space...
>> - sline =3D sline.split()[0]
>> -
>> - # ... strip the "AS" prefix from it ...
>> - sline =3D sline.strip("AS")
>> -
>> - # ... and convert it into an integer. Voila.
>> - asn =3D int(sline)
>> -
>> # Filter invalid ASNs...
>> if not self._check_parsed_asn(asn):
>> log.warning("Skipping bogus ASN found in %s (%s): %s" % \
>> --=20
>> 2.35.3
>=20

--===============3389787531626040150==--