Hello, > On 8 Jun 2021, at 18:03, Peter Müller wrote: > > ARIN and LACNIC, unfortunately, do not seem to publish data containing > human readable AS names. For the former, we at least have a list of > tecnical names, which this patch fetches and inserts into the autnums > table. > > While some of them do not seem to be suitable for human consumption (i. > e. being very cryptic), providing these data might be helpful > neverthelesss. > > The second version of this patch contains some additional remarks on > efficient Python coding style from Michael, doing things more "pythonic". > > Signed-off-by: Peter Müller > --- > src/python/location-importer.in | 55 +++++++++++++++++++++++++++++++++ > 1 file changed, 55 insertions(+) > > diff --git a/src/python/location-importer.in b/src/python/location-importer.in > index aa3b8f7..6ccee3b 100644 > --- a/src/python/location-importer.in > +++ b/src/python/location-importer.in > @@ -505,6 +505,9 @@ class CLI(object): > for line in f: > self._parse_line(line, source_key, validcountries) > > + # Download and import (technical) AS names from ARIN > + self._import_as_names_from_arin() > + > def _check_parsed_network(self, network): > """ > Assistive function to detect and subsequently sort out parsed > @@ -775,6 +778,58 @@ class CLI(object): > "%s" % network, country, [country], source_key, > ) > > + def _import_as_names_from_arin(self): > + downloader = location.importer.Downloader() > + > + # XXX: Download AS names file from ARIN (note that these names appear to be quite > + # technical, not intended for human consumption, as description fields in > + # organisation handles for other RIRs are - however, this is what we have got, > + # and in some cases, it might be still better than nothing) > + with downloader.request("https://ftp.arin.net/info/asn.txt", return_blocks=False) as f: > + for line in f: > + # Convert binary line to string... > + line = str(line) > + > + # ... valid lines start with a space, followed by the number of the Autonomous System ... > + if not line.startswith(" "): > + continue > + > + # Split line and check if there is a valid ASN in it... > + asn, name = line.split()[0:2] > + > + try: > + asn = int(asn) > + except ValueError: > + log.debug("Skipping ARIN AS names line not containing an integer for ASN") > + continue > + > + if not ((1 <= asn and asn <= 23455) or (23457 <= asn and asn <= 64495) or (131072 <= asn and asn <= 4199999999)): > + log.debug("Skipping ARIN AS names line not containing a valid ASN: %s" % asn) > + continue > + > + # Skip any AS name that appears to be a placeholder for a different RIR or entity... > + if re.match(r"^(ASN-BLK|)(AFCONC|AFRINIC|APNIC|ASNBLK|DNIC|LACNIC|RIPE|IANA)(\d?$|\-.*)", name): > + continue This is still not entirely optimal. It doesn’t matter too much, so I will merge it, but… * You added a selection group which you do not need, so you could have written (?:…) instead of (…). \-.* matches a literal dash and then anything after it. You do not care about what comes after, so you could have just had \- and that is it. It would have saved a couple of CPU cycles because you don’t have to read the entire rest of the string. > + > + # Bail out in case the AS name contains anything we do not expect here... > + if re.search(r"[^a-zA-Z0-9-_]", name): > + log.debug("Skipping ARIN AS name for %s containing invalid characters: %s" % \ > + (asn, name)) > + > + # Things look good here, run INSERT statement and skip this one if we already have > + # a (better?) name for this Autonomous System... > + self.db.execute(""" > + INSERT INTO autnums( > + number, > + name, > + source > + ) VALUES (%s, %s, %s) > + ON CONFLICT (number) DO NOTHING""", > + asn, > + name, > + "ARIN", > + ) > + > def handle_update_announcements(self, ns): > server = ns.server[0] > > -- > 2.20.1 > -Michael