Hi,
I'm an indirect user of ipfire IP -> ASN and ASN -> AS name data since the torproject recently switched from an outdated maxmind db to ipfire's database on 2021-06-18.
When the switch from maxmind to ipfire's db occurred I observerd a significant increase of ASes without AS name.
I did read the recent threads on this mailing list about "Import (technical) AS names from ARIN", which I assume should improve or entirely solve the situation.
Is my understanding correct that your recent work [1] on adding ARIN's AS name [2] data has been merged?
[1] https://lists.ipfire.org/pipermail/location/2021-June/000394.html [2] https://ftp.arin.net/info/asn.txt
As a test I manually verified AS14061
14061 DIGITALOCEAN-ASN
and was unable find the AS name when using ipfire's db.
example lookup: https://location.ipfire.org/lookup/134.209.159.74
Should I already see ARINs asn.txt AS name information or will this be "visible" in ipfire's db in the near future?
kind regards, nusenu
Hi nusenu,
thank you for reaching out.
I did read the recent threads on this mailing list about "Import (technical) AS names from ARIN", which I assume should improve or entirely solve the situation.
Yes, this more or less addresses the issue of AS names missing completely for ARIN space.
Given the fact that I have no idea on what to do with LACNIC, and the AS names file provided by ARIN does not contain human-readable AS names (although DigitalOcean is one of the better examples in there), I rather see it as an improvement than a solution. :-)
It would be interesting to learn where MaxMind got these data from, as well as more precise location information for ARIN and LACNIC space. Apart from scraping (PITA deluxe) and/or some contract-based stuff, I have no clue...
If you or anybody else has any idea of getting more precise location or AS data for space maintained by these two RIRs in a reliable way, we will be most interested in hearing about it.
Is my understanding correct that your recent work [1] on adding ARIN's AS name [2] data has been merged?
Yes, you are correct, that patch has been merged. It can be retrieved here: https://git.ipfire.org/?p=location/libloc.git;a=commit;h=92403f3910c7a1aa576...
The reason why you do not see those AS names in production is because there is no new libloc release, yet. There is one bug (#12553) left to fix, but afterwards, I guess we are fine to release libloc 0.9.7; a day later, the generated database dump should contain the AS information.
Should I already see ARINs asn.txt AS name information or will this be "visible" in ipfire's db in the near future?
No and yes. :-)
For the sake of completeness, I take the liberty to note that the database.txt file you are referring to, and the Tor maintainers retrieve for baking it into their releases is _not_ the actual location database. So, technically, Tor is using the data conglomerated together by libloc - which we are happy about -, but not libloc itself. As far as I am aware, they decided against shipping libloc with Tor for the moment, primarily due to compatibility issues with the 0.3.5.x LTS branch.
Thanks, and best regards, Peter Müller
Hi Peter,
thanks for your comprehensive reply.
Peter Müller wrote:
the AS names file provided by ARIN does not contain human-readable AS names (although DigitalOcean is one of the better examples in there), I rather see it as an improvement than a solution. :-)
out of curiosity: I haven't understood your differentiation between "human-readable" and "technical" AS names. Would you mind elaborating on your definitions of human-readable AS names?
I understand AS name as the name attribute of the AutNum WHOIS object.
to use an example again: "DIGITALOCEAN-ASN" is the AS name of AS14061 according to [1], WHOIS [2] and well established tools like RIPEstat show the very same string [3].
[1] https://ftp.arin.net/info/asn.txt [2] https://whois.arin.net/rest/asn/AS14061 [3] https://stat.ripe.net/widget/as-overview#w.resource=AS14061
If you or anybody else has any idea of getting more precise location or AS data for space maintained by these two RIRs in a reliable way, we will be most interested in hearing about it.
with regards to LACNIC: Maybe their bulk whois data is something you find useful? (I didn't use it myself)
https://www.lacnic.net/en/web/lacnic/manual-8 https://github.com/microsoft/WhoisParsers/blob/fb54e3cf576e5caa4d2f8e0c15768...
RIPEstat is also an option for AS names but it is not bulk data and requires one request per ASN to get the well formated JSON response.
The reason why you do not see those AS names in production is because there is no new libloc release, yet. There is one bug (#12553) left to fix, but afterwards, I guess we are fine to release libloc 0.9.7; a day later, the generated database dump should contain the AS information.
Do you have a rough estimate on when this next version will be released?
For the sake of completeness, I take the liberty to note that the database.txt file you are referring to, and the Tor maintainers retrieve for baking it into their releases is _not_ the actual location database. So, technically, Tor is using the data conglomerated together by libloc - which we are happy about -, but not libloc itself. As far as I am aware, they decided against shipping libloc with Tor for the moment, primarily due to compatibility issues with the 0.3.5.x LTS branch.
I'm aware of the differences here. To be precise I'm not referring to the db file that the torproject ships with tor releases. I'm referring to their onionoo service (https://metrics.torproject.org/onionoo.html) which uses ipfire's db as data source as well since recently.
kind regards, nusenu
Hi nusenu,
thanks for your response.
out of curiosity: I haven't understood your differentiation between "human-readable" and "technical" AS names. Would you mind elaborating on your definitions of human-readable AS names?
I certainly don't mind, and it's not an accurate definition either. If I may pick an arbitrary example from the ARIN asn.txt file:
7843 TWC-7843-BB IPADD1-ARIN (Abuse), IPADD1-ARIN (Admin), IPADD1-ARIN (Tech)
So, for AS7843, "TWC-7843-BB" is the name this file supplies to us - which is not very telling. The description field of the corresponding organisation handle is ("Charter Communications Inc"), but we lack that information for ARIN and LACNIC space.
This is why I consider asn.txt to contain "technical" names, which are not necessarily suitable for end-user consumption - although some companies, such as DigitalOcean, do a better job to put meaningful strings in there.
On the second thought, however, this again boils down to a missing full dump of the ARIN database...
with regards to LACNIC: Maybe their bulk whois data is something you find useful? (I didn't use it myself)
Thanks for the hint, I was unaware of that possibility. Skimming through the terms of conditions, however, I somehow doubt this would be suitable, as we
(a) are going to redistribute the data in some way, which LACNIC ToS forbid (based on my humble understanding) (b) are probably not going to snail mail anything from Europe to Uruguay :-)
@Michael: That is, of course, unless your opinion differs. ;-)
ARIN offers a similar channel (https://www.arin.net/reference/research/bulkwhois/). We would get organisation handle names from there as well, but it seems to miss country information for suballocations, too (say an AOL dial-up block assigned to LATAM or similar).
So, at the moment, I see this kind of ambivalent...
Do you have a rough estimate on when this next version will be released?
We are not making any guarantees on that front. I am currently short on spare time, but would expect this to be released within the next couple of weeks.
Thanks, and best regards, Peter Müller
Hello nusenu,
Do you have a rough estimate on when this next version will be released?
We are not making any guarantees on that front. I am currently short on spare time, but would expect this to be released within the next couple of weeks.
just a quick follow-up on this.
Yesterday, we released libloc 0.9.7 (see https://source.ipfire.org/releases/libloc/ for its source .tar.gz), which contains the ARIN AS names change you mentioned initially. Also, generated databases are now significantly more accurate for IP space being part of Amazon AWS, which we mostly returned "US" before.
All location databases and their corresponding database.txt dumps generated after Fri, 9 Jul 2021 23:10:48 UTC should now include AS names for ARIN space. Just thought you might want to know. :-)
Please drop us a line here in case of any questions, bugs, or comments.
Thanks, and best regards, Peter Müller
Hello Peter,
All location databases and their corresponding database.txt dumps generated after Fri, 9 Jul 2021 23:10:48 UTC should now include AS names for ARIN space. Just thought you might want to know. :-)
I do indeed, thank you for the update and the notification!
Please drop us a line here in case of any questions, bugs, or comments.
I'll be able to check the improved data coverage once the torproject updated to a version containing the data and will be able to report back here and maybe we can even tackle the LACNIC region as well.
kind regards, nusenu
Hello nusenu,
On 27 Jun 2021, at 11:14, nusenu nusenu-lists@riseup.net wrote:
Hi Peter,
thanks for your comprehensive reply.
Peter Müller wrote:
the AS names file provided by ARIN does not contain human-readable AS names (although DigitalOcean is one of the better examples in there), I rather see it as an improvement than a solution. :-)
out of curiosity: I haven't understood your differentiation between "human-readable" and "technical" AS names. Would you mind elaborating on your definitions of human-readable AS names?
I understand AS name as the name attribute of the AutNum WHOIS object.
to use an example again: "DIGITALOCEAN-ASN" is the AS name of AS14061 according to [1], WHOIS [2] and well established tools like RIPEstat show the very same string [3].
[1] https://ftp.arin.net/info/asn.txt [2] https://whois.arin.net/rest/asn/AS14061 [3] https://stat.ripe.net/widget/as-overview#w.resource=AS14061
If you or anybody else has any idea of getting more precise location or AS data for space maintained by these two RIRs in a reliable way, we will be most interested in hearing about it.
with regards to LACNIC: Maybe their bulk whois data is something you find useful? (I didn't use it myself)
https://www.lacnic.net/en/web/lacnic/manual-8 https://github.com/microsoft/WhoisParsers/blob/fb54e3cf576e5caa4d2f8e0c15768...
RIPEstat is also an option for AS names but it is not bulk data and requires one request per ASN to get the well formated JSON response.
The reason why you do not see those AS names in production is because there is no new libloc release, yet. There is one bug (#12553) left to fix, but afterwards, I guess we are fine to release libloc 0.9.7; a day later, the generated database dump should contain the AS information.
Do you have a rough estimate on when this next version will be released?
For the sake of completeness, I take the liberty to note that the database.txt file you are referring to, and the Tor maintainers retrieve for baking it into their releases is _not_ the actual location database. So, technically, Tor is using the data conglomerated together by libloc - which we are happy about -, but not libloc itself. As far as I am aware, they decided against shipping libloc with Tor for the moment, primarily due to compatibility issues with the 0.3.5.x LTS branch.
I'm aware of the differences here. To be precise I'm not referring to the db file that the torproject ships with tor releases. I'm referring to their onionoo service (https://metrics.torproject.org/onionoo.html) which uses ipfire's db as data source as well since recently.
I just wanted to add something (potentially rather obvious here):
We originally designed this project with the library which is where probably the biggest value is. It comes with many features like:
* Super fast lookup of any data because we store it in a quite smart way. * A simple C API so it can be integrated into many projects (and we have bindings for Python 3 and Perl, too). * One of the most important ones: It cryptographically verifies the database when it is being updated. That way, it can be trusted as it not being downloaded from a random HTTP server.
As far as I can see (couldn’t find any up to date source code) onionoo is written in Java. Is that an obstacle and are there any blockers why libloc cannot be used apart from that there are no bindings for Java available?
Best, -Michael
kind regards, nusenu
As far as I can see (couldn’t find any up to date source code) onionoo is written in Java. Is that an obstacle and are there any blockers why libloc cannot be used apart from that there are no bindings for Java available?
to avoid misunderstandings: I'm not an onionoo developer, I'm just an onionoo user that is affected by the reduced ASN coverage after their geoip DB switch from maxmind. Since they are really thin on resources I figured getting this fixed upstream would help them, others and me.
your question reminded me of this past discussion on this gitlab issue: https://gitlab.torproject.org/tpo/network-health/metrics/relay-search/-/issu...
https://gitlab.torproject.org/tpo/network-health/metrics/onionoo/-/commit/49...
onionoo source: https://gitlab.torproject.org/tpo/network-health/metrics/onionoo/-/tree/mast...
kind regards, nusenu
Hello,
On 14 Jul 2021, at 22:03, nusenu nusenu-lists@riseup.net wrote:
As far as I can see (couldn’t find any up to date source code) onionoo is written in Java. Is that an obstacle and are there any blockers why libloc cannot be used apart from that there are no bindings for Java available?
to avoid misunderstandings: I'm not an onionoo developer, I'm just an onionoo user that is affected by the reduced ASN coverage after their geoip DB switch from maxmind. Since they are really thin on resources I figured getting this fixed upstream would help them, others and me.
Ah okay, I think I did indeed assume that.
And yes, fixing things upstream is always the best way!
your question reminded me of this past discussion on this gitlab issue: https://gitlab.torproject.org/tpo/network-health/metrics/relay-search/-/issu...
https://gitlab.torproject.org/tpo/network-health/metrics/onionoo/-/commit/49...
Yeah, that is a little bit sad to see. The graph search would probably be substantially faster than parsing a text file.
-Michael
onionoo source: https://gitlab.torproject.org/tpo/network-health/metrics/onionoo/-/tree/mast...
kind regards, nusenu