Hello Michael, hello location folks (CC'ed),
unfortunately, another problem surfaces when processing inetnum and inet6num feeds from RIRs which provide that kind of more precise data: A decent amount of network objects have multiple distinct "country" fields.
Here is an example:
inetnum: 178.79.192.0 - 178.79.255.255 netname: EU-LLNW-20100512 country: EU country: SE country: DE country: NL country: GB country: ES country: FR country: IT org: ORG-LNI1-RIPE admin-c: GU2143-RIPE tech-c: GU2143-RIPE status: ALLOCATED PA remarks: ****************** ABUSE COMPLAINTS TO: abuse@limelightnetworks.com mnt-by: RIPE-NCC-HM-MNT mnt-by: LLNW-MNT mnt-domains: LLNW-MNT mnt-routes: LLNW-MNT created: 2010-05-12T16:20:38Z last-modified: 2017-09-01T17:39:08Z source: RIPE # Filtered
Currently, the last country item is made persistent via the SQL INSERT statement. Since these do not appear to be sorted in any way, this makes things completely nondeterministic.
The network above would be, however, recoverable: If we do not interpret "EU" as the European Union, but rather as the European country, all other country codes given here would be covered by it.
Alas, this is not helping in cases such as these two:
Country of network [IPv4Network('77.74.172.0/23')] already set to 'CH', omitting 'FI' (multiple country lines in RIR data?) Country of network [IPv4Network('185.253.140.0/24')] already set to 'GB', omitting 'NL' (multiple country lines in RIR data?) Country of network [IPv4Network('185.253.140.0/24')] already set to 'GB', omitting 'US' (multiple country lines in RIR data?) Country of network [IPv4Network('213.230.255.0/24')] already set to 'GB', omitting 'US' (multiple country lines in RIR data?) Country of network [IPv4Network('213.230.255.0/24')] already set to 'GB', omitting 'JP' (multiple country lines in RIR data?) Country of network [IPv4Network('213.230.255.0/24')] already set to 'GB', omitting 'SG' (multiple country lines in RIR data?) Country of network [IPv4Network('213.230.255.0/24')] already set to 'GB', omitting 'AU' (multiple country lines in RIR data?) Country of network [IPv4Network('213.230.255.0/24')] already set to 'GB', omitting 'NL' (multiple country lines in RIR data?) Country of network [IPv4Network('213.230.255.0/24')] already set to 'GB', omitting 'FR' (multiple country lines in RIR data?) Country of network [IPv4Network('213.230.255.0/24')] already set to 'GB', omitting 'DE' (multiple country lines in RIR data?) Country of network [IPv4Network('193.109.168.0/22')] already set to 'GB', omitting 'US' (multiple country lines in RIR data?)
There are _plenty_ of such networks, I believe RIPE IPv4 only fills several screen pages. Nothing in life is ever easy, and parsing RIR data definitely isn't... :-/
Delegating the task of handling such situations to the application using libloc does not make sense to me, as people are _expecting_ precise answers from it - if we can use the term of preciseness here at all -, otherwise, they could simply parse RIR data on their own. Therefore, we have to somehow make do with this. Possible options would be as follows:
(a) We do not process such networks entirely. If a network operator wants to have his/her network covered by libloc, he/she/it should kindly fix it's RIR data.
That would not prevent us from obtaining announcements for such networks, but we would not label them with any country anymore.
(b) We try to automatically determine meaningful codes in each case.
This is tricky and not very deterministic. What about a network having "CY" and "TR" set? Would that be covered by "EU"?
213.230.255.0/24 seems to be used worldwide, but in my point of view, this is not sufficient to classify it as an anycast network. Worse, we have or should assign a country code to anycast networks as well.
(c) We try to determine the jurisdiction of a networks' organisation handle.
Frankly, I have no idea what problems would arise in this case. If an organisation fails to provide accurate and meaningful RIR data, what will their organisation handle possibly look like?
Trying to keep things deterministic, (a) is my current favorite - it is the most brutal, though.
Do you see a better way of dealing with such networks?
@All: Thoughts? Comments? Opinions?
Thanks, and best regards, Peter Müller