From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter =?utf-8?q?M=C3=BCller?= To: location@lists.ipfire.org Subject: How should location-importer.in deal with RIR objects having multiple distinct "country" fields? Date: Mon, 03 May 2021 22:56:10 +0200 Message-ID: <642234e4-c993-5c2d-199c-a1afed0d255b@ipfire.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============3007474098856733206==" List-Id: --===============3007474098856733206== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hello Michael, hello location folks (CC'ed), unfortunately, another problem surfaces when processing inetnum and inet6num = feeds from RIRs which provide that kind of more precise data: A decent amount of network obje= cts have multiple distinct "country" fields. Here is an example: > inetnum: 178.79.192.0 - 178.79.255.255 > netname: EU-LLNW-20100512 > country: EU > country: SE > country: DE > country: NL > country: GB > country: ES > country: FR > country: IT > org: ORG-LNI1-RIPE > admin-c: GU2143-RIPE > tech-c: GU2143-RIPE > status: ALLOCATED PA > remarks: ****************** ABUSE COMPLAINTS TO: abuse(a)limelightne= tworks.com > mnt-by: RIPE-NCC-HM-MNT > mnt-by: LLNW-MNT > mnt-domains: LLNW-MNT > mnt-routes: LLNW-MNT > created: 2010-05-12T16:20:38Z > last-modified: 2017-09-01T17:39:08Z > source: RIPE # Filtered Currently, the last country item is made persistent via the SQL INSERT statem= ent. Since these do not appear to be sorted in any way, this makes things completely nondetermini= stic. The network above would be, however, recoverable: If we do not interpret "EU"= as the European Union, but rather as the European country, all other country codes given here would = be covered by it. Alas, this is not helping in cases such as these two: > Country of network [IPv4Network('77.74.172.0/23')] already set to 'CH', omi= tting 'FI' (multiple country lines in RIR data?) = = =20 > Country of network [IPv4Network('185.253.140.0/24')] already set to 'GB', o= mitting 'NL' (multiple country lines in RIR data?) = = =20 > Country of network [IPv4Network('185.253.140.0/24')] already set to 'GB', o= mitting 'US' (multiple country lines in RIR data?) = = =20 > Country of network [IPv4Network('213.230.255.0/24')] already set to 'GB', o= mitting 'US' (multiple country lines in RIR data?) = = =20 > Country of network [IPv4Network('213.230.255.0/24')] already set to 'GB', o= mitting 'JP' (multiple country lines in RIR data?) = = =20 > Country of network [IPv4Network('213.230.255.0/24')] already set to 'GB', o= mitting 'SG' (multiple country lines in RIR data?) = = =20 > Country of network [IPv4Network('213.230.255.0/24')] already set to 'GB', o= mitting 'AU' (multiple country lines in RIR data?) = = =20 > Country of network [IPv4Network('213.230.255.0/24')] already set to 'GB', o= mitting 'NL' (multiple country lines in RIR data?) = = =20 > Country of network [IPv4Network('213.230.255.0/24')] already set to 'GB', o= mitting 'FR' (multiple country lines in RIR data?) = = =20 > Country of network [IPv4Network('213.230.255.0/24')] already set to 'GB', o= mitting 'DE' (multiple country lines in RIR data?) = = =20 > Country of network [IPv4Network('193.109.168.0/22')] already set to 'GB', o= mitting 'US' (multiple country lines in RIR data?) There are _plenty_ of such networks, I believe RIPE IPv4 only fills several s= creen pages. Nothing in life is ever easy, and parsing RIR data definitely isn't... :-/ Delegating the task of handling such situations to the application using libl= oc does not make sense to me, as people are _expecting_ precise answers from it - if we can use the = term of preciseness here at all -, otherwise, they could simply parse RIR data on their own. Therefore= , we have to somehow make do with this. Possible options would be as follows: (a) We do not process such networks entirely. If a network operator wants to = have his/her network covered by libloc, he/she/it should kindly fix it's RIR data. That would not prevent us from obtaining announcements for such networks,= but we would not label them with any country anymore. (b) We try to automatically determine meaningful codes in each case. This is tricky and not very deterministic. What about a network having "C= Y" and "TR" set? Would that be covered by "EU"? 213.230.255.0/24 seems to be used worldwide, but in my point of view, thi= s is not sufficient to classify it as an anycast network. Worse, we have or should assign a coun= try code to anycast networks as well. (c) We try to determine the jurisdiction of a networks' organisation handle. Frankly, I have no idea what problems would arise in this case. If an org= anisation fails to provide accurate and meaningful RIR data, what will their organisation handle pos= sibly look like? Trying to keep things deterministic, (a) is my current favorite - it is the m= ost brutal, though. Do you see a better way of dealing with such networks? @All: Thoughts? Comments? Opinions? Thanks, and best regards, Peter M=C3=BCller --===============3007474098856733206==--