From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Tremer To: location@lists.ipfire.org Subject: Re: [PATCH v2] location-importer.in: skip networks with unknown country codes Date: Sun, 04 Apr 2021 13:37:31 +0100 Message-ID: In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============4720149888539599029==" List-Id: --===============4720149888539599029== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hello, Very good analysis. Thank you very much for investing your time. Can you do the same for Serbia and Montenegro, please? And I would like to silence the warning then (at least for special country co= des like ZZ, YU and whatever else we find). -Michael > On 2 Apr 2021, at 20:58, Peter M=C3=BCller wro= te: >=20 > Hello Michael, >=20 > thank you for your reply and merging this. >=20 > On location02, the amount of networks being ignored because of a "YU" count= ry set is > (please excuse the crappy "sort" output - it is really useless when it come= s to IP addresses): >=20 >> 194.106.185.0/26 >> 194.106.185.144/28 >> 194.106.185.96/28 >> 194.194.158.0/25 >> 194.194.158.128/26 >> 194.247.200.160/28 >> 194.247.207.224/27 >> 194.247.223.12/30 >> 194.247.223.16/28 >> 194.247.223.32/28 >> 194.247.223.48/28 >> 194.247.223.72/29 >> 194.247.223.80/28 >> 194.247.223.96/28 >> 195.178.61.128/26 >> 195.178.62.192/27 >> 195.178.62.64/28 >> 195.178.63.0/29 >> 195.178.63.128/28 >> 195.178.63.144/28 >> 195.178.63.32/27 >> 195.178.63.8/29 >> 195.250.104.135/32 >> 195.250.104.140/32 >> 195.250.104.145/32 >> 195.250.104.224/28 >> 195.250.104.240/28 >> 195.250.113.144/28 >> 195.250.113.192/27 >> 195.250.114.224/27 >> 195.250.116.0/26 >> 195.250.116.64/26 >> 195.252.107.128/29 >> 195.252.110.192/26 >> 195.252.111.128/26 >> 195.252.111.192/26 >> 195.252.115.0/24 >> 195.252.118.0/26 >> 195.252.118.64/26 >> 195.252.120.0/24 >> 195.66.165.0/24 >> 217.26.77.64/26 >> 217.26.79.0/27 >> 62.108.115.0/28 >> 62.108.117.16/28 >> 62.108.117.96/28 >> 62.193.141.192/28 >> 62.193.141.224/28 >> 62.193.141.240/28 >> 62.193.141.56/29 >=20 > Since we currently ignore anything more specific than a /24, only these net= works are actually > relevant in this discussion, as we would have discarded the others anyway: >=20 >> 195.66.165.0/24 >> 195.252.115.0/24 >> 195.252.120.0/24 >=20 > Digging deeper into them, the first one dead-ends somewhere in the vicinity= of Croatia, having > a RIPE database entry dated before 2003: >=20 >> inetnum: 195.66.165.0 - 195.66.165.255 >> netname: Posta_Crne_Gore >> descr: Posta Crne Gore >> country: YU >> admin-c: MM609-RIPE >> tech-c: MM609-RIPE >> status: ASSIGNED PA >> mnt-by: AS8585-MNT >> created: 2001-10-04T08:34:51Z >> last-modified: 2002-10-30T09:36:47Z >> source: RIPE >>=20 >> person: Martinovic Milan >> address: Posta Crne Gore >> address: Slobode 1 >> address: 81000 Podgorica >> address: Montenegro, Yugoslavia >> phone: +381 81 225 181 >> nic-hdl: MM609-RIPE >> created: 1970-01-01T00:00:00Z >> last-modified: 2020-06-03T10:52:16Z >> source: RIPE # Filtered >> mnt-by: AS8585-MNT >>=20 >> route: 195.66.160.0/19 >> descr: Internet Crna Gora >> origin: AS8585 >> mnt-by: AS8585-MNT >> created: 1970-01-01T00:00:00Z >> last-modified: 2001-09-22T09:33:48Z >> source: RIPE # Filtered >=20 > The second is routed by AS6700 into a residential dial-up network pool some= where in Serbia, > while it's RIPE DB entry shows: >=20 >> inetnum: 195.252.115.0 - 195.252.115.255 >> netname: DRENIK >> descr: Drenik ISP >> descr: Beograd, Deligradska 19 >> country: YU >> admin-c: DR47-RIPE >> tech-c: DR47-RIPE >> status: ASSIGNED PA >> mnt-by: AS6700-MNT >> created: 2002-04-11T08:21:26Z >> last-modified: 2002-04-11T08:21:26Z >> source: RIPE >>=20 >> person: Nenad Repac >> address: D.D. TELEFONIJA >> address: Marsala Tolbuhina 56 >> address: 11000 Beograd >> address: Yugoslavia >> phone: +381 11 444 11 44 Ext. 381 >> fax-no: +381 11 3248 953 >> nic-hdl: DR47-RIPE >> mnt-by: AS6700-MNT >> created: 1970-01-01T00:00:00Z >> last-modified: 2001-09-21T23:28:31Z >> source: RIPE # Filtered >>=20 >> route: 195.252.96.0/19 >> descr: BeotelNet ISP, Belgrade, RS >> origin: AS6700 >> mnt-by: AS6700-MNT >> created: 1970-01-01T00:00:00Z >> last-modified: 2019-07-15T09:12:36Z >> source: RIPE >=20 > Same goes for the third network, having a RIPE DB entry maintained by the s= ame organisation: >=20 >> inetnum: 195.252.120.0 - 195.252.120.255 >> netname: ABSOFT >> descr: AB SOFT >> descr: Kneza Milosa 82, Beograd >> country: YU >> admin-c: DR47-RIPE >> tech-c: DR47-RIPE >> status: ASSIGNED PA >> mnt-by: AS6700-MNT >> created: 2002-04-10T16:54:48Z >> last-modified: 2002-04-10T16:54:48Z >> source: RIPE >>=20 >> person: Nenad Repac >> address: D.D. TELEFONIJA >> address: Marsala Tolbuhina 56 >> address: 11000 Beograd >> address: Yugoslavia >> phone: +381 11 444 11 44 Ext. 381 >> fax-no: +381 11 3248 953 >> nic-hdl: DR47-RIPE >> mnt-by: AS6700-MNT >> created: 1970-01-01T00:00:00Z >> last-modified: 2001-09-21T23:28:31Z >> source: RIPE # Filtered >>=20 >> route: 195.252.96.0/19 >> descr: BeotelNet ISP, Belgrade, RS >> origin: AS6700 >> mnt-by: AS6700-MNT >> created: 1970-01-01T00:00:00Z >> last-modified: 2019-07-15T09:12:36Z >> source: RIPE >=20 > Since we are only dealing with three networks here and their actual locatio= n seems to be pretty > clear to me, I suggest _not_ to add YU as a legitimate country. Instead, I = would just write overrides > for these networks. >=20 > Would you be fine with that? >=20 > Thanks, and best regards, > Peter M=C3=BCller >=20 >=20 >> Hello, >>=20 >> I merged this patch, but it has some unwanted side-effects: >>=20 >> Technically it works as designed as we are successfully dropping any count= ries that are not part of the imported list. I changed our scripts that these= will always be imported first now. >>=20 >> I ran a manual import which dropped CS which is Serbia and Montenegro. Thi= s used to be a valid country code, but Serbia and Montenegro is not a single = country any more. I decided to add it because we would have dropped too many = networks without it. Now we are dropping a few networks with country code YU = - Yugoslavia. >>=20 >> Montenegro became independent from Serbia in 2006, Yugoslavia became the S= tate Union of Serbia and Montenegro in 2003. For some reasons (probably becau= se I didn=E2=80=99t do research) I thought these events were closer together = and therefore thought that all networks with country code CS simply =E2=80=9C= forgot=E2=80=9D to update this, but there never were any that actually existi= ng during the time of Yugoslavia. >>=20 >> Long story short: Would anybody object to add YU to the database although = it doesn=E2=80=99t exist as a country any more? I guess we cannot just =E2=80= =9Crewrite=E2=80=9D it because the situation is way too complicated. However,= we wanted to give people an idea where some IP address is located and that i= s kind of does not work if the country does not exist any more. Returning not= hing instead is not a great solution either because we are then simply hiding= networks that exist. >>=20 >> Or did I overlook an ever better option? >>=20 >> -Michael >>=20 >>> On 30 Mar 2021, at 16:47, Peter M=C3=BCller = wrote: >>>=20 >>> There is no sense in parsing and storting networks whose country codes >>> cannot be found in the ISO-3166-x country code table. This avoids side >>> effects in applications using the location database, and introduces >>> another sanity check to compensate bogus RIR data. >>>=20 >>> On location02, this affects some networks from APNIC (country code: ZZ) >>> as well as a bunch of smaller allocations within the RIPE region still >>> tagged to CS or YU (Yugoslavia). To my surprise, no network tagged as SU >>> (Soviet Union) was found - while the NIC for .su TLD is still >>> operational. :-) >>>=20 >>> Applying this patch causes the countries to be processed before >>> update_whois() is called. In case no countries are present in the SQL >>> table, this check is silently omitted. >>>=20 >>> Fixes: #12510 >>>=20 >>> Signed-off-by: Peter M=C3=BCller >>> --- >>> src/python/location-importer.in | 38 ++++++++++++++++++++++----------- >>> 1 file changed, 26 insertions(+), 12 deletions(-) >>>=20 >>> diff --git a/src/python/location-importer.in b/src/python/location-import= er.in >>> index e2f201b..1e08458 100644 >>> --- a/src/python/location-importer.in >>> +++ b/src/python/location-importer.in >>> @@ -388,10 +388,17 @@ class CLI(object): >>> TRUNCATE TABLE networks; >>> """) >>>=20 >>> + # Fetch all valid country codes to check parsed networks aganist... >>> + rows =3D self.db.query("SELECT * FROM countries ORDER BY country_code= ") >>> + validcountries =3D [] >>> + >>> + for row in rows: >>> + validcountries.append(row.country_code) >>> + >>> for source in location.importer.WHOIS_SOURCES: >>> with downloader.request(source, return_blocks=3DTrue) as f: >>> for block in f: >>> - self._parse_block(block) >>> + self._parse_block(block, validcountries) >>>=20 >>> # Process all parsed networks from every RIR we happen to have access = to, >>> # insert the largest network chunks into the networks table immediatel= y... >>> @@ -467,7 +474,7 @@ class CLI(object): >>> # Download data >>> with downloader.request(source) as f: >>> for line in f: >>> - self._parse_line(line) >>> + self._parse_line(line, validcountries) >>>=20 >>> def _check_parsed_network(self, network): >>> """ >>> @@ -532,7 +539,7 @@ class CLI(object): >>> # be suitable for libloc consumption... >>> return True >>>=20 >>> - def _parse_block(self, block): >>> + def _parse_block(self, block, validcountries =3D None): >>> # Get first line to find out what type of block this is >>> line =3D block[0] >>>=20 >>> @@ -542,7 +549,7 @@ class CLI(object): >>>=20 >>> # inetnum >>> if line.startswith("inet6num:") or line.startswith("inetnum:"): >>> - return self._parse_inetnum_block(block) >>> + return self._parse_inetnum_block(block, validcountries) >>>=20 >>> # organisation >>> elif line.startswith("organisation:"): >>> @@ -573,7 +580,7 @@ class CLI(object): >>> autnum.get("asn"), autnum.get("org"), >>> ) >>>=20 >>> - def _parse_inetnum_block(self, block): >>> + def _parse_inetnum_block(self, block, validcountries =3D None): >>> log.debug("Parsing inetnum block:") >>>=20 >>> inetnum =3D {} >>> @@ -616,10 +623,10 @@ class CLI(object): >>> if not inetnum or not "country" in inetnum: >>> return >>>=20 >>> - # Skip objects with bogus country code 'ZZ' >>> - if inetnum.get("country") =3D=3D "ZZ": >>> - log.warning("Skipping network with bogus country 'ZZ': %s" % \ >>> - (inetnum.get("inet6num") or inetnum.get("inetnum"))) >>> + # Skip objects with unknown country codes >>> + if validcountries and inetnum.get("country") not in validcountries: >>> + log.warning("Skipping network with bogus country '%s': %s" % \ >>> + (inetnum.get("country"), inetnum.get("inet6num") or inetnum.get("ine= tnum"))) >>> return >>>=20 >>> # Iterate through all networks enumerated from above, check them for pl= ausibility and insert >>> @@ -652,7 +659,7 @@ class CLI(object): >>> org.get("organisation"), org.get("org-name"), >>> ) >>>=20 >>> - def _parse_line(self, line): >>> + def _parse_line(self, line, validcountries =3D None): >>> # Skip version line >>> if line.startswith("2"): >>> return >>> @@ -667,8 +674,15 @@ class CLI(object): >>> log.warning("Could not parse line: %s" % line) >>> return >>>=20 >>> - # Skip any lines that are for stats only >>> - if country_code =3D=3D "*": >>> + # Skip any lines that are for stats only or do not have a country >>> + # code at all (avoids log spam below) >>> + if not country_code or country_code =3D=3D '*': >>> + return >>> + >>> + # Skip objects with unknown country codes >>> + if validcountries and country_code not in validcountries: >>> + log.warning("Skipping line with bogus country '%s': %s" % \ >>> + (country_code, line)) >>> return >>>=20 >>> if type in ("ipv6", "ipv4"): >>> --=20 >>> 2.26.2 >>=20 --===============4720149888539599029==--