From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Tremer To: location@lists.ipfire.org Subject: Re: Thoughts on importing IP feeds from Amazon, second attempt Date: Thu, 03 Jun 2021 11:12:31 +0100 Message-ID: <8A99C94A-15B3-4BCD-9EA7-BAA099C34C94@ipfire.org> In-Reply-To: <39ae49e1-db28-277b-35b5-c710612bd4b5@ipfire.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============7578882877967394640==" List-Id: --===============7578882877967394640== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hello, > On 2 Jun 2021, at 22:12, Peter M=C3=BCller wro= te: >=20 > Hello Michael, >=20 > thanks for your reply. >=20 >> Hello, >>=20 >> First, is there a need to constantly rename subjects? I find this more con= fusing than helpful to keep track of a conversation on this list. >=20 > Personally, I like the idea of changing the subject as soon as the discussi= on leaves the proposed patch as such, > shifting towards a more general issue. That way, I thought it might be easi= er to differ between remarks targetting > the _actual_ patch, and general discussions. >=20 > If you'll object, I will stop doing that. You are the boss around here... := -) Not really :) I am just trying to find a strategy to deal with the massive amount of emails= and having conversations split it not helpful. >=20 >>=20 >>> On 30 May 2021, at 10:15, Peter M=C3=BCller = wrote: >>>=20 >>> Hello Michael, >>> hello *, >>>=20 >>> before I start coding, I just wanted to share my current idea of importin= g IP feeds from Amazon AWS >>> in a less insecure way. Comments, etc. are appreciated. :-) >>=20 >> You already submitted some code before. What happened to that? >=20 > It is still available, although I would not consider it being safe for prod= uction anymore. >=20 >>=20 >>> (a) Run "location-importer update-whois" and "location-importer update-an= nouncements", as we did before. >>> (b) Introduce something like "location-importer update-3rd-party-feeds", = which is a blanket function for >>> updating all the 3rd party feeds we will have at some day, as Amazon fo= r sure won't be the only one. >>=20 >> Does this need a third command? Why can this not be part of =E2=80=9Cupdat= e-whois=E2=80=9D? >=20 > Because we do not necessarily have the BGP data available at this step. If = we want to build in AS-based > safeguards, we will have to parse 3rd party feeds after running "location-i= mporter update-announcements". I would say that this is a difficult dependency. We should have the =E2=80=9C= owner=E2=80=9D of that subnet somewhere in the WHOIS data which is determinis= tic while the BGP isn=E2=80=99t. There are =E2=80=9Croute=E2=80=9D objects which should allow you to do what y= ou want to do. >>=20 >>> (c) In case of Amazon, download their feed, parse it and put the results = in a temporary table. >>> (d) Process a list of Autonomous Systems owned or controlled by Amazon. >>=20 >> Where is this list coming from? >=20 > Something similar to "countries.txt", I guess. It would definitely be somet= hing we will have to maintain > on our own. A simple .txt file per 3rd party source, containing one ASN per= line, would do it in my point > of view. Hmm, okay. >>=20 >>> (d) Delete every IP network from this temporary table which is not announ= ced by one of the Autonomous >>> Systems. That way, we limit potential damage by a broken or manipulated= Amazon IP feed to their ASNs. >>=20 >> This is your second step (d). >=20 > ? You mis-numbered them. Never mind. >=20 >>=20 >> When you say you are comparing this, what is the authority for this? The B= GP feed? Whois? >=20 > The BGP feed. We cannot rely on RIR data for this job, as they do not refle= ct reality and we don't have them > for ARIN- and LACNIC-maintained space. Well, interesting. I would have said the opposite. >>> (e) Anything left in the temporary table is safe to go, and will be merge= d into the overrides table. >>>=20 >>> Sounds a bit complicated than my first patch looked like, but is more ver= satile and robust. :-) >>=20 >> I kind of liked the first patch. It was simple and it worked. >=20 > Indeed. But it allowed Amazon to inject arbitrary data. This is bad enough = for RIRs already, I do not want > to extend the list of entities being able to do this to some profit-oriente= d companies... I consider that a different problem. Not necessarily less important, but prob= ably we should not try to fix all problems in only one patch. >>=20 >>> Speaking of robustness, do we want a "source" column for the overrides ta= ble as well? Although it won't >>> appear in the generated database or it's .txt dump, it might be worth hav= ing, so we still have transparency >>> on 3rd party feeds at this point. >>=20 >> I do not think it is worth it, because it is easy to check. If you want it= , I wouldn=E2=80=99t object either. >=20 > Hm, it might not be that easy in production, since we do not store the raw = contents of our IP feeds. Especially > if there is a delta, finding out which entry in the overrides table came fr= om with source could be tricky, > eventually. >=20 > Thanks, and best regards, > Peter M=C3=BCller -Michael --===============7578882877967394640==--