* Re: Thoughts on importing IP feeds from Amazon, second attempt
[not found] <0105552B-5866-44B9-BFEF-4470E92C8BCD@ipfire.org>
@ 2021-06-02 21:12 ` Peter Müller
2021-06-03 10:12 ` Michael Tremer
0 siblings, 1 reply; 4+ messages in thread
From: Peter Müller @ 2021-06-02 21:12 UTC (permalink / raw)
To: location
[-- Attachment #1: Type: text/plain, Size: 3750 bytes --]
Hello Michael,
thanks for your reply.
> Hello,
>
> First, is there a need to constantly rename subjects? I find this more confusing than helpful to keep track of a conversation on this list.
Personally, I like the idea of changing the subject as soon as the discussion leaves the proposed patch as such,
shifting towards a more general issue. That way, I thought it might be easier to differ between remarks targetting
the _actual_ patch, and general discussions.
If you'll object, I will stop doing that. You are the boss around here... :-)
>
>> On 30 May 2021, at 10:15, Peter Müller <peter.mueller(a)ipfire.org> wrote:
>>
>> Hello Michael,
>> hello *,
>>
>> before I start coding, I just wanted to share my current idea of importing IP feeds from Amazon AWS
>> in a less insecure way. Comments, etc. are appreciated. :-)
>
> You already submitted some code before. What happened to that?
It is still available, although I would not consider it being safe for production anymore.
>
>> (a) Run "location-importer update-whois" and "location-importer update-announcements", as we did before.
>> (b) Introduce something like "location-importer update-3rd-party-feeds", which is a blanket function for
>> updating all the 3rd party feeds we will have at some day, as Amazon for sure won't be the only one.
>
> Does this need a third command? Why can this not be part of “update-whois”?
Because we do not necessarily have the BGP data available at this step. If we want to build in AS-based
safeguards, we will have to parse 3rd party feeds after running "location-importer update-announcements".
>
>> (c) In case of Amazon, download their feed, parse it and put the results in a temporary table.
>> (d) Process a list of Autonomous Systems owned or controlled by Amazon.
>
> Where is this list coming from?
Something similar to "countries.txt", I guess. It would definitely be something we will have to maintain
on our own. A simple .txt file per 3rd party source, containing one ASN per line, would do it in my point
of view.
>
>> (d) Delete every IP network from this temporary table which is not announced by one of the Autonomous
>> Systems. That way, we limit potential damage by a broken or manipulated Amazon IP feed to their ASNs.
>
> This is your second step (d).
?
>
> When you say you are comparing this, what is the authority for this? The BGP feed? Whois?
The BGP feed. We cannot rely on RIR data for this job, as they do not reflect reality and we don't have them
for ARIN- and LACNIC-maintained space.
>
>> (e) Anything left in the temporary table is safe to go, and will be merged into the overrides table.
>>
>> Sounds a bit complicated than my first patch looked like, but is more versatile and robust. :-)
>
> I kind of liked the first patch. It was simple and it worked.
Indeed. But it allowed Amazon to inject arbitrary data. This is bad enough for RIRs already, I do not want
to extend the list of entities being able to do this to some profit-oriented companies...
>
>> Speaking of robustness, do we want a "source" column for the overrides table as well? Although it won't
>> appear in the generated database or it's .txt dump, it might be worth having, so we still have transparency
>> on 3rd party feeds at this point.
>
> I do not think it is worth it, because it is easy to check. If you want it, I wouldn’t object either.
Hm, it might not be that easy in production, since we do not store the raw contents of our IP feeds. Especially
if there is a delta, finding out which entry in the overrides table came from with source could be tricky,
eventually.
Thanks, and best regards,
Peter Müller
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Thoughts on importing IP feeds from Amazon, second attempt
2021-06-02 21:12 ` Thoughts on importing IP feeds from Amazon, second attempt Peter Müller
@ 2021-06-03 10:12 ` Michael Tremer
2021-06-05 12:40 ` Peter Müller
0 siblings, 1 reply; 4+ messages in thread
From: Michael Tremer @ 2021-06-03 10:12 UTC (permalink / raw)
To: location
[-- Attachment #1: Type: text/plain, Size: 4619 bytes --]
Hello,
> On 2 Jun 2021, at 22:12, Peter Müller <peter.mueller(a)ipfire.org> wrote:
>
> Hello Michael,
>
> thanks for your reply.
>
>> Hello,
>>
>> First, is there a need to constantly rename subjects? I find this more confusing than helpful to keep track of a conversation on this list.
>
> Personally, I like the idea of changing the subject as soon as the discussion leaves the proposed patch as such,
> shifting towards a more general issue. That way, I thought it might be easier to differ between remarks targetting
> the _actual_ patch, and general discussions.
>
> If you'll object, I will stop doing that. You are the boss around here... :-)
Not really :)
I am just trying to find a strategy to deal with the massive amount of emails and having conversations split it not helpful.
>
>>
>>> On 30 May 2021, at 10:15, Peter Müller <peter.mueller(a)ipfire.org> wrote:
>>>
>>> Hello Michael,
>>> hello *,
>>>
>>> before I start coding, I just wanted to share my current idea of importing IP feeds from Amazon AWS
>>> in a less insecure way. Comments, etc. are appreciated. :-)
>>
>> You already submitted some code before. What happened to that?
>
> It is still available, although I would not consider it being safe for production anymore.
>
>>
>>> (a) Run "location-importer update-whois" and "location-importer update-announcements", as we did before.
>>> (b) Introduce something like "location-importer update-3rd-party-feeds", which is a blanket function for
>>> updating all the 3rd party feeds we will have at some day, as Amazon for sure won't be the only one.
>>
>> Does this need a third command? Why can this not be part of “update-whois”?
>
> Because we do not necessarily have the BGP data available at this step. If we want to build in AS-based
> safeguards, we will have to parse 3rd party feeds after running "location-importer update-announcements".
I would say that this is a difficult dependency. We should have the “owner” of that subnet somewhere in the WHOIS data which is deterministic while the BGP isn’t.
There are “route” objects which should allow you to do what you want to do.
>>
>>> (c) In case of Amazon, download their feed, parse it and put the results in a temporary table.
>>> (d) Process a list of Autonomous Systems owned or controlled by Amazon.
>>
>> Where is this list coming from?
>
> Something similar to "countries.txt", I guess. It would definitely be something we will have to maintain
> on our own. A simple .txt file per 3rd party source, containing one ASN per line, would do it in my point
> of view.
Hmm, okay.
>>
>>> (d) Delete every IP network from this temporary table which is not announced by one of the Autonomous
>>> Systems. That way, we limit potential damage by a broken or manipulated Amazon IP feed to their ASNs.
>>
>> This is your second step (d).
>
> ?
You mis-numbered them. Never mind.
>
>>
>> When you say you are comparing this, what is the authority for this? The BGP feed? Whois?
>
> The BGP feed. We cannot rely on RIR data for this job, as they do not reflect reality and we don't have them
> for ARIN- and LACNIC-maintained space.
Well, interesting. I would have said the opposite.
>>> (e) Anything left in the temporary table is safe to go, and will be merged into the overrides table.
>>>
>>> Sounds a bit complicated than my first patch looked like, but is more versatile and robust. :-)
>>
>> I kind of liked the first patch. It was simple and it worked.
>
> Indeed. But it allowed Amazon to inject arbitrary data. This is bad enough for RIRs already, I do not want
> to extend the list of entities being able to do this to some profit-oriented companies...
I consider that a different problem. Not necessarily less important, but probably we should not try to fix all problems in only one patch.
>>
>>> Speaking of robustness, do we want a "source" column for the overrides table as well? Although it won't
>>> appear in the generated database or it's .txt dump, it might be worth having, so we still have transparency
>>> on 3rd party feeds at this point.
>>
>> I do not think it is worth it, because it is easy to check. If you want it, I wouldn’t object either.
>
> Hm, it might not be that easy in production, since we do not store the raw contents of our IP feeds. Especially
> if there is a delta, finding out which entry in the overrides table came from with source could be tricky,
> eventually.
>
> Thanks, and best regards,
> Peter Müller
-Michael
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Thoughts on importing IP feeds from Amazon, second attempt
2021-06-03 10:12 ` Michael Tremer
@ 2021-06-05 12:40 ` Peter Müller
2021-06-10 9:25 ` Michael Tremer
0 siblings, 1 reply; 4+ messages in thread
From: Peter Müller @ 2021-06-05 12:40 UTC (permalink / raw)
To: location
[-- Attachment #1: Type: text/plain, Size: 3568 bytes --]
Hello Michael,
thanks for your reply.
>>>> (b) Introduce something like "location-importer update-3rd-party-feeds", which is a blanket function for
>>>> updating all the 3rd party feeds we will have at some day, as Amazon for sure won't be the only one.
>>>
>>> Does this need a third command? Why can this not be part of “update-whois”?
>>
>> Because we do not necessarily have the BGP data available at this step. If we want to build in AS-based
>> safeguards, we will have to parse 3rd party feeds after running "location-importer update-announcements".
>
> I would say that this is a difficult dependency. We should have the “owner” of that subnet somewhere in the WHOIS data which is deterministic while the BGP isn’t.
>
> There are “route” objects which should allow you to do what you want to do.
Unfortunately, we do not have these data for ARIN and LACNIC space - at least not in a manner we can properly
use in an automated way. For the rest of the RIRs, I agree.
>
>>>
>>>> (c) In case of Amazon, download their feed, parse it and put the results in a temporary table.
>>>> (d) Process a list of Autonomous Systems owned or controlled by Amazon.
>>>
>>> Where is this list coming from?
>>
>> Something similar to "countries.txt", I guess. It would definitely be something we will have to maintain
>> on our own. A simple .txt file per 3rd party source, containing one ASN per line, would do it in my point
>> of view.
>
> Hmm, okay.
>
>>>
>>>> (d) Delete every IP network from this temporary table which is not announced by one of the Autonomous
>>>> Systems. That way, we limit potential damage by a broken or manipulated Amazon IP feed to their ASNs.
>>>
>>> This is your second step (d).
>>
>> ?
>
> You mis-numbered them. Never mind.
>
>>
>>>
>>> When you say you are comparing this, what is the authority for this? The BGP feed? Whois?
>>
>> The BGP feed. We cannot rely on RIR data for this job, as they do not reflect reality and we don't have them
>> for ARIN- and LACNIC-maintained space.
>
> Well, interesting. I would have said the opposite.
Normative power of the factual (a bit phony, I guess): If something is announced via BGP, traffic to it will
be routed towards the announced ASN - things like RPKI not taken into account here -, no matter what a RIR database
contains for this network.
I don't like saying that, but I guess this is what we have.
>
>>>> (e) Anything left in the temporary table is safe to go, and will be merged into the overrides table.
>>>>
>>>> Sounds a bit complicated than my first patch looked like, but is more versatile and robust. :-)
>>>
>>> I kind of liked the first patch. It was simple and it worked.
>>
>> Indeed. But it allowed Amazon to inject arbitrary data. This is bad enough for RIRs already, I do not want
>> to extend the list of entities being able to do this to some profit-oriented companies...
>
> I consider that a different problem. Not necessarily less important, but probably we should not try to fix all problems in only one patch.
All right, I agree. Given the state of discussion, I propose to work and submit a two-part patchset: The first
one adds a source column to the network_overrides, so we can debug things better on our systems at least, while
the second one imports Amazon AWS feed in the same way as I did initially.
We would then care about additional safeguards later... *fingers crossed* :-]
Would you agree on that?
Thanks, and best regards,
Peter Müller
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Thoughts on importing IP feeds from Amazon, second attempt
2021-06-05 12:40 ` Peter Müller
@ 2021-06-10 9:25 ` Michael Tremer
0 siblings, 0 replies; 4+ messages in thread
From: Michael Tremer @ 2021-06-10 9:25 UTC (permalink / raw)
To: location
[-- Attachment #1: Type: text/plain, Size: 3998 bytes --]
Hello,
> On 5 Jun 2021, at 13:40, Peter Müller <peter.mueller(a)ipfire.org> wrote:
>
> Hello Michael,
>
> thanks for your reply.
>
>>>>> (b) Introduce something like "location-importer update-3rd-party-feeds", which is a blanket function for
>>>>> updating all the 3rd party feeds we will have at some day, as Amazon for sure won't be the only one.
>>>>
>>>> Does this need a third command? Why can this not be part of “update-whois”?
>>>
>>> Because we do not necessarily have the BGP data available at this step. If we want to build in AS-based
>>> safeguards, we will have to parse 3rd party feeds after running "location-importer update-announcements".
>>
>> I would say that this is a difficult dependency. We should have the “owner” of that subnet somewhere in the WHOIS data which is deterministic while the BGP isn’t.
>>
>> There are “route” objects which should allow you to do what you want to do.
>
> Unfortunately, we do not have these data for ARIN and LACNIC space - at least not in a manner we can properly
> use in an automated way. For the rest of the RIRs, I agree.
I hate that we have to have a lot of duplicate code to deal with two RIRs.
>>
>>>>
>>>>> (c) In case of Amazon, download their feed, parse it and put the results in a temporary table.
>>>>> (d) Process a list of Autonomous Systems owned or controlled by Amazon.
>>>>
>>>> Where is this list coming from?
>>>
>>> Something similar to "countries.txt", I guess. It would definitely be something we will have to maintain
>>> on our own. A simple .txt file per 3rd party source, containing one ASN per line, would do it in my point
>>> of view.
>>
>> Hmm, okay.
>>
>>>>
>>>>> (d) Delete every IP network from this temporary table which is not announced by one of the Autonomous
>>>>> Systems. That way, we limit potential damage by a broken or manipulated Amazon IP feed to their ASNs.
>>>>
>>>> This is your second step (d).
>>>
>>> ?
>>
>> You mis-numbered them. Never mind.
>>
>>>
>>>>
>>>> When you say you are comparing this, what is the authority for this? The BGP feed? Whois?
>>>
>>> The BGP feed. We cannot rely on RIR data for this job, as they do not reflect reality and we don't have them
>>> for ARIN- and LACNIC-maintained space.
>>
>> Well, interesting. I would have said the opposite.
>
> Normative power of the factual (a bit phony, I guess): If something is announced via BGP, traffic to it will
> be routed towards the announced ASN - things like RPKI not taken into account here -, no matter what a RIR database
> contains for this network.
>
> I don't like saying that, but I guess this is what we have.
RPKI uses the RIR database. So we would just enforce RPKI for the entire database. I guess it is pretty much the same.
>
>>
>>>>> (e) Anything left in the temporary table is safe to go, and will be merged into the overrides table.
>>>>>
>>>>> Sounds a bit complicated than my first patch looked like, but is more versatile and robust. :-)
>>>>
>>>> I kind of liked the first patch. It was simple and it worked.
>>>
>>> Indeed. But it allowed Amazon to inject arbitrary data. This is bad enough for RIRs already, I do not want
>>> to extend the list of entities being able to do this to some profit-oriented companies...
>>
>> I consider that a different problem. Not necessarily less important, but probably we should not try to fix all problems in only one patch.
>
> All right, I agree. Given the state of discussion, I propose to work and submit a two-part patchset: The first
> one adds a source column to the network_overrides, so we can debug things better on our systems at least, while
> the second one imports Amazon AWS feed in the same way as I did initially.
>
> We would then care about additional safeguards later... *fingers crossed* :-]
>
> Would you agree on that?
Okay. Merged.
>
> Thanks, and best regards,
> Peter Müller
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-06-10 9:25 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <0105552B-5866-44B9-BFEF-4470E92C8BCD@ipfire.org>
2021-06-02 21:12 ` Thoughts on importing IP feeds from Amazon, second attempt Peter Müller
2021-06-03 10:12 ` Michael Tremer
2021-06-05 12:40 ` Peter Müller
2021-06-10 9:25 ` Michael Tremer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox