public inbox for development@lists.ipfire.org
 help / color / mirror / Atom feed
* [Discussion] Privacy and security for IPFire updates
@ 2018-04-10 17:15 Peter Müller
  2018-04-14  6:35 ` Matthias Fischer
  2018-04-16 11:23 ` Michael Tremer
  0 siblings, 2 replies; 8+ messages in thread
From: Peter Müller @ 2018-04-10 17:15 UTC (permalink / raw)
  To: development

[-- Attachment #1: Type: text/plain, Size: 4504 bytes --]

Hello,

a few days ago, I had a discussion with Michael about
privacy and security for IPFire updates (we mainly
focused on IPFire 3.x, but some points might be
applicable to 2.x, too).

In order to get some more opinions about this, I would
like to mention the main points here and ask for comments.
Forgive me if you receive this mail twice; I'll post
it on both development and mirror list.

(a) Security
Whenever you are updating an application or an entire
operating systems, security is the most important
aspect: An attacker must not be able to manipulate update
packages, or fake information about the current patchlevel,
or something similar.

In the past, we saw several incidents here - perhaps
the most famous one was "Flame" (also known as "Flamer" or
"Skywiper"), a sophisticated malware which was detected
in Q1/2012 and spread via Windows Updates with a valid
signature - and it turned out that strong cryptography
is a very good way to be more robust here.

In IPFire 2.x, we recently switched from SHA1 to SHA512
for the Pakfire signatures, and plan to change the signing
key (which is currently a 1024-bit-DSA one) with Core
Update 121 to a more secure one.

So far, so good.

Speaking about IPFire 3.x, we plan to download updates
over HTTPS only (see below why). However, there are still
a few questions left:
(i) Should we sign the mirror list?
In 3.x, using a custom mirror (i.e. in a company network) will
be possible. If not specified, we use the public mirror
infrastructure; a list of all servers and paths will
be published as it already is today.

In my opinion, we should sign that list, too, to prevent
an attacker from inserting his/her mirror silently. On
the other hand, packages are sill signed, so a manipulation
here would not be possible (and we do not have to trust our
mirrors), but an attacker might still gather some metadata.

[The mirror list can be viewed at https://mirrors.ipfire.org/,
if anyone is interested.]

(ii) Should we introduce signers?
A package built for IPFire 3.x will be signed at the builder
using a custom key for each machine. Since malicious activity
might took place during the build, the key might became
compromised.

Some Linux distributions are using dedicated signers, which
are only signing data but never unpack or execute them. That
way, we could also move the signing keys to a HSM (example:
https://www.nitrokey.com/) and run the server at a secure
location (not in a public data centre).


(b) Privacy
Fetching updates typically leaks a lot of information (such
as your current patch level, or systems architecture, or
IP address). By using HTTPS only, we avoid information leaks
to eavesdroppers, which I consider a security benefit, too.

However, a mirror operator still has access to those information.
Perhaps the IP address is the most critical one, since it
allows tracing a system back to a city/country, or even to
an organisation.

Because of that, I do consider mirrors to be somewhat critical,
and would like to see the list singed in 3.x, too.

(i) Should we introduce mirror servers in the Tor network?
One way to solve this problem is to download updates via a
proxy, or an anonymisation network. In most cases, Tor fits
the bill.

For best privacy, some mirror servers could be operated as
so-called "hidden services", so traffic won't even leave the
Tor network and pass some exit nodes. (Debian runs several
services that way, including package mirrors: https://onion.debian.org/ .)

Since Tor is considered bad traffic in some corporate network
(or even states), this technique should be disabled by
default.

What are your opinions here?

(ii) Reducing update connections to anybody else
Some resources (GeoIP database, IDS rulesets, proxy blacklists)
are currently not fetched via the IPFire mirrors, causing
some of the problems mentioned above.

For example, to fetch the GeoIP database, all systems sooner
or later connect to "geolite.maxmind.com", so we can assume
they see a lot of IP addresses IPFire systems are located behind. :-\
Michael and I are currently working on a replacement for
this, called "libloc", but that is a different topic.

Pushing all these resources into packages (if they are
free, of course) and deliver them over our own mirrors would
reduce some traffic to third party servers here. For libloc,
we plan to do so.

Should we do this for other resources such as rulesets and
blacklists, too?


Looking forward to read your comments.

Thanks, and best regards,
Peter Müller

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Discussion] Privacy and security for IPFire updates
  2018-04-10 17:15 [Discussion] Privacy and security for IPFire updates Peter Müller
@ 2018-04-14  6:35 ` Matthias Fischer
  2018-04-16 11:23 ` Michael Tremer
  1 sibling, 0 replies; 8+ messages in thread
From: Matthias Fischer @ 2018-04-14  6:35 UTC (permalink / raw)
  To: development

[-- Attachment #1: Type: text/plain, Size: 5396 bytes --]

Hi,

I'm more hardware oriented, so I'm not so familiar with this kind of
security structures, but...

On 10.04.2018 19:15, Peter Müller wrote:
> Hello,
> 
> a few days ago, I had a discussion with Michael about
> privacy and security for IPFire updates (we mainly
> focused on IPFire 3.x, but some points might be
> applicable to 2.x, too).
> 
> In order to get some more opinions about this, I would
> like to mention the main points here and ask for comments.
> Forgive me if you receive this mail twice; I'll post
> it on both development and mirror list.
> 
> (a) Security
> Whenever you are updating an application or an entire
> operating systems, security is the most important
> aspect: An attacker must not be able to manipulate update
> packages, or fake information about the current patchlevel,
> or something similar.
> 
> In the past, we saw several incidents here - perhaps
> the most famous one was "Flame" (also known as "Flamer" or
> "Skywiper"), a sophisticated malware which was detected
> in Q1/2012 and spread via Windows Updates with a valid
> signature - and it turned out that strong cryptography
> is a very good way to be more robust here.
> 
> In IPFire 2.x, we recently switched from SHA1 to SHA512
> for the Pakfire signatures, and plan to change the signing
> key (which is currently a 1024-bit-DSA one) with Core
> Update 121 to a more secure one.
> 
> So far, so good.
> 
> Speaking about IPFire 3.x, we plan to download updates
> over HTTPS only (see below why). However, there are still
> a few questions left:
> (i) Should we sign the mirror list?
> In 3.x, using a custom mirror (i.e. in a company network) will
> be possible. If not specified, we use the public mirror
> infrastructure; a list of all servers and paths will
> be published as it already is today.
> 
> In my opinion, we should sign that list, too, to prevent
> an attacker from inserting his/her mirror silently. On
> the other hand, packages are sill signed, so a manipulation
> here would not be possible (and we do not have to trust our
> mirrors), but an attacker might still gather some metadata.
> 
> [The mirror list can be viewed at https://mirrors.ipfire.org/,
> if anyone is interested.]

Jm2c:
Sounds reasonable. Agreed. Mirrors should be as secure as possible. Yes.

> (ii) Should we introduce signers?
> A package built for IPFire 3.x will be signed at the builder
> using a custom key for each machine. Since malicious activity
> might took place during the build, the key might became
> compromised.
> 
> Some Linux distributions are using dedicated signers, which
> are only signing data but never unpack or execute them. That
> way, we could also move the signing keys to a HSM (example:
> https://www.nitrokey.com/) and run the server at a secure
> location (not in a public data centre).

Being more on the hardware side, I must confess, I don't know what the
main consequences would be. No firm opinion...

> 
> (b) Privacy
> Fetching updates typically leaks a lot of information (such
> as your current patch level, or systems architecture, or
> IP address). By using HTTPS only, we avoid information leaks
> to eavesdroppers, which I consider a security benefit, too.

ACK.

> However, a mirror operator still has access to those information.
> Perhaps the IP address is the most critical one, since it
> allows tracing a system back to a city/country, or even to
> an organisation.
> 
> Because of that, I do consider mirrors to be somewhat critical,
> and would like to see the list singed in 3.x, too.

ACK.

> (i) Should we introduce mirror servers in the Tor network?
> One way to solve this problem is to download updates via a
> proxy, or an anonymisation network. In most cases, Tor fits
> the bill.
> 
> For best privacy, some mirror servers could be operated as
> so-called "hidden services", so traffic won't even leave the
> Tor network and pass some exit nodes. (Debian runs several
> services that way, including package mirrors: https://onion.debian.org/ .)
> 
> Since Tor is considered bad traffic in some corporate network
> (or even states), this technique should be disabled by
> default.
> 
> What are your opinions here?

I never used Tor and don't plan to use it in the future, so I can't say
much about this.

> (ii) Reducing update connections to anybody else
> Some resources (GeoIP database, IDS rulesets, proxy blacklists)
> are currently not fetched via the IPFire mirrors, causing
> some of the problems mentioned above.
> 
> For example, to fetch the GeoIP database, all systems sooner
> or later connect to "geolite.maxmind.com", so we can assume
> they see a lot of IP addresses IPFire systems are located behind. :-\
> Michael and I are currently working on a replacement for
> this, called "libloc", but that is a different topic.
> 
> Pushing all these resources into packages (if they are
> free, of course) and deliver them over our own mirrors would
> reduce some traffic to third party servers here. For libloc,
> we plan to do so.
> 
> Should we do this for other resources such as rulesets and
> blacklists, too?

Same as above: I don't know what the main consequences would be and how
'libloc' really works. But it sounds reasonable. Yes.

> Looking forward to read your comments.

Since you didnt' get much answers until now, I hope this one helps a
bit... ;-)

> Thanks, and best regards,
> Peter Müller
> 

Best,
Matthias

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Discussion] Privacy and security for IPFire updates
  2018-04-10 17:15 [Discussion] Privacy and security for IPFire updates Peter Müller
  2018-04-14  6:35 ` Matthias Fischer
@ 2018-04-16 11:23 ` Michael Tremer
  2018-04-16 15:25   ` Peter Müller
  1 sibling, 1 reply; 8+ messages in thread
From: Michael Tremer @ 2018-04-16 11:23 UTC (permalink / raw)
  To: development

[-- Attachment #1: Type: text/plain, Size: 12440 bytes --]

Hello,

On Tue, 2018-04-10 at 19:15 +0200, Peter Müller wrote:
> Hello,
> 
> a few days ago, I had a discussion with Michael about
> privacy and security for IPFire updates (we mainly
> focused on IPFire 3.x, but some points might be
> applicable to 2.x, too).

It is good that we have a discussion about this in the public, too. We
unfortunately have too many discussions in private which isn't too good.

But please, other people, contribute!

> In order to get some more opinions about this, I would
> like to mention the main points here and ask for comments.
> Forgive me if you receive this mail twice; I'll post
> it on both development and mirror list.

You can CC the same email to multiple lists at the same time. I did this now, so
people subscribed to both will only receive one.

We should also include the Pakfire list.

> (a) Security
> Whenever you are updating an application or an entire
> operating systems, security is the most important
> aspect: An attacker must not be able to manipulate update
> packages, or fake information about the current patchlevel,
> or something similar.
> 
> In the past, we saw several incidents here - perhaps
> the most famous one was "Flame" (also known as "Flamer" or
> "Skywiper"), a sophisticated malware which was detected
> in Q1/2012 and spread via Windows Updates with a valid
> signature - and it turned out that strong cryptography
> is a very good way to be more robust here.
> 
> In IPFire 2.x, we recently switched from SHA1 to SHA512
> for the Pakfire signatures, and plan to change the signing
> key (which is currently a 1024-bit-DSA one) with Core
> Update 121 to a more secure one.
> 
> So far, so good.

Keys in Pakfire 3 are usually 4k in size, RSA and we sign and hash with SHA512
only.

That is practically as best as we can go right now.

> Speaking about IPFire 3.x, we plan to download updates
> over HTTPS only (see below why). However, there are still
> a few questions left:

We probably need to encourage some more people to move their mirrors to HTTPS
too before we can go large. But I would prefer to have HTTPS only and fewer
mirrors than having more mirrors with HTTP only.

However, this is only for privacy and not for security.

> (i) Should we sign the mirror list?
> In 3.x, using a custom mirror (i.e. in a company network) will
> be possible. If not specified, we use the public mirror
> infrastructure; a list of all servers and paths will
> be published as it already is today.

The list is a bit more complex than that, but essentially serves the same
purpose:

  https://pakfire.ipfire.org/distro/ipfire3/repo/stable/mirrorlist?arch=x86_64

This is what it looks like right now.

It is not required at all to sign this list for the integrity of the entire
package management system. The packages have a signature and it does not matter
if the package was downloaded from a source that was not trustworthy since the
signature is validated and either matches or it does not.

However, for the privacy argument, I can understand that there are some
arguments for signing it so that no man in the middle can add other mirrors and
gather information from any downloading clients.

The mirror list however is being downloaded over HTTPS and therefore we have
transport security. TLS can be man-in-the-middle-ed of course.

Generally I would like to allow for users to download a package from a source
that we do not know of verify. Debian is making a far stronger point towards
that that they even ban HTTPS. They want bigger organizations to use proxy
servers that cache data and they want to give them the opportunity to redirect
them back to any self-hosted mirrors. That I completely regard out of scope for
us since we don't create anywhere near the traffic that Debian creates (because
both: size and number of users of our distribution). I would also like to
emphasize that we consider security first and then bandwidth use.

I consider the likelihood that an attacker is inserting malicious mirrors in
here very small. A signature on the list would also only show that we have seen
the same list that a client has downloaded.

When we add a mirror to our list, we do not conduct any form of audit, so that
it is even possible that some of the mirrors are compromised or configured in a
fashion that we would not prefer. That - by design - is not a problem for the
security of Pakfire. But it is possible that people just swap the files on the
servers. That is an attack vector that we cannot remove unless we host all
mirrors ourselves and never make any mistakes. Not going to happen.

Pakfire 2 has the mirror list being distributed over the mirrors. Therefore it
*is* signed.

Pakfire 3 has a different approach. A central service is creating that list on
demand and tries to *optimise* it for each client. That means putting mirrors
that are closer or have a bigger pipe to the top of the list. Not sure how good
our algorithm is right now, but we can change it on the server-side at any time
and changes on the list will propagate quicker than with Pakfire 2.

Pakfire 2 also only has one key that is used to sign everything. I do not intend
to go down that path why that is a bad idea, but Pakfire 3 is not doing this any
more. In fact, packages can have multiple signatures.

That leads me to the question with what key the list should be signed. We would
need to sign maybe up to one-hundred lists per second since we generate them
live. We could now simplify the proximity algorithm so that each country only
gets one list or something similar and then deliver that list from cache.

I do not think that the main key of the repository is a good idea. Potentially
we should have an extra key just for the mirror lists on the server.

We would also need to let the signature expire so that mirrors that are found
out to be compromised are *actually* removed. At the moment the client keeps
using the mirror list until it can download a new one. What would happen if the
download is not possible but the signature on a previous list has expired?

We would also make the entire package management system very prone to clock
issues. If you are five minutes off, or an hour, the list could have expired and
you cannot download any packages any more or you would always fall back to the
main mirror.

> In my opinion, we should sign that list, too, to prevent
> an attacker from inserting his/her mirror silently. On
> the other hand, packages are sill signed, so a manipulation
> here would not be possible (and we do not have to trust our
> mirrors), but an attacker might still gather some metadata.

So to bring this to a conclusion what I want to say here is, that I do not have
a problem with it being signed. I just have a problem with all the new problems
being created. If you can give me answers to the questions above and we can come
up with an approach that improves security and privacy and also does not make
bootstrapping a new system a pain in the rear end, then I am up for it.

But it will by design be a weak signature. We could probably not put the key
into a HSM, etc.

> [The mirror list can be viewed at https://mirrors.ipfire.org/,
> if anyone is interested.]

Pakfire 3 has its mirrors here: https://pakfire.ipfire.org/mirrors

> (ii) Should we introduce signers?
> A package built for IPFire 3.x will be signed at the builder
> using a custom key for each machine. Since malicious activity
> might took place during the build, the key might became
> compromised.
> 
> Some Linux distributions are using dedicated signers, which
> are only signing data but never unpack or execute them. That
> way, we could also move the signing keys to a HSM (example:
> https://www.nitrokey.com/) and run the server at a secure
> location (not in a public data centre).

I am in favour of this.

This is just very hard for us to do. Can we bring the entire build service back
to work again and then add this?

It is not very straight forward and since we won't have builders and the signers
in the same DC, we would need to have a way to either transfer the package
securely or do some remote signing. Both doesn't sound like a good idea.

> (b) Privacy
> Fetching updates typically leaks a lot of information (such
> as your current patch level, or systems architecture, or
> IP address). By using HTTPS only, we avoid information leaks
> to eavesdroppers, which I consider a security benefit, too.
> 
> However, a mirror operator still has access to those information.
> Perhaps the IP address is the most critical one, since it
> allows tracing a system back to a city/country, or even to
> an organisation.

We have been hosting a ClamAV mirror once and it was very interesting to see
this.

Also, many mirrors seem to open up the usage statistics through webalizer. So
this will indeed be a public record.

> Because of that, I do consider mirrors to be somewhat critical,
> and would like to see the list singed in 3.x, too.

As stated above, I do not think that this gets rid of the problem that you are
describing here.

> (i) Should we introduce mirror servers in the Tor network?
> One way to solve this problem is to download updates via a
> proxy, or an anonymisation network. In most cases, Tor fits
> the bill.
> 
> For best privacy, some mirror servers could be operated as
> so-called "hidden services", so traffic won't even leave the
> Tor network and pass some exit nodes. (Debian runs several
> services that way, including package mirrors: https://onion.debian.org/ .)
> 
> Since Tor is considered bad traffic in some corporate network
> (or even states), this technique should be disabled by
> default.
> 
> What are your opinions here?

I have never hosted a hidden service on Tor. I do not see a problem with that.
It might  only be possible that a very tiny of people are going to use this and
therefore it is a lot of work to do this with only a few people benefiting from
it.

What does it need so that Pakfire would be able to connect to the Tor network?
How would this look like from a user's perspective? Where is this being
configured? How to we send mirror lists or repository information?

> (ii) Reducing update connections to anybody else
> Some resources (GeoIP database, IDS rulesets, proxy blacklists)
> are currently not fetched via the IPFire mirrors, causing
> some of the problems mentioned above.
> 
> For example, to fetch the GeoIP database, all systems sooner
> or later connect to "geolite.maxmind.com", so we can assume
> they see a lot of IP addresses IPFire systems are located behind. :-\
> Michael and I are currently working on a replacement for
> this, called "libloc", but that is a different topic.

This is a huge problem for me. We cannot rely on any third parties any more. I
guess the reports in the media over the last days and weeks have proven that
there is too much of a conflict of interest. There are no free services from an
organization that is trying to make billions of dollars.

Since it is very hard to get consent from the IPFire users on every of those, we
should just get everything from one entity only.

> Pushing all these resources into packages (if they are
> free, of course) and deliver them over our own mirrors would
> reduce some traffic to third party servers here. For libloc,
> we plan to do so.

If by package, you are NOT referring to a package in the sense of a pakfire
package, then I agree.

> Should we do this for other resources such as rulesets and
> blacklists, too?

Ideally yes, but realistically we cannot reinvent everything ourselves. I am
personally involved into too many of these side-projects that there is only
little time for the main thing. So I would rather consider that we work together
with the blacklist people, or just leave it for now. I guess that is a thing for
the blacklists because they are opt-in. People have to pick one and it is
obvious that something is being downloaded. However, it is not obvious what the
dangers are. The geo IP database however is not opt-in. And it isn't opt-out
either.

> Looking forward to read your comments.

Sorry this took a little while.

> 
> Thanks, and best regards,
> Peter Müller

-Michael

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Discussion] Privacy and security for IPFire updates
  2018-04-16 11:23 ` Michael Tremer
@ 2018-04-16 15:25   ` Peter Müller
  2018-04-16 21:12     ` Michael Tremer
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Müller @ 2018-04-16 15:25 UTC (permalink / raw)
  To: development

[-- Attachment #1: Type: text/plain, Size: 16519 bytes --]

Hello,
> [...]>> (a) Security
>> Whenever you are updating an application or an entire
>> operating systems, security is the most important
>> aspect: An attacker must not be able to manipulate update
>> packages, or fake information about the current patchlevel,
>> or something similar.
>>
>> In the past, we saw several incidents here - perhaps
>> the most famous one was "Flame" (also known as "Flamer" or
>> "Skywiper"), a sophisticated malware which was detected
>> in Q1/2012 and spread via Windows Updates with a valid
>> signature - and it turned out that strong cryptography
>> is a very good way to be more robust here.
>>
>> In IPFire 2.x, we recently switched from SHA1 to SHA512
>> for the Pakfire signatures, and plan to change the signing
>> key (which is currently a 1024-bit-DSA one) with Core
>> Update 121 to a more secure one.
>>
>> So far, so good.
> 
> Keys in Pakfire 3 are usually 4k in size, RSA and we sign and hash with SHA512
> only.
> 
> That is practically as best as we can go right now.
I agree. ECC cryptography might become relevant some day here,
too (RSA does not scale well), but first things first.
> 
>> Speaking about IPFire 3.x, we plan to download updates
>> over HTTPS only (see below why). However, there are still
>> a few questions left:
> 
> We probably need to encourage some more people to move their mirrors to HTTPS
> too before we can go large. But I would prefer to have HTTPS only and fewer
> mirrors than having more mirrors with HTTP only.>
> However, this is only for privacy and not for security.
I agree.
> 
>> (i) Should we sign the mirror list?
>> In 3.x, using a custom mirror (i.e. in a company network) will
>> be possible. If not specified, we use the public mirror
>> infrastructure; a list of all servers and paths will
>> be published as it already is today.
> 
> The list is a bit more complex than that, but essentially serves the same
> purpose:
> 
>   https://pakfire.ipfire.org/distro/ipfire3/repo/stable/mirrorlist?arch=x86_64
> 
> This is what it looks like right now.
> 
> It is not required at all to sign this list for the integrity of the entire
> package management system. The packages have a signature and it does not matter
> if the package was downloaded from a source that was not trustworthy since the
> signature is validated and either matches or it does not.
> 
> However, for the privacy argument, I can understand that there are some
> arguments for signing it so that no man in the middle can add other mirrors and
> gather information from any downloading clients.
> 
> The mirror list however is being downloaded over HTTPS and therefore we have
> transport security. TLS can be man-in-the-middle-ed of course.
> 
> Generally I would like to allow for users to download a package from a source
> that we do not know of verify. Debian is making a far stronger point towards
> that that they even ban HTTPS. They want bigger organizations to use proxy
> servers that cache data and they want to give them the opportunity to redirect
> them back to any self-hosted mirrors. That I completely regard out of scope for
> us since we don't create anywhere near the traffic that Debian creates (because
> both: size and number of users of our distribution). I would also like to
> emphasize that we consider security first and then bandwidth use.
> 
> I consider the likelihood that an attacker is inserting malicious mirrors in
> here very small. A signature on the list would also only show that we have seen
> the same list that a client has downloaded.
> 
> When we add a mirror to our list, we do not conduct any form of audit, so that
> it is even possible that some of the mirrors are compromised or configured in a
> fashion that we would not prefer. That - by design - is not a problem for the
> security of Pakfire. But it is possible that people just swap the files on the
> servers. That is an attack vector that we cannot remove unless we host all
> mirrors ourselves and never make any mistakes. Not going to happen.
Another point I see here is that an attacker running an evil mirror
might denial the existance of new updates by simply not publishing them.

Of course, we might detect that sooner or later via a monitoring tool,
but in combination with an unsigned mirror list this point becomes
more relevant.

Should we publish the current update state (called "Core Update" in 2.x,
not sure if it exists in 3.x) via DNS, too? That way, we could avoid
pings to the mirrors, so installations only need to connect in case an
update has been announced.
> 
> Pakfire 2 has the mirror list being distributed over the mirrors. Therefore it
> *is* signed.
> 
> Pakfire 3 has a different approach. A central service is creating that list on
> demand and tries to *optimise* it for each client. That means putting mirrors
> that are closer or have a bigger pipe to the top of the list. Not sure how good
> our algorithm is right now, but we can change it on the server-side at any time
> and changes on the list will propagate quicker than with Pakfire 2.
There are two points I have a different opinion:
(a) If I got it right, every client needs to connect to this central
server sometimes, which I consider quite bad for various reasons
(privacy, missing redundancy, etc.). If we'd distribute the mirror list,
we only need a connect at the first time to learn which mirrors are out
there.

After that, a client can use a cached list, and fetch updates from any
mirror. In case we have a system at the other end of the world, we also
avoid connectivitiy issues, as we currently observe them in connection
with mirrors in Ecuador.

(b) If might be a postmaster-disease, but I was never a fan of moving
knowledge from client to server (my favorite example here are MX recors,
which work much better than implementing fail-over and loadbalancing on
the server side).

An individual list for every client is very hard to debug, since it
becomes difficult to reproduce a connectivity scenario if you do not
know which servers the client saw. Second, we have a server side
bottleneck here (signing!) and need an always-online key if we decide to
sign that list, anyway.

I do not took a look at the algorithm, yet, but the idea is to prorise
mirror servers located near the client, assuming that geographic
distance correlates with network distance today (not sure if that is
correct anyway, but it is definitely better than in the 90s).

The only problem here is to determine which public IP a client has. But
there are ways to work around this, and in the end, we'll probably solve
most of the issues (especially dealing with signature expire times) you
mentioned.

Any thoughts? :-)
> 
> Pakfire 2 also only has one key that is used to sign everything. I do not intend
> to go down that path why that is a bad idea, but Pakfire 3 is not doing this any
> more. In fact, packages can have multiple signatures.
> 
> That leads me to the question with what key the list should be signed. We would
> need to sign maybe up to one-hundred lists per second since we generate them
> live. We could now simplify the proximity algorithm so that each country only
> gets one list or something similar and then deliver that list from cache.
See above, I do not consider this being necessary.
> 
> I do not think that the main key of the repository is a good idea. Potentially
> we should have an extra key just for the mirror lists on the server.
Either way, I agree here.
> 
> We would also need to let the signature expire so that mirrors that are found
> out to be compromised are *actually* removed. At the moment the client keeps
> using the mirror list until it can download a new one. What would happen if the
> download is not possible but the signature on a previous list has expired?
Since this is a scenario which might happen any time, I'd consider
falling back to the main mirror the best alternative.
> 
> We would also make the entire package management system very prone to clock
> issues. If you are five minutes off, or an hour, the list could have expired and
> you cannot download any packages any more or you would always fall back to the
> main mirror.
Another problem solved by a more intelligent client. :-) :-) :-)
> 
>> In my opinion, we should sign that list, too, to prevent
>> an attacker from inserting his/her mirror silently. On
>> the other hand, packages are sill signed, so a manipulation
>> here would not be possible (and we do not have to trust our
>> mirrors), but an attacker might still gather some metadata.
> 
> So to bring this to a conclusion what I want to say here is, that I do not have
> a problem with it being signed. I just have a problem with all the new problems
> being created. If you can give me answers to the questions above and we can come
> up with an approach that improves security and privacy and also does not make
> bootstrapping a new system a pain in the rear end, then I am up for it.
> 
> But it will by design be a weak signature. We could probably not put the key
> into a HSM, etc.
In case we do not use individual mirror list, using a key baked into a
HSM would be possible here.
> 
>> [The mirror list can be viewed at https://mirrors.ipfire.org/,
>> if anyone is interested.]
> 
> Pakfire 3 has its mirrors here: https://pakfire.ipfire.org/mirrors
> 
>> (ii) Should we introduce signers?
>> A package built for IPFire 3.x will be signed at the builder
>> using a custom key for each machine. Since malicious activity
>> might took place during the build, the key might became
>> compromised.
>>
>> Some Linux distributions are using dedicated signers, which
>> are only signing data but never unpack or execute them. That
>> way, we could also move the signing keys to a HSM (example:
>> https://www.nitrokey.com/) and run the server at a secure
>> location (not in a public data centre).
> 
> I am in favour of this.
> 
> This is just very hard for us to do. Can we bring the entire build service back
> to work again and then add this?
> 
> It is not very straight forward and since we won't have builders and the signers
> in the same DC, we would need to have a way to either transfer the package
> securely or do some remote signing. Both doesn't sound like a good idea.
Assumed both builder and signer have good connectivity, transferring a
package securely sounds good. To avoid MITM attacks, a sort of "builder
signature" might be useful - in the end, a package has two or three
signatures then:

First, it is signed by the builder, to prove that it was build on that
machine (in case a package turns out to be compromised, this makes
tracing much easier) and it was transferred correctly to the signer.
Second, the signer adds it signature, which is assumed to be trusted by
Pakfire clients here. If not, we need a sort of "master key", too, but I
though that's what we wanted to avoid here.
> 
>> (b) Privacy
>> Fetching updates typically leaks a lot of information (such
>> as your current patch level, or systems architecture, or
>> IP address). By using HTTPS only, we avoid information leaks
>> to eavesdroppers, which I consider a security benefit, too.
>>
>> However, a mirror operator still has access to those information.
>> Perhaps the IP address is the most critical one, since it
>> allows tracing a system back to a city/country, or even to
>> an organisation.
> 
> We have been hosting a ClamAV mirror once and it was very interesting to see
> this.
> 
> Also, many mirrors seem to open up the usage statistics through webalizer. So
> this will indeed be a public record.
> 
>> Because of that, I do consider mirrors to be somewhat critical,
>> and would like to see the list singed in 3.x, too.
> 
> As stated above, I do not think that this gets rid of the problem that you are
> describing here.
> 
>> (i) Should we introduce mirror servers in the Tor network?
>> One way to solve this problem is to download updates via a
>> proxy, or an anonymisation network. In most cases, Tor fits
>> the bill.
>>
>> For best privacy, some mirror servers could be operated as
>> so-called "hidden services", so traffic won't even leave the
>> Tor network and pass some exit nodes. (Debian runs several
>> services that way, including package mirrors: https://onion.debian.org/ .)
>>
>> Since Tor is considered bad traffic in some corporate network
>> (or even states), this technique should be disabled by
>> default.
>>
>> What are your opinions here?
> 
> I have never hosted a hidden service on Tor. I do not see a problem with that.
> It might  only be possible that a very tiny of people are going to use this and
> therefore it is a lot of work to do this with only a few people benefiting from
> it.
Well, setting up a Tor mirror server is not very hard (_securing_ it is
the hard task here :-) ), but I am unaware how much development effort
that will be.
> 
> What does it need so that Pakfire would be able to connect to the Tor network?
> How would this look like from a user's perspective? Where is this being
> configured? How to we send mirror lists or repository information?
(i) You can connect to a locally running Tor daemon (which is probably
what we have on IPFire systems) via SOCKS. To provide a HTTP proxy, some
additional software is needed (polipo, see here for a configuration
example:
https://www.marcus-povey.co.uk/2016/03/24/using-tor-as-a-http-proxy/).

(ii) What does "user's perspective" mean here? Of course, transferring
files over Tor is slower, but that does not really matters since updates
are not that time critical.

(iii) /etc/tor/torrc (and Pakfire configuration I do not know, yet).
(iv) As ususal, it does to make any difference wether a mirror is
accessed via Tor or plaintext.

A good example might be apt-transport-tor
(https://packages.debian.org/stretch/apt-transport-tor), not sure how
good it fits on IPFire.
> 
>> (ii) Reducing update connections to anybody else
>> Some resources (GeoIP database, IDS rulesets, proxy blacklists)
>> are currently not fetched via the IPFire mirrors, causing
>> some of the problems mentioned above.
>>
>> For example, to fetch the GeoIP database, all systems sooner
>> or later connect to "geolite.maxmind.com", so we can assume
>> they see a lot of IP addresses IPFire systems are located behind. :-\
>> Michael and I are currently working on a replacement for
>> this, called "libloc", but that is a different topic.
> 
> This is a huge problem for me. We cannot rely on any third parties any more. I
> guess the reports in the media over the last days and weeks have proven that
> there is too much of a conflict of interest. There are no free services from an
> organization that is trying to make billions of dollars.
> 
> Since it is very hard to get consent from the IPFire users on every of those, we
> should just get everything from one entity only.
ACK.
> 
>> Pushing all these resources into packages (if they are
>> free, of course) and deliver them over our own mirrors would
>> reduce some traffic to third party servers here. For libloc,
>> we plan to do so.
> 
> If by package, you are NOT referring to a package in the sense of a pakfire
> package, then I agree.
> 
>> Should we do this for other resources such as rulesets and
>> blacklists, too?
> 
> Ideally yes, but realistically we cannot reinvent everything ourselves. I am
> personally involved into too many of these side-projects that there is only
> little time for the main thing. So I would rather consider that we work together
> with the blacklist people, or just leave it for now. I guess that is a thing for
> the blacklists because they are opt-in. People have to pick one and it is
> obvious that something is being downloaded. However, it is not obvious what the
> dangers are. The geo IP database however is not opt-in. And it isn't opt-out
> either.
Since blocklists do not eat up much disk space, I'd say we host
everything ourselves we can do (Emerging Threats IDS signatures, or
Spamhaus DROP if we want to implement that sometimes, ...).

But we probably need to get in touch with the maintainers first.
> 
>> Looking forward to read your comments.
> 
> Sorry this took a little while.
No problem. :-)

Best regards,
Peter Müller

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Discussion] Privacy and security for IPFire updates
  2018-04-16 15:25   ` Peter Müller
@ 2018-04-16 21:12     ` Michael Tremer
  2018-04-21 17:55       ` Peter Müller
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Tremer @ 2018-04-16 21:12 UTC (permalink / raw)
  To: development

[-- Attachment #1: Type: text/plain, Size: 21292 bytes --]

On Mon, 2018-04-16 at 17:25 +0200, Peter Müller wrote:
> Hello,
> > [...]>> (a) Security
> > > Whenever you are updating an application or an entire
> > > operating systems, security is the most important
> > > aspect: An attacker must not be able to manipulate update
> > > packages, or fake information about the current patchlevel,
> > > or something similar.
> > > 
> > > In the past, we saw several incidents here - perhaps
> > > the most famous one was "Flame" (also known as "Flamer" or
> > > "Skywiper"), a sophisticated malware which was detected
> > > in Q1/2012 and spread via Windows Updates with a valid
> > > signature - and it turned out that strong cryptography
> > > is a very good way to be more robust here.
> > > 
> > > In IPFire 2.x, we recently switched from SHA1 to SHA512
> > > for the Pakfire signatures, and plan to change the signing
> > > key (which is currently a 1024-bit-DSA one) with Core
> > > Update 121 to a more secure one.
> > > 
> > > So far, so good.
> > 
> > Keys in Pakfire 3 are usually 4k in size, RSA and we sign and hash with SHA512
> > only.
> > 
> > That is practically as best as we can go right now.
> 
> I agree. ECC cryptography might become relevant some day here,
> too (RSA does not scale well), but first things first.

I think we could technically use ECC here since we control all of the
stack, but I think RSA could still have better compatibility.

> > Speaking about IPFire 3.x, we plan to download updates
> > > over HTTPS only (see below why). However, there are still
> > > a few questions left:
> > 
> > We probably need to encourage some more people to move their mirrors to HTTPS
> > too before we can go large. But I would prefer to have HTTPS only and fewer
> > mirrors than having more mirrors with HTTP only.>
> > However, this is only for privacy and not for security.
> 
> I agree.
> > > (i) Should we sign the mirror list?
> > > In 3.x, using a custom mirror (i.e. in a company network) will
> > > be possible. If not specified, we use the public mirror
> > > infrastructure; a list of all servers and paths will
> > > be published as it already is today.
> > 
> > The list is a bit more complex than that, but essentially serves the same
> > purpose:
> > 
> >   https://pakfire.ipfire.org/distro/ipfire3/repo/stable/mirrorlist?arch=x86_64
> > 
> > This is what it looks like right now.
> > 
> > It is not required at all to sign this list for the integrity of the entire
> > package management system. The packages have a signature and it does not matter
> > if the package was downloaded from a source that was not trustworthy since the
> > signature is validated and either matches or it does not.
> > 
> > However, for the privacy argument, I can understand that there are some
> > arguments for signing it so that no man in the middle can add other mirrors and
> > gather information from any downloading clients.
> > 
> > The mirror list however is being downloaded over HTTPS and therefore we have
> > transport security. TLS can be man-in-the-middle-ed of course.
> > 
> > Generally I would like to allow for users to download a package from a source
> > that we do not know of verify. Debian is making a far stronger point towards
> > that that they even ban HTTPS. They want bigger organizations to use proxy
> > servers that cache data and they want to give them the opportunity to redirect
> > them back to any self-hosted mirrors. That I completely regard out of scope for
> > us since we don't create anywhere near the traffic that Debian creates (because
> > both: size and number of users of our distribution). I would also like to
> > emphasize that we consider security first and then bandwidth use.
> > 
> > I consider the likelihood that an attacker is inserting malicious mirrors in
> > here very small. A signature on the list would also only show that we have seen
> > the same list that a client has downloaded.
> > 
> > When we add a mirror to our list, we do not conduct any form of audit, so that
> > it is even possible that some of the mirrors are compromised or configured in a
> > fashion that we would not prefer. That - by design - is not a problem for the
> > security of Pakfire. But it is possible that people just swap the files on the
> > servers. That is an attack vector that we cannot remove unless we host all
> > mirrors ourselves and never make any mistakes. Not going to happen.
> 
> Another point I see here is that an attacker running an evil mirror
> might denial the existance of new updates by simply not publishing them.

Yes, this is an attack vector and an easy one.

We have a timestamp in the repository metadata that is downloaded
first. It also has a hash with the latest version of the package
database. The client will walk along all mirrors until it could
download it. The last place will be the base mirror that will have it.

  https://pakfire.ipfire.org/repositories/ipfire3/stable/x86_64/repodata/repomd.json

However, the repository metadata is not signed (as it would be in DNS),
 but I would argue that it should be.

It is kind of undefined what will happen when no repository data could
be downloaded at all or in an interval of about a week.

> Of course, we might detect that sooner or later via a monitoring tool,
> but in combination with an unsigned mirror list this point becomes
> more relevant.

Monitoring is good. It ensures the quality of the mirroring. But the
system itself needs to be resilient against this sort of attack.

> Should we publish the current update state (called "Core Update" in 2.x,
> not sure if it exists in 3.x) via DNS, too? That way, we could avoid
> pings to the mirrors, so installations only need to connect in case an
> update has been announced.

They would only download the metadata from the main service and there
would be no need to redownload the database again which is large. We
have to assume that people have a slow connection and bandwidth is
expensive.

> > Pakfire 2 has the mirror list being distributed over the mirrors. Therefore it
> > *is* signed.
> > 
> > Pakfire 3 has a different approach. A central service is creating that list on
> > demand and tries to *optimise* it for each client. That means putting mirrors
> > that are closer or have a bigger pipe to the top of the list. Not sure how good
> > our algorithm is right now, but we can change it on the server-side at any time
> > and changes on the list will propagate quicker than with Pakfire 2.
> 
> There are two points I have a different opinion:
> (a) If I got it right, every client needs to connect to this central
> server sometimes, which I consider quite bad for various reasons
> (privacy, missing redundancy, etc.). If we'd distribute the mirror list,
> we only need a connect at the first time to learn which mirrors are out
> there.

A decentralised system is better, but I do not see how we can achieve
this. A distributed list could of course not be signed.

> After that, a client can use a cached list, and fetch updates from any
> mirror. In case we have a system at the other end of the world, we also
> avoid connectivitiy issues, as we currently observe them in connection
> with mirrors in Ecuador.

A client can use a cached list now. The list is only refreshed once a
day (I think). Updates can then be fetched from any mirror as long as
the repository data is recent.

> (b) If might be a postmaster-disease, but I was never a fan of moving
> knowledge from client to server (my favorite example here are MX recors,
> which work much better than implementing fail-over and loadbalancing on
> the server side).
> 
> An individual list for every client is very hard to debug, since it
> becomes difficult to reproduce a connectivity scenario if you do not
> know which servers the client saw. Second, we have a server side
> bottleneck here (signing!) and need an always-online key if we decide to
> sign that list, anyway.

We do not really care about any connectivity issues. There might be
many reasons for that and I do not want to debug any mirror issues. The
client just needs to move on to the next one.

> I do not took a look at the algorithm, yet, but the idea is to prorise
> mirror servers located near the client, assuming that geographic
> distance correlates with network distance today (not sure if that is
> correct anyway, but it is definitely better than in the 90s).

It puts everything in the same country to the top and all the rest to
the bottom.

It correlates, but that is it. We should have a list of countries
nearby an other one. It would make sense to group them together by
continent, etc. But that is for somewhere else.

Basically the client has no way to measure "distance" or "speed". And I
do not think it is right to implement this in the client. Just a GeoIP
lookup requires to resolve DNS for all mirrors and then perform the
database lookup. That takes a long time and I do not see why this is
much better than the server-side approach.

> The only problem here is to determine which public IP a client has. But
> there are ways to work around this, and in the end, we'll probably solve
> most of the issues (especially dealing with signature expire times) you
> mentioned.

Determining the public IP is a huge problem. See ddns.

> Any thoughts? :-)

Yeah, you didn't convince me by assuring that there will be a solution.
This can be implemented. But is this worth the work and creating a much
more complex system to solve a problem only half-way?

:)

> > 
> > Pakfire 2 also only has one key that is used to sign everything. I do not intend
> > to go down that path why that is a bad idea, but Pakfire 3 is not doing this any
> > more. In fact, packages can have multiple signatures.
> > 
> > That leads me to the question with what key the list should be signed. We would
> > need to sign maybe up to one-hundred lists per second since we generate them
> > live. We could now simplify the proximity algorithm so that each country only
> > gets one list or something similar and then deliver that list from cache.
> 
> See above, I do not consider this being necessary.
> > 
> > I do not think that the main key of the repository is a good idea. Potentially
> > we should have an extra key just for the mirror lists on the server.
> 
> Either way, I agree here.
> > 
> > We would also need to let the signature expire so that mirrors that are found
> > out to be compromised are *actually* removed. At the moment the client keeps
> > using the mirror list until it can download a new one. What would happen if the
> > download is not possible but the signature on a previous list has expired?
> 
> Since this is a scenario which might happen any time, I'd consider
> falling back to the main mirror the best alternative.

What if that is compromised? What would be the contingency plan there?

> > We would also make the entire package management system very prone to clock
> > issues. If you are five minutes off, or an hour, the list could have expired and
> > you cannot download any packages any more or you would always fall back to the
> > main mirror.
> 
> Another problem solved by a more intelligent client. :-) :-) :-)

How? Please provide detail. Pakfire should not set the system clock.
Ever. That is totally out of scope.

> > > In my opinion, we should sign that list, too, to prevent
> > > an attacker from inserting his/her mirror silently. On
> > > the other hand, packages are sill signed, so a manipulation
> > > here would not be possible (and we do not have to trust our
> > > mirrors), but an attacker might still gather some metadata.
> > 
> > So to bring this to a conclusion what I want to say here is, that I do not have
> > a problem with it being signed. I just have a problem with all the new problems
> > being created. If you can give me answers to the questions above and we can come
> > up with an approach that improves security and privacy and also does not make
> > bootstrapping a new system a pain in the rear end, then I am up for it.
> > 
> > But it will by design be a weak signature. We could probably not put the key
> > into a HSM, etc.
> 
> In case we do not use individual mirror list, using a key baked into a
> HSM would be possible here.

Would bring us back to the signers again. It is hard to do this in a
VM.

> > 
> > > [The mirror list can be viewed at https://mirrors.ipfire.org/,
> > > if anyone is interested.]
> > 
> > Pakfire 3 has its mirrors here: https://pakfire.ipfire.org/mirrors
> > 
> > > (ii) Should we introduce signers?
> > > A package built for IPFire 3.x will be signed at the builder
> > > using a custom key for each machine. Since malicious activity
> > > might took place during the build, the key might became
> > > compromised.
> > > 
> > > Some Linux distributions are using dedicated signers, which
> > > are only signing data but never unpack or execute them. That
> > > way, we could also move the signing keys to a HSM (example:
> > > https://www.nitrokey.com/) and run the server at a secure
> > > location (not in a public data centre).
> > 
> > I am in favour of this.
> > 
> > This is just very hard for us to do. Can we bring the entire build service back
> > to work again and then add this?
> > 
> > It is not very straight forward and since we won't have builders and the signers
> > in the same DC, we would need to have a way to either transfer the package
> > securely or do some remote signing. Both doesn't sound like a good idea.
> 
> Assumed both builder and signer have good connectivity, transferring a
> package securely sounds good. To avoid MITM attacks, a sort of "builder
> signature" might be useful - in the end, a package has two or three
> signatures then:
> 
> First, it is signed by the builder, to prove that it was build on that
> machine (in case a package turns out to be compromised, this makes
> tracing much easier) and it was transferred correctly to the signer.
> Second, the signer adds it signature, which is assumed to be trusted by
> Pakfire clients here. If not, we need a sort of "master key", too, but I
> though that's what we wanted to avoid here.

The signature of the builder is not trustworthy. That is precisely why
we need a signer. The builder is executing untrusted code and can
therefore be easily compromised.

Master keys are bad.

> > 
> > > (b) Privacy
> > > Fetching updates typically leaks a lot of information (such
> > > as your current patch level, or systems architecture, or
> > > IP address). By using HTTPS only, we avoid information leaks
> > > to eavesdroppers, which I consider a security benefit, too.
> > > 
> > > However, a mirror operator still has access to those information.
> > > Perhaps the IP address is the most critical one, since it
> > > allows tracing a system back to a city/country, or even to
> > > an organisation.
> > 
> > We have been hosting a ClamAV mirror once and it was very interesting to see
> > this.
> > 
> > Also, many mirrors seem to open up the usage statistics through webalizer. So
> > this will indeed be a public record.
> > 
> > > Because of that, I do consider mirrors to be somewhat critical,
> > > and would like to see the list singed in 3.x, too.
> > 
> > As stated above, I do not think that this gets rid of the problem that you are
> > describing here.
> > 
> > > (i) Should we introduce mirror servers in the Tor network?
> > > One way to solve this problem is to download updates via a
> > > proxy, or an anonymisation network. In most cases, Tor fits
> > > the bill.
> > > 
> > > For best privacy, some mirror servers could be operated as
> > > so-called "hidden services", so traffic won't even leave the
> > > Tor network and pass some exit nodes. (Debian runs several
> > > services that way, including package mirrors: https://onion.debian.org/ .)
> > > 
> > > Since Tor is considered bad traffic in some corporate network
> > > (or even states), this technique should be disabled by
> > > default.
> > > 
> > > What are your opinions here?
> > 
> > I have never hosted a hidden service on Tor. I do not see a problem with that.
> > It might  only be possible that a very tiny of people are going to use this and
> > therefore it is a lot of work to do this with only a few people benefiting from
> > it.
> 
> Well, setting up a Tor mirror server is not very hard (_securing_ it is
> the hard task here :-) ), but I am unaware how much development effort
> that will be.

Tell me what it needs.

> > 
> > What does it need so that Pakfire would be able to connect to the Tor network?
> > How would this look like from a user's perspective? Where is this being
> > configured? How to we send mirror lists or repository information?
> 
> (i) You can connect to a locally running Tor daemon (which is probably
> what we have on IPFire systems) via SOCKS. To provide a HTTP proxy, some
> additional software is needed (polipo, see here for a configuration
> example:
> https://www.marcus-povey.co.uk/2016/03/24/using-tor-as-a-http-proxy/).

I was hoping that there was something builtin available. Until this day
I do not understand why Tor does not implement a HTTP proxy.

> (ii) What does "user's perspective" mean here? Of course, transferring
> files over Tor is slower, but that does not really matters since updates
> are not that time critical.

What steps the user will have to do. So setting up Tor is one thing.
Installing another service is another thing. Under those circumstances
it looks like we don't need to change a thing in pakfire since pakfire
can handle a HTTP proxy. But it wouldn't be a switch in pakfire.

Then, there will be DNS traffic.

> (iii) /etc/tor/torrc (and Pakfire configuration I do not know, yet).
> (iv) As ususal, it does to make any difference wether a mirror is
> accessed via Tor or plaintext.

Under those circumstances, is it even worth hosting a hidden service?
Why not access the other mirrors?

> A good example might be apt-transport-tor
> (https://packages.debian.org/stretch/apt-transport-tor), not sure how
> good it fits on IPFire.
> > 
> > > (ii) Reducing update connections to anybody else
> > > Some resources (GeoIP database, IDS rulesets, proxy blacklists)
> > > are currently not fetched via the IPFire mirrors, causing
> > > some of the problems mentioned above.
> > > 
> > > For example, to fetch the GeoIP database, all systems sooner
> > > or later connect to "geolite.maxmind.com", so we can assume
> > > they see a lot of IP addresses IPFire systems are located behind. :-\
> > > Michael and I are currently working on a replacement for
> > > this, called "libloc", but that is a different topic.
> > 
> > This is a huge problem for me. We cannot rely on any third parties any more. I
> > guess the reports in the media over the last days and weeks have proven that
> > there is too much of a conflict of interest. There are no free services from an
> > organization that is trying to make billions of dollars.
> > 
> > Since it is very hard to get consent from the IPFire users on every of those, we
> > should just get everything from one entity only.
> 
> ACK.
> > 
> > > Pushing all these resources into packages (if they are
> > > free, of course) and deliver them over our own mirrors would
> > > reduce some traffic to third party servers here. For libloc,
> > > we plan to do so.
> > 
> > If by package, you are NOT referring to a package in the sense of a pakfire
> > package, then I agree.
> > 
> > > Should we do this for other resources such as rulesets and
> > > blacklists, too?
> > 
> > Ideally yes, but realistically we cannot reinvent everything ourselves. I am
> > personally involved into too many of these side-projects that there is only
> > little time for the main thing. So I would rather consider that we work together
> > with the blacklist people, or just leave it for now. I guess that is a thing for
> > the blacklists because they are opt-in. People have to pick one and it is
> > obvious that something is being downloaded. However, it is not obvious what the
> > dangers are. The geo IP database however is not opt-in. And it isn't opt-out
> > either.
> 
> Since blocklists do not eat up much disk space, I'd say we host
> everything ourselves we can do (Emerging Threats IDS signatures, or
> Spamhaus DROP if we want to implement that sometimes, ...).

We wouldn't have the license to do that.

> But we probably need to get in touch with the maintainers first.
> > 
> > > Looking forward to read your comments.
> > 
> > Sorry this took a little while.
> 
> No problem. :-)

So, this discussion is getting longer and longer. Let's please try to
keep it on track and high level. If we have certain decisions coming
out of it, then we can split it up and discuss things more in detail.
Just want to make sure it doesn't take me an hour to reply to these
emails...

> 
> Best regards,
> Peter Müller

-Michael

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Discussion] Privacy and security for IPFire updates
  2018-04-16 21:12     ` Michael Tremer
@ 2018-04-21 17:55       ` Peter Müller
  2018-04-24 11:03         ` Michael Tremer
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Müller @ 2018-04-21 17:55 UTC (permalink / raw)
  To: development

[-- Attachment #1: Type: text/plain, Size: 15320 bytes --]

Hello Michael,

> On Mon, 2018-04-16 at 17:25 +0200, Peter Müller wrote:
>> Hello,
>>> [...]
>> Another point I see here is that an attacker running an evil mirror
>> might denial the existance of new updates by simply not publishing them.
> 
> Yes, this is an attack vector and an easy one.
> 
> We have a timestamp in the repository metadata that is downloaded
> first. It also has a hash with the latest version of the package
> database. The client will walk along all mirrors until it could
> download it. The last place will be the base mirror that will have it.
> 
>   https://pakfire.ipfire.org/repositories/ipfire3/stable/x86_64/repodata/repomd.json
> 
> However, the repository metadata is not signed (as it would be in DNS),
>  but I would argue that it should be.
Agreed.
> 
> It is kind of undefined what will happen when no repository data could
> be downloaded at all or in an interval of about a week.
A client could just move on to the next mirror, if the existance of a newer
version is already known (mirrors can be out of sync, too). If not, I am not
sure what the best practise is - DNS lookups might become handy...
> 
>> Of course, we might detect that sooner or later via a monitoring tool,
>> but in combination with an unsigned mirror list this point becomes
>> more relevant.
> 
> Monitoring is good. It ensures the quality of the mirroring. But the
> system itself needs to be resilient against this sort of attack.
Agreed.
> 
>> Should we publish the current update state (called "Core Update" in 2.x,
>> not sure if it exists in 3.x) via DNS, too? That way, we could avoid
>> pings to the mirrors, so installations only need to connect in case an
>> update has been announced.
> 
> They would only download the metadata from the main service and there
> would be no need to redownload the database again which is large. We
> have to assume that people have a slow connection and bandwidth is
> expensive.
I did not get this. Which database are you talking about here?

My idea was to publish a DNS TXT record (similar to ClamAV) containing the
current Core Update version. Since DNSSEC is obligatory in IPFire, this
information is secured. Clients can look up that record in a certain
time period (twice a day?), and in case anything has changed, they try to
reach a mirror in order to download the update.

This assumes that we will still have Core Updates in 3.x, and I remember
you saying no. Second, for databases (libloc, ...), clients need to connect
to mirrors sooner or later, so maybe the DNS stuff is not working here well.
> 
>>> Pakfire 2 has the mirror list being distributed over the mirrors. Therefore it
>>> *is* signed.
>>>
>>> Pakfire 3 has a different approach. A central service is creating that list on
>>> demand and tries to *optimise* it for each client. That means putting mirrors
>>> that are closer or have a bigger pipe to the top of the list. Not sure how good
>>> our algorithm is right now, but we can change it on the server-side at any time
>>> and changes on the list will propagate quicker than with Pakfire 2.
>>
>> There are two points I have a different opinion:
>> (a) If I got it right, every client needs to connect to this central
>> server sometimes, which I consider quite bad for various reasons
>> (privacy, missing redundancy, etc.). If we'd distribute the mirror list,
>> we only need a connect at the first time to learn which mirrors are out
>> there.
> 
> A decentralised system is better, but I do not see how we can achieve
> this. A distributed list could of course not be signed.
By "distributed list" you mean the mirror list? Why can't it be signed?
> 
>> After that, a client can use a cached list, and fetch updates from any
>> mirror. In case we have a system at the other end of the world, we also
>> avoid connectivitiy issues, as we currently observe them in connection
>> with mirrors in Ecuador.
> 
> A client can use a cached list now. The list is only refreshed once a
> day (I think). Updates can then be fetched from any mirror as long as
> the repository data is recent.
I hate to say it, but this does not sound very good (signatures expire,
mirrors go offline, and so on).
> 
>> (b) If might be a postmaster-disease, but I was never a fan of moving
>> knowledge from client to server (my favorite example here are MX recors,
>> which work much better than implementing fail-over and loadbalancing on
>> the server side).
>>
>> An individual list for every client is very hard to debug, since it
>> becomes difficult to reproduce a connectivity scenario if you do not
>> know which servers the client saw. Second, we have a server side
>> bottleneck here (signing!) and need an always-online key if we decide to
>> sign that list, anyway.
> 
> We do not really care about any connectivity issues. There might be
> many reasons for that and I do not want to debug any mirror issues. The
> client just needs to move on to the next one.
Okay, but then why bother doing all the signing and calculation at one server?
> 
>> I do not took a look at the algorithm, yet, but the idea is to prorise
>> mirror servers located near the client, assuming that geographic
>> distance correlates with network distance today (not sure if that is
>> correct anyway, but it is definitely better than in the 90s).
> 
> It puts everything in the same country to the top and all the rest to
> the bottom.
> 
> It correlates, but that is it. We should have a list of countries
> nearby an other one. It would make sense to group them together by
> continent, etc. But that is for somewhere else.
Yes, but it sounds easy to implement:

1. Determine my public IP address.
2. Determine country for that IP.
3. Which countries are near mine?
4. Determine preferred mirror servers from these countries.

Am I missing something here?
> 
> Basically the client has no way to measure "distance" or "speed". And I
> do not think it is right to implement this in the client. Just a GeoIP
> lookup requires to resolve DNS for all mirrors and then perform the
> database lookup. That takes a long time and I do not see why this is
> much better than the server-side approach.
True, we need DNS and GeoIP/libloc database lookups here, but these information
can be safely cached for N days. After that, the lookup procedure is repeated.

I do not consider these lookups to be too bandwith-consuming or else if
we do not perform them every time.
> 
>> The only problem here is to determine which public IP a client has. But
>> there are ways to work around this, and in the end, we'll probably solve
>> most of the issues (especially dealing with signature expire times) you
>> mentioned.
> 
> Determining the public IP is a huge problem. See ddns.
Yes (carrier grade NAT, and so on). But most systems will have a public
IP on RED because they manage PPPoE dialin. If not, why not letting them
look up as they do with DDNS? In case that is not possible, clients are
still able to fall back to no mirror preference and just pick them randomly.
> 
>> Any thoughts? :-)
> 
> Yeah, you didn't convince me by assuring that there will be a solution.
> This can be implemented. But is this worth the work and creating a much
> more complex system to solve a problem only half-way?
Maybe we should split up this discussion:
(a) I assume we agree for the privacy and security aspects
(HTTPS only and maybe Tor services) in general.
(b) Signed mirror list: Yes, but using a local mirror must be possible - which
simply overrides the list but that is all right since the user requested to do
so - and it is not a magic bullet.
(c) Individual mirror lists vs. one-size-fits-all: Both ideas have their pros
and cons: If we introduce mirror lists generated for each client individually,
we have a bottleneck (signing?) and a SPOF. Further, some persons like me might
argue that this leaks IPs since all clients must connect to a central server.
If we distribute a signed mirror list via the mirrors (as we do at the moment),
we need to implement an algorithm for selecting servers from that list on the
clients. Further, we bump into the problem that a client needs to know its public
IP and that we need to cache the selection results to avoid excessive DNS and
GeoIP/libloc queries.

Since we need to implement a selection algorithm _somewhere_, I only consider
determine public IPs a real problem and would therefor prefer the second
scenario.
> 
> :)
No harm feelings. :-)>>> [...]
>>>
>>> We would also need to let the signature expire so that mirrors that are found
>>> out to be compromised are *actually* removed. At the moment the client keeps
>>> using the mirror list until it can download a new one. What would happen if the
>>> download is not possible but the signature on a previous list has expired?
Try the download again a few times, and if that fails, trigger a warning and stop.
>>
>> Since this is a scenario which might happen any time, I'd consider
>> falling back to the main mirror the best alternative.
> 
> What if that is compromised? What would be the contingency plan there?
I was not thinking about that. Another idea might to check for every mirror server
if data with a valid signature is available - in case the signature is not expired,
yet.

But that is not a very good idea, and I propose to discuss these side-channel
issues in another threat. Maybe we can implement some of them later and focus
on the entire construction first.
> 
>>> We would also make the entire package management system very prone to clock
>>> issues. If you are five minutes off, or an hour, the list could have expired and
>>> you cannot download any packages any more or you would always fall back to the
>>> main mirror.
>>
>> Another problem solved by a more intelligent client. :-) :-) :-)
> 
> How? Please provide detail. Pakfire should not set the system clock.
> Ever. That is totally out of scope.
Yes, that's not what I had in mind. My idea here was similar to how Postfix treats
MX records in case one server is offline or causes too many trouble: Disable
that mirror/server locally, and ignore it for a certain time. That might solve
problems in case syncronisation on a mirror is broken.

Second idea is do overlap signatures: If there is little time left before a
list expires, the problem you mentioned is becomes more relevant. On the other
hand, if we push a new version of the list to the servers, and the signature of
the old one is still valid for a week or two, clients have more time to update.

In the local copy of a mirror list is expired, a client should try the main
server to get a new one. If it cannot validate it (because the server was
compromised), it is the same as above: Stop and alert.

We are on security here, and as usual, I consider usability a bit out of scope. :-D>>>> [...]
>>>
>>> But it will by design be a weak signature. We could probably not put the key
>>> into a HSM, etc.
>>
>> In case we do not use individual mirror list, using a key baked into a
>> HSM would be possible here.
> 
> Would bring us back to the signers again. It is hard to do this in a
> VM.
Depends on the interface a HSM uses. If it is USB, chances are good to
pass a machine port to a VM directly.
> 
>>> [...]
>> Assumed both builder and signer have good connectivity, transferring a
>> package securely sounds good. To avoid MITM attacks, a sort of "builder
>> signature" might be useful - in the end, a package has two or three
>> signatures then:
>>
>> First, it is signed by the builder, to prove that it was build on that
>> machine (in case a package turns out to be compromised, this makes
>> tracing much easier) and it was transferred correctly to the signer.
>> Second, the signer adds it signature, which is assumed to be trusted by
>> Pakfire clients here. If not, we need a sort of "master key", too, but I
>> though that's what we wanted to avoid here.
> 
> The signature of the builder is not trustworthy. That is precisely why
> we need a signer. The builder is executing untrusted code and can
> therefore be easily compromised.
The signature of the builder is not trustworthy indeed, but that is not
what it's good for. It was intended to prove that a package was built
on a certain builder. In case we stumble over compromised packages one
time, we can safely trace them back to a builder and do forensics there.

If we strip the signature off, this step becomes much harder (especially
when no logs are present anymore), and all builders suddenly could be
compromised since we cannot pinpoint to one machine.
> 
> Master keys are bad.
Yes.
> 
>>>
>>>> (b) Privacy
>>>> [...]
>>
>> Well, setting up a Tor mirror server is not very hard (_securing_ it is
>> the hard task here :-) ), but I am unaware how much development effort
>> that will be.
> 
> Tell me what it needs.I would like to do that in a second topic.>>> [...]
> 
>> (ii) What does "user's perspective" mean here? Of course, transferring
>> files over Tor is slower, but that does not really matters since updates
>> are not that time critical.
> 
> What steps the user will have to do. So setting up Tor is one thing.
> Installing another service is another thing. Under those circumstances
> it looks like we don't need to change a thing in pakfire since pakfire
> can handle a HTTP proxy. But it wouldn't be a switch in pakfire.
Technically we both need to set up Tor and start the Tor service/daemon.
But there are almost no configuration steps to made, and we can hide the
entire procedure behind a "fetch updates via Tor" switch in some web
frontend.
> 
> Then, there will be DNS traffic.
Which DNS traffic exactly?
> 
>> (iii) /etc/tor/torrc (and Pakfire configuration I do not know, yet).
>> (iv) As ususal, it does to make any difference wether a mirror is
>> accessed via Tor or plaintext.
> 
> Under those circumstances, is it even worth hosting a hidden service?
> Why not access the other mirrors?
We can contact the existing mirror servers via Tor, but then we have the
problem that Tor traffic must pass some exit nodes, and so on. I consider
hidden services a better alternative.
> 
>> A good example might be apt-transport-tor
>> (https://packages.debian.org/stretch/apt-transport-tor), not sure how
>> good it fits on IPFire.
>>>
>>>> (ii) Reducing update connections to anybody else
>>>> [...]
>>
>> Since blocklists do not eat up much disk space, I'd say we host
>> everything ourselves we can do (Emerging Threats IDS signatures, or
>> Spamhaus DROP if we want to implement that sometimes, ...).
> 
> We wouldn't have the license to do that.
Don't know, but at that point, I would just ask them. Maybe we do. :-)
> 
>> But we probably need to get in touch with the maintainers first.
> 
> So, this discussion is getting longer and longer. Let's please try to
> keep it on track and high level. If we have certain decisions coming
> out of it, then we can split it up and discuss things more in detail.
> Just want to make sure it doesn't take me an hour to reply to these
> emails...
> 
Best regards,
Peter Müller

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Discussion] Privacy and security for IPFire updates
  2018-04-21 17:55       ` Peter Müller
@ 2018-04-24 11:03         ` Michael Tremer
  2018-04-24 19:23           ` Peter Müller
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Tremer @ 2018-04-24 11:03 UTC (permalink / raw)
  To: development

[-- Attachment #1: Type: text/plain, Size: 22050 bytes --]

Hi,

On Sat, 2018-04-21 at 19:55 +0200, Peter Müller wrote:
> Hello Michael,
> 
> > On Mon, 2018-04-16 at 17:25 +0200, Peter Müller wrote:
> > > Hello,
> > > > [...]
> > > 
> > > Another point I see here is that an attacker running an evil mirror
> > > might denial the existance of new updates by simply not publishing them.
> > 
> > Yes, this is an attack vector and an easy one.
> > 
> > We have a timestamp in the repository metadata that is downloaded
> > first. It also has a hash with the latest version of the package
> > database. The client will walk along all mirrors until it could
> > download it. The last place will be the base mirror that will have it.
> > 
> >   https://pakfire.ipfire.org/repositories/ipfire3/stable/x86_64/repodata/rep
> > omd.json
> > 
> > However, the repository metadata is not signed (as it would be in DNS),
> >  but I would argue that it should be.
> 
> Agreed.
> > 
> > It is kind of undefined what will happen when no repository data could
> > be downloaded at all or in an interval of about a week.
> 
> A client could just move on to the next mirror, if the existance of a newer
> version is already known (mirrors can be out of sync, too). If not, I am not
> sure what the best practise is - DNS lookups might become handy...

Generally, I think Pakfire should *NOT* rely on DNS. DNS is blocked in a few
networks of government agencies where we have IPFire installations and as a
result they don't install any updates on anything.

If a system is only behind an upstream HTTP(S) proxy, that should be enough to
download updates in the optimal way.

> > > Of course, we might detect that sooner or later via a monitoring tool,
> > > but in combination with an unsigned mirror list this point becomes
> > > more relevant.
> > 
> > Monitoring is good. It ensures the quality of the mirroring. But the
> > system itself needs to be resilient against this sort of attack.
> 
> Agreed.
> > 
> > > Should we publish the current update state (called "Core Update" in 2.x,
> > > not sure if it exists in 3.x) via DNS, too? That way, we could avoid
> > > pings to the mirrors, so installations only need to connect in case an
> > > update has been announced.
> > 
> > They would only download the metadata from the main service and there
> > would be no need to redownload the database again which is large. We
> > have to assume that people have a slow connection and bandwidth is
> > expensive.
> 
> I did not get this. Which database are you talking about here?

The package database.

> My idea was to publish a DNS TXT record (similar to ClamAV) containing the
> current Core Update version. Since DNSSEC is obligatory in IPFire, this
> information is secured. Clients can look up that record in a certain
> time period (twice a day?), and in case anything has changed, they try to
> reach a mirror in order to download the update.

It is not guaranteed that DNSSEC is always on. I am also not trusting DNSSEC
being around for forever. People feel that DNS-over-TLS seems to be enough.
Different debate.

They do that with the repository metadata just like you described. A small file
that is being checked very often and the big database is only downloaded when it
has changed.

> This assumes that we will still have Core Updates in 3.x, and I remember
> you saying no. Second, for databases (libloc, ...), clients need to connect
> to mirrors sooner or later, so maybe the DNS stuff is not working here well.

For libloc we can do this in the same way. But a file on a server with a hash
and signature should do the job just as well as DNS. It is easier to implement
and HTTPS connectivity is required anyways.

The question here is if we want a central redirect service like the download
links or do we want to distribute a list of mirror servers?

> > 
> > > > Pakfire 2 has the mirror list being distributed over the mirrors.
> > > > Therefore it
> > > > *is* signed.
> > > > 
> > > > Pakfire 3 has a different approach. A central service is creating that
> > > > list on
> > > > demand and tries to *optimise* it for each client. That means putting
> > > > mirrors
> > > > that are closer or have a bigger pipe to the top of the list. Not sure
> > > > how good
> > > > our algorithm is right now, but we can change it on the server-side at
> > > > any time
> > > > and changes on the list will propagate quicker than with Pakfire 2.
> > > 
> > > There are two points I have a different opinion:
> > > (a) If I got it right, every client needs to connect to this central
> > > server sometimes, which I consider quite bad for various reasons
> > > (privacy, missing redundancy, etc.). If we'd distribute the mirror list,
> > > we only need a connect at the first time to learn which mirrors are out
> > > there.
> > 
> > A decentralised system is better, but I do not see how we can achieve
> > this. A distributed list could of course not be signed.
> 
> By "distributed list" you mean the mirror list? Why can't it be signed?

If multiple parties agree on *the* mirror list, then there cannot be a key,
because that would be shared with everyone. I am talking about a distributed
group of people making the list and not a list that is generated and then being
distributed.

> > 
> > > After that, a client can use a cached list, and fetch updates from any
> > > mirror. In case we have a system at the other end of the world, we also
> > > avoid connectivitiy issues, as we currently observe them in connection
> > > with mirrors in Ecuador.
> > 
> > A client can use a cached list now. The list is only refreshed once a
> > day (I think). Updates can then be fetched from any mirror as long as
> > the repository data is recent.
> 
> I hate to say it, but this does not sound very good (signatures expire,
> mirrors go offline, and so on).

The signature would only be verified when the list is being received and those
mirrors will be added to an internal list as well as any manually configured
ones.

Mirrors that are unreachable will of course be skipped. But the client cannot
know if a mirror is temporarily gone or for forever.

> > 
> > > (b) If might be a postmaster-disease, but I was never a fan of moving
> > > knowledge from client to server (my favorite example here are MX recors,
> > > which work much better than implementing fail-over and loadbalancing on
> > > the server side).
> > > 
> > > An individual list for every client is very hard to debug, since it
> > > becomes difficult to reproduce a connectivity scenario if you do not
> > > know which servers the client saw. Second, we have a server side
> > > bottleneck here (signing!) and need an always-online key if we decide to
> > > sign that list, anyway.
> > 
> > We do not really care about any connectivity issues. There might be
> > many reasons for that and I do not want to debug any mirror issues. The
> > client just needs to move on to the next one.
> 
> Okay, but then why bother doing all the signing and calculation at one server?

?!

> > 
> > > I do not took a look at the algorithm, yet, but the idea is to prorise
> > > mirror servers located near the client, assuming that geographic
> > > distance correlates with network distance today (not sure if that is
> > > correct anyway, but it is definitely better than in the 90s).
> > 
> > It puts everything in the same country to the top and all the rest to
> > the bottom.
> > 
> > It correlates, but that is it. We should have a list of countries
> > nearby an other one. It would make sense to group them together by
> > continent, etc. But that is for somewhere else.
> 
> Yes, but it sounds easy to implement:
> 
> 1. Determine my public IP address.

That problem is a lot harder than it sounds. Look at ddns.

Ultimately, there is a central service that responds with a public IP address
from which the request came from. If you want to avoid contacting a central
service at all, then this solution doesn't solve that.

> 2. Determine country for that IP.
> 3. Which countries are near mine?
> 4. Determine preferred mirror servers from these countries.
> 
> Am I missing something here?

I guess you are underestimating that this is quite complex to implement
especially in environments where DNS is not available or some other oddities
happen. Pakfire needs to work like a clock.

Things that spring to mind is Cisco appliances that truncate DNS packets when
they are longer than 100 bytes or something (and the TXT record will be a lot
longer than that). Then there needs to be fallback mechanism and I think it
would make sense to directly use HTTPS only. If that doesn't work, there won't
be any updates anyways.

> > 
> > Basically the client has no way to measure "distance" or "speed". And I
> > do not think it is right to implement this in the client. Just a GeoIP
> > lookup requires to resolve DNS for all mirrors and then perform the
> > database lookup. That takes a long time and I do not see why this is
> > much better than the server-side approach.
> 
> True, we need DNS and GeoIP/libloc database lookups here, but these
> information
> can be safely cached for N days. After that, the lookup procedure is repeated.

That can of course be in the downloaded mirror list.

> I do not consider these lookups to be too bandwith-consuming or else if
> we do not perform them every time.
> > 
> > > The only problem here is to determine which public IP a client has. But
> > > there are ways to work around this, and in the end, we'll probably solve
> > > most of the issues (especially dealing with signature expire times) you
> > > mentioned.
> > 
> > Determining the public IP is a huge problem. See ddns.
> 
> Yes (carrier grade NAT, and so on). But most systems will have a public
> IP on RED because they manage PPPoE dialin. If not, why not letting them
> look up as they do with DDNS? In case that is not possible, clients are
> still able to fall back to no mirror preference and just pick them randomly.
> > 
> > > Any thoughts? :-)
> > 
> > Yeah, you didn't convince me by assuring that there will be a solution.
> > This can be implemented. But is this worth the work and creating a much
> > more complex system to solve a problem only half-way?
> 
> Maybe we should split up this discussion:

Yes, I do not want to prolong certain aspects of this. We are wasting too much
time and not getting anywhere with this and I think it is wiser to spend that
time on coding :)

> (a) I assume we agree for the privacy and security aspects
> (HTTPS only and maybe Tor services) in general.

HTTPS is being settled. You have been reaching out to the last remaining mirrors
that do not support it yet and I am sure we can convince a few more to enable
it.

Tor. I have no technical insight nor do I think that many users will be using
it. So please consider contributing the technical implementation of this.

> (b) Signed mirror list: Yes, but using a local mirror must be possible - which
> simply overrides the list but that is all right since the user requested to do
> so - and it is not a magic bullet.

The question that isn't answered for me here is which key should be used. The
repo's key? Guess it would be that one.

> (c) Individual mirror lists vs. one-size-fits-all: Both ideas have their pros
> and cons: If we introduce mirror lists generated for each client individually,
> we have a bottleneck (signing?) and a SPOF. 

SPOF yes, but that is not a problem because clients will continue using an old
list.

We will have to check how long signing takes. Cannot be ages. But we will need
to do this for each client since we randomize all mirrors. Or we implement the
randomization at the client and then we can sign one per country and cache it
which is well feasible.

> Further, some persons like me might
> argue that this leaks IPs since all clients must connect to a central server.
> If we distribute a signed mirror list via the mirrors (as we do at the
> moment),
> we need to implement an algorithm for selecting servers from that list on the
> clients. Further, we bump into the problem that a client needs to know its
> public
> IP and that we need to cache the selection results to avoid excessive DNS and
> GeoIP/libloc queries.

"Leaking" the client's IP address isn't solved when there is a fallback to
another central service. That just solves it for a number of clients but not
all.

> Since we need to implement a selection algorithm _somewhere_, I only consider
> determine public IPs a real problem and would therefor prefer the second
> scenario.

Unless you really really object, I would like to cut this conversation short and
would like to propose that we go with the current implementation. It does not
have any severe disadvantages over the other approach which just has other
disadvantages. Our users won't care that much about this tiny detail and we
could potentially change it later.

If you want to hide your IP address, you can use Tor and then the server-based
approach should tick all your boxes.

> > 
> > :)
> 
> No harm feelings. :-)>>> [...]
> > > > 
> > > > We would also need to let the signature expire so that mirrors that are
> > > > found
> > > > out to be compromised are *actually* removed. At the moment the client
> > > > keeps
> > > > using the mirror list until it can download a new one. What would happen
> > > > if the
> > > > download is not possible but the signature on a previous list has
> > > > expired?
> 
> Try the download again a few times, and if that fails, trigger a warning and
> stop.

And leave the system unpatched? I consider that a much more severe problem then.

> > > 
> > > Since this is a scenario which might happen any time, I'd consider
> > > falling back to the main mirror the best alternative.
> > 
> > What if that is compromised? What would be the contingency plan there?
> 
> I was not thinking about that. Another idea might to check for every mirror
> server
> if data with a valid signature is available - in case the signature is not
> expired,
> yet.

That will always happen before a client connects to the base mirror.

> But that is not a very good idea, and I propose to discuss these side-channel
> issues in another threat. Maybe we can implement some of them later and focus
> on the entire construction first.
> > 
> > > > We would also make the entire package management system very prone to
> > > > clock
> > > > issues. If you are five minutes off, or an hour, the list could have
> > > > expired and
> > > > you cannot download any packages any more or you would always fall back
> > > > to the
> > > > main mirror.
> > > 
> > > Another problem solved by a more intelligent client. :-) :-) :-)
> > 
> > How? Please provide detail. Pakfire should not set the system clock.
> > Ever. That is totally out of scope.
> 
> Yes, that's not what I had in mind. My idea here was similar to how Postfix
> treats
> MX records in case one server is offline or causes too many trouble: Disable
> that mirror/server locally, and ignore it for a certain time. That might solve
> problems in case syncronisation on a mirror is broken.

We have a very easy way to determine if a mirror is out of sync by trying to
download a file and if we get 404, then the file does not exist and we move to
the next mirror. If the file is being corrupted and the checksum does not match,
then we do the same. The last resort will be the base mirror which by definition
is always in sync.

> Second idea is do overlap signatures: If there is little time left before a
> list expires, the problem you mentioned is becomes more relevant. On the other
> hand, if we push a new version of the list to the servers, and the signature
> of
> the old one is still valid for a week or two, clients have more time to
> update.
> 
> In the local copy of a mirror list is expired, a client should try the main
> server to get a new one. If it cannot validate it (because the server was
> compromised), it is the same as above: Stop and alert.
> 
> We are on security here, and as usual, I consider usability a bit out of
> scope. :-D>>>> [...]

Yes you do. And that will leave some systems unpatched if it doesn't try to
download updates very aggressively.

> > > > 
> > > > But it will by design be a weak signature. We could probably not put the
> > > > key
> > > > into a HSM, etc.
> > > 
> > > In case we do not use individual mirror list, using a key baked into a
> > > HSM would be possible here.
> > 
> > Would bring us back to the signers again. It is hard to do this in a
> > VM.
> 
> Depends on the interface a HSM uses. If it is USB, chances are good to
> pass a machine port to a VM directly.

We move VMs across multiple hardware nodes for load-balancing and maintenance.
Not really feasible then. Clearly this introduces a SPOF.

> > 
> > > > [...]
> > > 
> > > Assumed both builder and signer have good connectivity, transferring a
> > > package securely sounds good. To avoid MITM attacks, a sort of "builder
> > > signature" might be useful - in the end, a package has two or three
> > > signatures then:
> > > 
> > > First, it is signed by the builder, to prove that it was build on that
> > > machine (in case a package turns out to be compromised, this makes
> > > tracing much easier) and it was transferred correctly to the signer.
> > > Second, the signer adds it signature, which is assumed to be trusted by
> > > Pakfire clients here. If not, we need a sort of "master key", too, but I
> > > though that's what we wanted to avoid here.
> > 
> > The signature of the builder is not trustworthy. That is precisely why
> > we need a signer. The builder is executing untrusted code and can
> > therefore be easily compromised.
> 
> The signature of the builder is not trustworthy indeed, but that is not
> what it's good for. It was intended to prove that a package was built
> on a certain builder. In case we stumble over compromised packages one
> time, we can safely trace them back to a builder and do forensics there.
> 
> If we strip the signature off, this step becomes much harder (especially
> when no logs are present anymore), and all builders suddenly could be
> compromised since we cannot pinpoint to one machine.

The name of the builder is always inside the package. So a lacking signature
does not mean that the package has not been built by that mirror.

> > 
> > Master keys are bad.
> 
> Yes.
> > 
> > > > 
> > > > > (b) Privacy
> > > > > [...]
> > > 
> > > Well, setting up a Tor mirror server is not very hard (_securing_ it is
> > > the hard task here :-) ), but I am unaware how much development effort
> > > that will be.
> > 
> > Tell me what it needs.I would like to do that in a second topic.>>> [...]
> > 
> > > (ii) What does "user's perspective" mean here? Of course, transferring
> > > files over Tor is slower, but that does not really matters since updates
> > > are not that time critical.
> > 
> > What steps the user will have to do. So setting up Tor is one thing.
> > Installing another service is another thing. Under those circumstances
> > it looks like we don't need to change a thing in pakfire since pakfire
> > can handle a HTTP proxy. But it wouldn't be a switch in pakfire.
> 
> Technically we both need to set up Tor and start the Tor service/daemon.
> But there are almost no configuration steps to made, and we can hide the
> entire procedure behind a "fetch updates via Tor" switch in some web
> frontend.

Please send patches :)

> > 
> > Then, there will be DNS traffic.
> 
> Which DNS traffic exactly?

Determining your own IP address with help of an external service, the mirrors,
etc. That does not go through Tor by default, does it?

> > 
> > > (iii) /etc/tor/torrc (and Pakfire configuration I do not know, yet).
> > > (iv) As ususal, it does to make any difference wether a mirror is
> > > accessed via Tor or plaintext.
> > 
> > Under those circumstances, is it even worth hosting a hidden service?
> > Why not access the other mirrors?
> 
> We can contact the existing mirror servers via Tor, but then we have the
> problem that Tor traffic must pass some exit nodes, and so on. I consider
> hidden services a better alternative.

One instance of that will do, won't it? So if you set up your mirror as a hidden
service, then this should be fine.

> > 
> > > A good example might be apt-transport-tor
> > > (https://packages.debian.org/stretch/apt-transport-tor), not sure how
> > > good it fits on IPFire.
> > > > 
> > > > > (ii) Reducing update connections to anybody else
> > > > > [...]
> > > 
> > > Since blocklists do not eat up much disk space, I'd say we host
> > > everything ourselves we can do (Emerging Threats IDS signatures, or
> > > Spamhaus DROP if we want to implement that sometimes, ...).
> > 
> > We wouldn't have the license to do that.
> 
> Don't know, but at that point, I would just ask them. Maybe we do. :-)

You can ask, but I am certain that they will disagree. I assume a huge part of
their business is seeing who is downloading their databases.

> > > But we probably need to get in touch with the maintainers first.
> > 
> > So, this discussion is getting longer and longer. Let's please try to
> > keep it on track and high level. If we have certain decisions coming
> > out of it, then we can split it up and discuss things more in detail.
> > Just want to make sure it doesn't take me an hour to reply to these
> > emails...

We kind of failed at this. We talked about this on the phone yesterday and I
have added my thoughts to this email again. Do not really have much more to say.

> 
> Best regards,
> Peter Müller

-Michael

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Discussion] Privacy and security for IPFire updates
  2018-04-24 11:03         ` Michael Tremer
@ 2018-04-24 19:23           ` Peter Müller
  0 siblings, 0 replies; 8+ messages in thread
From: Peter Müller @ 2018-04-24 19:23 UTC (permalink / raw)
  To: development

[-- Attachment #1: Type: text/plain, Size: 13123 bytes --]

Hello,
> Hi,
>>>>> [...]
>>
>> A client could just move on to the next mirror, if the existance of a newer
>> version is already known (mirrors can be out of sync, too). If not, I am not
>> sure what the best practise is - DNS lookups might become handy...
> 
> Generally, I think Pakfire should *NOT* rely on DNS. DNS is blocked in a few
> networks of government agencies where we have IPFire installations and as a
> result they don't install any updates on anything.
For the records: We talked about this yesterday and decided to drop the DNS
idea since it does not solve more problems than it creates. So I agree with you here.
> 
> If a system is only behind an upstream HTTP(S) proxy, that should be enough to
> download updates in the optimal way.
Yes. I think access to HTTPS services (either directly or via a proxy) can be
safely added to IPFire's system requirements.
> [...]
>>>
>>>> Should we publish the current update state (called "Core Update" in 2.x,
>>>> not sure if it exists in 3.x) via DNS, too? That way, we could avoid
>>>> pings to the mirrors, so installations only need to connect in case an
>>>> update has been announced.
>>>
>>> They would only download the metadata from the main service and there
>>> would be no need to redownload the database again which is large. We
>>> have to assume that people have a slow connection and bandwidth is
>>> expensive.
>>
>> I did not get this. Which database are you talking about here?
> 
> The package database.
Since DNS does not seem to be a good idea, my point here became obsolete.
> 
>> My idea was to publish a DNS TXT record (similar to ClamAV) containing the
>> current Core Update version. Since DNSSEC is obligatory in IPFire, this
>> information is secured. Clients can look up that record in a certain
>> time period (twice a day?), and in case anything has changed, they try to
>> reach a mirror in order to download the update.
> 
> It is not guaranteed that DNSSEC is always on. I am also not trusting DNSSEC
> being around for forever. People feel that DNS-over-TLS seems to be enough.
> Different debate.
> 
> They do that with the repository metadata just like you described. A small file
> that is being checked very often and the big database is only downloaded when it
> has changed.
See above.
> 
>> This assumes that we will still have Core Updates in 3.x, and I remember
>> you saying no. Second, for databases (libloc, ...), clients need to connect
>> to mirrors sooner or later, so maybe the DNS stuff is not working here well.
> 
> For libloc we can do this in the same way. But a file on a server with a hash
> and signature should do the job just as well as DNS. It is easier to implement
> and HTTPS connectivity is required anyways.
Yes.
> 
> The question here is if we want a central redirect service like the download
> links or do we want to distribute a list of mirror servers?
My opinion is to use a distributed list of mirror servers. To avoid unnecessary
and expensive DNS and libloc updates on clients, we can just include the libloc
information to the distributed list since it won't matter who does the lookup.
> [...]
>>>
>>> A decentralised system is better, but I do not see how we can achieve
>>> this. A distributed list could of course not be signed.
>>
>> By "distributed list" you mean the mirror list? Why can't it be signed?
> 
> If multiple parties agree on *the* mirror list, then there cannot be a key,
> because that would be shared with everyone. I am talking about a distributed
> group of people making the list and not a list that is generated and then being
> distributed.
I consider this being a different topic and would like to discuss this in
case we actually settled on signing the mirror list.
> 
>>>
>>>> After that, a client can use a cached list, and fetch updates from any
>>>> mirror. In case we have a system at the other end of the world, we also
>>>> avoid connectivitiy issues, as we currently observe them in connection
>>>> with mirrors in Ecuador.
>>>
>>> A client can use a cached list now. The list is only refreshed once a
>>> day (I think). Updates can then be fetched from any mirror as long as
>>> the repository data is recent.
>>
>> I hate to say it, but this does not sound very good (signatures expire,
>> mirrors go offline, and so on).
> 
> The signature would only be verified when the list is being received and those
> mirrors will be added to an internal list as well as any manually configured
> ones.
Good idea.
> 
> Mirrors that are unreachable will of course be skipped. But the client cannot
> know if a mirror is temporarily gone or for forever.> 
>>>> (b) If might be a postmaster-disease, but I was never a fan of moving
>>>> knowledge from client to server (my favorite example here are MX recors,
>>>> which work much better than implementing fail-over and loadbalancing on
>>>> the server side).
>>>>
>>>> An individual list for every client is very hard to debug, since it
>>>> becomes difficult to reproduce a connectivity scenario if you do not
>>>> know which servers the client saw. Second, we have a server side
>>>> bottleneck here (signing!) and need an always-online key if we decide to
>>>> sign that list, anyway.
>>>
>>> We do not really care about any connectivity issues. There might be
>>> many reasons for that and I do not want to debug any mirror issues. The
>>> client just needs to move on to the next one.
>>
>> Okay, but then why bother doing all the signing and calculation at one server?
Well, the idea behind this was to let the client decide which mirrors will
be used by him. By delivering libloc information with the mirror list itself,
we solved most of the problems you saw here initially, leaving the task to
determine it's public IP to the client, which works well in most cases (direct
PPPoE dialin, etc.), and otherwise, we have a fallback.

In my point of view, this way solves more problem than it causes.
> 
> ?!
> 
>>>
>>>> I do not took a look at the algorithm, yet, but the idea is to prorise
>>>> mirror servers located near the client, assuming that geographic
>>>> distance correlates with network distance today (not sure if that is
>>>> correct anyway, but it is definitely better than in the 90s).
>>>
>>> It puts everything in the same country to the top and all the rest to
>>> the bottom.
>>>
>>> It correlates, but that is it. We should have a list of countries
>>> nearby an other one. It would make sense to group them together by
>>> continent, etc. But that is for somewhere else.
>>
>> Yes, but it sounds easy to implement:
>>
>> 1. Determine my public IP address.
> 
> That problem is a lot harder than it sounds. Look at ddns.
But in the end, it works. :-)
> 
> Ultimately, there is a central service that responds with a public IP address
> from which the request came from. If you want to avoid contacting a central
> service at all, then this solution doesn't solve that.
Yes. In case a system is unable to determine its public IP address, we
need to either randomly select a mirror (and ignore all selection logic
for that client) or lose a bit of privacy by connecting to a central server.

I prefer the second option here.
> 
>> 2. Determine country for that IP.
>> 3. Which countries are near mine?
>> 4. Determine preferred mirror servers from these countries.
>>
>> Am I missing something here?
> 
> I guess you are underestimating that this is quite complex to implement
> especially in environments where DNS is not available or some other oddities
> happen. Pakfire needs to work like a clock.
In case we deliver the libloc information with the mirror list, we only
have the "determine-my-public-IP"-problem left, which I consider to be
solvable.
> 
> Things that spring to mind is Cisco appliances that truncate DNS packets when
> they are longer than 100 bytes or something (and the TXT record will be a lot
> longer than that). Then there needs to be fallback mechanism and I think it
> would make sense to directly use HTTPS only. If that doesn't work, there won't
> be any updates anyways.
See above. And about the Cisco devices stripping some DNS traffic... It's
a commercial appliance, isn't it? *vomit*
> 
>>>
>>> Basically the client has no way to measure "distance" or "speed". And I
>>> do not think it is right to implement this in the client. Just a GeoIP
>>> lookup requires to resolve DNS for all mirrors and then perform the
>>> database lookup. That takes a long time and I do not see why this is
>>> much better than the server-side approach.
>>
>> True, we need DNS and GeoIP/libloc database lookups here, but these
>> information
>> can be safely cached for N days. After that, the lookup procedure is repeated.
> 
> That can of course be in the downloaded mirror list.
Yep, that solves many problems. Let's do it. :-)
> [...] 
> Yes, I do not want to prolong certain aspects of this. We are wasting too much
> time and not getting anywhere with this and I think it is wiser to spend that
> time on coding :)
> 
>> (a) I assume we agree for the privacy and security aspects
>> (HTTPS only and maybe Tor services) in general.
> 
> HTTPS is being settled. You have been reaching out to the last remaining mirrors
> that do not support it yet and I am sure we can convince a few more to enable
> it.
> 
> Tor. I have no technical insight nor do I think that many users will be using
> it. So please consider contributing the technical implementation of this.
I can do so. Settled.
> 
>> (b) Signed mirror list: Yes, but using a local mirror must be possible - which
>> simply overrides the list but that is all right since the user requested to do
>> so - and it is not a magic bullet.
> 
> The question that isn't answered for me here is which key should be used. The
> repo's key? Guess it would be that one.
I do not know the answer to that. Perhaps you have to give me a crash course in
current Pakfire 3.x first (in a different mailing list topic). Or did you mean
for 2.x? In that case, it will be the repository key.
> 
>> (c) Individual mirror lists vs. one-size-fits-all: Both ideas have their pros
>> and cons: If we introduce mirror lists generated for each client individually,
>> we have a bottleneck (signing?) and a SPOF. 
> 
> SPOF yes, but that is not a problem because clients will continue using an old
> list.
> 
> We will have to check how long signing takes. Cannot be ages. But we will need
> to do this for each client since we randomize all mirrors. Or we implement the
> randomization at the client and then we can sign one per country and cache it
> which is well feasible.
Signing _can_ take ages, especially when we use HSMs here. Mostly, they are not
optimised for speed, but for security. Different mirror lists for several countries
is what Ubuntu does (http://mirrors.ubuntu.com/), but this causes some other problems
(we actually do not want countries, but world "zones", signing take, etc.), so
I consider delivering one mirror list in general the best practise here.

What if a IPFire system move (mobile device, laptop, LTE uplink in a truck, ...)?
It would need to connect to a central server - which is actually what we are
trying to avoid - and fetch a new list suitable for its current location to
benefit from faster mirrors. Does not sound very great to me. :-|

(Playing the devils advocate here, please do not take this personally.)
> 
>> Further, some persons like me might
>> argue that this leaks IPs since all clients must connect to a central server.
>> If we distribute a signed mirror list via the mirrors (as we do at the
>> moment),
>> we need to implement an algorithm for selecting servers from that list on the
>> clients. Further, we bump into the problem that a client needs to know its
>> public
>> IP and that we need to cache the selection results to avoid excessive DNS and
>> GeoIP/libloc queries.
> 
> "Leaking" the client's IP address isn't solved when there is a fallback to
> another central service. That just solves it for a number of clients but not
> all.
As mentioned above, we have to lose some privacy if we want connectivity, I consider
that being OK in special cases such as mentioned above.
> 
>> Since we need to implement a selection algorithm _somewhere_, I only consider
>> determine public IPs a real problem and would therefor prefer the second
>> scenario.
> 
> Unless you really really object, I would like to cut this conversation short and
> would like to propose that we go with the current implementation. It does not
> have any severe disadvantages over the other approach which just has other
> disadvantages. Our users won't care that much about this tiny detail and we
> could potentially change it later.
Okay, I agree. You won. :-)

Best regards,
Peter Müller
> 
> If you want to hide your IP address, you can use Tor and then the server-based
> approach should tick all your boxes.
> 
[...]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-04-24 19:23 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-10 17:15 [Discussion] Privacy and security for IPFire updates Peter Müller
2018-04-14  6:35 ` Matthias Fischer
2018-04-16 11:23 ` Michael Tremer
2018-04-16 15:25   ` Peter Müller
2018-04-16 21:12     ` Michael Tremer
2018-04-21 17:55       ` Peter Müller
2018-04-24 11:03         ` Michael Tremer
2018-04-24 19:23           ` Peter Müller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox