public inbox for development@lists.ipfire.org
 help / color / mirror / Atom feed
From: "Peter Müller" <peter.mueller@link38.eu>
To: development@lists.ipfire.org
Subject: Re: [Discussion] Privacy and security for IPFire updates
Date: Sat, 21 Apr 2018 19:55:32 +0200	[thread overview]
Message-ID: <e8883d81-16a7-4559-7294-66b8dce3f11b@link38.eu> (raw)
In-Reply-To: <1523913133.2700.30.camel@ipfire.org>

[-- Attachment #1: Type: text/plain, Size: 15320 bytes --]

Hello Michael,

> On Mon, 2018-04-16 at 17:25 +0200, Peter Müller wrote:
>> Hello,
>>> [...]
>> Another point I see here is that an attacker running an evil mirror
>> might denial the existance of new updates by simply not publishing them.
> 
> Yes, this is an attack vector and an easy one.
> 
> We have a timestamp in the repository metadata that is downloaded
> first. It also has a hash with the latest version of the package
> database. The client will walk along all mirrors until it could
> download it. The last place will be the base mirror that will have it.
> 
>   https://pakfire.ipfire.org/repositories/ipfire3/stable/x86_64/repodata/repomd.json
> 
> However, the repository metadata is not signed (as it would be in DNS),
>  but I would argue that it should be.
Agreed.
> 
> It is kind of undefined what will happen when no repository data could
> be downloaded at all or in an interval of about a week.
A client could just move on to the next mirror, if the existance of a newer
version is already known (mirrors can be out of sync, too). If not, I am not
sure what the best practise is - DNS lookups might become handy...
> 
>> Of course, we might detect that sooner or later via a monitoring tool,
>> but in combination with an unsigned mirror list this point becomes
>> more relevant.
> 
> Monitoring is good. It ensures the quality of the mirroring. But the
> system itself needs to be resilient against this sort of attack.
Agreed.
> 
>> Should we publish the current update state (called "Core Update" in 2.x,
>> not sure if it exists in 3.x) via DNS, too? That way, we could avoid
>> pings to the mirrors, so installations only need to connect in case an
>> update has been announced.
> 
> They would only download the metadata from the main service and there
> would be no need to redownload the database again which is large. We
> have to assume that people have a slow connection and bandwidth is
> expensive.
I did not get this. Which database are you talking about here?

My idea was to publish a DNS TXT record (similar to ClamAV) containing the
current Core Update version. Since DNSSEC is obligatory in IPFire, this
information is secured. Clients can look up that record in a certain
time period (twice a day?), and in case anything has changed, they try to
reach a mirror in order to download the update.

This assumes that we will still have Core Updates in 3.x, and I remember
you saying no. Second, for databases (libloc, ...), clients need to connect
to mirrors sooner or later, so maybe the DNS stuff is not working here well.
> 
>>> Pakfire 2 has the mirror list being distributed over the mirrors. Therefore it
>>> *is* signed.
>>>
>>> Pakfire 3 has a different approach. A central service is creating that list on
>>> demand and tries to *optimise* it for each client. That means putting mirrors
>>> that are closer or have a bigger pipe to the top of the list. Not sure how good
>>> our algorithm is right now, but we can change it on the server-side at any time
>>> and changes on the list will propagate quicker than with Pakfire 2.
>>
>> There are two points I have a different opinion:
>> (a) If I got it right, every client needs to connect to this central
>> server sometimes, which I consider quite bad for various reasons
>> (privacy, missing redundancy, etc.). If we'd distribute the mirror list,
>> we only need a connect at the first time to learn which mirrors are out
>> there.
> 
> A decentralised system is better, but I do not see how we can achieve
> this. A distributed list could of course not be signed.
By "distributed list" you mean the mirror list? Why can't it be signed?
> 
>> After that, a client can use a cached list, and fetch updates from any
>> mirror. In case we have a system at the other end of the world, we also
>> avoid connectivitiy issues, as we currently observe them in connection
>> with mirrors in Ecuador.
> 
> A client can use a cached list now. The list is only refreshed once a
> day (I think). Updates can then be fetched from any mirror as long as
> the repository data is recent.
I hate to say it, but this does not sound very good (signatures expire,
mirrors go offline, and so on).
> 
>> (b) If might be a postmaster-disease, but I was never a fan of moving
>> knowledge from client to server (my favorite example here are MX recors,
>> which work much better than implementing fail-over and loadbalancing on
>> the server side).
>>
>> An individual list for every client is very hard to debug, since it
>> becomes difficult to reproduce a connectivity scenario if you do not
>> know which servers the client saw. Second, we have a server side
>> bottleneck here (signing!) and need an always-online key if we decide to
>> sign that list, anyway.
> 
> We do not really care about any connectivity issues. There might be
> many reasons for that and I do not want to debug any mirror issues. The
> client just needs to move on to the next one.
Okay, but then why bother doing all the signing and calculation at one server?
> 
>> I do not took a look at the algorithm, yet, but the idea is to prorise
>> mirror servers located near the client, assuming that geographic
>> distance correlates with network distance today (not sure if that is
>> correct anyway, but it is definitely better than in the 90s).
> 
> It puts everything in the same country to the top and all the rest to
> the bottom.
> 
> It correlates, but that is it. We should have a list of countries
> nearby an other one. It would make sense to group them together by
> continent, etc. But that is for somewhere else.
Yes, but it sounds easy to implement:

1. Determine my public IP address.
2. Determine country for that IP.
3. Which countries are near mine?
4. Determine preferred mirror servers from these countries.

Am I missing something here?
> 
> Basically the client has no way to measure "distance" or "speed". And I
> do not think it is right to implement this in the client. Just a GeoIP
> lookup requires to resolve DNS for all mirrors and then perform the
> database lookup. That takes a long time and I do not see why this is
> much better than the server-side approach.
True, we need DNS and GeoIP/libloc database lookups here, but these information
can be safely cached for N days. After that, the lookup procedure is repeated.

I do not consider these lookups to be too bandwith-consuming or else if
we do not perform them every time.
> 
>> The only problem here is to determine which public IP a client has. But
>> there are ways to work around this, and in the end, we'll probably solve
>> most of the issues (especially dealing with signature expire times) you
>> mentioned.
> 
> Determining the public IP is a huge problem. See ddns.
Yes (carrier grade NAT, and so on). But most systems will have a public
IP on RED because they manage PPPoE dialin. If not, why not letting them
look up as they do with DDNS? In case that is not possible, clients are
still able to fall back to no mirror preference and just pick them randomly.
> 
>> Any thoughts? :-)
> 
> Yeah, you didn't convince me by assuring that there will be a solution.
> This can be implemented. But is this worth the work and creating a much
> more complex system to solve a problem only half-way?
Maybe we should split up this discussion:
(a) I assume we agree for the privacy and security aspects
(HTTPS only and maybe Tor services) in general.
(b) Signed mirror list: Yes, but using a local mirror must be possible - which
simply overrides the list but that is all right since the user requested to do
so - and it is not a magic bullet.
(c) Individual mirror lists vs. one-size-fits-all: Both ideas have their pros
and cons: If we introduce mirror lists generated for each client individually,
we have a bottleneck (signing?) and a SPOF. Further, some persons like me might
argue that this leaks IPs since all clients must connect to a central server.
If we distribute a signed mirror list via the mirrors (as we do at the moment),
we need to implement an algorithm for selecting servers from that list on the
clients. Further, we bump into the problem that a client needs to know its public
IP and that we need to cache the selection results to avoid excessive DNS and
GeoIP/libloc queries.

Since we need to implement a selection algorithm _somewhere_, I only consider
determine public IPs a real problem and would therefor prefer the second
scenario.
> 
> :)
No harm feelings. :-)>>> [...]
>>>
>>> We would also need to let the signature expire so that mirrors that are found
>>> out to be compromised are *actually* removed. At the moment the client keeps
>>> using the mirror list until it can download a new one. What would happen if the
>>> download is not possible but the signature on a previous list has expired?
Try the download again a few times, and if that fails, trigger a warning and stop.
>>
>> Since this is a scenario which might happen any time, I'd consider
>> falling back to the main mirror the best alternative.
> 
> What if that is compromised? What would be the contingency plan there?
I was not thinking about that. Another idea might to check for every mirror server
if data with a valid signature is available - in case the signature is not expired,
yet.

But that is not a very good idea, and I propose to discuss these side-channel
issues in another threat. Maybe we can implement some of them later and focus
on the entire construction first.
> 
>>> We would also make the entire package management system very prone to clock
>>> issues. If you are five minutes off, or an hour, the list could have expired and
>>> you cannot download any packages any more or you would always fall back to the
>>> main mirror.
>>
>> Another problem solved by a more intelligent client. :-) :-) :-)
> 
> How? Please provide detail. Pakfire should not set the system clock.
> Ever. That is totally out of scope.
Yes, that's not what I had in mind. My idea here was similar to how Postfix treats
MX records in case one server is offline or causes too many trouble: Disable
that mirror/server locally, and ignore it for a certain time. That might solve
problems in case syncronisation on a mirror is broken.

Second idea is do overlap signatures: If there is little time left before a
list expires, the problem you mentioned is becomes more relevant. On the other
hand, if we push a new version of the list to the servers, and the signature of
the old one is still valid for a week or two, clients have more time to update.

In the local copy of a mirror list is expired, a client should try the main
server to get a new one. If it cannot validate it (because the server was
compromised), it is the same as above: Stop and alert.

We are on security here, and as usual, I consider usability a bit out of scope. :-D>>>> [...]
>>>
>>> But it will by design be a weak signature. We could probably not put the key
>>> into a HSM, etc.
>>
>> In case we do not use individual mirror list, using a key baked into a
>> HSM would be possible here.
> 
> Would bring us back to the signers again. It is hard to do this in a
> VM.
Depends on the interface a HSM uses. If it is USB, chances are good to
pass a machine port to a VM directly.
> 
>>> [...]
>> Assumed both builder and signer have good connectivity, transferring a
>> package securely sounds good. To avoid MITM attacks, a sort of "builder
>> signature" might be useful - in the end, a package has two or three
>> signatures then:
>>
>> First, it is signed by the builder, to prove that it was build on that
>> machine (in case a package turns out to be compromised, this makes
>> tracing much easier) and it was transferred correctly to the signer.
>> Second, the signer adds it signature, which is assumed to be trusted by
>> Pakfire clients here. If not, we need a sort of "master key", too, but I
>> though that's what we wanted to avoid here.
> 
> The signature of the builder is not trustworthy. That is precisely why
> we need a signer. The builder is executing untrusted code and can
> therefore be easily compromised.
The signature of the builder is not trustworthy indeed, but that is not
what it's good for. It was intended to prove that a package was built
on a certain builder. In case we stumble over compromised packages one
time, we can safely trace them back to a builder and do forensics there.

If we strip the signature off, this step becomes much harder (especially
when no logs are present anymore), and all builders suddenly could be
compromised since we cannot pinpoint to one machine.
> 
> Master keys are bad.
Yes.
> 
>>>
>>>> (b) Privacy
>>>> [...]
>>
>> Well, setting up a Tor mirror server is not very hard (_securing_ it is
>> the hard task here :-) ), but I am unaware how much development effort
>> that will be.
> 
> Tell me what it needs.I would like to do that in a second topic.>>> [...]
> 
>> (ii) What does "user's perspective" mean here? Of course, transferring
>> files over Tor is slower, but that does not really matters since updates
>> are not that time critical.
> 
> What steps the user will have to do. So setting up Tor is one thing.
> Installing another service is another thing. Under those circumstances
> it looks like we don't need to change a thing in pakfire since pakfire
> can handle a HTTP proxy. But it wouldn't be a switch in pakfire.
Technically we both need to set up Tor and start the Tor service/daemon.
But there are almost no configuration steps to made, and we can hide the
entire procedure behind a "fetch updates via Tor" switch in some web
frontend.
> 
> Then, there will be DNS traffic.
Which DNS traffic exactly?
> 
>> (iii) /etc/tor/torrc (and Pakfire configuration I do not know, yet).
>> (iv) As ususal, it does to make any difference wether a mirror is
>> accessed via Tor or plaintext.
> 
> Under those circumstances, is it even worth hosting a hidden service?
> Why not access the other mirrors?
We can contact the existing mirror servers via Tor, but then we have the
problem that Tor traffic must pass some exit nodes, and so on. I consider
hidden services a better alternative.
> 
>> A good example might be apt-transport-tor
>> (https://packages.debian.org/stretch/apt-transport-tor), not sure how
>> good it fits on IPFire.
>>>
>>>> (ii) Reducing update connections to anybody else
>>>> [...]
>>
>> Since blocklists do not eat up much disk space, I'd say we host
>> everything ourselves we can do (Emerging Threats IDS signatures, or
>> Spamhaus DROP if we want to implement that sometimes, ...).
> 
> We wouldn't have the license to do that.
Don't know, but at that point, I would just ask them. Maybe we do. :-)
> 
>> But we probably need to get in touch with the maintainers first.
> 
> So, this discussion is getting longer and longer. Let's please try to
> keep it on track and high level. If we have certain decisions coming
> out of it, then we can split it up and discuss things more in detail.
> Just want to make sure it doesn't take me an hour to reply to these
> emails...
> 
Best regards,
Peter Müller

  reply	other threads:[~2018-04-21 17:55 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-10 17:15 Peter Müller
2018-04-14  6:35 ` Matthias Fischer
2018-04-16 11:23 ` Michael Tremer
2018-04-16 15:25   ` Peter Müller
2018-04-16 21:12     ` Michael Tremer
2018-04-21 17:55       ` Peter Müller [this message]
2018-04-24 11:03         ` Michael Tremer
2018-04-24 19:23           ` Peter Müller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e8883d81-16a7-4559-7294-66b8dce3f11b@link38.eu \
    --to=peter.mueller@link38.eu \
    --cc=development@lists.ipfire.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox