From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail02.haj.ipfire.org (localhost [127.0.0.1]) by mail02.haj.ipfire.org (Postfix) with ESMTP id 4ZDcJc1tvFz34N8 for ; Fri, 14 Mar 2025 08:01:36 +0000 (UTC) Received: from mail01.ipfire.org (mail01.haj.ipfire.org [172.28.1.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mail01.haj.ipfire.org", Issuer "R10" (verified OK)) by mail02.haj.ipfire.org (Postfix) with ESMTPS id 4ZDcJX3xs5z336N for ; Fri, 14 Mar 2025 08:01:32 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail01.ipfire.org (Postfix) with ESMTPSA id 4ZDcJW4mxfz1FV for ; Fri, 14 Mar 2025 08:01:31 +0000 (UTC) DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=ipfire.org; s=202003ed25519; t=1741939291; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=hKi4Xsa+PUO5H7N4uNgYS/dpBpNcGdZn2UriZLGIEZw=; b=T5upM9S9enJZEEEkW+nTU1PakJ2o8E2oqJvyQCjFf9+hcSm3W2AfYjSQrGpIKJzyDiabGw 9KaFzHpoao6xoDBg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ipfire.org; s=202003rsa; t=1741939291; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=hKi4Xsa+PUO5H7N4uNgYS/dpBpNcGdZn2UriZLGIEZw=; b=NbzwCGFgX3R7eIBZbEtu92RqUPQX30ivvFgXR7wnXvV/Eq0PlEToweD0jVePhNu8OnyLo8 2wsXYa3bDzXzs7Ybxj7ch7Nd4wkaQcdM2uDTGa9s7u9YqfcLvejnw4dGuVeNTLtQHs0Gw4 +w+IsnuuaV4sTFz+WSzbTPR4XpXIyJdc7/cOEFqsISZqrOUQ106LY14wr3+bJq3Y6B+Slj zhlSDw1pE3pSTJa1YNUn6sZyUWOCzGts5mNtPrYA/vpKfZhgiljoqkZe0+xUziYRhQcui7 rgPm4UsHxOEWHmn6vFwiwqJ1wx0YOwNWjbvY2cLbms8JTn0hEDYKhMIHo/n+ug== From: Michael Tremer Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Precedence: list List-Id: List-Subscribe: , List-Unsubscribe: , List-Post: List-Help: Sender: Mail-Followup-To: Mime-Version: 1.0 Subject: This Week In Pakfire: Mirror Management Message-Id: <27124191-C293-40F6-A60A-2DE353D63523@ipfire.org> Date: Fri, 14 Mar 2025 08:00:02 +0000 To: "IPFire: Development-List" This is the start of a small blog series. Yes, blog series. Although = this list is not a blog, I think this is the best place to keep everyone = in the loop of what I am actually doing. I believe that I have been too = much working in the dark for large parts of the community which I = don=E2=80=99t want to. This series might go on for a little while and the posts will be deep = dives into some technology that has been built into Pakfire. Feel free = to ask me questions if you are interested. ## Mirror Management Although this is not the most essential feature of Pakfire, it is = necessary to ensure that we can provide packages fast and guarantee = their integrity for users all over the world. For that, we will have a number of mirror servers spread all over the = globe. They are hosted by universities, web hosting providers, and = organisations of all kinds. Although we vet them and they all have a = good credentials, there is a chance that their infrastructure might be = compromised. With so many mirror servers on the world, how could we = possibly keep track of them? And of course, they are there to fail. The system is designed that = mirrors can be down, can serve corrupted files and so on. Pakfire is = there to make sure that it finds the fastest, functioning mirror server. This needs to happen in two places: The first is when downloading packages. They are small files and they = are downloaded fairly often. IPFire is a modular distribution but most = people will have the same set of base packages installed. If there is an = update, Pakfire will try to download a package from the first mirror = server on the mirror list that was provided. If the mirror server = responds with a 404, or if there is any other problem, Pakfire will move = on to the next mirror server. Simples. Servers could not have the right files if they are out of sync. To not = try an out of sync mirror too often, Pakfire will keep track of how many = downloads have failed, and if there have been too many, it will disable = the mirror. A mirror might also be disabled immediately if there has = been an unrecoverable error, for example an expired TLS certificate, the = mirror not responding at all, etc=E2=80=A6 During the download, the checksum of the downloaded file will be = compared and if it does not match the file that we wanted, we know that = the mirror is either out of sync and serving an old file; or there has = been a problem where the file has either been corrupted by a filesystem = problem or broken hard drive, or has been replaced by some adversary. In = that case, we will throw away the package and download it again from = another mirror until we have found the right file. This feature allows = us to not trust any of the providers and we will also guarantee that = nobody else has tempered with the file - like a web proxy. ## How do packages get onto a mirror server? In the Pakfire Build Service, each repository has a flag that can be = enabled to sync them to our master mirror. Usually, we only do this for = anything that will be downloaded by a lot of people, like stable = releases. Testing repositories change too often and will be downloaded only by a = few people so the sync traffic is not worth it, and since mirrors are = very likely to be out of sync with the fast development pace, chances = are high that Pakfire will come back to the master mirror anyways. To allow downloads when the master mirror is down we might want to add = maybe a few selected mirrors, but currently this is not important enough = to be implemented. The build service is running on a different host to the master mirror, = and so repositories will be generated on a different machine and = regularly synced to the master mirror where all other mirror servers are = pulling from. ## What about that second place you mentioned? We will also have to steer people to the right mirror server for them = when they are downloading an image - like an ISO file. This currently = happens from the build service only, but I can see how this will also = handle actual downloads from the main website. For this, there is a special handler implemented in the build service = that has a lot of features. Mainly it will redirect a downloader to a = certain mirror based on where they are coming from - by their IP address = that is. But this is a rather complicated algorithm. We will select all = mirror servers and then order them by a priority. This priority is very = different for each client, because it tries to estimate how =E2=80=9Cclose= =E2=80=9D you are to the mirror. Proximity on the Internet is hard to determine. The borders of a country = don=E2=80=99t matter, so we don=E2=80=99t start here. The algorithm = starts with checking if the client is in the same Autonomous System. If = so, it will be most preferred, because you will be downloading either = from the same provider or building. Nothing should be as fast as this. = Then we consider the country code of the client and the mirror. If they = match, the mirror will be preferred next. Lastly, we will check if a = mirror is on the same continent. Starting from the closest mirror, the build service will check if the = file that we are looking for is available on the mirror and redirect the = client. This all happens within milliseconds using IPFire Location, and = we will cache whether a mirror had a file available or not. The same algorithm is used when Pakfire clients will download a mirror = list from the build service. This happens at least once every 24 hours = and mirrors will be sorted by the closest first. That allows to have the = downloading process as described above to be dumb and simply walk = through the list from top to bottom. It would be too complicated to = measure mirror distance in Pakfire itself. Ah, and last, the download handler mentioned above can also answer HEAD = requests to give download managers some extra meta information and avoid = that images will be downloaded if the client thinks it might already = have the right file. You can see the code here - of course it uses all other features we have = like rate limiting, etc: = https://git.ipfire.org/?p=3Dpbs.git;a=3Dblob;f=3Dsrc/web/mirrors.py;h=3Db6= 8fb6284ae88e6f47bb6bc481f77ad0cf88b28e;hb=3DHEAD#l129 = https://git.ipfire.org/?p=3Dpbs.git;a=3Dblob;f=3Dsrc/buildservice/mirrors.= py;h=3D5456535c91051bb8c97e953ad11e3d21c101738a;hb=3DHEAD#l317 I believe that all this will set us up very nicely to ensure that people = can download IPFire fast and are guaranteed to get the right software = that has not been injected with malware or just accidentally corrupted.=