From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail02.haj.ipfire.org (localhost [IPv6:::1]) by mail02.haj.ipfire.org (Postfix) with ESMTP id 4g9T7q2jXMz30LP for ; Wed, 06 May 2026 08:29:27 +0000 (UTC) Received: from mail01.ipfire.org (mail01.haj.ipfire.org [172.28.1.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange x25519) (Client CN "mail01.haj.ipfire.org", Issuer "R12" (not verified)) by mail02.haj.ipfire.org (Postfix) with ESMTPS id 4g9T7l6nKvz2xMF for ; Wed, 06 May 2026 08:29:23 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail01.ipfire.org (Postfix) with ESMTPSA id 4g9T7l1qY2z1X0; Wed, 06 May 2026 08:29:23 +0000 (UTC) DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=ipfire.org; s=202003ed25519; t=1778056163; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WkbMxdxyUBOtCbgLHs3iDNU+JK9FhvdY07Orhs7nSSg=; b=dgUL4SvIgAld4Wba2e0Zdfdy0fd8/I4QbKxzfPbVPxt33LmGhUczxH0fNzjS3vl6p7+Owu tavJYCFO1jqEA5Cw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ipfire.org; s=202003rsa; t=1778056163; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WkbMxdxyUBOtCbgLHs3iDNU+JK9FhvdY07Orhs7nSSg=; b=DbfjRRccS7kszWW5k8PNTq4gPPkkEEDVz0GFFDUoB4FeH1CFLH6BAb7bJzS9HfU3E3HX8p pzB/dIch7theniaAYtONGTjRQHEcqtlfbz4HLk8jTsltnSVF5TdEy6hCiLqChWCt9U/k1O XhwwYRDRZDKkkeeYoapu9G7oaBgWkr7T+FVUmmfGGgg8PBtx1KHmIewkmajgZ192m/5ttS TsFXkXV2odbcTL4+4RyF1sId3oqQlE9pQfGxMIK0d2c/Q11ugcTJN8fQqmFh7rkDYPq5cx pOrp4dK+K9S1H1zMr2XSfc0dHw0eFewPdxvAnTEeun4kP37G1qPqLU4CiQABww== Content-Type: text/plain; charset=utf-8 Precedence: list List-Id: List-Subscribe: , List-Unsubscribe: , List-Post: List-Help: Sender: Mail-Followup-To: Mime-Version: 1.0 Subject: Re: Feedback about the DNS FW From: Michael Tremer In-Reply-To: Date: Wed, 6 May 2026 09:29:22 +0100 Cc: development@lists.ipfire.org Content-Transfer-Encoding: quoted-printable Message-Id: References: <1fad9a20-1291-4584-a35e-e0a6251df296@ipfire.org> <6d3f21de-40c8-4f6d-8946-6b6e28e50bc0@ipfire.org> <210f08e2-c1ed-46c9-9e51-65ec200fe487@Canary> To: Bernhard Bitsch Hello Bernhard, > On 2 May 2026, at 21:23, Bernhard Bitsch wrote: >=20 > The runtime of 'fast_reload' is very long. Reason is the minimization = of response time. This is a known effect in real time systems. If you = minimize the function runtime, you pay it with worse responsiveness. > In case of unbound reload, you can either restart the process ( = resulting in no response between stop and end of startup ), or you = update the config while the functionality is running ( with full = response based on the intermediate states ). Latter means that whilst = the reload two processes are running, which must synchronise. >=20 > There are three possiblities: > - fast_reload, unbound is full responsive but the operation lasts some = time. > - reload or reload_keep_cache, unbound may inresponsive some time, in = some cases this may result in a restart > - restart ( stop, start ), unbound is not responsive until it is fully = initialised again. >=20 > Which option to choose may depend on runtime on some devices ( as = Michael mentioned ). > But, as investigated, the fast_reload operation needs a temporary = cache. This isn't freed after termination of the reload function. This = yields an increase in memory consumption over time. Each change in = configuration adds the cache size. >=20 > Remains the problem what to do. Whether the newest version of unbound = handles this problem, I couldn't find out. I don=E2=80=99t think that this is a very complicated question. It = simply isn=E2=80=99t an option to restart or classic reload unbound, = because reloading the zones for a minute without any DNS resolution for = the network is an outage. Therefore fast-reload is the only option we have. That it does not free = its memory is a bug in my opinion. So that has to be fixed. -Michael > Regards, > Bernhard >=20 >=20 >=20 > Am 02.05.2026 um 19:03 schrieb Jon Murphy: >> I am seeing the same as Bernhard in his first post. Last night I = disabled phishing (and clicked save) and this morning I saw a big memory = jump (from last night). >> This is on an LWL Mini (APU4D4) with 4 GB RAM >> FYI - after clicking Save, it takes about 2 minutes for the DNS = Firewall WebGUI page to reload, but unbound is still processing RPZ = requests. >> =E2=80=94 >> Jon >>> On Thursday, Apr 30, 2026 at 5:06 AM, Michael Tremer = wrote: >>> Hello Bernhard, >>>=20 >>>> On 29 Apr 2026, at 23:19, Bernhard Bitsch = wrote: >>>>=20 >>>> Hello Michael, >>>>=20 >>>> Am 29.04.2026 um 22:09 schrieb Michael Tremer: >>>>> Hello Bernhard, >>>>>> On 29 Apr 2026, at 18:43, Bernhard Bitsch = wrote: >>>>>>=20 >>>>>> Hi, >>>>>>=20 >>>>>> after using the new DNS FW ( congrats to this nice feature! ), I = found some issues. >>>>> Thanks. I believe that the entire feature has received very poor = testing. Considering how many people have stated how important it is to = them, really critical issues have been reported very late in the release = process which indicates that the feature has not been tested, or if = people found those bugs, they have not been reported. >>>>> I have to say that I am very disappointed about this. But it has = nothing to do with your question. >>>>>> - Each 'save' in WUI page increases the memory consumption. Even = if nothing changed. A restart of unbound frees this huge allocation. >>>>> Yes, this is known. It is a problem inside Unbound and there is = nothing we can do about it. I did not report it to Unbound, but I am = sure they should be made aware. >>>>> Unbound in general is using a lot of memory when it is downloading = the lists. I have imported the lists into PowerDNS Recursor and it = raises its memory consumption by about ~300 MiB when Unbound is going = into 1.6-1.7 GiB. >>>>=20 >>>> Some more investigation in unbound docs about the operation = fast_reload ("This command is experimental at this time.") and some = experiments I can state, that the fast_reload doesn't free the copy of = the state. This gives the increase in memory consumption. >>>> The runtime for the 'reload_keep_cache' operation is about the same = as for 'fast_reload' ( just my feeling ). But the memory load doesn't = increase ( measured just with the WUI memory stats ). >>>=20 >>> Well, that is not experimental then, it is utterly broken. >>>=20 >>> We are however between a rock and a hard place, because the = alternative would be to run a regular reload which will result in = Unbound stopping to process any queries, reload the zones and then = resuming. On some hardware, we are in the area of minutes to load the = zones which will cause absolute chaos if there is no DNS resolution for = that time. >>>=20 >>>>> For now we are stuck with Unbound, but it has always been giving = us a lot of trouble. >>>>=20 >>>> Does this mean we are switching to PowerDNs? But we should have a = stable system meantime. What about going back to the 'reload_keep_cache' = operation? >>>=20 >>> No, we are not switching to anything at the moment because I simply = don=E2=80=99t have the time. We will however do it at some point in the = future. PowerDNS Recursor is one of the candidates because it is very = scriptable with Lua. Knot Resolver would also be a good option. >>>=20 >>> I tested running them on IPFire and they both work great, but we = have a lot of custom tooling which will be quite time-consuming to = migrate to any of the other solutions. >>>=20 >>> -Michael >>>=20 >>>> Regards, >>>> Bernhard >>>>>> - Knowing from using Jon's RPZ prototype, I checked whether a = single reload ( used in DNS FW? ) propagates the changes, new list = and/or allow/deny entries, really. I found cases where this isn't true. = A unbound restart yielded the right behaviour. >>>>>>=20 >>>>>>=20 >>>>>> I must apologize not to have tested the release. But I haven't = the equipment, yet ( only one production system ). >>>>>>=20 >>>>>> Regards, >>>>>> Bernhard >>>>>>=20 >>>>=20 >>>>=20 >>>=20 >>>=20 >=20 >=20