From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nick Howitt To: development@lists.ipfire.org Subject: Re: [PATCH] Stop unbound-dhcp-leases-bridge from continually restarting unbound Date: Wed, 27 Mar 2024 15:28:10 +0000 Message-ID: In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1946740898555500792==" List-Id: --===============1946740898555500792== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On 26/03/2024 11:10, Michael Tremer wrote: > > >> On 25 Mar 2024, at 13:11, Nick Howitt wrote: >> >> >> >> On 25/03/2024 11:31, Michael Tremer wrote: >>> Hello, >>> I don=E2=80=99t think that the risk is very high, but I would account for= this problem. >>> With my proposed changes from my previous email this should not be a larg= e problem at all. When opening /etc/unbound/dhcp-leases.conf with mode =E2=80= =9Cr+=E2=80=9D you can catch the Exception and skip the comparison and open t= he file again with =E2=80=9Cw=E2=80=9D. >>>> On 25 Mar 2024, at 10:10, Nick Howitt wrote: >>>> >>>> If there is a risk that the /etc/unbound/dhcp-leases.conf does not exist= , for example on a clean install (I don't know), the filecmp.cmp line needs t= o be wrapped in a try...except to stop it blowing up: >>>> >>>> try: >>>> RequireUnboundReload =3D not filecmp.cmp(f.name, self.path, shallow=3DFa= lse) >>>> except: >>>> RequireUnboundReload =3D True >>> I generally don=E2=80=99t like blanco exception catches. You will never k= now about the daemons that you are hiding. >>>> This will force the file to be to be generated, even if it is only copyi= ng in a blank temp file, and unbound will restart. It will only ever happen o= nce. >>>> >>>> >>>> On 24/03/2024 15:49, Nick Howitt wrote: >>>>> Hi all, >>>>> **** This is a repost of an email incorrectly submitted. **** >>>>> Please bear with me as I am new to IPFire and not at all used to this s= tyle of development, so any guidance would be appreciated. >>>>> There is a bug, https://bugzilla.ipfire.org/show_bug.cgi?id=3D13254, wh= ere unbound-dhcp-leases-bridge restarts unbound a lot when it is totally unne= cessary. It appears to be because when a lease is renewed, the script gets tr= iggered, created a new /etc/unbound/dhcp-leases.conf then restarts Unbound to= read the file. Generally this seems to be unnecessary as, with a renewed lea= se, the old and new /etc/unbound/dhcp-leases.conf files are the same. With 5 = leases in that file (one for a machine not active) I am getting 3-4 restarts = an hour of Unbound when I have a min/max lease time of 60/120min. >>>>> Looking at the code, it is fairly easy to fix. The current code creates= a temp file then copies it into place then restarts unbound. All I am doing = is doing a file comparison before the copy and skipping the restart if the fi= les are the same. >>>>> There were a couple of gotchas because setting the file attributes and = copying the file were done inside the "with" block for generating the tempora= ry file. This meant a file comparison always returned False as the temp file = was still open and so never the same is the old file. >>>>> I moved those two statements outside the "with". This forced me to chan= ge the fchmod to chmod. >>>>> It could be argued that the file copy should not be done if the files a= re not different, but it would take an extra "if" and you'd have to remember = to delete the temp file. If required, I can make that change. >>>>> Also, one small thing I noticed once is that the old and new dhcp-lease= s.conf files could occasionally contain the same leases but in a different or= der. I have been unable to reproduce, but to sidestep it, instead of stepping= through the leases variable directly, I sort it and step through that. It sh= ould make the resulting file completely deterministic and make the file compa= rison more effective. >>>>> My patch is: >>>>> From 73873d4944944a2f02317a73074a6894726f36f7 Mon Sep 17 00:00:00 2001 >>>>> From: Nick Howitt >>>>> Date: Sun, 24 Mar 2024 15:17:19 +0000 >>>>> Subject: [PATCH] Stop unbound-dhcp-leases-bridge from restarting unboun= d every >>>>> time it write dhcp-leases.conf as it is very often unchanged and does= not >>>>> require a restart. >>>>> --- >>>>> config/unbound/unbound-dhcp-leases-bridge | 28 +++++++++++++++-------- >>>>> 1 file changed, 19 insertions(+), 9 deletions(-) >>>>> diff --git a/config/unbound/unbound-dhcp-leases-bridge b/config/unbound= /unbound-dhcp-leases-bridge >>>>> index e9f022aff..d22772066 100644 >>>>> --- a/config/unbound/unbound-dhcp-leases-bridge >>>>> +++ b/config/unbound/unbound-dhcp-leases-bridge >>>>> @@ -22,6 +22,7 @@ >>>>> import argparse >>>>> import datetime >>>>> import daemon >>>>> +import filecmp >>>>> import functools >>>>> import ipaddress >>>>> import logging >>>>> @@ -516,26 +517,35 @@ class UnboundConfigWriter(object): >>>>> def update_dhcp_leases(self, leases): >>>>> # Write out all leases >>>>> - self.write_dhcp_leases(leases) >>>>> + if self.write_dhcp_leases(leases): >>>>> - log.debug("Reloading Unbound...") >>>>> + log.debug("Reloading Unbound...") >>>>> - # Reload the configuration without dropping the cache >>>>> - self._control("reload_keep_cache") >>>>> + # Reload the configuration without dropping the cache >>>>> + self._control("reload_keep_cache") >>>>> + >>>>> + else: >>>>> + >>>>> + log.debug("Not reloading Unbound. Leases not changed.") >>>>> def write_dhcp_leases(self, leases): >>>>> log.debug("Writing DHCP leases...") >>>>> with tempfile.NamedTemporaryFile(mode=3D"w", delete=3DFalse) = as f: >>>>> - for l in leases: >>>>> + for l in sorted(leases): >>>>> for rr in l.rrset: >>>>> f.write("local-data: \"%s\"\n" % " ".join(rr)) >>>>> - # Make file readable for everyone >>>>> - os.fchmod(f.fileno(), stat.S_IRUSR|stat.S_IWUSR|stat.S_IRG= RP|stat.S_IROTH) >>>>> + filecmp.clear_cache() >>>>> + RequireUnboundReload =3D not filecmp.cmp(f.name, self.path, sh= allow=3DFalse) >>>>> + >>>>> + # Make file readable for everyone >>>>> + os.chmod(f.name, stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S= _IROTH) >>>>> + >>>>> + # Move the file to its destination >>>>> + os.rename(f.name, self.path) >>>>> - # Move the file to its destination >>>>> - os.rename(f.name, self.path) >>>>> + return RequireUnboundReload >>>>> def _control(self, *args): >>>>> command =3D ["unbound-control"] >> Resubmitting the patch. >> >> Following my previous reply, I would like to stick with filecmp as I am us= ing it with "shallow=3DFalse" to avoid doing the comparison just with stat() = calls. Note that I also flush the filecmp cache before using the command so i= t is running with the latest file data. Talking about file descriptors, atomi= c file replacement etc., scares me. > File descriptors are great. They avoid race-conditions and all sorts of thi= ngs. But I have never used them so I don't intend to go that way. Too much=20 for me to get my brain round and a limited ability to test. > > I don=E2=80=99t think that module is the best choice really if you have to = hack around and flush the cache. Why would it even cache things in the first = place? I don=E2=80=99t see where the cache helps. From the filecmp.clear_cache documentation: Clear the filecmp cache. This may be useful if a file is compared so quickly after it is modified that it is within the mtime resolution of the underlying filesystem. Now this makes me believe that the clear_cache is only relevant when=20 using shallow=3DTrue, which I'm not and I don't know how to check the=20 underlying code. It is used to make sure the stat() info is updated, but=20 we are not using the stat() info. I guess we could as easily add a one=20 second sleep instead as that is way longer than the mtime resolution. The module is chosen because all the references I read about comparing=20 files preferred this one-liner for comparing files, although other=20 methods were given and there were lots of discussions about large files=20 (GB+) because if you chomp them it may be much quicker to say they are=20 not the same when you find the first block that does not match. Having=20 said this, I don't know how filecmp works, but we only have a small=20 file. At the same time, I am not a regular programmer so simple=20 one-liners are good for me. >> I have modified the earlier try/except I proposed to catch only FileNotFou= ndError, and I have changed CamelCase to snake_case. >> >> --- >> config/unbound/unbound-dhcp-leases-bridge | 31 ++++++++++++++++------- >> 1 file changed, 22 insertions(+), 9 deletions(-) >> >> diff --git a/config/unbound/unbound-dhcp-leases-bridge b/config/unbound/un= bound-dhcp-leases-bridge >> index e9f022aff..a9ec8e1e2 100644 >> --- a/config/unbound/unbound-dhcp-leases-bridge >> +++ b/config/unbound/unbound-dhcp-leases-bridge >> @@ -22,6 +22,7 @@ >> import argparse >> import datetime >> import daemon >> +import filecmp >> import functools >> import ipaddress >> import logging >> @@ -516,26 +517,38 @@ class UnboundConfigWriter(object): >> >> def update_dhcp_leases(self, leases): >> # Write out all leases >> - self.write_dhcp_leases(leases) >> + if self.write_dhcp_leases(leases): >> >> - log.debug("Reloading Unbound...") >> + log.debug("Reloading Unbound...") >> >> - # Reload the configuration without dropping the cache >> - self._control("reload_keep_cache") >> + # Reload the configuration without dropping the cache >> + self._control("reload_keep_cache") >> + >> + else: >> + >> + log.debug("Not reloading Unbound. Leases not changed.") >> >> def write_dhcp_leases(self, leases): >> log.debug("Writing DHCP leases...") >> >> with tempfile.NamedTemporaryFile(mode=3D"w", delete=3DFalse) as f: >> - for l in leases: >> + for l in sorted(leases): >> for rr in l.rrset: >> f.write("local-data: \"%s\"\n" % " ".join(rr)) >> >> - # Make file readable for everyone >> - os.fchmod(f.fileno(), stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROT= H) >> + filecmp.clear_cache() >> + try: >> + require_unbound_reload =3D not filecmp.cmp(f.name, self.path, shallow=3D= False) >> + except FileNotFoundError: >> + require_unbound_reload =3D True > If you want to go this route, you will have to flush the temporary file fir= st. Otherwise you might have some data left in the buffer that has not been w= ritten (because the file is still open). You will then compare incomplete out= put. This would not become a problem if you used the file descriptors. The temp file is closed at the point of doing the filecmp, and the=20 filecmp cache has been flushed. How do I flush the temporary file=20 further. Reading up, is it just having the last statement in the with=20 block as f.flush() with the same indent as "for l in sorted(leases):" > > Catching FileNotFound would work for me. Good > >> + >> + # Make file readable for everyone >> + os.chmod(f.name, stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH) >> + >> + # Move the file to its destination >> + os.rename(f.name, self.path) >> >> - # Move the file to its destination >> - os.rename(f.name, self.path) >> + return require_unbound_reload >> >> def _control(self, *args): >> command =3D ["unbound-control"] >> --=20 >> 2.43.0 >> >> >> Nick --===============1946740898555500792==--