From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nick Howitt <nick@howitts.co.uk> To: development@lists.ipfire.org Subject: Re: [PATCH] Stop unbound-dhcp-leases-bridge from continually restarting unbound Date: Thu, 28 Mar 2024 17:25:37 +0000 Message-ID: <74172f8f-7b5e-4c1a-b073-743dd12714d9@howitts.co.uk> In-Reply-To: <E74713AF-AF6B-409A-9885-2F6162FA9D9B@ipfire.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============5626133137340091642==" List-Id: <development.lists.ipfire.org> --===============5626133137340091642== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Next version. On 28/03/2024 10:41, Michael Tremer wrote: >=20 > Hello, >=20 >> On 27 Mar 2024, at 15:28, Nick Howitt <nick(a)howitts.co.uk> wrote: >> >> >> >> On 26/03/2024 11:10, Michael Tremer wrote: >>> >>> >>>> On 25 Mar 2024, at 13:11, Nick Howitt <nick(a)howitts.co.uk> wrote: >>>> >>>> >>>> >>>> On 25/03/2024 11:31, Michael Tremer wrote: >>>>> Hello, >>>>> I don=E2=80=99t think that the risk is very high, but I would account f= or this problem. >>>>> With my proposed changes from my previous email this should not be a la= rge problem at all. When opening /etc/unbound/dhcp-leases.conf with mode =E2= =80=9Cr+=E2=80=9D you can catch the Exception and skip the comparison and ope= n the file again with =E2=80=9Cw=E2=80=9D. >>>>>> On 25 Mar 2024, at 10:10, Nick Howitt <nick(a)howitts.co.uk> wrote: >>>>>> >>>>>> If there is a risk that the /etc/unbound/dhcp-leases.conf does not exi= st, for example on a clean install (I don't know), the filecmp.cmp line needs= to be wrapped in a try...except to stop it blowing up: >>>>>> >>>>>> try: >>>>>> RequireUnboundReload =3D not filecmp.cmp(f.name, self.path, shallow=3D= False) >>>>>> except: >>>>>> RequireUnboundReload =3D True >>>>> I generally don=E2=80=99t like blanco exception catches. You will never= know about the daemons that you are hiding. >>>>>> This will force the file to be to be generated, even if it is only cop= ying in a blank temp file, and unbound will restart. It will only ever happen= once. >>>>>> >>>>>> >>>>>> On 24/03/2024 15:49, Nick Howitt wrote: >>>>>>> Hi all, >>>>>>> **** This is a repost of an email incorrectly submitted. **** >>>>>>> Please bear with me as I am new to IPFire and not at all used to this= style of development, so any guidance would be appreciated. >>>>>>> There is a bug, https://bugzilla.ipfire.org/show_bug.cgi?id=3D13254, = where unbound-dhcp-leases-bridge restarts unbound a lot when it is totally un= necessary. It appears to be because when a lease is renewed, the script gets = triggered, created a new /etc/unbound/dhcp-leases.conf then restarts Unbound = to read the file. Generally this seems to be unnecessary as, with a renewed l= ease, the old and new /etc/unbound/dhcp-leases.conf files are the same. With = 5 leases in that file (one for a machine not active) I am getting 3-4 restart= s an hour of Unbound when I have a min/max lease time of 60/120min. >>>>>>> Looking at the code, it is fairly easy to fix. The current code creat= es a temp file then copies it into place then restarts unbound. All I am doin= g is doing a file comparison before the copy and skipping the restart if the = files are the same. >>>>>>> There were a couple of gotchas because setting the file attributes an= d copying the file were done inside the "with" block for generating the tempo= rary file. This meant a file comparison always returned False as the temp fil= e was still open and so never the same is the old file. >>>>>>> I moved those two statements outside the "with". This forced me to ch= ange the fchmod to chmod. >>>>>>> It could be argued that the file copy should not be done if the files= are not different, but it would take an extra "if" and you'd have to remembe= r to delete the temp file. If required, I can make that change. >>>>>>> Also, one small thing I noticed once is that the old and new dhcp-lea= ses.conf files could occasionally contain the same leases but in a different = order. I have been unable to reproduce, but to sidestep it, instead of steppi= ng through the leases variable directly, I sort it and step through that. It = should make the resulting file completely deterministic and make the file com= parison more effective. >>>>>>> My patch is: >>>>>>> From 73873d4944944a2f02317a73074a6894726f36f7 Mon Sep 17 00:00:00 20= 01 >>>>>>> From: Nick Howitt <nick(a)howitts.co.uk> >>>>>>> Date: Sun, 24 Mar 2024 15:17:19 +0000 >>>>>>> Subject: [PATCH] Stop unbound-dhcp-leases-bridge from restarting unbo= und every >>>>>>> time it write dhcp-leases.conf as it is very often unchanged and do= es not >>>>>>> require a restart. >>>>>>> --- >>>>>>> config/unbound/unbound-dhcp-leases-bridge | 28 +++++++++++++++-----= --- >>>>>>> 1 file changed, 19 insertions(+), 9 deletions(-) >>>>>>> diff --git a/config/unbound/unbound-dhcp-leases-bridge b/config/unbou= nd/unbound-dhcp-leases-bridge >>>>>>> index e9f022aff..d22772066 100644 >>>>>>> --- a/config/unbound/unbound-dhcp-leases-bridge >>>>>>> +++ b/config/unbound/unbound-dhcp-leases-bridge >>>>>>> @@ -22,6 +22,7 @@ >>>>>>> import argparse >>>>>>> import datetime >>>>>>> import daemon >>>>>>> +import filecmp >>>>>>> import functools >>>>>>> import ipaddress >>>>>>> import logging >>>>>>> @@ -516,26 +517,35 @@ class UnboundConfigWriter(object): >>>>>>> def update_dhcp_leases(self, leases): >>>>>>> # Write out all leases >>>>>>> - self.write_dhcp_leases(leases) >>>>>>> + if self.write_dhcp_leases(leases): >>>>>>> - log.debug("Reloading Unbound...") >>>>>>> + log.debug("Reloading Unbound...") >>>>>>> - # Reload the configuration without dropping the cache >>>>>>> - self._control("reload_keep_cache") >>>>>>> + # Reload the configuration without dropping the cache >>>>>>> + self._control("reload_keep_cache") >>>>>>> + >>>>>>> + else: >>>>>>> + >>>>>>> + log.debug("Not reloading Unbound. Leases not changed.") >>>>>>> def write_dhcp_leases(self, leases): >>>>>>> log.debug("Writing DHCP leases...") >>>>>>> with tempfile.NamedTemporaryFile(mode=3D"w", delete=3DFalse= ) as f: >>>>>>> - for l in leases: >>>>>>> + for l in sorted(leases): >>>>>>> for rr in l.rrset: >>>>>>> f.write("local-data: \"%s\"\n" % " ".join(rr)) >>>>>>> - # Make file readable for everyone >>>>>>> - os.fchmod(f.fileno(), stat.S_IRUSR|stat.S_IWUSR|stat.S_I= RGRP|stat.S_IROTH) >>>>>>> + filecmp.clear_cache() >>>>>>> + RequireUnboundReload =3D not filecmp.cmp(f.name, self.path, = shallow=3DFalse) >>>>>>> + >>>>>>> + # Make file readable for everyone >>>>>>> + os.chmod(f.name, stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat= .S_IROTH) >>>>>>> + >>>>>>> + # Move the file to its destination >>>>>>> + os.rename(f.name, self.path) >>>>>>> - # Move the file to its destination >>>>>>> - os.rename(f.name, self.path) >>>>>>> + return RequireUnboundReload >>>>>>> def _control(self, *args): >>>>>>> command =3D ["unbound-control"] >>>> Resubmitting the patch. >>>> >>>> Following my previous reply, I would like to stick with filecmp as I am = using it with "shallow=3DFalse" to avoid doing the comparison just with stat(= ) calls. Note that I also flush the filecmp cache before using the command so= it is running with the latest file data. Talking about file descriptors, ato= mic file replacement etc., scares me. >>> File descriptors are great. They avoid race-conditions and all sorts of t= hings. >> But I have never used them so I don't intend to go that way. Too much for = me to get my brain round and a limited ability to test. >=20 > Well, I pretty much gave you the entire solution=E2=80=A6 There was nothing= left to do apart from copying and pasting the snippets. >=20 Ok I've tried to use your version but struggled to get it going. If I=20 used it as was and called compare(file1_name, file2_name) it failed as=20 "compare" was not defined. I could call it using self.compare but then it failed with "TypeError:=20 UnboundConfigWriter.compare() takes 2 positional arguments but 3 were=20 given". I changed the function to compare(self, f1, f2). Then it failed with "AttributeError: 'str' object has no attribute=20 'seek'", so I changed it again to accept file names then open the files=20 before doing the seeks. I really hope it is OK as I am out of my depth when defining functions=20 with "self" as an argument. Please can you check it? If it is not OK,=20 can I revert to filecmp? I have made one futher change, the sorted(leases) line was not working=20 and not sorting the object ever, and I saw lines changing places in the=20 resulting leases file. I have changed it to "sorted(leases, key=3Dlambda=20 x: x.ipaddr)". It definitely sorts this time. >>> I don=E2=80=99t think that module is the best choice really if you have t= o hack around and flush the cache. Why would it even cache things in the firs= t place? I don=E2=80=99t see where the cache helps. >> From the filecmp.clear_cache documentation: >> >> Clear the filecmp cache. This may be useful if a file is compared so >> quickly after it is modified that it is within the mtime resolution >> of the underlying filesystem. >> >> Now this makes me believe that the clear_cache is only relevant when using= shallow=3DTrue, which I'm not and I don't know how to check the underlying c= ode. It is used to make sure the stat() info is updated, but we are not using= the stat() info. I guess we could as easily add a one second sleep instead a= s that is way longer than the mtime resolution. >=20 > The function is here and it is rather simple: https://github.com/python/cpy= thon/blob/3.12/Lib/filecmp.py#L30 >=20 > It will always stat() both files and compares three items in case shallow i= s False: The type of the file, the size and mtime. If the file is not a regul= ar file, the function will exist rather early. So in our case, that is all wo= rk the function is doing for thing. It boils down to something very similar t= o what I suggested to use to compare the files. >=20 > On top of that, there is a cache which will always be used but will never r= eturn a match for us because the temporary file at least will always have ano= ther mtime. >=20 > You could clear the cache after every call, but I would rather not have our= code messy and care about implementation details of some Python module. >=20 >> The module is chosen because all the references I read about comparing fil= es preferred this one-liner for comparing files, although other methods were = given and there were lots of discussions about large files (GB+) because if y= ou chomp them it may be much quicker to say they are not the same when you fi= nd the first block that does not match. Having said this, I don't know how fi= lecmp works, but we only have a small file. At the same time, I am not a regu= lar programmer so simple one-liners are good for me. >=20 > You would always want to read both files block for block and compare the bl= ocks. There is no way to implement this significantly faster. >=20 >>>> I have modified the earlier try/except I proposed to catch only FileNotF= oundError, and I have changed CamelCase to snake_case. >>>> >>>> --- >>>> config/unbound/unbound-dhcp-leases-bridge | 31 ++++++++++++++++------- >>>> 1 file changed, 22 insertions(+), 9 deletions(-) >>>> >>>> diff --git a/config/unbound/unbound-dhcp-leases-bridge b/config/unbound/= unbound-dhcp-leases-bridge >>>> index e9f022aff..a9ec8e1e2 100644 >>>> --- a/config/unbound/unbound-dhcp-leases-bridge >>>> +++ b/config/unbound/unbound-dhcp-leases-bridge >>>> @@ -22,6 +22,7 @@ >>>> import argparse >>>> import datetime >>>> import daemon >>>> +import filecmp >>>> import functools >>>> import ipaddress >>>> import logging >>>> @@ -516,26 +517,38 @@ class UnboundConfigWriter(object): >>>> >>>> def update_dhcp_leases(self, leases): >>>> # Write out all leases >>>> - self.write_dhcp_leases(leases) >>>> + if self.write_dhcp_leases(leases): >>>> >>>> - log.debug("Reloading Unbound...") >>>> + log.debug("Reloading Unbound...") >>>> >>>> - # Reload the configuration without dropping the cache >>>> - self._control("reload_keep_cache") >>>> + # Reload the configuration without dropping the cache >>>> + self._control("reload_keep_cache") >>>> + >>>> + else: >>>> + >>>> + log.debug("Not reloading Unbound. Leases not changed.") >>>> >>>> def write_dhcp_leases(self, leases): >>>> log.debug("Writing DHCP leases...") >>>> >>>> with tempfile.NamedTemporaryFile(mode=3D"w", delete=3DFalse) as f: >>>> - for l in leases: >>>> + for l in sorted(leases): >>>> for rr in l.rrset: >>>> f.write("local-data: \"%s\"\n" % " ".join(rr)) >>>> >>>> - # Make file readable for everyone >>>> - os.fchmod(f.fileno(), stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IR= OTH) >>>> + filecmp.clear_cache() >>>> + try: >>>> + require_unbound_reload =3D not filecmp.cmp(f.name, self.path, shallow= =3DFalse) >>>> + except FileNotFoundError: >>>> + require_unbound_reload =3D True >>> If you want to go this route, you will have to flush the temporary file f= irst. Otherwise you might have some data left in the buffer that has not been= written (because the file is still open). You will then compare incomplete o= utput. This would not become a problem if you used the file descriptors. >> The temp file is closed at the point of doing the filecmp, and the filecmp= cache has been flushed. How do I flush the temporary file further. Reading u= p, is it just having the last statement in the with block as f.flush() with t= he same indent as "for l in sorted(leases):" >=20 > If you have left the with block you will already have flushed the buffers a= nd closed the file. It looks like I misread the indentation in the patch then. >=20 >>> >>> Catching FileNotFound would work for me. >> Good >>> >>>> + >>>> + # Make file readable for everyone >>>> + os.chmod(f.name, stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH) >>>> + >>>> + # Move the file to its destination >>>> + os.rename(f.name, self.path) >>>> >>>> - # Move the file to its destination >>>> - os.rename(f.name, self.path) >>>> + return require_unbound_reload >>>> >>>> def _control(self, *args): >>>> command =3D ["unbound-control"] >>>> --=20 >>>> 2.43.0 >>>> >>>> >>>> Nick >=20 >=20 --- config/unbound/unbound-dhcp-leases-bridge | 50 +++++++++++++++++++---- 1 file changed, 41 insertions(+), 9 deletions(-) diff --git a/config/unbound/unbound-dhcp-leases-bridge=20 b/config/unbound/unbound-dhcp-leases-bridge index e9f022aff..80d9523ea 100644 --- a/config/unbound/unbound-dhcp-leases-bridge +++ b/config/unbound/unbound-dhcp-leases-bridge @@ -516,26 +516,37 @@ class UnboundConfigWriter(object): def update_dhcp_leases(self, leases): # Write out all leases - self.write_dhcp_leases(leases) + if self.write_dhcp_leases(leases): - log.debug("Reloading Unbound...") + log.debug("Reloading Unbound...") - # Reload the configuration without dropping the cache - self._control("reload_keep_cache") + # Reload the configuration without dropping the cache + self._control("reload_keep_cache") + + else: + + log.debug("Not reloading Unbound. Leases not changed.") def write_dhcp_leases(self, leases): log.debug("Writing DHCP leases...") with tempfile.NamedTemporaryFile(mode=3D"w", delete=3DFalse) as f: - for l in leases: + for l in sorted(leases, key=3Dlambda x: x.ipaddr): for rr in l.rrset: f.write("local-data: \"%s\"\n" % " ".join(rr)) - # Make file readable for everyone - os.fchmod(f.fileno(),=20 stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH) + try: + require_unbound_reload =3D not self.compare(f.name, self.path) + except FileNotFoundError: + require_unbound_reload =3D True + + # Make file readable for everyone + os.chmod(f.name, stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH) + + # Move the file to its destination + os.rename(f.name, self.path) - # Move the file to its destination - os.rename(f.name, self.path) + return require_unbound_reload def _control(self, *args): command =3D ["unbound-control"] @@ -551,6 +562,27 @@ class UnboundConfigWriter(object): raise e + def compare(self, if1, if2): + buffersize =3D 16 * 1024 + + f1 =3D open(if1, "r") + f2 =3D open(if2, "r") + + # Reset file handles + f1.seek(0) + f2.seek(0) + + while True: + b1 =3D f1.read(buffersize) + b2 =3D f2.read(buffersize) + + if not b1 =3D=3D b2: + return False + + elif not b1 or not b2: + break + + return True if __name__ =3D=3D "__main__": parser =3D argparse.ArgumentParser(description=3D"Bridge for DHCP Leases=20 and Unbound DNS") --=20 2.43.0 I trust that you will find these changes OK. Nick --===============5626133137340091642==--