From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nick Howitt To: development@lists.ipfire.org Subject: Re: Stop unbound-dhcp-leases-bridge from continually restarting unbound Date: Mon, 25 Mar 2024 11:46:58 +0000 Message-ID: <21c9ef28-5340-4712-adde-4def514038f4@howitts.co.uk> In-Reply-To: <1333C6E1-0D16-4E09-90E3-E0CF132383EA@ipfire.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============8737468404005113021==" List-Id: --===============8737468404005113021== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On 25/03/2024 11:21, Michael Tremer wrote: >=20 > Hello, >=20 >> On 24 Mar 2024, at 15:25, Nick Howitt wrote: >> >> Hi all, >> >> Please bear with me as I am new to IPFire and not at all used to this styl= e of development, so any guidance would be appreciated. >=20 > No worries, feel free to ask any questions that you have=E2=80=A6 There is = documentation around all of this here: >=20 > https://www.ipfire.org/docs/devel >=20 > Especially here: >=20 > https://www.ipfire.org/docs/devel/submit-patches >=20 >> There is a bug, https://bugzilla.ipfire.org/show_bug.cgi?id=3D13254, where= unbound-dhcp-leases-bridge restarts unbound a lot when it is totally unneces= sary. It appears to be because when a lease is renewed, the script gets trigg= ered, created a new /etc/unbound/dhcp-leases.conf then restarts Unbound to re= ad the file. Generally this seems to be unnecessary as, with a renewed lease,= the old and new /etc/unbound/dhcp-leases.conf files are the same. With 5 lea= ses in that file (one for a machine not active) I am getting 3-4 restarts an = hour of Unbound when I have a min/max lease time of 60/120min. >=20 > I am still not entirely sure why it is such an issue to reload Unbound that= often, but I am happy to avoid doing this whenever possible. >=20 I am seeing odd pauses in browsing and I was wondering if it was related=20 to having to wait on DNS lookups as Unbound was reloading. That is when=20 I bumped into the bugzilla report and noticed others possibly=20 experiencing something similar. >> Looking at the code, it is fairly easy to fix. The current code creates a = temp file then copies it into place then restarts unbound. All I am doing is = doing a file comparison before the copy and skipping the restart if the files= are the same. >=20 > Yeah, I think this is a workable solution to mitigate the problem slightly. >=20 > Note that Jon has sent a couple of ideas a while back solving this problem = by removing the bridge and running shell commands instead, but I did not have= the time to look into how robust that solution would be. With the bridge, we= have at least a certain amount of confidence. >=20 I like what Jon has done and want to evaluate it. Also, as it is in Bash=20 I have a better chance of understanding all of it. I would consider my=20 solution as a short term sticking plaster and Jon's as the proper way to=20 go longer term. >> There were a couple of gotchas because setting the file attributes and cop= ying the file were done inside the "with" block for generating the temporary = file. This meant a file comparison always returned False as the temp file was= still open and so never the same is the old file. >> >> I moved those two statements outside the "with". This forced me to change = the fchmod to chmod. >=20 > In theory you could have left it there, because even if the file is being r= emoved later, there is no harm in changing permissions before. >=20 Agreed. It just didn't seem to be the right place inside the with=20 statement. It also meant using a file descriptor rather than a file name=20 which I find harder to read/understand as I am not a programmer, just a=20 hacker. As I was in that area, I thought it a good opportunity to change it. >> It could be argued that the file copy should not be done if the files are = not different, but it would take an extra "if" and you'd have to remember to = delete the temp file. If required, I can make that change. >=20 > I think I can live with that. This while thing is very far from efficient a= nyways and serves as some glue between Unbound and ISC DHCP. The latter will = definitely go soon as it has become EOL and whenever we make that transition = we will have to come up with something new which hopefully does not have the = shortcomings that the bridge has. >=20 >> Also, one small thing I noticed once is that the old and new dhcp-leases.c= onf files could occasionally contain the same leases but in a different order= . I have been unable to reproduce, but to sidestep it, instead of stepping th= rough the leases variable directly, I sort it and step through that. It shoul= d make the resulting file completely deterministic and make the file comparis= on more effective. >=20 > I think we are evaluating where to do the maths. I always considered it per= fectly safe to just throw the new file into the Unbound configuration, have U= nbound read it again and then everything is good. I would also argue that tha= t works without any problems even in places where I have 2000 active leases w= ith a rather short lease time. However, there is the log message=E2=80=A6 >=20 > Sorting creates some extra work, but I am sure that this is a lot cheaper t= han reloading Unbound. So once again, I am happy to go down this route. >=20 >> >> My patch is: >> >> From 73873d4944944a2f02317a73074a6894726f36f7 Mon Sep 17 00:00:00 2001 >> From: Nick Howitt >> Date: Sun, 24 Mar 2024 15:17:19 +0000 >> Subject: [PATCH] Stop unbound-dhcp-leases-bridge from restarting unbound e= very >> time it write dhcp-leases.conf as it is very often unchanged and does not >> require a restart. >> >> --- >> config/unbound/unbound-dhcp-leases-bridge | 28 +++++++++++++++-------- >> 1 file changed, 19 insertions(+), 9 deletions(-) >> >> diff --git a/config/unbound/unbound-dhcp-leases-bridge b/config/unbound/un= bound-dhcp-leases-bridge >> index e9f022aff..d22772066 100644 >> --- a/config/unbound/unbound-dhcp-leases-bridge >> +++ b/config/unbound/unbound-dhcp-leases-bridge >> @@ -22,6 +22,7 @@ >> import argparse >> import datetime >> import daemon >> +import filecmp >=20 > This module does however not seem very fit for purpose for me. It tries to = check the metadata of the file using stat() calls which will not do at all wh= at we want them to do and it will cache any results which is also not at all = helpful. >=20 > I would suggest to simply implement this function yourself like so: >=20 > def compare(f1, f2): > buffersize =3D 16 * 1024 >=20 > # Reset file handles > f1.seek(0) > f2.seek(0) >=20 > while True: > b1 =3D f1.read(buffersize) > b2 =3D f2.read(buffersize) >=20 > if not b1 =3D=3D b2: > return False >=20 > elif not b1 or not b2: > break >=20 > return True >=20 > You can pass the file descriptors to each of the files and since we are now= changing a couple of things, former assumptions can be dropped: >=20 > * Unbound used to reload the configuration file, but there could have been = the case of concurrent reads. Therefore the entire atomic file replacement ca= n be removed. > * With that, you could remove the delete=3DFalse attribute from the tempora= ry file. You already have the target file open and you can just copy the cont= ent from the temporary file. Because of the former assumption I would conside= r that safe. >=20 > That way, you will be able to avoid copying anything if the files don=E2=80= =99t match and you will not have to introduce any extra code to remove the te= mporary file. And the chmod statement can go, too :) >=20 If you ise filecmp.cmp with "shallow=3DFalse" it does not use stat() and=20 does a proper file comparison. It that is OK with you, I'd like to keep=20 using filecmp as it is a one-liner. >> import functools >> import ipaddress >> import logging >> @@ -516,26 +517,35 @@ class UnboundConfigWriter(object): >> >> def update_dhcp_leases(self, leases): >> # Write out all leases >> - self.write_dhcp_leases(leases) >> + if self.write_dhcp_leases(leases): >> >> - log.debug("Reloading Unbound...") >> + log.debug("Reloading Unbound...") >> >> - # Reload the configuration without dropping the cache >> - self._control("reload_keep_cache") >> + # Reload the configuration without dropping the cache >> + self._control("reload_keep_cache") >> + >> + else: >> + >> + log.debug("Not reloading Unbound. Leases not changed.") >> >> def write_dhcp_leases(self, leases): >> log.debug("Writing DHCP leases...") >> >> with tempfile.NamedTemporaryFile(mode=3D"w", delete=3DFalse) as f: >> - for l in leases: >> + for l in sorted(leases): >> for rr in l.rrset: >> f.write("local-data: \"%s\"\n" % " ".join(rr)) >> >> - # Make file readable for everyone >> - os.fchmod(f.fileno(), stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROT= H) >> + filecmp.clear_cache() >> + RequireUnboundReload =3D not filecmp.cmp(f.name, self.path, shallow=3DFa= lse) >=20 > Oh and please use snake case for variables. It is Python after all. > Never heard of the expression, but then I'm a hacker, not a programmer!=20 I'll change it. >> + # Make file readable for everyone >> + os.chmod(f.name, stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH) >> + >> + # Move the file to its destination >> + os.rename(f.name, self.path) >> >> - # Move the file to its destination >> - os.rename(f.name, self.path) >> + return RequireUnboundReload >> >> def _control(self, *args): >> command =3D ["unbound-control"] >> --=20 >> 2.43.0 >> >> I have this locally on a branch on my firewall. If it is OK, do I need som= e sort of rights to push my branch and then how I get it merged? >=20 > I can create your own Git repository for you to backup/manage your own chan= ges, but we do not allow anyone apart from two people to push into the master= repository. >=20 I only have basic knowledge of Git and am not sure how much I will do as=20 contribs. I am happy to work locally. If I had my own git repo, would I=20 then have to create a merge request as well as submit by email? > I am also very happy to see your first code contribution to IPFire! >=20 > -Michael >=20 >> >> Regards, >> >> Nick >=20 --===============8737468404005113021==--