Please can I bump this again to request a review of my patch submitted on 28th March? Thanks, Nick.

On 10/04/2024 12:04, Nick Howitt wrote:
Bump

On 28/03/2024 17:25, Nick Howitt wrote:
Next version.

On 28/03/2024 10:41, Michael Tremer wrote:

Hello,

On 27 Mar 2024, at 15:28, Nick Howitt <nick@howitts.co.uk> wrote:



On 26/03/2024 11:10, Michael Tremer wrote:


On 25 Mar 2024, at 13:11, Nick Howitt <nick@howitts.co.uk> wrote:



On 25/03/2024 11:31, Michael Tremer wrote:
Hello,
I don’t think that the risk is very high, but I would account for this problem.
With my proposed changes from my previous email this should not be a large problem at all. When opening /etc/unbound/dhcp-leases.conf with mode “r+” you can catch the Exception and skip the comparison and open the file again with “w”.
On 25 Mar 2024, at 10:10, Nick Howitt <nick@howitts.co.uk> wrote:

If there is a risk that the /etc/unbound/dhcp-leases.conf does not exist, for example on a clean install (I don't know), the filecmp.cmp line needs to be wrapped in a try...except to stop it blowing up:

try:
RequireUnboundReload = not filecmp.cmp(f.name, self.path, shallow=False)
except:
RequireUnboundReload = True
I generally don’t like blanco exception catches. You will never know about the daemons that you are hiding.
This will force the file to be to be generated, even if it is only copying in a blank temp file, and unbound will restart. It will only ever happen once.


On 24/03/2024 15:49, Nick Howitt wrote:
Hi all,
**** This is a repost of an email incorrectly submitted. ****
Please bear with me as I am new to IPFire and not at all used to this style of development, so any guidance would be appreciated.
There is a bug, https://bugzilla.ipfire.org/show_bug.cgi?id=13254, where unbound-dhcp-leases-bridge restarts unbound a lot when it is totally unnecessary. It appears to be because when a lease is renewed, the script gets triggered, created a new /etc/unbound/dhcp-leases.conf then restarts Unbound to read the file. Generally this seems to be unnecessary as, with a renewed lease, the old and new /etc/unbound/dhcp-leases.conf files are the same. With 5 leases in that file (one for a machine not active) I am getting 3-4 restarts an hour of Unbound when I have a min/max lease time of 60/120min.
Looking at the code, it is fairly easy to fix. The current code creates a temp file then copies it into place then restarts unbound. All I am doing is doing a file comparison before the copy and skipping the restart if the files are the same.
There were a couple of gotchas because setting the file attributes and copying the file were done inside the "with" block for generating the temporary file. This meant a file comparison always returned False as the temp file was still open and so never the same is the old file.
I moved those two statements outside the "with". This forced me to change the fchmod to chmod.
It could be argued that the file copy should not be done if the files are not different, but it would take an extra "if" and you'd have to remember to delete the temp file. If required, I can make that change.
Also, one small thing I noticed once is that the old and new dhcp-leases.conf files could occasionally contain the same leases but in a different order. I have been unable to reproduce, but to sidestep it, instead of stepping through the leases variable directly, I sort it and step through that. It should make the resulting file completely deterministic and make the file comparison more effective.
My patch is:
 From 73873d4944944a2f02317a73074a6894726f36f7 Mon Sep 17 00:00:00 2001
From: Nick Howitt <nick@howitts.co.uk>
Date: Sun, 24 Mar 2024 15:17:19 +0000
Subject: [PATCH] Stop unbound-dhcp-leases-bridge from restarting unbound every
  time it write dhcp-leases.conf as it is very often unchanged and does not
  require a restart.
---
  config/unbound/unbound-dhcp-leases-bridge | 28 +++++++++++++++--------
  1 file changed, 19 insertions(+), 9 deletions(-)
diff --git a/config/unbound/unbound-dhcp-leases-bridge b/config/unbound/unbound-dhcp-leases-bridge
index e9f022aff..d22772066 100644
--- a/config/unbound/unbound-dhcp-leases-bridge
+++ b/config/unbound/unbound-dhcp-leases-bridge
@@ -22,6 +22,7 @@
  import argparse
  import datetime
  import daemon
+import filecmp
  import functools
  import ipaddress
  import logging
@@ -516,26 +517,35 @@ class UnboundConfigWriter(object):
      def update_dhcp_leases(self, leases):
          # Write out all leases
-        self.write_dhcp_leases(leases)
+        if self.write_dhcp_leases(leases):
-        log.debug("Reloading Unbound...")
+            log.debug("Reloading Unbound...")
-        # Reload the configuration without dropping the cache
-        self._control("reload_keep_cache")
+            # Reload the configuration without dropping the cache
+            self._control("reload_keep_cache")
+
+        else:
+
+            log.debug("Not reloading Unbound. Leases not changed.")
      def write_dhcp_leases(self, leases):
          log.debug("Writing DHCP leases...")
          with tempfile.NamedTemporaryFile(mode="w", delete=False) as f:
-            for l in leases:
+            for l in sorted(leases):
                  for rr in l.rrset:
                      f.write("local-data: \"%s\"\n" % " ".join(rr))
-            # Make file readable for everyone
-            os.fchmod(f.fileno(), stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH)
+        filecmp.clear_cache()
+        RequireUnboundReload = not filecmp.cmp(f.name, self.path, shallow=False)
+
+        # Make file readable for everyone
+        os.chmod(f.name, stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH)
+
+        # Move the file to its destination
+        os.rename(f.name, self.path)
-            # Move the file to its destination
-            os.rename(f.name, self.path)
+        return RequireUnboundReload
      def _control(self, *args):
          command = ["unbound-control"]
Resubmitting the patch.

Following my previous reply, I would like to stick with filecmp as I am using it with "shallow=False" to avoid doing the comparison just with stat() calls. Note that I also flush the filecmp cache before using the command so it is running with the latest file data. Talking about file descriptors, atomic file replacement etc., scares me.
File descriptors are great. They avoid race-conditions and all sorts of things.
But I have never used them so I don't intend to go that way. Too much for me to get my brain round and a limited ability to test.

Well, I pretty much gave you the entire solution… There was nothing left to do apart from copying and pasting the snippets.

Ok I've tried to use your version but struggled to get it going. If I used it as was and called compare(file1_name, file2_name) it failed as "compare" was not defined.
I could call it using self.compare but then it failed with "TypeError: UnboundConfigWriter.compare() takes 2 positional arguments but 3 were given". I changed the function to compare(self, f1, f2).
Then it failed with "AttributeError: 'str' object has no attribute 'seek'", so I changed it again to accept file names then open the files before doing the seeks.
I really hope it is OK as I am out of my depth when defining functions with "self" as an argument. Please can you check it? If it is not OK, can I revert to filecmp?

I have made one futher change, the sorted(leases) line was not working and not sorting the object ever, and I saw lines changing places in the resulting leases file. I have changed it to "sorted(leases, key=lambda x: x.ipaddr)". It definitely sorts this time.

I don’t think that module is the best choice really if you have to hack around and flush the cache. Why would it even cache things in the first place? I don’t see where the cache helps.
 From the filecmp.clear_cache documentation:

   Clear the filecmp cache. This may be useful if a file is compared so
   quickly after it is modified that it is within the mtime resolution
   of the underlying filesystem.

Now this makes me believe that the clear_cache is only relevant when using shallow=True, which I'm not and I don't know how to check the underlying code. It is used to make sure the stat() info is updated, but we are not using the stat() info. I guess we could as easily add a one second sleep instead as that is way longer than the mtime resolution.

The function is here and it is rather simple: https://github.com/python/cpython/blob/3.12/Lib/filecmp.py#L30

It will always stat() both files and compares three items in case shallow is False: The type of the file, the size and mtime. If the file is not a regular file, the function will exist rather early. So in our case, that is all work the function is doing for thing. It boils down to something very similar to what I suggested to use to compare the files.

On top of that, there is a cache which will always be used but will never return a match for us because the temporary file at least will always have another mtime.

You could clear the cache after every call, but I would rather not have our code messy and care about implementation details of some Python module.

The module is chosen because all the references I read about comparing files preferred this one-liner for comparing files, although other methods were given and there were lots of discussions about large files (GB+) because if you chomp them it may be much quicker to say they are not the same when you find the first block that does not match. Having said this, I don't know how filecmp works, but we only have a small file. At the same time, I am not a regular programmer so simple one-liners are good for me.

You would always want to read both files block for block and compare the blocks. There is no way to implement this significantly faster.

I have modified the earlier try/except I proposed to catch only FileNotFoundError, and I have changed CamelCase to snake_case.

---
config/unbound/unbound-dhcp-leases-bridge | 31 ++++++++++++++++-------
1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/config/unbound/unbound-dhcp-leases-bridge b/config/unbound/unbound-dhcp-leases-bridge
index e9f022aff..a9ec8e1e2 100644
--- a/config/unbound/unbound-dhcp-leases-bridge
+++ b/config/unbound/unbound-dhcp-leases-bridge
@@ -22,6 +22,7 @@
import argparse
import datetime
import daemon
+import filecmp
import functools
import ipaddress
import logging
@@ -516,26 +517,38 @@ class UnboundConfigWriter(object):

def update_dhcp_leases(self, leases):
# Write out all leases
- self.write_dhcp_leases(leases)
+ if self.write_dhcp_leases(leases):

- log.debug("Reloading Unbound...")
+ log.debug("Reloading Unbound...")

- # Reload the configuration without dropping the cache
- self._control("reload_keep_cache")
+ # Reload the configuration without dropping the cache
+ self._control("reload_keep_cache")
+
+ else:
+
+ log.debug("Not reloading Unbound. Leases not changed.")

def write_dhcp_leases(self, leases):
log.debug("Writing DHCP leases...")

with tempfile.NamedTemporaryFile(mode="w", delete=False) as f:
- for l in leases:
+ for l in sorted(leases):
for rr in l.rrset:
f.write("local-data: \"%s\"\n" % " ".join(rr))

- # Make file readable for everyone
- os.fchmod(f.fileno(), stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH)
+ filecmp.clear_cache()
+ try:
+ require_unbound_reload = not filecmp.cmp(f.name, self.path, shallow=False)
+ except FileNotFoundError:
+ require_unbound_reload = True
If you want to go this route, you will have to flush the temporary file first. Otherwise you might have some data left in the buffer that has not been written (because the file is still open). You will then compare incomplete output. This would not become a problem if you used the file descriptors.
The temp file is closed at the point of doing the filecmp, and the filecmp cache has been flushed. How do I flush the temporary file further. Reading up, is it just having the last statement in the with block as f.flush() with the same indent as "for l in sorted(leases):"

If you have left the with block you will already have flushed the buffers and closed the file. It looks like I misread the indentation in the patch then.


Catching FileNotFound would work for me.
Good

+
+ # Make file readable for everyone
+ os.chmod(f.name, stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH)
+
+ # Move the file to its destination
+ os.rename(f.name, self.path)

- # Move the file to its destination
- os.rename(f.name, self.path)
+ return require_unbound_reload

def _control(self, *args):
command = ["unbound-control"]
-- 
2.43.0


Nick


---
 config/unbound/unbound-dhcp-leases-bridge | 50 +++++++++++++++++++----
 1 file changed, 41 insertions(+), 9 deletions(-)

diff --git a/config/unbound/unbound-dhcp-leases-bridge b/config/unbound/unbound-dhcp-leases-bridge
index e9f022aff..80d9523ea 100644
--- a/config/unbound/unbound-dhcp-leases-bridge
+++ b/config/unbound/unbound-dhcp-leases-bridge
@@ -516,26 +516,37 @@ class UnboundConfigWriter(object):

     def update_dhcp_leases(self, leases):
         # Write out all leases
-        self.write_dhcp_leases(leases)
+        if self.write_dhcp_leases(leases):

-        log.debug("Reloading Unbound...")
+            log.debug("Reloading Unbound...")

-        # Reload the configuration without dropping the cache
-        self._control("reload_keep_cache")
+            # Reload the configuration without dropping the cache
+            self._control("reload_keep_cache")
+
+        else:
+
+            log.debug("Not reloading Unbound. Leases not changed.")

     def write_dhcp_leases(self, leases):
         log.debug("Writing DHCP leases...")

         with tempfile.NamedTemporaryFile(mode="w", delete=False) as f:
-            for l in leases:
+            for l in sorted(leases, key=lambda x: x.ipaddr):
                 for rr in l.rrset:
                     f.write("local-data: \"%s\"\n" % " ".join(rr))

-            # Make file readable for everyone
-            os.fchmod(f.fileno(), stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH)
+        try:
+            require_unbound_reload = not self.compare(f.name, self.path)
+        except FileNotFoundError:
+            require_unbound_reload = True
+
+        # Make file readable for everyone
+        os.chmod(f.name, stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH)
+
+        # Move the file to its destination
+        os.rename(f.name, self.path)

-            # Move the file to its destination
-            os.rename(f.name, self.path)
+        return require_unbound_reload

     def _control(self, *args):
         command = ["unbound-control"]
@@ -551,6 +562,27 @@ class UnboundConfigWriter(object):

             raise e

+    def compare(self, if1, if2):
+        buffersize = 16 * 1024
+
+        f1 = open(if1, "r")
+        f2 = open(if2, "r")
+
+        # Reset file handles
+        f1.seek(0)
+        f2.seek(0)
+
+        while True:
+            b1 = f1.read(buffersize)
+            b2 = f2.read(buffersize)
+
+            if not b1 == b2:
+                return False
+
+            elif not b1 or not b2:
+                break
+
+        return True

 if __name__ == "__main__":
     parser = argparse.ArgumentParser(description="Bridge for DHCP Leases and Unbound DNS")