From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Tremer To: development@lists.ipfire.org Subject: Re: Long delays restarting unbound if Red is down Date: Thu, 13 Dec 2018 16:41:59 +0000 Message-ID: <11684CBC-A4C2-4F5E-B8B8-27AA1D32EBFE@ipfire.org> In-Reply-To: <94190F08-05FC-41C1-B68D-6D8CEE8195DC@rymes.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0902634219304061903==" List-Id: --===============0902634219304061903== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable > On 13 Dec 2018, at 16:36, Tom Rymes wrote: >=20 > That=E2=80=99s precisely the problem, Michael. Everything works fine, until= something (like an internet outage) prevents you from reaching the DNS serve= rs. Then, when you reboot the router as part of a troubleshooting process, it= takes forever while unbound flails about. Yes. If you have seen my conversation with Erik on this list, we are aware of= loads of issues with this script and want to remove those problems. However, getting rid of that script might have other implications that we do = not know yet. It is actually quite risky. Unbound just does not handle those DNS servers well that do not conform to (r= ecent) RFC standards. I am not sure if that has changed with recent releases = and with more people who have deployed unbound in recent months and years. I = hope it did. > In this instance, we were replacing a failed router, and someone had miscon= figured the red interface settings, so we were modifying those settings in se= tup. It took us a while to figure out what we were doing wrong, so we ended u= p having to wait over and over again as unbound tried to restart and failed. = We were the problem, but unbound made it a real hassle while we worked to fig= ure out what we were doing wrong. >=20 > Shouldn=E2=80=99t it have a reasonably short timeout for when the DNS serve= rs are unavailable? It does, but that timeout is still quite long. Some DNS servers do not respon= d when the EDNS0 buffer size is too large. We try to go down from a large siz= e to a smaller one and hope that the server responds at some point (old Cisco= equipment is to blame for this - they assumed that a DNS packet will never b= e larger than a couple of bytes). Each try has a small timeout, but we will t= est quite a couple of sizes until we finally decide that the server does not = respond at all. The algorithm could of course be tuned to figure that out earlier, but I neve= r considered many people running into this problem. However, it has an effect= too when the name servers are not reachable at all. I want to get rid of this test though. That should be the solution instead of= making a shitty script a little bit better. It will still be a bit shitty. Best, -Michael >=20 > Tom >=20 >> On Dec 13, 2018, at 11:19 AM, Michael Tremer = wrote: >>=20 >> Hi Tom, >>=20 >> the script is usually quite fast when it can reach the DNS servers. >>=20 >> If it can=E2=80=99t it is waiting quite long and runs into a timeout. >>=20 >> Is there a reason why it cannot reach the DNS servers? It should be able t= o or you should not have functioning DNS. >>=20 >> -Michael >>=20 >>> On 12 Dec 2018, at 15:31, Tom Rymes wrote: >>>=20 >>> Last night we were fighting with a router that had a non-functioning red = interface. It ended up being a misconfiguration on our part, but while we wer= e troubleshooting, we routinely had to wait for very long periods as unbound = failed to contact the defined name servers.=20 >>>=20 >>> Is there a way to make the startup script time out more quickly when thin= gs are not functioning? >>>=20 >>> Tom >>=20 --===============0902634219304061903==--