Last night we were fighting with a router that had a non-functioning red interface. It ended up being a misconfiguration on our part, but while we were troubleshooting, we routinely had to wait for very long periods as unbound failed to contact the defined name servers.
Is there a way to make the startup script time out more quickly when things are not functioning?
Tom
Hi Tom,
the script is usually quite fast when it can reach the DNS servers.
If it can’t it is waiting quite long and runs into a timeout.
Is there a reason why it cannot reach the DNS servers? It should be able to or you should not have functioning DNS.
-Michael
On 12 Dec 2018, at 15:31, Tom Rymes trymes@rymes.com wrote:
Last night we were fighting with a router that had a non-functioning red interface. It ended up being a misconfiguration on our part, but while we were troubleshooting, we routinely had to wait for very long periods as unbound failed to contact the defined name servers.
Is there a way to make the startup script time out more quickly when things are not functioning?
Tom
That’s precisely the problem, Michael. Everything works fine, until something (like an internet outage) prevents you from reaching the DNS servers. Then, when you reboot the router as part of a troubleshooting process, it takes forever while unbound flails about.
In this instance, we were replacing a failed router, and someone had misconfigured the red interface settings, so we were modifying those settings in setup. It took us a while to figure out what we were doing wrong, so we ended up having to wait over and over again as unbound tried to restart and failed. We were the problem, but unbound made it a real hassle while we worked to figure out what we were doing wrong.
Shouldn’t it have a reasonably short timeout for when the DNS servers are unavailable?
Tom
On Dec 13, 2018, at 11:19 AM, Michael Tremer michael.tremer@ipfire.org wrote:
Hi Tom,
the script is usually quite fast when it can reach the DNS servers.
If it can’t it is waiting quite long and runs into a timeout.
Is there a reason why it cannot reach the DNS servers? It should be able to or you should not have functioning DNS.
-Michael
On 12 Dec 2018, at 15:31, Tom Rymes trymes@rymes.com wrote:
Last night we were fighting with a router that had a non-functioning red interface. It ended up being a misconfiguration on our part, but while we were troubleshooting, we routinely had to wait for very long periods as unbound failed to contact the defined name servers.
Is there a way to make the startup script time out more quickly when things are not functioning?
Tom
On 13 Dec 2018, at 16:36, Tom Rymes trymes@rymes.com wrote:
That’s precisely the problem, Michael. Everything works fine, until something (like an internet outage) prevents you from reaching the DNS servers. Then, when you reboot the router as part of a troubleshooting process, it takes forever while unbound flails about.
Yes. If you have seen my conversation with Erik on this list, we are aware of loads of issues with this script and want to remove those problems.
However, getting rid of that script might have other implications that we do not know yet. It is actually quite risky.
Unbound just does not handle those DNS servers well that do not conform to (recent) RFC standards. I am not sure if that has changed with recent releases and with more people who have deployed unbound in recent months and years. I hope it did.
In this instance, we were replacing a failed router, and someone had misconfigured the red interface settings, so we were modifying those settings in setup. It took us a while to figure out what we were doing wrong, so we ended up having to wait over and over again as unbound tried to restart and failed. We were the problem, but unbound made it a real hassle while we worked to figure out what we were doing wrong.
Shouldn’t it have a reasonably short timeout for when the DNS servers are unavailable?
It does, but that timeout is still quite long. Some DNS servers do not respond when the EDNS0 buffer size is too large. We try to go down from a large size to a smaller one and hope that the server responds at some point (old Cisco equipment is to blame for this - they assumed that a DNS packet will never be larger than a couple of bytes). Each try has a small timeout, but we will test quite a couple of sizes until we finally decide that the server does not respond at all.
The algorithm could of course be tuned to figure that out earlier, but I never considered many people running into this problem. However, it has an effect too when the name servers are not reachable at all.
I want to get rid of this test though. That should be the solution instead of making a shitty script a little bit better. It will still be a bit shitty.
Best, -Michael
Tom
On Dec 13, 2018, at 11:19 AM, Michael Tremer michael.tremer@ipfire.org wrote:
Hi Tom,
the script is usually quite fast when it can reach the DNS servers.
If it can’t it is waiting quite long and runs into a timeout.
Is there a reason why it cannot reach the DNS servers? It should be able to or you should not have functioning DNS.
-Michael
On 12 Dec 2018, at 15:31, Tom Rymes trymes@rymes.com wrote:
Last night we were fighting with a router that had a non-functioning red interface. It ended up being a misconfiguration on our part, but while we were troubleshooting, we routinely had to wait for very long periods as unbound failed to contact the defined name servers.
Is there a way to make the startup script time out more quickly when things are not functioning?
Tom
On Dec 13, 2018, at 11:42 AM, Michael Tremer michael.tremer@ipfire.org wrote:
<snip>
I want to get rid of this test though. That should be the solution instead of making a shitty script a little bit better. It will still be a bit shitty.
Best, -Michael
It’s certainly not an urgent problem, but I am definitely interested in anything that can reduce the level of shittiness!
Tom