Table of Contents
Background
Here are some Bind troubleshooting tips. These tips are related to Webmin’s Cluster name servers which are used by Virtualmin when creating a name server hierarchy.
failed to determine a valid IP address for this system (got 127.0.1.1)
Adding new DNS zone .. .. done
Adding secondary zone on ns1.example.com ns2.example.com .. .. failed to determine a valid IP address for this system (got 127.0.1.1). You will need to set the IP in the BIND DNS Server module.
The above error might also go with this:
refused notify from non-primary: 1.2.3.4
Jul 18 04:23:12 ns2 named[110656]: client @0x7f91d106cb18 41.203.6.203#56022: received notify for zone 'host.example.com' Jul 18 04:23:12 ns2 named[110656]: zone host.example.com/IN: refused notify from non-primary: 1.2.3.4#56022
Go here to fix:
Webmin -> Servers -> BIND DNS Server -> Cogwheel -> Cluster slave servers
Change to below:
Default master server IP for remote slave zones | 1.2.3.4 |
This could happens when a server’s IP address has changed.
When creating a new DNS record, REFUSED from existing secondaries
This situation might occur when you’ve just added a new slave of which the IP address has changed.
The error means the existing secondaries does not know about IP address of this new slave and they have to be updated. If you look at your existing zone files you’ll see this:
zone "example.com" { type slave; masters { A.B.C.D; E.F.G.H; I.J.K.L; M.N.O.P; }; allow-transfer { A.B.C.D; E.F.G.H; I.J.K.L; M.N.O.P; }; file "/var/lib/bind/example.com.hosts"; };
Essentially you don’t transfer if the IP address if your new IP address server isn’t in that list. You now need to update /etc/bind/named.conf.local
with the new IP address.
/etc/bind# cp named.conf.local named.conf.local.backup sed -i 's/A.B.C.D/W.X.Y.Z/g' named.conf.local
When doing these changed on CentOS, the path is /etc
and the file is /etc/named.conf
.
sed -i
replaces the text, if you use sed without -i
the text will be output.
Reference: https://phoenixnap.com/kb/sed-replace
refused notify from non-master
When adding a new DNS record, you might this problem:
zone example.com/IN: refused notify from non-master: A.B.C.D#42935
A.B.C.D is the new IP address of the new slave server. You might experience this on your 4th name server, so orientate yourself first properly.
A good place to look for a consistent setup is in:
/etc/bind/named.conf.options
You will find allow-transfer and also-notify settings there. Are these values consistent across all your slaves?
bad zone transfer request / non-authoritative zone
Scenario:
You existing name server is complaining about bad zone transfer request / non authoritative zone.
One possible solution:
Look for the zone. Perhaps it was cancelled and never properly removed from the old server?
You can just delete it from the primary name server, Webmin cluster should offer to remove it from the slaves also.
Restarting Slave Fails
Re-starting slave DNS servers ..
.. some slave servers failed
nsx.example.com :
This probably means that rndc is failing somewhere. You can check it on nsx.example.com
like so:
rndc status
If you get this, it means something is wrong:
# rndc status rndc: connection to remote host closed. * This may indicate that the * remote server is using an older * version of the command protocol, * this host is not authorized to connect, * the clocks are not synchronized, * the key signing algorithm is incorrect * or the key is invalid.
rndc -V status
This can happen because the rndc key is defined in multiple places, and also if you have 3rd party DNS integration (e.g. Plesk), you have to be extra careful. See here for some guidance:
https://unix.stackexchange.com/questions/489748/bind-9-9-4-rndc-connection-to-remote-host-closed
Permissions Problem in Cluster Slaves
This problem appears sometimes on slaves, even though the master is working properly. It’s best to use tail
in combination with grep
.
Variation 1
You may see this error when tail -f /var/log/syslog | grep domain
:
zone example.com/IN: refresh: could not set file modification time of '/var/lib/bind/example.co.za.hosts': permission denied
Variation 2
Jul 18 03:40:12 ns2 named[110656]: zone example.com/IN: loading from master file /var/lib/bind/example.com.hosts failed: end of file Jul 18 03:40:12 ns2 kernel: [540002.553467] audit: type=1400 audit(1721266812.060:70): apparmor="DENIED" operation="link" profile="named" name="/var/lib/bind/db-tPmcdazf" pid=110656 comm="isc-net-0000" requested_mask="l" denied_mask="l" fsuid=113 ouid=0 target="/var/lib/bind/example.com.hosts" Jul 18 03:40:13 ns2 named[110656]: zone example.com/IN: Transfer started. Jul 18 03:40:13 ns2 named[110656]: transfer of 'example.com/IN' from 129.232.252.163#53: connected using 129.232.252.163#53 Jul 18 03:40:13 ns2 named[110656]: zone example.com/IN: transferred serial 1621244761 Jul 18 03:40:13 ns2 named[110656]: zone example.com/IN: transfer: could not set file modification time of '/var/lib/bind/example.com.hosts': permission denied
Variation 2 warns about apparmor.
Variation 1 Solution and Explanation
cd /var/lib/bind/
Check the owner permissions. Some owners might be:
bind:bind and rw-r--r--
These will be working. Other owners might be:
root:bind and rwxrwxr-x
In spite of the second lot having a lot of permissions, the second lot didn’t transfer. The solution is an obscure button in Webmin / Servers / Bind DNS Server / cogwheel small icon
to left. Click it and navigate to Zone file options
and set the permissions right here:
Variation 2 Solution and Explanation
Variation 2 might throw you off and make you think it’s a app armor issue. In fact, it was the secondary which also didn’t have bind:bind
permissions assigned.