Down in the Pi-hole

A couple of months ago I made the ill-fated decision to upgrade my pi-hole from version 5 to 6. Since then, my local DNS resolution has stopped working on devices other than the Pi.

Searching online didn't show up much - the odd forum post where someone changes a seemingly arbitrary setting and things magically work for them. Not so satisfying. So, I figure why not take a deeper dive into how the pi-hole works.

First, the problem. If I run dig blah.home @<pi address> on my laptop I get a timeout error on pi-address#53, where blah.home is in my list of custom DNS entries in the pihole admin page. If I run it on my Pi then it resolves the IPV4 address just fine. Looking at pihole's logs, in both cases the custom DNS file is consulted and the address is found for the A record. On the Pi a AAAA record is also checked. Now, you may say...firewall! Which would be reasonable except external addresses are resolving just fine on that port, as well as CNAME entries.

So what is going on? Well pi-hole serves DNS queries using its FTL server, which is a fork of dnsmasq. So first stop is figuring out how that works. You can enable packet dumping but my pcap game is not strong and I figure looking at packets won't be enough to tell me what exactly is going on.

I already had dnsmasq installed on the Pi and launched it with:

dnsmasq -h -d -q -p 5353 -A /domain.name/192.168.0.10

Here -A is a custom name-address mapping and -h d -q enable 'no-hosts mode', 'no-daemon mode' and 'log query mode' respectively.

On the client side if I then run:

dig domain.name @192.168.0.10 -p 5353

I get on the server:

dnsmasq: query[A] grafana.home from 192.168.0.74
dnsmasq: config grafana.home is 192.168.0.10

and on the client:

;; QUESTION SECTION:
;grafana.home.			IN	A

;; ANSWER SECTION:
grafana.home.		0	IN	A	192.168.0.10

;; Query time: 7 msec
;; SERVER: 192.168.0.10#5353(192.168.0.10) (UDP)

All as expected. If I try the same with my Pi I get:

;; communications error to 192.168.0.10#53: timed out
;; communications error to 192.168.0.10#53: timed out
;; communications error to 192.168.0.10#53: timed out

; <<>> DiG 9.19.24 <<>> navidrome.home @192.168.0.10 -p53
;; global options: +cmd
;; no servers could be reached

What gives here? I should at least see some sort of resolution error somewhere - rather than a refusal to respond at all.

Since there isn't much to go on in the logs let's see if we can build FTL and poke around - following: https://docs.pi-hole.net/ftldns/compile/. I had some trouble building mbdtls due to a seemingly defective archive download link - but otherwise this went fine.

Now we can run directly against the FTL development version:

sudo service pihole-FTL stop
sudo pihole-FTL debug

Some extra logging revealed that the wrapped dnsmasq version was indeed trying to send a message via its sndmsg call. So - time to take a look at some packets after all. Using tcpdump on the pi and my laptop, e.g.:

sudo tcpdump -i wlp2s0 host 192.168.0.10 and port 53

I could see the Pi sending a response back to the A record request - but that there was no response coming on the laptop.

15:37:04.236194 IP rex.local.42138 > raspberrypi.domain: 33845+ [1au] A? navidrome.home. (55)
15:37:04.319626 IP rex.local.34406 > raspberrypi.domain: 64344+ PTR? 10.0.168.192.in-addr.arpa. (43)
15:37:04.327971 IP raspberrypi.domain > rex.local.34406: 64344* 1/0/0 PTR raspberrypi. (68)
15:37:09.238439 IP rex.local.41264 > raspberrypi.domain: 33845+ [1au] A? navidrome.home. (55)
15:37:14.244129 IP rex.local.34274 > raspberrypi.domain: 33845+ [1au] A? navidrome.home. (55)
15:37:19.328530 IP rex.local.46479 > raspberrypi.domain: 61373+ PTR? 10.0.168.192.in-addr.arpa. (43)
15:37:19.351417 IP raspberrypi.domain > rex.local.46479: 61373* 1/0/0 PTR raspberrypi. (68)
15:39:54.619470 IP raspberrypi.domain > rex.59592: 51011* 1/0/1 A 192.168.0.10 (59)
15:39:59.622195 IP rex.36985 > raspberrypi.domain: 51011+ [1au] A? navidrome.home. (55)
15:39:59.622504 IP raspberrypi.domain > rex.36985: 51011* 1/0/1 A 192.168.0.10 (59)
15:40:04.627099 IP rex.51835 > raspberrypi.domain: 51011+ [1au] A? navidrome.home. (55)