In a production environment, you will want to be able to monitor PowerDNS performance. Furthermore, PowerDNS can perform a configurable amount of operational logging.
On modern Linux distributions, the PowerDNS recursor logs to stderr, which is consumed by systemd-journald
.
This means that looking into the logs that are produced, journalctl can be used:
# journalctl -u pdns-recursor -n 100
Additionally, the Recursor can log to syslog on these systems.
Logging to syslog is disabled in the unit file to prevent double logging.
To enable this, create an drop in unit file at /etc/systemd/systemd/pdns-recursor.service.d/use-syslog.conf
:
[Service]
ExecStart=
ExecStart=/usr/sbin/pdns_recursor --daemon=no --write-pid=no --enable-syslog
This chapter assumes familiarity with syslog, the unix logging device. PowerDNS logs messages with different levels. The more urgent the message, the lower the ‘priority’.
By default, PowerDNS will only log messages with an urgency of 3 or lower, but this can be changed using the loglevel setting in the configuration file. Setting it to 0 will eliminate all logging, 9 will log everything.
By default, logging is performed under the ‘DAEMON’ facility which is shared with lots of other programs. If you regard nameserving as important, you may want to have it under a dedicated facility so PowerDNS can log to its own files, and not clutter generic files.
For this purpose, syslog knows about ‘local’ facilities, numbered from LOCAL0 to LOCAL7. To move PowerDNS logging to LOCAL0, add logging-facility=0 to your configuration.
Furthermore, you may want to have separate files for the differing priorities - preventing lower priority messages from obscuring important ones.
A sample syslog.conf
might be:
local0.info -/var/log/pdns.info
local0.warn -/var/log/pdns.warn
local0.err /var/log/pdns.err
Where local0.err would store the really important messages.
For performance and disk space reasons, it is advised to audit your syslog.conf
for statements also logging PowerDNS activities.
Many syslog.conf
s have a *.*
statement to /var/log/syslog
, which you may want to remove.
For performance reasons, be especially certain that no large amounts of synchronous logging take place.
Under Linux, this is indicated by file names not starting with a -
- indicating a synchronous log, which hurts performance.
Be aware that syslog by default logs messages at the configured priority and higher!
To log only info messages, use local0.=info
Sometimes a domain fails to resolve due to an error on the domain owner’s end, or records for your own domain have updated and you want your users to immediately see them without waiting for the TTL to expire. The rec_control tool can be used to selectively wipe the cache.
To wipe all records for the exact name ‘www.example.com’:
rec_control wipe-cache www.example.com
Whole subtrees can we wiped as well, to wipe all cache entries for ‘example.com’ and everything below it, suffix the name with a ‘$’:
rec_control wipe-cache example.com$
Note
When wiping cache entries, matching entries in all caches (packet cache, recursor cache, negative cache) are removed.
When debugging resolving issues, it can be advantageous to have a dump of all the cache entries. rec_control can write the caches of all threads to a file:
rec_control dump-cache /tmp/cache
To investigate failures with resolving certain domain names, the PowerDNS Recursor features a tracing infrastructure. This infrastructure will log every step the Recursor takes to resolve a name and will log all DNSSEC related information as well.
To enable tracing for all queries, enable the trace setting. Trace information will be written to the log.
Warning
Enabling tracing for all queries on a system with a high query rate can severely impact performance.
Tracing can also be enabled at runtime, without restarting the Recursor, for specific domains. These specific domains can be specified as a regular expression. This can be done using rec_control trace-regex:
rec_control trace-regex '.*\.example.com\.$'
Will enable tracing for any query in the example.com domain (but not example.com itself).
Since version 4.9.0 trace_regex
takes an extra file argument.
Trace information will be written to the file and not to the log.
If the file argument is a hyphen (-
), trace information will be written to the standard output stream.
For example:
rec_control trace-regex 'example\.com\.$' - | grep asking
will show which authoritative servers were consulted.
Do not forget to disable tracing after diagnosis is done:
rec_control trace-regex
In some cases a tracing provides too much information, and we want to follow what the recursor is doing on a higher level.
By setting quiet to true
the recursor will produce a log line for each client query received and answered.
Be aware that this causes overhead and should not be used in a high query-per-second production environment:
Jul 09 09:08:31 msg="Question" subsystem="syncres" level="0" prio="Info" tid="4" ts="1720508911.919" ecs="" mtid="1" proto="udp" qname="www.example.com" qtype="A" remote="127.0.0.1:54573"
Jul 09 09:08:32 msg="Answer" subsystem="syncres" level="0" prio="Info" tid="4" ts="1720508912.549" additional="1" answer-is-variable="0" answers="1" dotout="0" ecs="" into-packetcache="1" maxdepth="3" mtid="1" netms="617.317000" outqueries="13" proto="udp" qname="www.example.com" qtype="A" rcode="0" rd="1" remote="127.0.0.1:54573" tcpout="0" throttled="0" timeouts="0" totms="627.060000" validationState="Secure"
When quiet
is set to false
, the following keys and values are logged for questions and answers not
answered from the packet cache.
Refer to Structured Logging Dictionary for more details on the common keys used for structured logging messages.
Note that depending on record cache content a single client query can result into multiple queries to authoritative servers.
If the exact answer is available from the record cache no outgoing queries are needed.
Keys common to Questions and Answers | ||
Key | Description | Remarks |
ecs |
Client ECS info | Filled in if enabled |
proto |
Protocol used by client | udp or tcp |
qname |
Query name | |
qtype |
Query type | |
remote |
Client address | IP:port combination |
Keys specific to Answers | ||
additional |
Number of additional records in answer | |
answer-is-variable |
Is answer marked variable by recursor? | e.g. ECS dependent answers |
answers |
Number of answer records in answer | |
dotout |
Number of outgoing DoT queries sent to authoritative servers to resolve answer | |
into-packetcache |
Is the answer being stored into the packetcache? | Variable answers (as determined by the recursor or marked as such by Lua code) will not be put into the packet cache |
maxdepth |
Depth of recursion needed to resolve answer | Some queries need resolving multiple targets, e.g. to find the right delegation or answers containing CNAMEs |
netms |
Time spent waiting for answers from authoritative servers | |
outqueries |
Total queries sent to authoritative servers | A single client query can cause multiple queries to authoritative servers, depending on record cache content and the query itself. |
rcode |
Result code | If no rcode is available (e.g. in the case of timeouts) this value can be negative |
rd |
Did the client set the Recursion Desired DNS Header flag? | |
tcpout |
Number of outgoing TCP queries sent to authoritative servers to resolve answer | |
throttled |
Number of potential outgoing queries not sent out because the target was marked as unreliable by previous interactions | If a target is throttled, the recursor will try another suitable authoritative server (if available) |
timeouts |
Number of outgoing queries that timed out | |
totms |
Total time spent resolving | |
validationState |
The DNSSEC status of the answer |