Skip to content
Back to Blog
DNSFebruary 28, 2026·9 min read

DNS Health Audit: Best Practices for a Reliable Domain

Ensure your DNS is properly configured with this health audit guide — redundancy, TTL, delegation, and monitoring.

Server racks representing DNS health audit

A healthy DNS configuration is the invisible foundation of every online service. Misconfigurations—lame delegations, stale records, missing DNSSEC signatures—silently degrade availability and security. This guide provides a structured DNS health audit checklist, recommended TTL values, and monitoring strategies to keep your zones in top shape.

What Is a DNS Health Audit?

A DNS health audit is a systematic review of your domain's DNS configuration to identify misconfigurations, security gaps, and performance issues before they cause outages. It covers nameserver redundancy, record accuracy, DNSSEC integrity, and TTL tuning.

📖 Definition — A lame delegation occurs when a parent zone lists a nameserver as authoritative for a child zone, but that nameserver does not actually serve the zone—returning REFUSED or SERVFAIL instead of valid answers.

The DNS Health Checklist

Run through each item below during every audit cycle:

1. NS Redundancy — Verify at least two geographically diverse nameservers are configured and responding authoritatively.

2. SOA Serial — Confirm the SOA serial number increments with every zone change. Stale serials prevent secondary servers from syncing.

3. TTL Sanity — Check that TTL values match the record type's volatility. Overly short TTLs increase query load; overly long ones delay propagation.

4. Dangling CNAMEs — Scan for CNAME records pointing to decommissioned services—these are subdomain takeover vectors.

5. DNSSEC Validation — Verify the DS → DNSKEY → RRSIG chain is intact and signatures are not expired.

6. MX & SPF Alignment — Ensure MX records resolve and SPF includes all legitimate sending IPs.

Recommended TTL Values by Record Type

TTL values balance caching efficiency against change propagation speed. Use the table below as a starting point:

Record TypeRecommended TTLRationale
A / AAAA300–3600 sShort for failover-enabled hosts; longer for static servers
CNAME3600 sRarely changes once set; cache-friendly
MX3600 sMail routing changes infrequently
TXT (SPF/DKIM)3600 sStable after deployment; lower during rollout
NS86400 s (24 h)Nameserver changes are rare and must propagate widely
SOA86400 sNegative caching TTL (SOA MINIMUM) should be 300–900 s

💡 Before a planned migration, lower TTLs to 60–300 seconds 48 hours in advance. After the migration is verified, restore production TTLs to reduce resolver load.

DNSSEC Chain of Trust

DNSSEC adds cryptographic signatures to DNS responses, preventing cache poisoning. The chain of trust works as follows:

  1. The parent zone publishes a DS record containing a hash of the child's DNSKEY.
  2. The child zone holds a DNSKEY RRset (KSK + ZSK) signed by the KSK.
  3. Every record set in the child zone has an RRSIG created with the ZSK.
  4. Resolvers validate from the root trust anchor down through each DS → DNSKEY → RRSIG link.

⚠️ DNSSEC signature expiration is the #1 cause of DNSSEC-related outages. Monitor RRSIG expiry dates and automate key rollovers.

Zone Delegation & Lame Delegation

When you delegate a subdomain (e.g., app.example.com) to a separate set of nameservers, the parent zone contains NS records pointing to those servers. A lame delegation occurs when:

  • The listed nameserver is unreachable or misconfigured.
  • The nameserver responds but is not authoritative for the zone (returns REFUSED).
  • The glue records in the parent zone have stale IP addresses.

Detect lame delegations by querying each NS record directly with dig +norec @ns1.example.com example.com SOA and verifying the aa (Authoritative Answer) flag is set.

Monitoring Strategies

Passive Monitoring

Enable query logging on your authoritative servers and analyze patterns: unexpected NXDOMAIN spikes, query volume anomalies, and SERVFAIL rates.

Active Monitoring

Schedule synthetic queries from multiple global vantage points every 60 seconds. Alert on response time > 200 ms, SERVFAIL responses, or RRSIG expiration within 7 days.

Best Practices

  • Use at least two nameservers on different networks and ideally different providers (multi-homing).
  • Automate SOA serial increments in your CI/CD pipeline.
  • Audit DNS records quarterly—remove orphaned records from decommissioned services.
  • Set the SOA MINIMUM (negative caching TTL) to 300–900 seconds per RFC 2308.
  • Store zone files in version control for auditability.

Common Mistakes

MistakeImpactFix
Single nameserverComplete DNS failure on one server outageAdd a secondary NS on a different network
TTL of 86400 s on A records for load-balanced hosts24-hour delay before failover is visibleLower TTL to 300 s or use DNS-based health checks
Forgetting to update DS record after KSK rolloverDNSSEC validation failure → domain unreachableAutomate DS updates; use CDS/CDNSKEY (RFC 7344)
Dangling CNAME to deprovisioned cloud resourceSubdomain takeover riskAudit CNAMEs monthly; remove stale records
Lame delegation left after nameserver migrationIntermittent resolution failuresQuery every NS directly and verify authoritative flag

Tools

DNS Checker — Global propagation check across 50+ resolvers.

DNS Lookup — Query any record type against any resolver.

CNAME Lookup — Resolve CNAME chains and detect dangling records.

References

🚀 Free ToolZilla tools used in this article

All client-side, no signup, no upload — open them in a new tab while you read:


🎯 A DNS health audit is not a one-time task—schedule quarterly reviews covering NS redundancy, TTL tuning, DNSSEC signature freshness, and dangling record cleanup. Automate what you can and monitor the rest.

Continue Reading

Related Articles

Free & Private

Explore Our Free Tools

40+ browser-based utilities — fast, private, and always free. No sign-up required.

Browse All Tools