Per-Request Pacing
How v.recipes uses a token-bucket rate limiter to prevent upstream DNS congestion while maintaining low latency for individual users.
How It Works
Token Assignment
Each incoming DNS query requests a token from a shared bucket. The bucket holds up to 1,000 tokens and refills at a constant rate of 1,000 tokens per second.
Immediate Forwarding
If a token is available, the query is forwarded to the upstream resolver immediately with zero added latency. The user experiences no difference from a direct connection.
Short Wait Queue
If the bucket is empty (burst exceeded), the query enters a brief wait queue — typically under 5 milliseconds — until a token becomes available. This micro-delay is imperceptible to users.
Overflow Protection
Queries that cannot be served within the wait window receive a cached stale response if available, or a SERVFAIL as a last resort. The upstream provider is never overwhelmed.
Why Pacing Matters
- ✗Upstream provider rate-limits kick in at ~1,500 QPS
- ✗Queries start getting dropped silently
- ✗All users sharing the resolver experience degraded service
- ✗Retry storms amplify the congestion further
- ✓Upstream rate stays well below provider limits
- ✓No dropped queries under normal operation
- ✓Individual user latency unaffected (cache absorbs bulk traffic)
- ✓Graceful degradation under extreme load via stale cache
Real-World Example: ECS Optimized Variant
Google Public DNS enforces a rate limit of approximately 1,500 queries per second per source IP. The v.recipes /dns-ecs variant uses Google as its upstream for ECS (EDNS Client Subnet) support.
The caching layer absorbs the vast majority of traffic, and pacing ensures the remaining upstream queries never exceed Google's rate limit — even during traffic spikes.