v.recipes
DNS/Illustrations/Per-Request Pacing

Per-Request Pacing

How v.recipes uses a token-bucket rate limiter to prevent upstream DNS congestion while maintaining low latency for individual users.

Token-Bucket Rate Limiting
Incoming DNS Queries
Client 1
Client 2
Client 3
Client 4
Client 5
v.recipes DNS
Cache + Pacing Layer
TOKEN BUCKET
1,000 tokens/s
Refill rate
Paced output
Upstream
e.g., 1.1.1.1
Request Rate~700 req/s (of 1,000 max)
05001,000

How It Works

1

Token Assignment

Each incoming DNS query requests a token from a shared bucket. The bucket holds up to 1,000 tokens and refills at a constant rate of 1,000 tokens per second.

2

Immediate Forwarding

If a token is available, the query is forwarded to the upstream resolver immediately with zero added latency. The user experiences no difference from a direct connection.

3

Short Wait Queue

If the bucket is empty (burst exceeded), the query enters a brief wait queue — typically under 5 milliseconds — until a token becomes available. This micro-delay is imperceptible to users.

4

Overflow Protection

Queries that cannot be served within the wait window receive a cached stale response if available, or a SERVFAIL as a last resort. The upstream provider is never overwhelmed.

Why Pacing Matters

Without Pacing
  • Upstream provider rate-limits kick in at ~1,500 QPS
  • Queries start getting dropped silently
  • All users sharing the resolver experience degraded service
  • Retry storms amplify the congestion further
With v.recipes Pacing
  • Upstream rate stays well below provider limits
  • No dropped queries under normal operation
  • Individual user latency unaffected (cache absorbs bulk traffic)
  • Graceful degradation under extreme load via stale cache

Real-World Example: ECS Optimized Variant

Google Public DNS enforces a rate limit of approximately 1,500 queries per second per source IP. The v.recipes /dns-ecs variant uses Google as its upstream for ECS (EDNS Client Subnet) support.

10,000+
Incoming queries/s
~90%
Served from cache
<1,000
Forwarded to Google/s

The caching layer absorbs the vast majority of traffic, and pacing ensures the remaining upstream queries never exceed Google's rate limit — even during traffic spikes.