DNS/Illustrations/Per-Request Pacing

Per-Request Pacing

How v.recipes uses a token-bucket rate limiter to prevent upstream DNS congestion while maintaining low latency for individual users.

Token-Bucket Rate Limiting

Incoming DNS Queries

Client 1

Client 2

Client 3

Client 4

Client 5

v.recipes DNS

Cache + Pacing Layer

TOKEN BUCKET

1,000 tokens/s

Refill rate

Paced output

Upstream

e.g., 1.1.1.1

Request Rate~700 req/s (of 1,000 max)

05001,000

How It Works

Token Assignment

Each incoming DNS query requests a token from a shared bucket. The bucket holds up to 1,000 tokens and refills at a constant rate of 1,000 tokens per second.

Immediate Forwarding

If a token is available, the query is forwarded to the upstream resolver immediately with zero added latency. The user experiences no difference from a direct connection.

Short Wait Queue

If the bucket is empty (burst exceeded), the query enters a brief wait queue — typically under 5 milliseconds — until a token becomes available. This micro-delay is imperceptible to users.

Overflow Protection

Queries that cannot be served within the wait window receive a cached stale response if available, or a SERVFAIL as a last resort. The upstream provider is never overwhelmed.

Why Pacing Matters

Without Pacing

✗Upstream provider rate-limits kick in at ~1,500 QPS
✗Queries start getting dropped silently
✗All users sharing the resolver experience degraded service
✗Retry storms amplify the congestion further

With v.recipes Pacing

✓Upstream rate stays well below provider limits
✓No dropped queries under normal operation
✓Individual user latency unaffected (cache absorbs bulk traffic)
✓Graceful degradation under extreme load via stale cache

Real-World Example: ECS Optimized Variant

Google Public DNS enforces a rate limit of approximately 1,500 queries per second per source IP. The v.recipes /dns-ecs variant uses Google as its upstream for ECS (EDNS Client Subnet) support.

10,000+

Incoming queries/s

~90%

Served from cache

<1,000

Forwarded to Google/s

The caching layer absorbs the vast majority of traffic, and pacing ensures the remaining upstream queries never exceed Google's rate limit — even during traffic spikes.

← All Illustrations Cache Efficiency Documentation