What does a CDN actually do?

A CDN sits in front of your origin server and answers requests from edge nodes located close to your users. For cacheable content (static files, public API responses, even rendered HTML if marked cacheable), the edge node serves the response directly without bothering your origin. The user sees lower latency, your origin sees less load, and attackers attempting volumetric DDoS hit the CDN first.

Do I need a CDN if I have low traffic?

Probably yes, for two reasons that have nothing to do with traffic volume. First, the TLS termination and HTTP/3 support on modern CDNs is far better than what a small origin can configure correctly. Second, the DDoS protection turns "any motivated stranger can take you offline" into "any motivated stranger has to bring tens of gigabits per second" — a much higher bar. Cloudflare and Bunny.net both have free tiers that cover most low-traffic cases.

How does a CDN find the closest server to me?

Anycast routing. The CDN advertises the same IP from hundreds of locations worldwide; BGP routes the user's traffic to whichever location is closest in network terms. There is no DNS-based geographic guesswork — your packet just lands at the nearest edge by the laws of routing. This is why CDN IPs look like a small handful of addresses but resolve to thousands of physical servers.

Can my origin IP leak through a CDN?

Yes, and it happens constantly. Common leaks: outbound email headers (mail servers usually live on the origin), historical DNS records (someone scrapes Censys or SecurityTrails), error pages that include the origin's hostname, and Let’s Encrypt's certificate transparency logs publishing every domain you've ever held a cert for. The fix is to put the origin behind a firewall that only accepts traffic from the CDN's published IP ranges, so even if the IP leaks, direct connections are refused.

What is a CDN? Content delivery networks, explained from scratch

The problem CDNs were invented to solve

In the late 1990s, websites had one server (often in someone's garage or a single datacenter in Virginia) and increasingly distant users (East Asia and Europe were minutes-of-page-load away). Bandwidth was expensive, transcontinental latency was punishing, and an unexpected mention on Slashdot would flatten a small site for hours. Akamai, founded out of MIT in 1998, pioneered the model that became the industry: copy the content to thousands of servers spread around the world, and serve each user from the one geographically nearest.

Twenty-five years later that model has expanded enormously. A modern CDN is not just a cache — it terminates TLS, runs WAF rules, makes routing decisions, executes serverless code at the edge, transcodes images on the fly, and absorbs terabit-scale DDoS attacks. But the original premise still holds: get the response physically close to the user, and let your origin do as little work as possible.

What you actually get from a CDN

Latency reduction. Static assets — JavaScript bundles, CSS, images, fonts — live in cache at hundreds of points of presence. A user in Tokyo fetches your site from a Tokyo edge in 5–15 ms instead of crossing the Pacific for 120–180 ms. The bigger the bundle, the bigger the win, because the time-to-first-byte improvement compounds with the throughput improvement.
Origin shielding. Most requests never touch your origin server. Cloudflare reports that for the average site behind it, more than 70% of bytes are served from cache. Your backend can run on a tiny VPS while still surviving viral traffic — the cache absorbs the spike.
TLS termination. The CDN does the TLS handshake; your origin can run plain HTTP over a private link. Lower CPU on origin, hundreds of regional locations doing the certificate work, automatic HTTP/3 and TLS 1.3 even if your origin only speaks HTTP/1.1.
DDoS mitigation. Tier-1 CDNs absorb attacks measured in terabits per second; the famous Mirai botnet attack of 2016 hit 1.2 Tbps and was absorbed by OVH's network. No single origin can do this. Sitting behind a serious CDN means the cost of attacking you is the CDN's cost, not yours.
WAF and bot management. Layer 7 filtering at the edge blocks SQL injection probes, credential-stuffing attempts, scraper traffic, and malicious user agents before they reach your server. Most CDNs ship a default WAF ruleset that handles the OWASP Top Ten without configuration.
Image and video optimization. Many CDNs transcode on the fly — sending AVIF or WebP to browsers that support it, MP4 with adaptive bitrate to video clients, and resized variants of the same source image depending on the requested viewport. You upload one master file; the CDN serves the right derivative.
Edge compute. Modern CDNs (Cloudflare Workers, Fastly Compute@Edge, AWS Lambda@Edge, Vercel Edge Functions, Netlify Edge Functions) let you run code at the edge — A/B tests, auth checks, redirects, even full server-rendered HTML. The distinction between "CDN" and "platform" is steadily blurring.

How a request actually flows through a CDN

DNS resolution. Your domain's A record points at a CDN-owned IP, not your origin. DNS lookup tools show this — query www.shopify.com and you'll see a Fastly IP; query github.com and you'll see a Fastly IP again.
Anycast routing. The CDN advertises that same IP from hundreds of BGP points-of-presence worldwide. The user's ISP routes them to whichever PoP is closest in BGP terms — usually but not always the geographically closest. See the BGP glossary entry for how this works.
Cache lookup. The edge node hashes the request (URL + relevant query parameters + relevant headers like Accept-Encoding) and checks its local cache. A hit returns the cached response immediately, with a header like cf-cache-status: HIT or x-cache: HIT telling you which edge served you.
Tier 2 / shield. If the edge missed, most modern CDNs send the request to a regional shield (a second-tier cache closer to your origin) instead of straight to origin. The shield catches a miss-from-edge before it becomes a hit on your origin server, dramatically reducing origin traffic.
Origin fetch. If both levels of cache miss, the request finally reaches your origin. The CDN stores the response (according to your Cache-Control headers), serves it to the user, and serves subsequent users in that region from the cache.

The cache-control language

The behavior of every CDN is governed by HTTP cache headers your origin sends. The most important ones:

Cache-Control: max-age=31536000, immutable — cache for a year, and the client may treat the response as never-changing even when revalidating. Used for content-hashed assets like main.a8f3c2.js.
Cache-Control: public, s-maxage=300, stale-while-revalidate=86400 — CDN caches for 5 minutes; after expiry, serve the stale copy for up to 24 hours while asynchronously fetching a fresh one. Excellent for "mostly-stable" pages.
Cache-Control: no-store — never cache, anywhere. Used for personalized HTML or anything containing auth state.
Vary: Accept-Encoding, Accept-Language — instructs the CDN to cache a different version for each value of these headers, so a French-speaking user and an English-speaking user don't get each other's localized HTML.

The traditional "cache invalidation is hard" problem is mostly solved by content-hashed URLs: rather than invalidating /styles.css, you publish /styles-d4f9e7c2.css and update the HTML to reference it. The old version is still cached (and harmlessly so); nobody asks for it anymore.

The major providers, in plain English

The CDN market is concentrated at the top but increasingly competitive at the budget tier. The shortlist:

Cloudflare — 300+ cities, generous free tier (yes, including HTTP/3 and TLS 1.3), broad feature set (Workers, R2 object storage, D1 SQL at the edge, Zero Trust, DNS, Pages). Best default for most projects, especially those starting small.AS13335 on the IPFerret ASN explorer.
Akamai — the originator (1998), tremendous global reach including many ISPs as embedded caching partners, enterprise pricing, complex configuration. Where you go when you have a CDN budget and a procurement department.
Fastly — VCL-based configuration (very expressive — you can ship complex routing logic directly to the edge), strong developer community, loved by content sites (NYT, BuzzFeed, GitHub). Higher per-request cost but lower origin-egress cost.
AWS CloudFront — tight integration with the rest of AWS, the obvious pick if your origin is already an S3 bucket or an EC2 instance. Less compelling as a free-standing product.
Bunny.net — much smaller footprint than Cloudflare, much lower per-GB pricing (single-digit dollars per terabyte vs. Cloudflare R2's competitive but higher rate), great for video hosting and high-egress static delivery.
Vercel, Netlify, Render — platform-CDN hybrids; the CDN is invisible because the platform provides the whole stack. Easiest path if your application is a modern Next.js / Astro / Remix / SvelteKit site.

Trade-offs and gotchas

TLS visibility

The CDN sees your decrypted traffic by definition — it has to, to make routing and caching decisions. That means you are trusting the CDN with whatever your users send through it: form posts, authenticated session tokens, file uploads. Most CDNs publish clear data-handling policies; pick one whose privacy posture you can defend to your users. For sensitive workloads, look at end-to-end encryption schemes the CDN can't decrypt (Cloudflare's Encrypted Client Hello, Fastly's per-tenant TLS keys).

Origin IP leakage

If your origin's real IP is ever exposed — through email headers, historical DNS records on tools like SecurityTrails, certificate transparency logs publishing every domain you've held a cert for, or a careless error page — attackers can bypass the CDN entirely and hit your origin directly. The fix is a firewall on the origin that only accepts traffic from the CDN's published IP ranges. Most CDNs publish those ranges as a downloadable JSON; rotate the firewall rules from cron.

Cache poisoning

When the cache key doesn't include a header that affects the response (forgettingVary: Authorization, for example), one user's logged-in response can be served to another user. Bug, not a hypothetical — it has appeared at major sites. Solution: explicit Vary headers, careful keying of the cache, and synthetic monitoring that fetches as multiple identities and compares.

When NOT to use a CDN

Some workloads don't benefit. Long-running WebSocket connections (the cache adds nothing, the routing layer adds latency), uncacheable real-time APIs that hit the origin on every request anyway (you're just adding a hop), and trusted internal services where you control both ends and don't want a third party in the middle. For those, point at your origin directly or build your own routing layer.

How to tell if a site is using a CDN

Run a DNS lookup on the site. If the A record points at a known CDN range (104.16.x.x, 199.232.x.x, 23.227.x.x), you have your answer.
Fetch the page and inspect headers. cf-cache-status means Cloudflare,x-cache + x-served-by means Fastly, x-amz-cf-id means CloudFront, x-akamai-transformed means Akamai.
WHOIS the IP. The org will read "Cloudflare, Inc.", "Fastly, Inc.", "Akamai Technologies", etc.

What is a CDN?