Skip to main content
Explainer · privacy

What does a website actually see about you?

When you load a page, your browser hands over far more than your IP. Here is the full catalog of channels — what each one reveals, why it exists, and how identifying the combination is in practice.

The mental model: each request is a tiny self-introduction

Every HTTP request your browser makes carries a packet of self-introducing metadata. A normal request to a normal site reveals at least 20 distinct facts about your environment without doing anything tricky. Combined, those facts are usually enough to identify a particular browser across sessions, even when cookies are blocked and incognito mode is on.

The information falls into four tiers, from "everyone sees this" to "only sites that go looking will."

Tier 1: visible to every site, every request

Your IP address

Inevitable. The site needs to know where to send the response. From the IP, anyone can look up the geographic region (city, sometimes neighborhood, sometimes wrong by a continent for mobile), the ISP or hosting provider via WHOIS / RDAP, the ASN that announces the block, and any reputation flags. See the full what is my IP explainer for the depth.

The request line itself

Which URL you asked for, including the path and any query parameters. The Referer header (yes, misspelled in the spec) often reveals the previous page — useful for analytics, but it leaks more than people realize. If you click a link in your email to a SaaS app, the SaaS app's CDN may receive a Referer indicating which email provider you use.

User-Agent header

A free-form string identifying your browser, your operating system, and (usually) your major and minor versions. Looks like Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/.... Inspect yours on the user-agent parser. The combination of browser, OS, and version pins your environment to a small population of similar setups — typically a few thousand similar browsers in any large sample.

Accept-Language

Which human languages your browser is configured to prefer, in priority order. The value frequently reveals not just your language but the region your browser was configured in: en-US vs. en-GB vs. en-AU are all "English" but tell very different stories about where you set up your machine.

Accept-Encoding

Which compression algorithms your browser supports. Modern browsers all support gzip and Brotli; lacking Brotli marks you as an older browser, a curl/bot, or behind a stripping proxy.

Sec-CH-UA hints (Chromium client hints)

A modern replacement for the messy User-Agent string, sent by Chromium browsers as structured headers: which browser brand, which major version, which platform, whether the device is mobile. Same information, more cleanly formatted, easier to fingerprint. Inspect them on the request headers tool.

Tier 2: revealed as soon as the page loads anything beyond plain HTML

Screen resolution and color depth

Any JavaScript can read screen.width, screen.height, screen.colorDepth, devicePixelRatio, and the inner window dimensions. The combination is surprisingly distinctive — a 3440×1440 ultrawide on a HiDPI MacBook is a different fingerprint from a 2560×1440 desktop monitor.

Timezone and locale

Intl.DateTimeFormat().resolvedOptions().timeZone returns your IANA timezone identifier (America/New_York, Europe/Berlin). Combined with Accept-Language, it identifies your probable region with high precision — and reveals a discrepancy if you're using a VPN with an exit in a different timezone from your real location.

Installed fonts

A site can probe which fonts are available by trying to render text in candidate fonts and measuring the bounding box. The list of installed fonts is a high-entropy signal — most users have ten to twenty system fonts plus whatever applications (Office, Adobe, design tools) have added. Two browsers with identical user agents but different font lists are essentially distinguishable.

Plugins and MIME types

Less informative than it used to be — modern browsers shipped Flash, Java, and Silverlight to history, and what's left in navigator.plugins is a small list of PDF viewers. Still inspectable, still a low-entropy fingerprint contributor.

Tier 3: revealed when a site actively fingerprints

Canvas fingerprinting

The site draws text into an HTML5 canvas and reads the resulting pixel data back as a hash. Different combinations of OS, browser, GPU, font rendering settings, and installed fonts produce slightly different pixel results — same source code, very different output. The hash is a stable identifier across sessions on the same machine. This is the workhorse of modern anti-fraud and ad-tracking systems.

WebGL fingerprinting

Same idea but with hardware-accelerated 3D rendering. The site queries WebGL's identification strings (UNMASKED_VENDOR_WEBGL, UNMASKED_RENDERER_WEBGL) and renders a known scene; the rasterization differences between GPU models, drivers, and OS versions produce a fingerprint that's harder to randomize than canvas.

AudioContext fingerprinting

Generate an oscillator, pass it through an audio graph, read back the FFT output. Different OS audio stacks produce subtly different output. Less common than canvas and WebGL but supported.

Battery and device information

On older browsers, the Battery Status API exposed enough detail to identify a device. Modern browsers have largely removed or coarsened the API in response. The generic navigator.hardwareConcurrency (logical CPU count), navigator.deviceMemory (RAM tier), and accelerometer / gyroscope on mobile are all still readable.

WebRTC IP leakage

The serious one. WebRTC ICE candidate gathering can reveal your real IP address even behind a VPN, by inspecting OS network interfaces and probing STUN servers. We have a dedicated WebRTC leak test and a long-form explainer covering both the mechanism and the mitigations.

HTTP/2 + TLS fingerprints (JA3, JA4)

The least-obvious channel. The TLS handshake itself — the order in which your browser advertises cipher suites, extensions, and elliptic curves — is distinctive per browser-and-version. JA3 and JA4 are fingerprinting schemes that hash this information into a short string. Two clients claiming the same User-Agent but with different JA3 fingerprints have caught lots of bots.

Tier 4: combined with cross-site signals

Third-party cookies (declining but still around)

A cookie set by ad-network-x.com on one site is sent back when you visit any other site that loads from ad-network-x.com. Used to be the universal tracking primitive; all major browsers now restrict it by default in 2026, but it persists in pockets.

localStorage and IndexedDB

Per-origin client-side storage that survives across sessions. A site can stash an identifier and read it back the next time you visit. Cleared by incognito and by "Clear site data," but otherwise sticky.

Login state at common services

A site can probe whether you're logged into Facebook, Google, GitHub, etc. by attempting to load a known-protected resource and observing whether the response comes back as 200 (logged in) or 401 (not). The full technique has been narrowed by recent browser changes but variants still work.

Behavioral signals

Mouse movement patterns, typing cadence, scroll-wheel granularity, touch event characteristics, time between page loads — all distinctive enough to fingerprint users across sessions if a site cares to track them. This is at the edge of "what a normal site actually does" but it's well-instrumented in fraud-detection systems.

How distinguishing is all this in practice?

The seminal Panopticlick study (EFF, 2010) found that 84% of browsers in their test population had a fingerprint unique to that browser within the study. Subsequent research has bounced the number up and down based on the population — homogeneous deployments (corporate fleets of identical machines) have much higher overlap; heterogeneous consumer populations (a random sample of all browsers visiting an ad-supported site) approach uniqueness for any actively-fingerprinted user.

The practical effect: a site that wants to identify returning visitors across sessions can usually do so without any consent-required identifier, by combining IP + User-Agent + canvas hash + a few other signals. This is why "I cleared my cookies" is no longer a meaningful privacy boundary on actively-tracked sites.

What you can do about it

The honest summary

A modern website that wants to identify you across sessions has many channels for doing so. Cookies are no longer the main lever; the underlying fingerprint of your specific browser-and-network combination is. For most users, on most sites, this doesn't matter — the site uses the fingerprint for fraud detection and analytics, not to find your name. For users who do care, the only meaningful defenses are browsers explicitly designed to resist fingerprinting (Tor, Brave, Firefox with strict mode), and the trade-off is usability for anti-tracking purity.

Related reading