What is PII? Types, Examples & Protection Guide (2026)

Q: What is the difference between PII and personal data?

The GDPR's definition of 'personal data' is broader than the US concept of 'PII.' Under GDPR, personal data includes any information relating to an identified or identifiable natural person — including online identifiers like IP addresses and cookie IDs. US 'PII' definitions vary by sector and state law, and are generally narrower, often requiring the data to directly identify an individual.

Q: Is an email address PII?

Yes, an email address is always considered PII. It is a direct identifier because it can uniquely identify a specific individual without needing additional context. This is true under both GDPR (personal data) and US privacy frameworks (PII). Even a work email address like firstname.lastname@company.com qualifies as PII.

Q: What happens if PII is exposed?

Exposure of PII can result in regulatory fines (up to 4% of annual global revenue under GDPR, or up to €20 million), class-action lawsuits, mandatory breach notifications, reputational damage, and loss of customer trust. In the US, state breach notification laws require disclosure within specific timeframes, and the FTC can pursue enforcement actions for inadequate data protection.

Q: How many types of PII can anonymize.solutions detect?

anonymize.solutions detects over 260 entity types across 48 languages using a hybrid detection engine that combines NLP-based named entity recognition (spaCy, Stanza, XLM-RoBERTa) with 317 regex recognizers including checksum validation. Entity types span identity, contact, financial, health, digital, biometric, employment, and education categories.

Q: What's the best way to protect PII in AI workflows?

The most effective approach is pre-processing anonymization: detect and remove PII before data reaches the LLM. This eliminates the risk at the source rather than relying on the AI provider's data handling policies. Use a deterministic detection engine (not another LLM) to avoid circular dependencies, and ensure the anonymization layer supports reversible tokens so authorized users can re-identify data in AI responses when needed.

Official Definition

“PII (Personally Identifiable Information) refers to any information that can be used to distinguish or trace an individual’s identity, either alone or when combined with other information that is linked or linkable to a specific individual.”

— NIST SP 800-122

CLASSIFICATION

Types of PII

Not all PII carries equal risk. Understanding the three categories helps determine the right level of protection for each data type.

Direct Identifiers

Data that can identify a person on its own, without needing any additional context or cross-referencing.

Full name
Social Security Number (SSN)
Passport number
Driver’s license number
Biometric data (fingerprints, face scans)

Quasi-Identifiers

Data that cannot identify a person alone but can do so when combined with other quasi-identifiers or external datasets.

Date of birth
ZIP code / postal code
Gender
Job title
Nationality

Sensitive PII

Data that requires heightened protection due to the potential for discrimination, harm, or significant privacy impact if exposed.

Medical records
Financial data (credit cards, bank accounts)
Racial or ethnic origin
Political opinions
Sexual orientation

REFERENCE

PII Categories at a Glance

A comprehensive breakdown of PII categories, common examples, associated risk levels, and the primary regulations that govern each type.

Category	Examples	Risk Level	Key Regulation
Identity	Name, SSN, passport number	High	All
Contact	Email, phone, physical address	Medium	GDPR, CCPA
Financial	Credit card, IBAN, tax ID	Critical	PCI-DSS, GDPR
Health	Medical records, prescriptions	Critical	HIPAA, GDPR
Digital	IP address, device ID, cookies	Medium	GDPR, ePrivacy
Biometric	Fingerprints, face scans, voice prints	Critical	GDPR Art. 9
Employment	Employee ID, salary, performance reviews	Medium	GDPR
Education	Student records, grades, transcripts	Medium	FERPA

LEGAL LANDSCAPE

How Laws Define PII

There is no single global definition of PII. The two major frameworks — GDPR and US law — take fundamentally different approaches.

The EU’s General Data Protection Regulation uses the broadest definition in global privacy law.

Any data relating to an identified or identifiable natural person
Includes online identifiers (IP addresses, cookie IDs, device fingerprints)
Pseudonymized data is still personal data
Only fully anonymized data falls outside GDPR scope
Special categories (Art. 9): health, biometric, racial origin, political opinions

US Law — Sector-Specific Definitions

The United States has no single federal privacy law. PII definitions vary by sector, state, and regulation.

No unified federal definition of PII
CCPA (California): broad — includes IP addresses, browsing history
HIPAA (Health): Protected Health Information (PHI), 18 identifier types
FERPA (Education): student education records
Generally narrower than GDPR — often requires data to directly identify a person

HIDDEN RISKS

Where PII Hides

PII is rarely confined to a single database column. It spreads across systems, documents, and workflows — often in places organizations never think to check.

Emails & Messages

Names, phone numbers, addresses, and account details embedded in email threads, Slack messages, and support tickets. Often forwarded and copied without redaction.

Documents & Spreadsheets

Contracts, invoices, HR files, and Excel exports containing SSNs, salaries, medical information, and customer records shared across departments.

AI Prompts & Chat Logs

Users paste customer data, code snippets with credentials, and personal details into ChatGPT, Claude, and Copilot — all logged by the AI provider.

Database Fields

Free-text columns, notes fields, and unstructured data in CRM, ERP, and ticketing systems where PII is entered without validation or tagging.

Log Files & Analytics

IP addresses, user agents, session IDs, and sometimes full request bodies with PII captured in application logs, web server logs, and analytics platforms.

Images & Scanned Documents

Scanned IDs, passport photos, medical forms, and screenshots containing visible PII that survives text-based redaction because it exists as pixels, not characters.

PROTECTION

How to Protect PII

Effective PII protection requires automated detection at scale, multiple anonymization methods, and architecture that eliminates single points of failure.

Automated PII Detection — 320+ entity types across 48 languages, powered by NLP and Pattern engines with checksum validation. No manual tagging required.
Multiple Anonymization Methods — Replace with realistic fakes, redact entirely, mask partially, hash for consistency, or encrypt for reversible protection. Choose per entity type.
AI Workflow Protection — MCP Server for IDE integration, Chrome Extension for AI chat platforms, REST API for pipelines. Anonymize before data reaches any LLM.
Compliance Presets — Pre-configured detection profiles for regulatory requirements (e.g., GDPR). Select a preset and the system automatically targets the relevant entity types.
Zero-Knowledge Architecture — Your data passes through, gets anonymized, and returns. We never store, log, or access your original content. Even our team cannot see your data.

Deterministic, Not AI-Based

We use NLP + Pattern engines — not LLMs — to detect PII. Same input always produces the same output. No hallucinations, no variation between runs, full audit trail for every detection.

100% EU Infrastructure

All processing happens on Hetzner infrastructure in Germany. No data leaves the EU. No US Cloud Act exposure. No third-party sub-processors with access to your content.

FAQ

Frequently Asked Questions

Common questions about PII, privacy regulations, and how automated anonymization works.

What is the difference between PII and personal data?

The GDPR’s definition of “personal data” is broader than the US concept of “PII.” Under GDPR, personal data includes any information relating to an identified or identifiable natural person — including online identifiers like IP addresses and cookie IDs. US “PII” definitions vary by sector and state law, and are generally narrower, often requiring the data to directly identify an individual. In practice, if you comply with GDPR’s broader definition, you are also covering US PII requirements.

Is an email address PII?

Yes, always. An email address is a direct identifier because it can uniquely identify a specific individual without needing additional context. This is true under both GDPR (personal data) and US privacy frameworks (PII). Even a generic work email like info@company.com may qualify if it can be linked to a specific person. A personal email like firstname.lastname@provider.com is unambiguously PII in every jurisdiction.

Is an IP address PII?

Under the GDPR, yes — IP addresses are explicitly classified as personal data because they can be used to identify an individual, especially when combined with data held by an ISP. The CJEU confirmed this in the Breyer v. Bundesrepublik case (2016). In the US, it depends on context: the CCPA considers IP addresses personal information, while other federal frameworks may not classify them as PII unless they can be directly linked to a specific person. Best practice: treat IP addresses as PII regardless of jurisdiction.

What happens if PII is exposed?

The consequences are severe and multi-layered. Regulatory fines: GDPR allows penalties up to 4% of annual global revenue or €20 million, whichever is higher. Legal liability: class-action lawsuits, individual compensation claims, and mandatory breach notifications within 72 hours (GDPR) or varying state deadlines (US). Reputational damage: loss of customer trust, negative press coverage, and long-term brand impact. Operational costs: forensic investigation, remediation, credit monitoring for affected individuals, and increased insurance premiums.

How many types of PII can anonymize.solutions detect?

Over 260 entity types across 48 languages. The hybrid detection engine combines NLP-based named entity recognition (25 spaCy languages + 7 Stanza + 16 XLM-RoBERTa) with 317 regex recognizers that include checksum validation (Luhn for credit cards, MOD-97 for IBANs, format rules for SSNs). Entity types span identity, contact, financial, health, digital, biometric, employment, and education categories — covering the full spectrum of PII as defined by major regulations including GDPR.

What’s the best way to protect PII in AI workflows?

Pre-processing anonymization: detect and remove PII before data reaches the LLM. This eliminates the risk at the source rather than relying on the AI provider’s data handling policies. Use a deterministic detection engine (not another LLM) to avoid circular dependencies where you send sensitive data to one AI to protect it from another. Ensure the anonymization layer supports reversible tokens so authorized users can re-identify data in AI responses when needed. For implementation, use an MCP Server for IDE integration, a Chrome Extension for browser-based AI chats, or a REST API for automated pipelines.

Start protecting PII automatically

320+ entity types, 48 languages, Zero-Knowledge architecture. Detect and anonymize PII before it becomes a compliance risk.

Request Demo View Features

Try PII Detection Live

See how our platforms detect 320+ entity types across 48 languages:

Enterprise API

Full API with comprehensive detection

Try anonymize.today ↗

Consumer-Friendly

Simple interface, no signup required

Try anonym.today ↗

View All 11 Platforms →