What are the 18 HIPAA PHI identifiers?

The 18 PHI identifiers under HIPAA are: names, geographic data smaller than a state, dates (except year) related to an individual, phone numbers, fax numbers, email addresses, Social Security numbers, medical record numbers, health plan beneficiary numbers, account numbers, certificate/license numbers, vehicle identifiers and serial numbers, device identifiers and serial numbers, web URLs, IP addresses, biometric identifiers, full-face photographs, and any other unique identifying number or code.

What is the difference between HIPAA Safe Harbor and Expert Determination?

Safe Harbor requires removing all 18 specified PHI identifiers and having no actual knowledge that remaining information could identify an individual. Expert Determination requires a qualified statistical or scientific expert to determine that the risk of identifying an individual is very small. Safe Harbor is prescriptive and easier to implement; Expert Determination is flexible but requires specialist involvement.

Does HIPAA apply to anonymized data?

No. Once data is properly de-identified under either the Safe Harbor or Expert Determination method as defined in 45 CFR 164.514, the data is no longer considered PHI and HIPAA protections no longer apply. This means de-identified data can be used, disclosed, and shared without HIPAA restrictions.

How does anonymize.solutions help with HIPAA compliance?

anonymize.solutions provides a HIPAA preset that covers all 18 PHI identifiers. The NLP Engine detects names, dates, and contextual health data while the Pattern Engine validates structured identifiers like SSNs, medical record numbers, and phone numbers with checksum validation. Role-based encryption keys enable department-level access control for healthcare workflows.

HIPAA Data De-identification — 18 PHI Types Guide 2026

Regulation Overview

What is HIPAA and Why Does De-Identification Matter?

The Health Insurance Portability and Accountability Act (HIPAA) of 1996 establishes national standards for protecting individuals’ medical records and other individually identifiable health information. The Privacy Rule (45 CFR Part 160 and Subparts A and E of Part 164) defines the requirements for de-identification.

The Privacy Rule

45 CFR §164.514 defines two methods for de-identifying Protected Health Information (PHI): Safe Harbor and Expert Determination. Both produce data that is no longer considered PHI and falls outside HIPAA’s scope.

Who Must Comply

Covered entities (health plans, healthcare clearinghouses, healthcare providers) and their business associates who create, receive, maintain, or transmit PHI on their behalf. Penalties range from $100 to $50,000 per violation, up to $1.5 million per year.

The Goal

De-identified data is no longer PHI. It can be used for research, analytics, public health, and quality improvement without patient consent, breach notification obligations, or HIPAA restrictions. This makes de-identification a critical enabler for healthcare innovation.

Safe Harbor Requirements

The 18 PHI Identifiers Under HIPAA

The Safe Harbor method (§164.514(b)) requires removal of all 18 types of identifiers. If all 18 are removed and the covered entity has no actual knowledge that remaining information could identify an individual, the data is considered de-identified.

Personal Identifiers

Names
Geographic data smaller than a state (street address, city, county, precinct, ZIP code)
Dates (except year) directly related to an individual (birth date, admission date, discharge date, death date, and all ages over 89)
Phone numbers
Fax numbers
Email addresses
Social Security numbers
Medical record numbers
Health plan beneficiary numbers

Technical & Other Identifiers

Account numbers
Certificate/license numbers
Vehicle identifiers and serial numbers (including license plate numbers)
Device identifiers and serial numbers
Web URLs
IP addresses
Biometric identifiers (fingerprints, voice prints)
Full-face photographs and comparable images
Any other unique identifying number, characteristic, or code

Note on ZIP codes: The first three digits of a ZIP code may be retained if the geographic unit formed by combining all ZIP codes with the same three initial digits contains more than 20,000 people. Otherwise, the first three digits must be replaced with “000.”

De-Identification Methods

Safe Harbor vs Expert Determination

HIPAA provides two paths to de-identification. The choice between them depends on your use case, data utility requirements, and available resources.

Comparison of Safe Harbor and Expert Determination de-identification methods under HIPAA
Dimension	Safe Harbor §164.514(b)	Expert Determination §164.514(a)
Approach	Prescriptive — remove all 18 identifiers	Analytical — statistical risk assessment
Ease of Implementation	Straightforward — clear checklist of identifiers	Complex — requires qualified expert
Data Utility	Lower — all 18 identifier types fully removed	Higher — expert may allow partial retention
Documentation	Demonstrate removal of 18 identifiers + no actual knowledge	Expert’s methods and results must be documented
Cost	Lower — can be automated with detection tools	Higher — requires engaging a qualified expert
When to Use	Standard de-identification for most use cases	Research requiring higher data utility, rare disease studies, small populations

Safe Harbor in Practice

The Safe Harbor method is the most commonly used approach because it provides a clear, auditable checklist. Organizations simply verify that all 18 identifier types have been removed or generalized. Automated PII detection tools can handle the vast majority of Safe Harbor de-identification.

Expert Determination in Practice

Expert Determination is used when preserving data utility is critical — for example, in clinical research with small cohorts or rare disease registries. The expert applies statistical and scientific principles to determine that the risk of identifying any individual is “very small.” Results and methods must be documented.

Implementation

De-Identification Techniques for Healthcare Data

Five core techniques for transforming PHI. Each serves a different purpose depending on whether the goal is irreversible anonymization, reversible pseudonymization, or partial masking for clinical workflows.

REPLACE

Replacement

Substitute PHI with realistic synthetic data. “John Smith” becomes “[PATIENT_1]” or “Jane Doe.” Medical record numbers become synthetic IDs. Maintains document readability for EHR exports, clinical trial reports, and training datasets.

REDACT

Redaction

Remove PHI entirely. Detected identifiers are deleted from the text with no replacement. Best for FOIA responses, public health reports, and documents shared with external researchers where zero PHI exposure is required.

MASK

Masking

Partially obscure sensitive values while preserving enough for verification. An SSN becomes “***-**-4532” and a medical record number becomes “MRN-****-789.” Ideal for patient portals where individuals need to verify their own records.

HASH

Hashing

One-way cryptographic transformation for link analysis. The same patient identifier always produces the same hash, enabling longitudinal studies across datasets without exposing identity. Note: hashing produces pseudonymized data — it is not considered de-identified under HIPAA Safe Harbor unless combined with other safeguards.

ENCRYPT

Encryption

Reversible transformation with key. Authorized clinicians can restore original PHI for treatment purposes. Role-based encryption keys enable department-level access: radiology sees imaging IDs, billing sees account numbers, researchers see de-identified records only. AES-256-GCM with per-entity keys.

Compliance Checklist

HIPAA De-Identification Implementation Checklist

A step-by-step implementation plan for deploying HIPAA-compliant de-identification across your healthcare organization.

Inventory PHI data flows

Map all systems that create, receive, maintain, or transmit PHI: EHR systems, claims processing, lab results, imaging, billing, patient portals, and third-party integrations.

Choose de-identification method

Select Safe Harbor (prescriptive, automated) or Expert Determination (analytical, specialist-driven) based on your use case, data utility requirements, and budget.

Map all 18 PHI identifiers to detection rules

For each of the 18 identifier types, configure detection rules. NLP engines detect names, dates, and contextual data. Pattern engines validate SSNs, phone numbers, medical record numbers, and account numbers.

Select anonymization method per identifier

Match techniques to identifiers: Replace for names and dates, Redact for geographic data below state level, Mask for SSNs in patient-facing contexts, Encrypt for data that authorized staff must later access.

Deploy automated detection and anonymization

Integrate the anonymization engine into EHR export pipelines, research data repositories, and inter-organizational data sharing workflows. anonymize.solutions HIPAA preset covers all 18 identifiers automatically.

Configure role-based encryption keys

Set up department-specific encryption keys so that radiology, billing, clinical research, and administration each see only the PHI relevant to their function.

Establish audit trails

Log every de-identification operation: what was detected, what method was applied, who initiated it, and when. These logs are essential for demonstrating HIPAA compliance during OCR audits.

Conduct re-identification risk assessment

Verify that de-identified datasets cannot be re-identified using reasonably available means. Test against known attack vectors: linkage attacks, inference attacks, and reconstruction attacks.

Train staff and schedule reviews

Ensure all staff handling PHI understand de-identification procedures. Schedule annual reviews of de-identification effectiveness as new data sources and technologies emerge.

Our Platform

How anonymize.solutions Helps With HIPAA

Purpose-built infrastructure for healthcare data de-identification. Every feature is designed to simplify Safe Harbor compliance and reduce implementation time.

HIPAA Preset

Pre-configured detection for all 18 PHI identifier types. Names, dates, SSNs, medical record numbers, health plan IDs, device identifiers, IP addresses, and more — all mapped to appropriate anonymization methods.

Dual Detection Engines

NLP Engine detects names, dates, and contextual health data. Pattern Engine validates structured identifiers (SSNs, medical record numbers, phone numbers) with checksum validation. Hybrid mode combines both for maximum accuracy.

Audit Trail

Complete processing logs for every de-identification operation. Entity type, method applied, confidence score, timestamp — everything needed for OCR compliance audits and internal governance.

Zero-Knowledge

We never see your patient data. Password-derived encryption with Argon2id means only mathematical proofs are transmitted. Even our team cannot access PHI — the strongest data minimisation guarantee for healthcare.

EU Hosting

100% Hetzner Germany infrastructure. No data leaves the European Union. For US-based organizations, Self-Managed deployment runs on your own HIPAA-compliant infrastructure.

Role-Based Keys

Department-specific encryption keys. Radiology sees imaging data, billing sees financial data, researchers see de-identified records only. Granular access control built into the anonymization layer.

Cross-Regulation

Healthcare organisations operating internationally often need to comply with both HIPAA (US) and GDPR (EU). The two frameworks overlap in purpose but differ in scope, definitions, and enforcement.

Comparison of HIPAA and GDPR across key compliance dimensions
Dimension	HIPAA (US)	GDPR (EU)
Scope	PHI held by covered entities and business associates	All personal data of EU residents, any sector
Protected Data	18 specific PHI identifier types	Any information relating to an identified or identifiable person
De-Identification Standard	Safe Harbor (remove 18 identifiers) or Expert Determination	Recital 26 “means reasonably likely” test
Penalties	$100–$50,000 per violation, up to $1.5M per year	Up to €20M or 4% of global annual turnover
Effect of De-Identification	De-identified data is no longer PHI	Anonymized data falls outside GDPR scope entirely
Enforcement	HHS Office for Civil Rights (OCR)	National Data Protection Authorities (DPAs)

anonymize.solutions supports both HIPAA and GDPR compliance. Our HIPAA preset covers all 18 PHI identifiers, while the GDPR preset covers the broader personal data categories. For organisations subject to both, use the combined preset to satisfy both frameworks simultaneously. Read the GDPR Guide →

Implement HIPAA-compliant de-identification today

From the 18 PHI identifiers to automated Safe Harbor compliance — we provide the tools, presets, and infrastructure to make your healthcare data protection production-ready.

Request Demo View Packages

Try HIPAA Compliance Live

See how our platforms handle all 18 PHI identifiers with automated de-identification:

Healthcare & Legal

Reversible encryption for medical records

Try blurgate.legal ↗

Enterprise Implementation

Safe Harbor compliance at scale

Try anonymize.today ↗

View All 11 Platforms →