PHI vs PII: What Every Compliance Officer Needs to Know in 2026
by VIDIZMO Team, Last updated: April 1, 2026, ref:

Protected Health Information (PHI) and Personally Identifiable Information (PII) are governed by different laws, carry different penalties, and demand different handling protocols. Confusing the two exposes organizations to regulatory fines, data breaches, and loss of public trust.
The stakes are real. The U.S. Department of Health and Human Services reported over 700 major healthcare data breaches in 2024 alone, each affecting 500 or more individuals. Many involved records that qualified as both PHI and PII. Organizations across healthcare, law enforcement, insurance, and legal sectors encounter both data types regularly, often in the same file or the same piece of digital evidence.
Getting the classification right determines which regulations apply, which safeguards you need, who can access what, and how long you retain it.
Key Takeaways
- PHI is health information linked to an individual, governed by HIPAA. PII is any data that can identify a person, covered by multiple federal and state laws.
- All PHI contains PII, but not all PII qualifies as PHI. Healthcare context is what elevates PII to PHI status.
- Organizations handling both need distinct workflows for detection, redaction, access control, and retention.
- Misclassifying PHI as PII can trigger HIPAA penalties up to $2.13 million per violation category per year.
- Automated detection tools can flag both data types across video, audio, and documents before human review.
What Is PII and Why Does It Require Protection?
Personally Identifiable Information (PII) is any data that can be used, alone or combined with other information, to identify a specific individual. The U.S. Government Accountability Office (GAO) defines PII broadly, and the definition varies slightly across agencies and regulations.
PII falls into two categories:
Directly Identifiable PII
This data identifies someone on its own, without needing additional context:
- Full name
- Social Security number (SSN)
- Driver's license number
- Passport number
- Biometric data (fingerprints, facial geometry, retinal scans)
- Email address tied to a real name
Indirectly Identifiable PII
This data can't identify someone alone but becomes identifying when combined with other data points:
- Date of birth
- ZIP code
- Gender
- Race or ethnicity
- Employment information
- Phone number (in some contexts)
Research from Latanya Sweeney at Harvard's Data Privacy Lab demonstrated that 87% of the U.S. population can be uniquely identified using just three indirect identifiers: ZIP code, birth date, and gender. That single finding reshaped how privacy professionals think about "non-sensitive" data.
No single law governs PII. Instead, a patchwork of federal statutes (the Privacy Act of 1974, FISMA, COPPA) and state laws (CCPA/CPRA in California, the Colorado Privacy Act, Virginia's CDPA) covers it. The rules that apply depend on your industry, the data subject's location, and how the data was collected.
What Is PHI and How Does HIPAA Define It?
Protected Health Information (PHI) is individually identifiable health information that is created, received, maintained, or transmitted by a covered entity or business associate. The HHS Office for Civil Rights specifies that PHI includes any data in a medical record that can identify an individual and was created during healthcare delivery.
Three conditions must all be true for data to qualify as PHI:
- It relates to a person's past, present, or future physical or mental health condition, healthcare services, or payment for healthcare.
- It identifies the individual (or provides a reasonable basis for identification).
- It is held by a covered entity (healthcare provider, health plan, healthcare clearinghouse) or a business associate.
The 18 HIPAA Identifiers
HIPAA's Privacy Rule defines 18 specific identifiers that make health information "individually identifiable." Removing all 18 is one path to de-identification (the Safe Harbor method).
.webp?width=1199&height=906&name=Table%2001%20(1).webp)
Remove all 18 from a health record, and it's no longer PHI under the Safe Harbor standard. Leave even one, and HIPAA's full requirements apply.
How Does PHI Overlap with PII?
Every piece of PHI contains PII by definition. A medical record with a patient's name, diagnosis, and treatment plan includes PII (the name) and health data (the diagnosis). Together, they form PHI. But a name on a gym membership form? That's PII only. No covered entity, no health treatment context.
The overlap creates confusion because the same data element can be PII in one context and PHI in another. Consider a Social Security number:
- On a tax return: PII, governed by IRS regulations and the Privacy Act.
- On a hospital intake form: PHI, governed by HIPAA.
- In a law enforcement report referencing a suspect's medical treatment: Potentially both, depending on the source and the entity holding it.
This contextual nature is why classification can't be done with simple keyword matching. The same field in a database carries different regulatory weight depending on who created the record, why it was created, and who holds it now.
Why Does Misclassifying PHI and PII Lead to Penalties?
Getting the classification wrong isn't theoretical. It triggers specific, measurable consequences.
HIPAA violations carry a tiered penalty structure that HHS adjusts annually for inflation. As of 2024, the tiers are:
- Tier 1 (Did not know): $137 to $68,928 per violation
- Tier 2 (Reasonable cause, not willful neglect): $1,379 to $68,928 per violation
- Tier 3 (Willful neglect, corrected within 30 days): $13,785 to $68,928 per violation
- Tier 4 (Willful neglect, not corrected): $68,928 per violation
All four tiers carry an annual maximum of $2,067,813 per violation category.
PII breaches carry different penalties depending on the applicable law. California's CCPA allows statutory damages of $100 to $750 per consumer per incident. The FTC has pursued enforcement actions with penalties exceeding $5 million for PII mishandling.
Here's the practical risk: if an organization treats PHI as ordinary PII, it may apply insufficient safeguards. Standard PII protections like encryption and access controls are necessary but not sufficient for PHI. HIPAA requires specific administrative, physical, and technical safeguards, plus documentation, training, business associate agreements, and breach notification within 60 days.
What Are Real-World Examples of PHI vs PII?
Body-Worn Camera Footage A police officer's body camera records a domestic violence call. The victim's face is PII (biometric identifier). A visible prescription bottle is potential PHI. The officer reading out the victim's name and address over dispatch is PII. If the camera then records a hospital intake, that portion likely contains PHI. A single piece of footage can carry both data types, each requiring different protection before it is shared. Learn how video and audio redaction addresses these challenges in law enforcement evidence workflows.
Insurance Claim File An auto insurance claim begins with PII: the claimant's name, policy number, and vehicle information. Once the adjuster adds medical records from the treating physician, those records become PHI. The entire claim file now contains both data types under one roof, with different handling requirements for each.
Employee Wellness Program An HR database stores employee names, addresses, and job titles, all PII. When health screening results from a healthcare provider are added, those results become PHI. The HR system now needs HIPAA-level protections for the health data, even though the PII sitting alongside it does not.
Court Evidence A prosecutor subpoenas medical records for a criminal trial. Those records are PHI when held by the hospital. Once part of the court record, HIPAA treatment depends on whether the court is a covered entity (it typically is not). But the PII within those records still requires protection under other applicable laws.
Which Compliance Frameworks Govern Each Data Type?
PHI and PII sit under different regulatory umbrellas, though those umbrellas sometimes overlap. Knowing which framework applies determines your safeguard requirements, breach notification timeline, and documentation obligations.

Organizations in healthcare, law enforcement, and insurance often fall under multiple frameworks at once. A hospital's security camera system, for example, may need to comply with HIPAA (patient areas), state privacy law (employee areas), and potentially CJIS requirements if footage is shared with law enforcement during investigations.
Building a Data Protection Workflow
Protecting PHI and PII requires a layered approach. No single control addresses both data types across all contexts.
Step 1: Classify at Ingestion Every file entering your system needs classification. Is it PII? PHI? Both? That answer determines which retention rules, access controls, and redaction policies apply. Manual classification does not scale at thousands of records daily. Automated detection that scans for the 18 HIPAA identifiers and common PII patterns catches what human reviewers miss. For a deeper look at how digital evidence management systems handle this at scale, see our 2026 agency guide.
Step 2: Apply Access Controls PII typically requires role-based access. PHI requires more: the HIPAA "minimum necessary" standard, where each user accesses only the specific PHI needed for their job. Granular permissions are essential.
Step 3: Redact Before Sharing When records leave your organization through public records requests, legal discovery, or inter-agency sharing, both data types must be identified and redacted. Video evidence is particularly complex. PII appears visually (faces, license plates) and audibly (spoken names, SSNs). Audio PHI includes a person describing their medical condition on a recorded interview. Automated redaction tools can handle both detection and removal across formats simultaneously.
Step 4: Log Everything HIPAA and CJIS both require audit trails. Every access, modification, share, and export should be logged with timestamps, user identity, and the specific data accessed. These logs demonstrate compliance during audits and provide evidence in breach investigations.
Step 5: Enforce Retention and Disposal HIPAA requires covered entities to retain PHI records for six years; state laws may mandate longer. PII retention depends on the applicable regulation and your data minimization policy. Disposal must be verifiable, whether through cryptographic erasure aligned with NIST SP 800-88 or physical destruction.
How VIDIZMO Redactor Handles PHI and PII
VIDIZMO Redactor detects and redacts both PHI and PII across video, audio, documents, and images in a single platform — purpose-built for organizations in healthcare, law enforcement, insurance, and legal sectors.
For PII, Redactor automatically identifies 40+ entity types across all content formats. Faces, license plates, on-screen text, and signatures are redacted with frame-by-frame precision. Spoken PII, including names, SSNs, and addresses, is detected across 82+ languages and muted or bleeped automatically. Teams that need to target specific categories can use selective PII redaction without over-redacting surrounding content.
For PHI, Redactor supports HIPAA-compliant workflows with AES-256 encryption at rest, TLS 1.2+ in transit, role-based access controls, MFA, and configurable retention policies. Every redaction action is logged in a full audit trail for compliance documentation. PDFs, DOCX, XLSX, PPTX, and scanned images are processed natively through OCR, so no separate tool is needed when PHI or PII appears in documents alongside video or audio.
Redactor deploys as SaaS, on-premises, government cloud, or hybrid to match your data sovereignty requirements.
Start your free trial to see how automated PII and PHI detection works across your evidence workflows.
Is SSN Considered PHI or PII?
A Social Security number is always PII. It becomes PHI only when it appears in a healthcare context held by a covered entity or business associate. This is one of the most common classification questions compliance teams face, and the answer is always context-dependent.
An SSN on a hospital registration form is both PII and PHI. The same SSN on a job application is PII only. On a law enforcement report, it's PII governed by CJIS and applicable state laws. The number itself doesn't change. The regulatory treatment changes based on who holds it and why.
This contextual sensitivity is why automated classification systems need more than pattern matching. They need to account for the document type, the holding entity, and the data's relationship to healthcare delivery.
Frequently Asked Questions
Are PHI and PII the same thing?
No. PHI is a subset of PII that specifically relates to health information held by HIPAA-covered entities. All PHI contains PII (the identifying elements), but most PII has nothing to do with healthcare and isn't governed by HIPAA. A person's name is PII; that same name on a medical record alongside a diagnosis is PHI.
What are the four types of PHI?
PHI exists in four primary forms: written (paper medical records, printed lab results), electronic (ePHI, including digital health records and scanned documents), oral (spoken health information during consultations or phone calls), and visual (medical imaging, video recordings of patient interactions). HIPAA's Security Rule specifically addresses ePHI, while the Privacy Rule covers all four forms.
Is PHI considered HIPAA?
PHI is the category of data that HIPAA protects. HIPAA is the law; PHI is the data type that law governs. If your organization creates, receives, stores, or transmits PHI, the Privacy Rule, Security Rule, and Breach Notification Rule all apply. Organizations that don't handle PHI aren't subject to HIPAA, even if they handle large volumes of PII.
Can video evidence contain both PHI and PII simultaneously?
Yes. Body-worn camera footage, surveillance video from healthcare facilities, and recorded insurance interviews frequently contain both. A single video might show a person's face (PII), capture their spoken name and address (PII), and record discussion of their medical condition or treatment (PHI). Each data type requires its own protection level before the evidence is shared, making automated visual and spoken redaction critical for compliance.
What happens if an organization treats PHI as regular PII?
Treating PHI as ordinary PII means applying insufficient safeguards. HIPAA requires specific administrative, physical, and technical controls beyond standard PII protections, including business associate agreements, workforce training, minimum necessary access standards, and 60-day breach notification. Failure to meet these requirements can result in penalties up to $2.13 million per violation category annually, plus potential criminal charges for knowing violations.
How do you de-identify PHI under HIPAA?
HIPAA provides two methods. The Safe Harbor method requires removing all 18 specified identifiers and confirming that remaining data can't reasonably identify an individual. The Expert Determination method uses a qualified statistician to certify that re-identification risk is very small. Once properly de-identified, the data is no longer PHI and HIPAA protections no longer apply. Organizations managing digital evidence at scale often use automated tools to strip identifiers from records before sharing.
Does VIDIZMO DEMS support HIPAA-compliant evidence management?
VIDIZMO DEMS supports HIPAA-compliant deployments through AES-256 encryption at rest, TLS 1.2+ encryption in transit, role-based access control with minimum necessary enforcement, audit logging in tamper-proof WORM storage, and automated PII/PHI detection and redaction. The platform deploys on Azure Government Cloud infrastructure that maintains its own HIPAA compliance posture, giving organizations a compliant foundation for managing evidence containing protected health information.
Protect Sensitive Data Across Every Evidence Type
The line between PHI and PII isn't always obvious, especially when your organization handles digital evidence from body cameras, surveillance systems, interview recordings, and insurance claim files. Getting the classification right is the first step. Building automated workflows that detect, protect, and redact both data types is what actually reduces your risk exposure.
Whether you're managing evidence in law enforcement, processing healthcare-related records, or handling insurance claims that cross into medical territory, the right platform eliminates guesswork and keeps your organization on the correct side of HIPAA, CJIS, and state privacy laws.
Jump to
You May Also Like
These Related Stories
.jpg)
5 Different Ways Through Which Video Can Help Grow Your Business
%20(1).jpg)
Why 911 Call Transcription is Critical for Law Enforcement


No Comments Yet
Let us know what you think