Home/Articles/AI and HIPAA: Using Patient Data Without Breaking the Rules

AI and HIPAA: Using Patient Data Without Breaking the Rules

You cannot build useful medical AI without medical data, and medical data is some of the most regulated information there is. Here is how those two facts are reconciled, from a security perspective.

By Sajed Khan/May 12, 2026/2 min read

This is not legal advice, and the specifics of any project should go through people who do compliance for a living. But the principles are not mysterious, and they shaped how I approached the privacy side of our patent.

What the rules actually care about

The point of health privacy rules is simple even when the regulations are dense. Protected health information should only be used for legitimate purposes, only by people who should see it, only as much as is needed, and with safeguards that keep it from leaking. Strip away the legalese and you get four ideas: legitimacy, least access, minimum necessary, and protection.

Most failures are not exotic. They are a researcher copying a dataset to a laptop, a storage bucket left open, an account with far more access than it needed. The rules exist because the careless path is so easy.

How you do AI without exposing patients

The strongest move is to work as far from raw, identifiable data as you can. Where possible, data is de-identified before it is used. Beyond that, a system can operate on transformed representations of the data rather than the raw records, which is part of why our design converts scans into privacy-preserving representations instead of passing the original images around.

Then you apply the unglamorous controls that actually prevent breaches. Encrypt the data. Control access by identity, and grant only the minimum each person or process needs. Assume nothing is trusted just because it is inside the system. And log every access, so that if a question is ever asked, you can answer it with a record instead of a guess.

Why this is a design choice, not a checkbox

The mistake I have watched teams make is treating compliance as paperwork to finish at the end. By then the architecture has already decided whether you are safe, and usually it has decided badly. Privacy has to be a constraint you build around from the first day, the same way you would design around any other hard requirement.

That is the conviction behind the security side of our work, and there is more on the patents page. Done right, you get the benefit of the data without putting the patient at risk. Done as an afterthought, you get a breach with your name on it.

FAQ

Can you use patient data to train AI under HIPAA?

With proper safeguards: de-identification or authorization, least and minimum-necessary access, encryption, and audit controls. Many systems reduce risk further by working from de-identified or transformed representations rather than raw records.