CureDAO
DiscordGithubGet InvolvedGovernance
  • ๐Ÿ’กLitepaper
  • โ˜ ๏ธIntroduction and Challenges
  • ๐Ÿ’กSolution
  • ๐ŸญPlatform
  • โค๏ธIncentivization
  • ๐Ÿ›๏ธOrganization
  • ๐ŸฅธPrivacy
  • ๐ŸŒŽEcosystem
  • ๐Ÿ”“Data Security
  • ๐Ÿ“–References
  • ๐Ÿ”ŒPlugins
    • ๐Ÿ•ธ๏ธApi Integration Plugins
    • ๐Ÿ–ฅ๏ธData Analysis Plugins
    • ๐Ÿ“ฒData Collection Plugins
    • ๐Ÿ“‘Observational Studies Plugin
    • ๐Ÿ’‰OpenCures Trial Management Plugin
    • ๐Ÿค–Optomitron Real-Time Decision Support Plugin
    • ๐Ÿท๏ธOutcome Labels Plugin
    • ๐Ÿฅ•Root Cause Analysis Plugin
    • ๐Ÿ”ŽPredictor Search Engine Plugin
  • ๐Ÿ“–Reference Databases
    • ๐ŸฉธBiomarker Databases
    • ๐ŸคฎDiseases
    • ๐Ÿฅ‘Food Databases
    • ๐Ÿ’ŠMedication Databases
    • ๐Ÿ’ŠSupplement Databases
    • ๐Ÿ“Units
  • ๐Ÿ•ธ๏ธAPI Docs
Powered by GitBook
On this page
  • Privacy
  • De-identification Methods
  • Safe Harbor Method of De-identification
  • Data De-identification Tools
  • References

Was this helpful?

Export as PDF

Privacy

To protect privacy, CureDAO will use deidentification and obfuscated but equivalent data synthetically derived from actual patient data.

PreviousOrganizationNextEcosystem

Last updated 1 year ago

Was this helpful?

๐Ÿ‘ˆ

Privacy

The Health Insurance Portability and Accountability Act of 1996 (โ€œHIPAAโ€) protects the privacy of patients and sets forth guidelines on how this private health information can be shared. Though the privacy of a patient must be protected, the legal right of a business to sell health information of patients has been upheld by the Supreme Court of the United States.

De-identification Methods

Data de-identification is the process of eliminating Personally Identifiable Data (PII) from any document or other media, including an individualโ€™s Protected Health Information (PHI).

Safe Harbor Method of De-identification

The HIPAA Safe Harbor Method is a precise standard for the de-identification of personal health information when disclosed for secondary purposes. It requires the removal of 18 identifiers from a dataset:

  1. Names

  2. All geographical subdivisions smaller than a State, including street address, city, county, precinct, zip code, and their equivalent geocodes, except for the initial three digits of a zip code, if according to the current publicly available data from the Bureau of the Census:

    1. The geographic unit formed by combining all zip codes with the same three initial digits contains more than 20,000 people and

    2. The initial three digits of a zip code for all such geographic units containing 20,000 or fewer people is changed to 000.

  3. All elements of dates (except year) for dates directly related to an individual, including birth date, admission date, discharge date, date of death and all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older

  4. Phone numbers

  5. Fax numbers

  6. Electronic mail addresses

  7. Social Security numbers

  8. Medical record numbers

  9. Health plan beneficiary numbers

  10. Account numbers

  11. Certificate/license numbers

  12. Vehicle identifiers and serial numbers, including license plate numbers

  13. Device identifiers and serial numbers

  14. Web Universal Resource Locators (URLs)

  15. Internet Protocol (IP) address numbers

  16. Biometric identifiers, including finger and voice-prints

  17. Full face photographic images and any comparable images and

  18. Any other unique identifying number, characteristic, or code (note this does not mean the unique code assigned by the investigator to code the data).

Data De-identification Tools

1. ARX Data Anonymization Tool

2. deid software package

3. Synthetic Patient Generation

References

is an open-source tool that anonymizes sensitive personal information. It supports a range of privacy and risk models, techniques for data transformation, and techniques to analyze the utility of output data.

The includes code and dictionaries that automatically locate and remove PHI in free text from medical records. It was developed using over 2,400 nursing notes that were methodically de-identified by a multi-pass process including various automated methods as well as reviews by multiple experts working autonomously.

is an open-source, synthetic patient generator that models the medical history of synthetic patients. Our mission is to provide high-quality, synthetic, realistic but not real, patient data and associated health records covering every aspect of healthcare. The resulting data is free from cost, privacy, and security restrictions, enabling research with Health IT data that is otherwise legally or practically unavailable.

๐Ÿ‘‰

This work is licensed under a .

๐Ÿฅธ
ARX
deid software package
Synthea
HealthVerity Census - HealthVerity
Frontiers | A Policy and Practice Review of Consumer Protections and Their Application to Hospital-Sourced Data Aggregation and Analytics by Third-Party Companies | Big Data (frontiersin.org)
Data De-Identification, Data Minimization Into Snowflake with Baffle
Use of Medicare Data (qemedicaredata.org)
Data De-Identification - Satori (satoricyber.com)
Next Privacy
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License
Back to Table of Contents
de-identification