Quick Links

Honest Broker System & Services of the RIS

Each division within the collaborative honest broker service should have established policies and procedures for their department level activities related to release of information and maintaining confidentiality of protected health information in accordance with system-wide UPMC and/or University policies. The Registry policy #AF03, Data Requests and Release of Information, is revised on a regular basis to assure all UPMC, University and federal policies are encompassed.

All data requests are tracked in a secure, web-based data request tool, regardless of whether the purpose is clinical or research related. Not only does this tool provide the capability of trending use of Registry and other data sources, it helps to serve as the mechanism for assuring compliance with IRB policies through tracking of broker certification and IRB-approved/exempt projects. Standard reports are available within the tool. Ad-hoc queries are built using Crystal Reports.

Types of Requests

As outlined previously, the Registry receives various types of requests throughout the year. The "purpose" for each request is stored in the data request tracking system.

Hospital Operations – Requests for data to be used for clinical, quality improvement, process improvement, incidence, educational presentations/abstracts and marketing do not require IRB approval.

Preparatory for Research – In accordance with UPMC Policy #HS-EC1611, UPMC permits researchers to review PHI for the purpose of preparing a research hypothesis and protocol. The researcher must complete a Data Use Agreement for “Preparatory for Research” activities and abide by the agreement. This agreement is kept on file by the honest broker service. A copy is sent to the requestor.

Research involving Human Subjects – Generally, a covered entity may not use or disclose PHI for research purposes without patient consent if the research is involving human subjects. Individual requests for PHI must provide full IRB proposal and IRB approval letter before any data will be released. This documentation is kept on file.

Exempt Research - There are circumstances deemed by the IRB to be exempt from the research authorization:

  • Use of De-Identified Health Information provided by a certified honest broker, ”Safe Harbor”, no PHI
  • Limited Data Sets provided by a certified honest broker, modified “Safe Harbor” permitting state, city, full zip code and dates (Data Use Agreement required)
  • Research Limited to Deceased Patients (Data Use Agreement required)
  • IRB Waiver of Authorization Requirement (Data Use Agreement required)
  • De-Identification Process

    De-identification of PHI can be done manually by the honest broker by stripping the following HIPAA-identified PHI and replacing with a unique code for each case or through use of automated de-identification applications. Unique codes associated with each record must be maintained on a project specific basis by the designated honest brokers.

  • full name
  • street address, city, county, precinct, complete zip code
  • year of dates directly related to an individual (dob, admission/discharge, death, etc.) and all ages over 89
  • telephone numbers
  • fax numbers
  • email addresses
  • social security numbers
  • medical record numbers
  • health plan numbers
  • account numbers
  • certificate/license numbers
  • vehicle identifiers (VIN, license plate numbers)
  • device identifiers and serial numbers
  • web Universal Resource Locators (URLs)
  • IP addresses
  • biometric identifiers
  • When full de-identification of electronic text-based documents are required for research projects, these are referred to the Center for Clinical Research Information Services.

    CRIS

    CRIS is a jointly sponsored service of the Office of Clinical Research and the Center for Biomedical Informatics. CRIS is available for use by faculty in the Schools of the Health Sciences, University of Pittsburgh and for UPMC special projects requiring de-identified datasets. CRIS is a certified honest broker with the University of Pittsburgh IRB and has a business associate agreement with UPMC. The polices and procedures of CRIS are posted on the Office of Clinical Research website

    CRIS uses the De-ID© application developed by the Center for Biomedical Informatics at the University of Pittsburgh and licensed by the University to De-ID Data Corp, Philadelphia, PA. The De-ID application is used by the National Cancer Institute and other academic medical centers for various research applications.

    De-ID© uses a set of heuristics to identify the presence of any of the HIPAA seventeen (images not included) specific identifiers within electronically stored medical text. The De-ID© application has a configurable option for either Safe-Harbor or Limited Data Sets. The downside of applying De-ID© is the removal of a small amount of clinical information during the de-identification process. In the work to date, only minor problems have been found with this approach. Most of the inappropriately de-identified text (over-markings) consists of (1) addresses that contain commonly used words (e.g., the "MI" in Lansing, MI is confused as being an abbreviation for myocardial infarction), and (2) names that are medical terms but not in the UMLS (i.e. Hickman catheter).

    De-ID©’s main process is to locate identifiable text, as defined by either safe-harbor method or the limited dataset. Identifiers are located in the text by the firing a set of rules a sentence at a time. De-ID© replaces identifiable text with specific tags. Names found multiple times in the report are consistently replaced with the same tag to improve readability of the report. For any of the potential identifiers removed from the text, a tag is left to hold its place. For example, when a telephone number is removed from text, the tag "**PHONE-NUMBER" will be left in its place so that the reader can see that something was removed. Each tag begins with a double asterisk.

    Supplemental dictionaries of geographic locations, hospital names, popular names found in the U.S. Census are used to locate identifiable text. The UMLS Metatheasaurus is utilized to ensure that words or phrases that are medical terms are preserved.

    De-ID© automatically creates a linkage file when a dataset is processed. The linkage file is stored in an encrypted format and only available for viewing with the password given at the time of processing. The study identifier is a two-part code; part one is the number of the report for that patient; and part two is a unique 12 alphanumeric code for that patient. This is done so that the study id remains consistent across data sets but different admissions and/or multiple reports can be easily identified.

    The Center for Biomedical Informatics (CBMI) performs formal evaluations of the De-ID© software. Five physicians are doing a current evaluation at UPMC Presbyterian. Also, the Center for Pathology Informatics performed an independent evaluation of the De-ID© software last year.

    In addition, the output generated by De-ID© is briefly reviewed prior to releasing it to the investigator. Continual improvements are done on De-ID© on a monthly basis from lessons learned from the use of the application. It is the goal of the software to have no under-markings (identifiable text remaining in the document) while minimizing over-markings (text that was de-identified that should not have been) to not distract from clinical content.

    Correspondence between HIPAA regulations for de-identification and the current de-identification goals of the De-ID. A yes in the second column indicates that de-ID attempts to remove the type of information that is listed in the first column.

    HIPAA Information Item Handling of the Information Item by de-ID
    Names yes, for all types of names of individuals, including patients, family members, friends, and clinicians
    All geographic subdivisions smaller than a State removes addresses, ZIP codes, hospital names, company names, and names of town and cities in the U.S.
    All elements of dates (except year) for dates directly related to an individual; patient age yes, replaces dates with relative dates and replaces ages in years with ages in deciles
    Telephone numbers Yes
    Fax numbers Yes
    Electronic mail addresses Yes
    Social security numbers Yes
    Medical record numbers Yes
    Health plan beneficiary numbers Yes*
    Account numbers Yes*
    Certificate/license numbers Yes*
    Vehicle identifiers and serial numbers Yes*
    Device identifiers and serial numbers Yes
    Web Universal Resource Locators (URLs) Yes
    Internet Protocol (IP) address numbers Yes
    Biometric identifiers, including finger and voice prints not applicable to text-only records
    Full face photographic images and any comparable images not applicable to text-only records
    Any other unique identifying number, characteristic, or code Includes surgical pathology slide numbers and other types of case specific codes.

    * The de-identification occurs for those numbers that contain nine or more digits, any of which may be separated by hyphens.