410. Maintaining Data Confidentiality

Updated July 13, 2021

The IRB is responsible for evaluating proposed research to ensure adequate provisions to protect the privacy of participants and to maintain the confidentiality of data. Research involving human participants must include adequate provisions to maintain the confidentiality of research data. Maintaining confidentiality requires safeguarding the information that an individual has disclosed in a relationship of trust and with the expectation that it will not be disclosed to others without permission, except in ways that are consistent with the original disclosure. Confidentiality in the context of human research also refers to the investigator’s agreement with participants, when applicable (i.e., through participants’ informed consent), about how their identifiable private information will be handled, managed, and disseminated. Individuals may only be willing to share information for research purposes with an understanding that the information will remain protected from disclosure outside of the research setting or to unauthorized persons.

NOTE: For the purposes of this policy, the term "data" is used in the widest sense, and includes numeric data files, and qualitative materials such as interview transcripts, diaries, and field notes. Research data may include audio and video formats, geospatial information, biometrics, Web sites, and data archives (including those available online).

When possible, it is best to retain research data without any identifiers so that individual participation is anonymous and the data collected cannot be linked to the individual. Requirements for confidentiality protections apply to Protected Personally Identifiable Information (PPII) obtained:

  • preliminary to research (e.g., PPII is obtained from private records to assess eligibility or contact prospective participants);
  • during data collection, analysis, and dispensation; and
  • after study closure (if PPII is retained).

Researchers are responsible for:

  • abiding by the IRB-approved researcher-participant agreement for the collection and protection of research data, and
  • protecting participants from harms that may result from breaches of confidentiality (e.g., psychological distress, loss of insurance, loss of employment, or damage to social standing).

Protecting Data Confidentiality

Routine Precautions to Protect Confidentiality

Where anonymity is not possible, researchers should take steps to preserve the confidentiality of study participants and the data collected from them. Methods for keeping data confidential range from using routine precautions, such as substituting codes for participant identifiers and storing data in locked cabinets, to more elaborate procedures involving statistical methods (e.g., error inoculation) or data encryption. Consideration should be given to requirements for data security and retention throughout and following completion of the study. Methods for handling and storing data (including the use of personal computers and portable storage devices) must comply with University policies. Restricted data, including protected health information, must be encrypted if stored or used on portable devices, if removed from a secure university location, or if electronically transmitted. In most research, assuring confidentiality is only a matter of following some routine practices:

  • PPII are replaced with research identification codes (ID Codes) for PPII. NOTE: Names and social security numbers may not be incorporated into or used for ID Codes.
  • Face sheets containing PPII are removed from completed survey instruments.
  • Access to master code lists or key codes is limited.
  • Master lists are stored separately from the data and destroyed as soon as reasonably possible.
  • Contact lists, recruitment records, or other documents that contain PPII are destroyed when no longer required for the research.
  • Files containing electronic data are password-protected and encrypted (at least when data are transferred or transported).
  • Research data/specimens are stored securely in locked cabinets or rooms.
  • Electronic data are stored in password-protected computers or files.
  • Files containing electronic data are closed when computers will be left unattended.
  • Consent and HIPAA authorization forms are stored securely in locked cabinets or rooms, separately from the research data.
  • Research staff are trained in the IRB-approved methods for managing and storing research data/specimens.

Considerations for Protecting Confidentiality During Data Collection

  • Inclusion of PPII: Will PPII be collected along with the data/specimens? What are the minimum PPII necessary to conduct the research?
  • Coding Data/Specimens: Will PPII be replaced with ID Codes when the data/specimen are collected/obtained (recommended)? If no, why not? If yes, will a master code list be used to link PPII with ID Codes? How will the confidentiality of the master code list be protected? Should numerical data be top- or bottom-coded?
  • Access to Clinic, Education, Program or Personnel Records for Research: How will researchers ensure only authorized persons access clinic or other private records that will be used for the research? How will researchers ensure confidentiality is maintained during the collection of private information from clinic or other records?
  • Electronic Records: How will researchers ensure electronic data are protected during data collection? Will participants completing online surveys be advised to close the browser to limit access to their responses?
  • Use of Translators or Interpreters: When data collection requires use of translators or interpreters who are not members of the research team, how will researchers ensure the confidentiality of the information collected?
  • In-person Interviews: What safeguards will be in place to maintain the confidentiality of data obtained through in-person interviews?
  • Focus Groups or Other Group Settings (schools, jail, clinics, and treatment centers): What protections will be in place to minimize the possibility that information shared in a group setting is disclosed outside of the group or for purposes other than those described in study documents?
  • Internet Research: How will researchers restrict access to survey responses during data collection (e.g., restricted access, data encryption, and virus and intruder protections)?
  • Data Collection via Mobile Applications (apps): What data will be collected? (Research data? Other data captured from the device the app is installed on?) Will the data being captured be identifiable? How will the data be obtained (e.g., data sent automatically from the app or device via the internet, or manual export of data)? Where will the data be stored and how? (Encryption utilized? University devices, firewalls, etc. utilized?) In case of a commercial app, what is the app’s privacy policy and will the app have access to the research data? Do participants need to be trained on how to use their mobile devices (e.g., how to adjust security features on the device, how to use encryption, how to use virtual private networks)? Does the app require usernames and passwords? (If yes, are they generated by the user or by the researcher? What if a participant forgets their username and/or password?)

NOTE: The University IRB does not allow research data to be collected or dispensed via email.

  • Field Procedures: What safeguards will be in place to maintain the confidentiality of data during collection in the field? During storage at field sites? During transport to the University?
  • Biometric or Genetic Testing: How will researchers protect the confidentiality of diagnostic or genetic information, especially if tests are outsourced?
  • Re-contacting Participants: What is the minimum information necessary for re-contacting participants? How will the confidentiality of the contact information be maintained during the research? When will the contact information be destroyed?
  • Linking Multiple Datasets: Research involving multiple datasets often require a common identifier be present in the various datasets (e.g., name, address, social security number). Will researchers use standard inter-file linkage procedures for merging the datasets? If not, how will confidentiality be protected?
  • Breach of Confidentiality Risks: Should documentation of consent be waived to protect participants in the event of a breach of confidentiality?

Considerations for Protecting Confidentiality When Storing Data/Specimens

NOTE: Considerations for data storage apply both before and after analysis.

  • Retaining PPII: Will PPII be stored with the data/specimens? Why?
  • Access to PPII: If PPII will be stored with data/specimens, who will have access? If stored data/specimens are coded, who will have access to the master code list? When will the master code list be destroyed?

NOTE: Access to PPII should be limited to researchers who require such access to fulfill research objectives. The master code list should be destroyed as soon as is feasible (e.g., immediately after data are cleaned).

  • Identification of Participants through Linked Elements: Will stored, coded data/specimens contain elements that may be used (alone or in combination) to link an individual with her/his data/specimens? This is particularly relevant to research with small cell sizes.
  • Storage of Electronic Records: How will researchers manage and electronic data to protect confidentiality?
  • Audio, Video, and Photographic Records: What additional precautions will be used to protect the confidentiality of audio, video, or photographic records in that individual participants may be identified through voice analysis (audio and video) or physical characteristics (video or photographic images)?
  • Security of Storage Facility: Are the security features of the storage site (or storage mechanisms for electronic data) sufficient to ensure data confidentiality?
  • Inclusion in Clinical or Program Records: Will research data be recorded in permanent clinical or program records? If yes, what information will be recorded and why will it be recorded in these records?
  • Placement of Data in Repositories: What are the requirements of the repository related to file formats; data management and sharing plans; documentation of form and content; variable names, labels, and groups; coding; and missing data.

Considerations for Protecting Confidentiality When Using Electronic Data

Many researchers are purchasing mobile apps or building their own app to interact with study participants. Even if the participant is asked to download a free app or provided monies for the download, the researcher is still responsible for disclosing potential risks. It is possible that the app the participant downloaded will capture other data stored or linked to the phone on which it is installed (e.g., contact list, GPS information, access to other applications such as Facebook). The researcher has the responsibility to understand known or potential risks and convey them to the study participant. Commercially available apps publish “terms of service” that detail how app data will be used by the vendor and/or shared with third parties. It is the researcher’s responsibility to understand these terms, relay that information to participants, and monitor said terms for updates. Additionally, it is important that the researcher collect from the app only the minimum data necessary to answer the research questions.

Many investigators wish to collect the IP addresses of survey participants to provide a method of determining whether the user has previously completed the survey. This is important to consider when conducting surveys, especially if the consent process indicates that a participant’s responses will be anonymous. When using Qualtrics, check the option to anonymize the data collection process and do not collect the IP address. If IP addresses are necessary to the research, include in the consent process that you will be recording this information.

Email notifications are generally not secure, except in very limited circumstances, and should not be used to share or transmit research data. Text messages are stored by the telecommunications provider and therefore are not secure. Data should be encrypted when “in-transit.”

The University’s standard Zoom environment is not HIPAA compliant. If the sessions are being recorded, the researcher needs to make sure the recordings are stored in a secure location. In addition, researchers must ensure that anti-virus software is up-to-date, operating system are patched with newest versions, and access is limited. Sessions should be stored in a cloud service or a University managed server.

Considerations for Protecting Confidentiality During Data Analysis and Presentation

  • Presenting Data: How will data be presented to ensure discrete variables cannot be used (alone or in combination) to identify an individual? This is especially important for research with small cell sizes.
  • Geocoding and Mapping: For research involving geocoding and mapping, what precautions will be implemented to protect the identities of individuals in the sample populations? Is it possible the mapped information may stigmatize or provoke anxiety among the individuals living in specific locales identified on the map?
  • Secondary or Incidental Findings: Will participants (or affected, biological family members) be told about secondary or incidental findings? If no, why not? If yes, how and to whom will the disclosure be made?

Informing Participants of Confidentiality Protections and Limitations

In general, researchers are obliged to provide the level of confidentiality specified in the consent materials. Individuals are to be informed about the extent to which confidentiality of their data will be maintained during all phases of the study, including who will have access to the data, what security measures will be used, and where data will be stored. Extensive security procedures may be needed in some studies, either to give individuals the confidence they need to participate and answer questions honestly, or to enable researchers to offer strong assurances of confidentiality. Complete confidentiality should not be promised, however, unless personal identifiers have not been obtained or recorded.

The information researchers are required to disclose to participants is commensurate with risk. More information about processes to protect confidentiality should be provided to participants in studies in which unauthorized disclosure may place them at risk, compared to participants in studies in which disclosure is not likely to expose them to harms.

Investigators may access PPII without informing the individuals to whom the information pertains if the IRB approves a waiver of the requirement to obtain informed consent. In such cases, researchers should be especially cognizant of the importance of keeping participants' information confidential because private information is being accessed without participants' knowledge or permission.

Required Disclosures Related to Confidentiality Protections

Researchers must tell participants:

  • how the information collected from/about them will be used (i.e., study purpose);
  • if PPII will be collected, and whether PPII will be disclosed in reports or publications resulting from the research;
  • who will have access to their PPII and the other information collected about them; and
  • the collection of audio, video, or photographic records. For the latter, researchers must obtain signed video/photo releases.

Optional Disclosures Related to Confidentiality Protections

Participants may benefit from being told:

  • why the collection/retention of PPII is necessary for the research;
  • if PPII will be stored with the data or linked to the data via a master code list;
  • how long the researchers will retain their PPII;
  • when data will be de-identified, or if not de-identified, when it will be destroyed; and
  • what procedures will be put in place to preclude unauthorized access to the research data.

Informing Participants about Secondary and Incidental Findings

When communicating the fundamental aspects of their research to the IRB and to participants, researchers must also consider whether study tests or procedures may reveal information about a study participant that is not the primary focus of the research but that may have clinical significance for the individual. Such findings may be secondary or to the research and anticipated or unanticipated.

Tests/procedures more likely to lead to secondary or incidental findings include large-scale genetic sequencing (e.g., whole genome sequencing, non-specific genomic analyses); non-discrete testing of blood and other biological specimens (e.g., metabolic panels); and imaging (e.g., MRI, CT, X-rays, ultrasounds). For more information, see the IRB policy for disclosing findings to participants.

Disclosures Related to Limits to Confidentiality

There are ethical or legal limits to confidentiality, for example when a researcher obtains information subject to mandatory reporting, such as evidence of child abuse. If it is probable that information subject to mandatory reporting may be collected during the study, a researcher should state these exceptions to confidentiality in the consent form. Researchers must tell participants about limitations on the protection of data confidentiality such as:

  • inspection of medical or research records by the IRB, FDA or sponsor;
  • mandatory reporting laws for communicable diseases; and
  • mandatory reporting laws for child or elder abuse.

Limits to Confidentiality for Humanities Projects

Humanities projects may not expect to keep participants' identities or their responses confidential; sometimes interviewees want their names associated with their responses. This practice is acceptable if research participants are made aware of whether or not their names will be associated with their responses and told of any inherent risks associated with such disclosure.

Additional Confidentiality Considerations

Certificates of Confidentiality

Research involving illegal activities, or the collection of sensitive data may require researchers to obtain a Certificate of Confidentiality for protection from subpoena.

Waivers of Documentation of Informed Consent

Research in which the principal risk is related to a breach of confidentiality may be eligible for an IRB waiver of signed consent. For example, in studies where participants are selected because of a sensitive, stigmatizing, or illegal characteristic (e.g., persons with illegal immigration status; or who have sexually abused children, sought treatment in a drug abuse program, or tested positive for HIV), keeping the identity of participants confidential may be more important than keeping the data obtained about the participants confidential. See IRB policy for consent waivers for more information.

Data Use and Materials Transfer Agreements

When researchers are sharing data/specimens with other entities, whether as the provider or recipient, formal agreements may be warranted. See the University's Office of Sponsored Projects policy and form for establishing Data Use Agreements. Contact the University Technology Transfer Office for information about Materials Transfer Agreements.

When applicable, investigators must attach approved Data Use Agreements and Materials Transfer Agreements to new projects or amendment packages (for newly added agreements) in IRBNet for IRB review or exempt determination.

IRB Review of Confidentiality Protections

When research data will be linked, directly or indirectly to PPII, the University IRB will not approve the research unless precautions are adequate to safeguard data confidentiality during data collection, storage, analysis, and dispensation. The University IRB balances requirements for protecting the confidentiality of research data with the level of risk associated with unauthorized disclosure, legal obligations related to confidentiality, and the confidentiality commitment made to research participants.

For research involving information that may be considered sensitive (e.g., mental illness, cognitive impairment, physical disabilities, STDs, drug and alcohol abuse), the IRB will assess the need for more robust safeguards, including Certificates of Confidentiality.

Unauthorized Disclosure of Information

Investigators must inform the IRB immediately in the event of an unauthorized release or loss of participants' private or confidential information. The IRB may determine the breach of confidentiality to constitute noncompliance and/or an unanticipated problem involving risks to participants or others. For more information, see IRB policy for reporting problems in research.