285 Secondary Use of Data or Specimens

Updated Oct. 1, 2020

This policy applies to activities that involve the secondary analysis of data (such as medical records, student records, data collected from previous studies, audio/video recordings, etc.) or biospecimens that were or will be collected for non-research purposes or for research studies other than the proposed research study. Even though such projects do not involve interactions or interventions with humans, they may still require University IRB review, as the definition of "human subject" at 45 CFR 46.102(e)(1) means a living individual about whom an investigator (whether professional or student) conducting research:

(1) Obtains information or biospecimens through intervention or interaction with the individual, and uses, studies, or analyzes the information or biospecimens; or
(2) Obtains, uses, studies, analyzes, or generates identifiable private information or identifiable biospecimens.

Data analysis activities that meet the definition of research with humans may qualify for an exemption or require expedited or even full committee review. Any such project must receive University IRB approval or a determination of exemption before the investigator accesses the data. Researchers are encouraged to contact Research Integrity for consultation about whether and what type of review is required.

When does the secondary use of data or specimens not require review?

In general, the secondary analysis of data or biospecimens does not require University IRB review when it does not meet the regulatory definition of research involving humans, as referenced above.

Public use data sets: Generally, public use data sets (such as portions of U.S. Census data, data from the National Center for Educational Statistics, National Center for Health Statistics, etc.) are data sets prepared with the intent of making them available for the public. The data available to the public are not individually identifiable and therefore their analysis would not involve humans.

In addition to being identifiable, the existing data must include "private information" in order to constitute research involving humans. Private information is defined as information which has been provided for specific purposes by an individual and which the individual can reasonably expect will not be made public (e.g., a medical or school record). Information that contains identifiers and can be accessed freely by the public (without special permission or application) is not "private" and the research therefore does not therefore involve humans. For example, a study involving only analysis of the published salaries and benefits of public university presidents would not need University IRB review since this information is not private.

De-identified data or specimens: If the dataset or biospecimens have been stripped of all identifying information and there is no way that the information or biospecimens could be linked back to the persons from whom it was originally collected (through a key to a coding system or by any other means), its subsequent use by the Principal Investigator or another investigator would not constitute research with humans, since it is no longer identifiable. Identifiable means the identity of the person is known or may be readily ascertained by the investigator or associated with the information. In general, information is considered to be identifiable when it can be linked to specific individuals by the investigator(s) either directly or indirectly through coding systems, or when characteristics of the information obtained are such that by their nature a reasonably knowledgeable person could ascertain the identities of individuals. Therefore, even though a dataset may have been stripped of direct identifiers (names, addresses, student ID numbers, etc.), it may still be possible to identify an individual through a combination of other characteristics (e.g., age, gender, ethnicity, and place of employment).

Example: Many student research projects involve secondary analysis of data that belongs to, or was initially collected by, their faculty advisor or another investigator. If the student is provided with a de- identified, non-coded data set, the use of the data does not constitute research with humans because there is no interaction with any individual and no identifiable private information will be used. The project does not therefore require University IRB review.

Coded data or specimens: Secondary analysis of coded private information or biospecimens is not considered to be research involving humans and would not require University IRB review if the investigator(s) cannot readily ascertain the identity of the individual(s) to whom the coded private information or biospecimens pertain as a result of one of the following circumstances:

  1. The investigators and the holder of the key have entered into an agreement prohibiting the release of the key to the investigators under any circumstances, until the individuals are deceased (DHHS regulations for research with humans do not require the IRB to review and approve this agreement);
  2. There are IRB-approved written policies and operating procedures for a repository or data management center that prohibit the release of the key to the investigator under any circumstances, until the individuals are deceased; or
  3. There are other legal requirements prohibiting the release of the key to the investigators, until the individuals are deceased.

An exception is if a student is analyzing coded data from a faculty advisor/sponsor who retains a key. This would be deemed research with humans, because the faculty advisor is considered an investigator on the student's protocol, and can readily ascertain the identity of the participants since he/she holds the key to the coded data. If the student's work fits within the scope of the initial protocol from which the dataset originates, the faculty advisor (or investigator who holds the dataset) may wish to consider adding the student and his/her work to the original protocol by means of an amendment application rather than having the student submit a new application for review.

Example: Researcher A plans to examine the relationships between attention deficit hyperactivity disorder (ADHD), oppositional defiance disorder, and teen drug abuse using data collected by Agencies I, II, and III that work with "at risk" youth. The data will be coded and the agencies have entered into an agreement prohibiting release of the key to the researcher that could connect the data with identifiers. The use of the data would not constitute research with humans and does not require UNR IRB review.

When is the secondary use of data or specimens exempt?

There are several categories of research activities involving humans that may be exempt from the requirements of the Federal Policy for the Protection of Human Subjects (45 CFR 46). Category 4 applies specifically to secondary research uses of identifiable private information or identifiable biospecimens. To determine if research qualifies for an exempt determination, researchers must submit an exempt application review of existing data or specimens form to Research Integrity for review. Research involving collection or study of data, documents, records, or biospecimens for secondary uses can be exempted under Category 4 of the federal regulations if: (i) the sources of such data or biospecimens are publicly available; or (ii) the information is recorded by the investigator in such a manner that participants cannot be identified, directly or through identifiers linked to the participants. The latter condition of this category applies in cases where the investigators initially have access to identifiable private information or identifiable biospecimens but abstract the data needed for the research in such a way that the information can no longer be connected to the identity of the participants. This means that the abstracted data set or biospecimens do not include direct identifiers (names, social security numbers, addresses, phone numbers, etc.) or indirect identifiers (codes or pseudonyms that are linked to the participant's identity). Furthermore, it must not be possible to identify participants by combining a number of characteristics (e.g., date of birth, gender, position, and place of employment). This is especially relevant in smaller datasets, where the population is confined to a limited participant pool.

Examples:

  1. A researcher conducts a study of treatment outcomes for a certain drug that involves the review of patient charts at a non-UCB medical facility. The researcher records patient age, sex, diagnosis, and treatment outcome in such a way that the information cannot be linked back to the patient. This project could qualify for an exemption.
  2. Student B will be given access to data from her faculty advisor's health survey research project. The data consists of coded survey responses, and the advisor will retain a key that would link the data to identifiers. The student will extract the information she needs for her project without including any identifying information and without retaining the code. The use of the data does constitute research with humans because the initial data set is identifiable (albeit through a coding system); however, it would qualify for exempt status.

Expansion of Exempt Category 4 under the University IRB Flex Policy

Non-federally funded research on secondary data where the data contains identifiers or a master code list linking codes to an individual would be eligible for exempt level review under the University IRB Flex policy.

When is the secondary use of existing data or specimens non-exempt?

If secondary analysis of existing data does involve research with humans and does not qualify for exempt status as explained above, the project must be reviewed either through expedited procedures, and researchers should submit the exempt core application review of existing data or specimens form to Research Integrity for review for IRB review. Expedited review would be required if identifiers or master code lists are included with the data for federally funded research.

Consent: Researchers using data previously collected under another study should consider whether the currently proposed research is a "compatible use" with what participants agreed to in the original consent form. For non-exempt, federally-funded projects, a consent process description or justification for a waiver must be included in the research protocol. The University IRB may require that informed consent for secondary analysis be obtained from participants whose data will be accessed. Alternatively, the University IRB can consider a request for a waiver of one or more elements of informed consent under 45 CFR 46.116(f)(3). In order to approve such waiver, the IRB must determine that the research:

  1. presents minimal risk (no risks of harm, considering probability and magnitude, greater than those ordinarily encountered in daily life or during the performance of routine examinations or tests); and
  2. the research could not practicably be carried out without the waiver or alteration; and
  3. if the research involves using identifiable private information or identifiable biospecimens, the research could not practicably be carried out without using such information or biospecimens in an identifiable format; and
  4. the waiver or alteration will not adversely affect the rights and welfare of the participants; and
  5. whenever appropriate, the participants will be provided with additional pertinent information after participation.

See University IRB policies for more details regarding informed consent and waivers of consent.

"Restricted Use Data": Certain agencies and research organizations release files to researchers with specific restrictions regarding their use and storage. The records frequently contain identifiers or extensive variables that combined might enable identification, even though this is not the intent of the researcher. Research using these data sets most often requires non-exempt level review.

Examples:

  1. Student C will be given access to coded mental health assessments from his faculty advisor's research project. The student plans to analyze the data with a code attached to each record, and the advisor will retain a key to the code that would link the data to identifiers. The use of the data does constitute research with humans and does not qualify for exempt status since participants can be identified. This student project would require an application to be submitted for non-exempt review by the University IRB.
  2. Student D is applying to the National Center for Health Statistics for use of data from the National Health and Nutrition Examination Survey that includes geographic identifiers and date of examination. The analysis of this restricted use data would require non-exempt review by the University IRB.

Secondary Data Research Classification

When is secondary data (e.g., medical records, purchased data, data from the Internet, etc.) considered research with humans? Research involving secondary data analysis is considered research with humans when data about individuals is both private and identifiable.

Projects that are unlikely to be human research because they involve only:

  • Public use data sets such as data from the National Center for Health Statistics-data is available to the public at large and not restricted to researchers.
  • Data sets from an outside source that have been stripped of all identifying information and of links back to identifiers before being provided to researcher.
  • Facebook public profiles found from Google searches.
  • Twitter tweets not in a private setting.
  • Publicly accessible forums or comments sections where users have no expectation of privacy (e.g., New York Times, YouTube, etc.).

Projects that might be human research because they involve:

  • Purchasing/obtaining enhanced data sets-data on individuals which may include enough information to potentially identify the individuals. This would include data sets where the owner or vendor requires local IRB approval and a data use agreement prior to allowing access to the data.
  • Certain agencies and research organizations release files to researchers with specific restrictions regarding their use and storage. The records frequently contain identifiers or extensive variables that combined might enable identification, even though this is not the intent of the researcher. Research using these data sets most often requires expedited or full committee review.
  • Receipt of coded data where data holder has code key-depending on whether the data holder only provides data or is a collaborator in the research, and whether an agreement between institutions prohibits receiver from ever receiving identifiers, etc.
  • Forums or chats where users must register as belonging to a certain group (e.g., cancer survivors) or housed in areas that are not public, e.g., where special passwords are needed to join.

Projects that are human research because they involve:

  • Private data sets obtained with identifiers (e.g., traffic violation data with driver's license numbers, survey data with email addresses, medical records with protected health information [PHI], restricted use datasets, etc.).
  • Stolen, hacked, accidentally released data about individuals-although data may now be publicly available (such as on the surface web or the dark web), the individuals whom the data is about had expectation of privacy, i.e., that the data would not be hacked, stolen, etc. Human research must be reviewed and either determined exempt or obtain University IRB approval before the research begins.