285 Secondary Use of Data

Updated July 1, 2019

This policy applies to activities that involve the secondary analysis of existing data, such as medical records, student records, data collected from previous studies, audio/video recordings, etc. that were initially collected for another purpose. In order to be existing, the information must be "on the shelf" (i.e., it has already been collected) at the time that the current research is proposed.   While such projects do not involve interactions or interventions with humans, they may still require UNR IRB review, since the definition of "human subject" at 45 CFR 46.102(f) means a living individual about whom an investigator (whether professional or student) conducting research obtains

(1) Data through intervention or interaction with the individual, or
(2) Identifiable private information.  

Data analysis activities that meet the definition of research with humans may qualify for an exemption or require expedited or even full committee review. Any such project must receive UNR IRB approval or a determination of exemption before the investigator accesses the data. Researchers are encouraged to contact Research Integrity for consultation about whether and what type of review is required.  

When does the secondary use of existing data not require review?

In general, the secondary analysis of existing data does not require UNR IRB review when it does not meet the regulatory definition of research involving humans, as referenced above.  

Public use data sets:  Generally, public use data sets (such as portions of U.S. Census data, data from the National Center for Educational Statistics, National Center for Health Statistics, etc.) are data sets prepared with the intent of making them available for the public. The data available to the public are not individually identifiable and therefore their analysis would not involve humans.  

In addition to being identifiable, the existing data must include "private information" in order to constitute research involving humans. Private information is defined as information which has been provided for specific purposes by an individual and which the individual can reasonably expect will not be made public (e.g., a medical or school record).   Information that contains identifiers and can be accessed freely by the public (without special permission or application) is not "private" and the research therefore does not therefore involve humans. For example, a study involving only analysis of the published salaries and benefits of public university presidents would not need UNR IRB review since this information is not private.  

De-identified data: If the dataset has been stripped of all identifying information and there is no way that it could be linked back to the persons from whom it was originally collected (through a key to a coding system or by any other means), its subsequent use by the PI or another investigator would not constitute research with humans, since it is no longer identifiable. Identifiable means the identity of the person is known or may be readily ascertained by the investigator or associated with the information. In general, information is considered to be identifiable when it can be linked to specific individuals by the investigator(s) either directly or indirectly through coding systems, or when characteristics of the information obtained are such that by their nature a reasonably knowledgeable person could ascertain the identities of individuals. Therefore, even though a dataset may have been stripped of direct identifiers (names, addresses, student ID numbers, etc.), it may still be possible to identify an individual through a combination of other characteristics (e.g., age, gender, ethnicity, and place of employment).  

Example: Many student research projects involve secondary analysis of data that belongs to, or was initially collected by, their faculty advisor or another investigator. If the student is provided with a de- identified, non-coded data set, the use of the data does not constitute research with humans because there is no interaction with any individual and no identifiable private information will be used. The project does not therefore require UNR IRB review.  

Coded data: Secondary analysis of coded private information is not considered to be research involving humans and would not require UNR IRB review if the investigator(s) cannot readily ascertain the identity of the individual(s) to whom the coded private information pertains as a result of one of the following circumstances:  

  1. The investigators and the holder of the key have entered into an agreement prohibiting the release of the key to the investigators under any circumstances, until the individuals are deceased (DHHS regulations for research with humans do not require the IRB to review and approve this agreement);
  2. There are IRB-approved written policies and operating procedures for a repository or data management center that prohibit the release of the key to the investigator under any circumstances, until the individuals are deceased; or
  3. There are other legal requirements prohibiting the release of the key to the investigators, until the individuals are deceased.  

An exception is if a student is analyzing coded data from a faculty advisor/sponsor who retains a key.  This would be deemed research with humans, because the faculty advisor is considered an investigator on the student's protocol, and can readily ascertain the identity of the participants since he/she holds the key to the coded data.  If the student's work fits within the scope of the initial protocol from which the dataset originates, the faculty advisor (or investigator who holds the dataset) may wish to consider adding the student and his/her work to the original protocol by means of an amendment application rather than having the student submit a new application for review.  

Example: Researcher A plans to examine the relationships between attention deficit hyperactivity disorder (ADHD), oppositional defiance disorder, and teen drug abuse using data collected by Agencies I, II, and III that work with "at risk" youth. The data will be coded and the agencies have entered into an agreement prohibiting release of the key to the researcher that could connect the data with identifiers. The use of the data would not constitute research with humans and does not require UNR IRB review.

When is the secondary use of existing data exempt?

There are six categories of research activities involving humans that may be exempt from the requirements of the Federal Policy for the Protection of Human Subjects (45 CFR 46). However, only Category 4 applies specifically to existing data. To determine if research qualifies for an exempt determination, researchers must submit an exempt core application review of existing data or specimens form to RI for review.   Research involving collection or study of existing data, documents, and records can be exempted under Category 4 of the federal regulations if: (i) the sources of such data are publicly available; or (ii) the information is recorded by the investigator in such a manner that participants cannot be identified, directly or through identifiers linked to the participants.   The latter condition of this category applies in cases where the investigators initially have access to identifiable private information but abstract the data needed for the research in such a way that the information can no longer be connected to the identity of the participants. This means that the abstracted data set does not include direct identifiers (names, social security numbers, addresses, phone numbers, etc.) or indirect identifiers (codes or pseudonyms that are linked to the participant's identity). Furthermore, it must not be possible to identify participants by combining a number of characteristics (e.g., date of birth, gender, position, and place of employment). This is especially relevant in smaller datasets, where the population is confined to a limited participant pool.

Examples:

  1. A researcher conducts a study of treatment outcomes for a certain drug that involves the review of patient charts at a non-UCB medical facility. The researcher records patient age, sex, diagnosis, and treatment outcome in such a way that the information cannot be linked back to the patient. This project could qualify for an exemption.   
  2. Student B will be given access to data from her faculty advisor's health survey research project. The data consists of coded survey responses, and the advisor will retain a key that would link the data to identifiers. The student will extract the information she needs for her project without including any identifying information and without retaining the code. The use of the data does constitute research with humans because the initial data set is identifiable (albeit through a coding system); however, it would qualify for exempt status.  

Expansion of Exempt Category 4 under the UNR IRB Flex Policy

Non-federally funded research on secondary data where the data contains identifiers or a master code list linking codes to an individual would be eligible for exempt level review under the UNR IRB Flex policy. Non-federally funded research involving retrospective and/or prospective analysis of secondary data are also permitted under this category.

When is the secondary use of existing data non-exempt?

If secondary analysis of existing data does involve research with humans and does not qualify for exempt status as explained above, the project must be reviewed either through expedited procedures, and researchers should submit the exempt core application review of existing data or specimens form to RI for review for IRB review. Expedited review would be required if identifiers or master code lists are included with the data for federally funded research.  

Consent: Researchers using data previously collected under another study should consider whether the currently proposed research is a "compatible use" with what participants agreed to in the original consent form. For non-exempt, federally-funded projects, a consent process description or justification for a waiver must be included in the research protocol.  The UNR IRB may require that informed consent for secondary analysis be obtained from participants whose data will be accessed. Alternatively, UNR IRB can consider a request for a waiver of one or more elements of informed consent under 45 CFR 46.116(d). In order to approve such waiver, the UNR IRB must determine that the research:  

  1. presents minimal risk (no risks of harm, considering probability and magnitude, greater than those ordinarily encountered in daily life or during the performance of routine examinations or tests); and
  2. the waiver or alteration will not adversely affect the rights and welfare of the participants; and
  3. the research could not practicably be carried out without the waiver or alteration; and
  4. whenever appropriate, the participants will be provided with additional pertinent information after participation.  

See UNR IRB policies for more details regarding informed consent and waivers of consent.  

"Restricted Use Data": Certain agencies and research organizations release files to researchers with specific restrictions regarding their use and storage. The records frequently contain identifiers or extensive variables that combined might enable identification, even though this is not the intent of the researcher.  Research using these data sets most often requires non-exempt level review.  

Examples:

  1. Student C will be given access to coded mental health assessments from his faculty advisor's research project. The student plans to analyze the data with a code attached to each record, and the advisor will retain a key to the code that would link the data to identifiers. The use of the data does constitute research with humans and does not qualify for exempt status since participants can be identified. This student project would require an application to be submitted for non-exempt review by the UNR IRB.  
  2. Student D is applying to the National Center for Health Statistics for use of data from the National Health and Nutrition Examination Survey that includes geographic identifiers and date of examination. The analysis of this restricted use data would require non-exempt review by UNR IRB.

Secondary Data Research Classification

When is secondary data (e.g., medical records, purchased data, data from the Internet, etc.) considered research with humans? Research involving secondary data analysis is considered research with humans when data about individuals is both private and identifiable.  

Projects that are unlikely to be research with humans because they involve only:

  • Public use data sets such as data from the National Center for Health Statistics-data is available to the public at large and not restricted to researchers.
  • Data sets from an outside source that have been stripped of all identifying informationand of links back to identifiers before being provided to researcher.
  • Facebook public profiles found from Google searches.
  • Twitter tweets not in a private setting.
  • Publicly accessible forums or comments sections where users have no expectation of privacy (e.g., New York Times, YouTube, etc.).  

Projects that might be research with humans research because they involve:

  • Purchasing/obtaining enhanced data sets-data on individuals which may include enough information to potentially identify the individuals. This would include data sets where the owner or vendor requires local IRB approval and a data use agreement prior to allowing access to the data.
  • Receipt of coded data where data holder has code key-depending on whether the data holder only provides data or is a collaborator in the research, and whether an agreement between institutions prohibits receiver from ever receiving identifiers, etc.
  • Forums or chats where users must register as belonging to a certain group (e.g., cancer survivors) or housed in areas that are not public, e.g., where special passwords are needed to join.  

Projects that are human subjects research because they involve:

  • Private data sets obtained with identifiers (e.g., traffic violation data with driver's license numbers, survey data with email addresses, medical records with protected health information [PHI], restricted use datasets, etc.).
  • Stolen, hacked, accidentally released data about individuals-although data may now be publicly available (such as on the surface web or the dark web), the individuals whom the data is about had expectation of privacy, i.e., that the data would not be hacked, stolen, etc.   Human subjects research must be reviewed and either determined exempt or obtain UNR IRB approval before the research begins.