Author: Valery Gordon PhD, MPH
Date: January 9, 2023


Data from patients and health care systems (referred to as Real World Data) are being increasingly used along with data from traditional controlled clinical research to support clinical and regulatory decision making. For our first blog of the new year, Learn eCORE turned to Dr. Valery Gordon, an expert in bioethics and federal research policy, to provide guidance on the ethical issues associated with research using Real World Data.


Real World Data (RWD)1 have received increasing amounts of attention since the Patient-Centered Outcomes Research Institute (PCORI) and All of Us were established in 2010 and 2018, respectively. A great deal may be learned from studying RWD, but there are also risks associated with this type of research. It is important to use these data to benefit individuals and populations and equally important to take steps to minimize potential harms.


For example, many clinical researchers are exploring databases that contain whole or partial information from Electronic Health Records (EHRs). These databases may contain additional information, such as genetic information or insurance data. Some databases have collected EHR data with the consent of patients, however, many have not obtained consent to store and use the information.


Researchers query these databases to identify characteristics that lead to conditions or diseases and the more granular the data, the more likely that clinical decisions based on research findings will benefit specific sub-populations of patients. The more information about individuals that is available from a single source, the more likely those individuals can be identified and receive affordable and evidence-based treatment.


At the same time, EHRs often contain sensitive information that individuals or groups wish to remain confidential. Informational risks are often mentioned as the most likely harm to result from research with EHRs. To maintain confidentiality, EHR information is often deidentified by hiding or removing certain identifying characteristics. These include Standards for Privacy of Individually Identifiable Health Information (HIPAA identifiers) and may also include genomic or cultural information. The identifiers that are hidden to protect individual privacy may depend on the research protocol. Protocols may need to be tailored to limit the generation of individual identifiers.


Use of these databases can sometimes lead to confusion as to whether the information is being gathered for treatment or for research. As stated, examining medical records to make a decision about clinical treatment would likely not be considered research, however examining HR data to identify generalizable information would meet the regulatory definition of research. Historically, confusion between research and treatment led to inadequacies in informed consent. The trust of some populations in research and researchers was damaged. A great deal of work on the part of the government, research organizations, and researchers is focused on restoring trust in research. For example, Section A. of the Belmont Report) describes the differences between research and treatment. RWD can be used for either purpose and clinical researchers need to determine how the data will be used before they access these databases.


The “gold standard” for evidence-based medicine has been the randomized clinical trial (RCT). The availability of RWD has led to debates as to whether research with RWD is superior to RCTs. RWD can be used for observational studies to generate new treatment approaches, observational studies to generate new treatment approaches, as well as large simple trials and pragmatic clinical trials. In terms of design features, RCTs can use masking and controlled conditions to minimize confounding factors and bias while maximizing differences in efficacy and safety among interventions. Highly controlled settings provide confidence in


RCT data. On the other hand, Real World Evidence can be generated from a broad range of patients during routine clinical practice and is increasingly important in determining effectiveness outside of the tightly controlled conditions of RCTs. RWD provides additional important information that may reflect a larger group of patients and may help inform decisions for care.


While there are benefits of RCTs and research with RWD, there are also limitations of each. RCTs often do not reflect real-world populations: there is limited generalizability when study designs control variability in ways that are not representative of real-world care and outcomes. That said, EHR-based RWD research can pose difficulties in controlling for biases and confounding factors may affect study outcomes. The quality of data can be low and missing fields can bias findings.


In terms of human research protections, in RCTs, protections against physical and informational risks to individuals are proactively considered, and informed consent can be more specific. In RWD research, accumulation of large-scale patient data and clinical trials using such data raise issues around protection of private information. Concerns about privacy and confidentiality are raised often in publications that discuss the use of RWD in research.


In summary, RCTs and RWD research each have advantages and limitations that make them complementary. Much research with EHR data is hypothesis-generating, that is, a pattern identified from the data may lead to the development of a clinical trial. Both RCTs and RWD research provide important information to further public health goals and the same regulatory requirements for protections for research participants often apply. Protections against research risks are essential to maintain public trust. Given that the salient concern about RWD research revolves around privacy, informing patients that their data could be used to design a research study and that their consent will be sought to participate in clinical research, they might feel more comfortable knowing that their data will be used for research.


1 In 2018, the US Food and Drug Administration (FDA) published their Framework for FDA’s Real-World Evidence Program, which provides standards and definitions that are used for both FDA-regulated and -non-regulated activities. This document uses the FDA ‘s definitions for Real World Data and Real World Evidence: “Real-World Data (RWD) are data relating to patient health status and/or the delivery of health care routinely collected from a variety of sources” and “Real-World Evidence (RWE) is the clinical evidence about the usage and potential benefits or risks of a medical product derived from analysis of RWD.”


About the Author: Valery Gordon, PhD, MPH has more than three decades of experience in science, clinical research, and public health policy. Dr. Gordon has been a researcher, Biosafety Officer, Research Integrity Liaison Officer, Director of the Clinical Research Policy Program at the National Institutes of Health (NIH) and served as the Inclusion Policy Officer responsible for the inclusion of women, minorities, and children in the Office of the NIH Director, at the National Institute of Biomedical Imaging and Bioengineering (NIBIB, NIH) and the National Center for Advancing Translational Sciences (NCATS, NIH). Dr. Gordon has experience in both leading and participating on projects to develop policies and guidance at NIH, the Department of Health and Human Services, and Research and Health Policy Centers. Currently, Dr. Gordon acts as a private consultant on public health policy.

Dr. Gordon’s expertise has made her a valued specialist on ethics, and regulatory and policy issues around human subjects research across the NIH and among the HHS agencies and offices. She excels in clarifying complex concepts to make them understandable to professional and lay audiences, both orally and in writing.

Dr. Gordon received her Ph.D. in pharmacology from the University of Virginia and her M.P.H. inbioethics and health policy from the Johns Hopkins School of Hygiene and Public Health.

Other Posts