Author: Dr. Ann Hardy and Dr. Sherry Mills
Date: December 5, 2022


This blog continues Learn eCORE’s discussion with Dr. Martin Mendoza, Director of health equity for the National Institutes of Health’s All of Us Program and focuses on the data generated by the All of Us Program and efforts to encourage its use by diverse investigators. See Part I of this blog series to learn about the All of Us Program’s ambitious goal to enroll persons from a variety of groups that are historically underrepresented in biomedical research. Learn eCORE is proud to be a member of the All of Us Community Advocate Network, to support the All of Us program goals, particularly those related to diversity, equity, and inclusion.


Dr. Mendoza, in addition to efforts to enroll diverse participants, you are also very involved in ensuring the robust but appropriate use of the data collected by All of Us.


What is unique about the All of Us data? Size and diversity are two features that set our data apart. Most cohort studies with genetic data are smaller and less diverse, with over 90% of cohort members being of European ancestry. The All of Us cohort is 50% non-white and when we consider the other population groups that make up our diversity, equity, and inclusion or DEI vision, 80% of the All of Us cohort are from underrepresented groups.


We also collect many socio-demographic measures such as country of origin, gender identity, education, home ownership, income, and employment, that can enrich the analysis of our genetic and health-related information.


That said, we view ourselves as collaborative, not competitive with other research initiatives. Each dataset supports research in important ways. Some of the things that set our program apart, though, are the diversity of our participant cohort and our approach for broad data accessibility.


What data are available for analysis? The All of Us data are organized into tiers:

  • The Public Tier has basic data about our cohort. This is useful to anyone interested in knowing more about All of Us but also can be used by researchers to help them determine if our data is appropriate to answer their research questions.
  • The Registered Tier enables access to a curated data set of anonymized individual-level data from surveys, health records, wearables, and physical measurements. Users whose institutions have a data use agreement with All of Us can register to use data in this tier.
  • The Controlled Tier has more granular data in terms of demographics and genetics. Researchers must sign an individual Data Use Agreement and complete other requirements before they can use this data.


How is the All of Us Program ensuring broad use of its data? In addition to enrolling diverse participants, we are also committed to fostering diversity among the researchers using All of Us data. To this end, we host a Minority Student Research Symposium to invite students from diverse backgrounds to conduct research with All of Us data and present their findings via virtual presentations. We have also conducted specific outreach with minority serving institutions and have data use and registration agreements in place with more than 65 of these institutions. The Voices of All of Us webpage includes stories of some of the diverse researchers who have used All of Us data.


Does All of Us promote or encourage methodological research? All of Us views itself as data resource so does not specifically drive the research topics. We want the data to be used for all types of research, which could include methodological research. We are trying to make the data as widely accessible as possible. For example, we are considering ways to make the data more readily useable with a larger variety of statistical software packages.


In our previous blog, we discussed the impact that the COVID pandemic had on recruitment activities for All of Us. Does All of Us also have data related to COVID? All of Us conducted the COVID-19 (COPE) Survey to gather data on COVID infection in participants and on COVID-related stress, employment, coping, and other psycho-social impacts. We also did another survey on COVID vaccination status. One interesting example of the utility of the All of Us data is the discovery of evidence from an analysis of 24,000 All of Us blood samples that the COVID-19 virus was present in 5 U.S. states earlier than initially reported. This demonstrates the ability of All of Us data to readily contribute knowledge about a new health problem.


Has the All of Us mission shifted at all because of COVID? Our fundamental mission and goals remain the same – to create an important health data resource by recruiting a cohort of one million or more participants from diverse groups and backgrounds. We want our data to have broad utility and are pleased that we are able to quickly pivot to provide insight into new health problems like COVID.


Going forward, what can we expert in terms of data sharing and analysis of the rich data resulting from All of Us? We plan to continue our efforts to ensure that diverse researchers gain access to the All of Us datasets, including those of varying race and ethnicities, career stages, and institutions including minority serving institutions and those that are less research intensive. We are also looking at ways to encourage the use of our data by international researchers, for example, to compare our data to cohort data from other countries like the UK Biobank. Click here to learn more about the data available from All of Us.


In closing, I want to thank Learn eCORE for sharing information about the All of Us Research Program data. I hope this will encourage your readers to develop innovative research projects using our data!


Click here to learn more about the All of Us Program’s data and opportunities for researchers.

Other Posts