Abstract
Background: The Observational Health Data Sciences and Informatics (OHDSI) network enables access to billions of deidentified, standardized health records and built-in analytics software for observational health research, with numerous potential applications to dermatology. While the use of the OHDSI has increased steadily over the past several years, review of the literature reveals few studies utilizing OHDSI in dermatology. To our knowledge, the University of Colorado School of Medicine is unique in its use of OHDSI for dermatology big data research. Summary: A PubMed search was conducted in August 2020, followed by a literature review, with 24 of the 72 screened articles selected for inclusion. In this review, we discuss the ways OHDSI has been used to compile and analyze data, improve prediction and estimation capabilities, and inform treatment guidelines across specialties. We also discuss the potential for OHDSI in dermatology – specifically, ways that it could reveal adherence to available guidelines, establish standardized protocols, and ensure health equity. Key Messages: OHDSI has demonstrated broad utility in medicine. Adoption of OHDSI by the field of dermatology would facilitate big data research, allow for examination of current prescribing and treatment patterns without clear best practice guidelines, improve the dermatologic knowledge base and, by extension, improve patient outcomes.
Introduction
The Observational Health Data Sciences and Informatics (OHDSI) collaborative is a volunteer collaborative international network of researchers, with a central hub at Columbia University for organization. OHDSI is the successor of the Observational Medical Outcomes Partnership (OMOP), a public-private partnership between the FDA, pharmaceutical companies, and healthcare providers [1]. One result of OMOP is the Common Data Model (CDM), which establishes transformation conventions for observational health data, including insurance claims, electronic health records (EHR), and hospital billing into a standardized common data model (Fig. 1). The CDM supports large-scale analytics across the many participating data partners, allowing access to billions of deidentified health records for observational health research [2]. A fundamental tool developed from OMOP is the Standard Vocabulary, which enables interoperability between systems, facilitates homogeneity and transparency of data across diverse observational databases, and supports the OHDSI network in conducting efficacious, high-quality research. The latest version of the Standard Vocabulary is available for download from the Athena vocabularies repository site (https://athena.ohdsi.org).
The Common Data Model (image from Observational Health Data Sciences and Informatics, 2020). Available from: https://www.ohdsi.org/data-standardization/the-common-data-model/.
The Common Data Model (image from Observational Health Data Sciences and Informatics, 2020). Available from: https://www.ohdsi.org/data-standardization/the-common-data-model/.
OHDSI also provides access to analytic and data quality tools for working with relevant data. Studies can be conducted on OMOP CDM data by writing custom code, or by using built-in software such as ATLAS, ACHILLES (Automated Characterization of Health Information at Large-Scale Longitudinal Evidence Systems) and its successor data quality software, the Data Quality Dashboard. For data analysis, ATLAS is a web-based component of OHDSI used to design and execute observational analyses of OMOP CDM data. Within ATLAS, specifications and codes are created to identify cohorts, outcome variables, and other analytic variables of interest, while also enabling vocabulary searching and browsing. For data quality, the Data Quality Dashboard precomputes summary statistics, describes datasets, and also creates reports allowing visualization of the previously computed summary statistics. Overall, it characterizes a database and provides data quality assessment.
The OHDSI community is open to all. Participation is as easy as joining a standing OHDSI community call, participating in a work group, joining a research network by converting data to the OMOP CDM, or leading a study across the network, and researchers do not need to have an affiliation with data partners to propose research studies (Fig. 2). Additionally, OHDSI maintains a commitment to reproducible research and making code and protocol available to the public on GIT, a code hosting platform.
OHDSI: Join the Journey (image from Observational Health Data Sciences and Informatics, 2020). Available from: https://www.ohdsi.org/join-the-journey.
OHDSI: Join the Journey (image from Observational Health Data Sciences and Informatics, 2020). Available from: https://www.ohdsi.org/join-the-journey.
As an open community, OHDSI is essentially a collaboration of volunteers; in this regard, anyone can be an OHDSI member. Once a member of OHDSI, anyone can take the lead on a study or participate in a study, either intellectually as a contributor to the study design, analysis, or manuscript development, and/or as a data contributor. Study leads are responsible for ensuring that the data methods and distributed analytics conform to OHDSI standards. The protocol and analytic code are distributed to community members, who volunteer to run the analysis on their data and provide the aggregate results to the lead site. In this sense, there is no formal governance or prioritization of the studies conducted by partners; whatever is accomplished is determined by the follow through of the lead site, and of course, the appropriateness of the available data to the research question.
Research proposal presentations and participant recruitment are generally conducted via weekly OHDSI community meetings and by way of a Microsoft Outlook OHDSI Teams site. OHDSI partners frequently participate in other funded research, such as from the FDA, using the community tools, but these collaborations are not governed by the community and participants are free to leverage the benefits of the OHDSI collaborative for other endeavors.
To our knowledge, the University of Colorado School of Medicine is unique in its use of OHDSI to support large-scale, observational dermatology research for the investigation of rare conditions, therapies, and outcomes. Within the field of dermatology, there is an urgent need to establish standardized treatment guidelines for a variety of acute and chronic dermatologic conditions, to better understand adherence to presently available guidelines and further elucidate the effect of patient demographics and population-related variables on treatment modalities and health outcomes.
To this end, we performed a review of the available literature citing OHDSI; upon review, there appears to be a relative dearth of studies utilizing OHDSI in dermatology. Here, we explore the merits of OHDSI and discuss its application as described in the literature.
Review Process
The PubMed database was used for publication review; a simple search for the term “OHDSI” was performed on August 3, 2020, returning a total of 72 articles published since 2014 (Fig. 3). These were screened by the principal author and assessed for content and relevance (Table 1). A total of 24 articles of clinical import were selected for inclusion and are discussed below.
Over the years, randomized controlled trials (RCTs) have assumed prominence in the scientific community and been widely accepted as presenting the highest-quality evidence among research methods. However, the “gold standard” status of RCTs has more recently come into question, with several publications highlighting their inherent flaws, along with the misconception that observational studies exaggerate exposure-outcome associations [3-5]. A review of meta-analyses in The New England Journal of Medicine that examined both RCTs and observational studies (case-control or cohort) found the summary results of observational studies and RCTs to be extremely similar, and noted that observational studies did not overestimate the extent of exposure-outcome associations when compared to the results of RCTs on the same topics [5]. Rapid medical innovation often outpaces the slow process of clinical trials’ development and execution, and while the internal validity of RCTs is strong, external validity remains problematic [4]. The strongest evidence may be obtained with the complementary use of observational studies, which enable the assessment of both generalizability to populations of interest and the applicability of study results to individuals [3, 4]. The ease and speed of conducting observational studies have grown dramatically with the advent of large-scale databases and online research networks such as OHDSI.
The potential applications of OHDSI are many. Its large scale provides increased power and reproducibility for observational research as well as characterizing the generalizability of clinical trials to real-world populations [6-8]. The large sample sizes provided by OHDSI increase the accuracy of estimations and predictions, facilitate the study of rare exposures, diseases, and outcomes, and are ideal for investigations across diverse populations [9, 10]. The following are some illustrative studies that utilize OHDSI in a broad range of applications, including the reporting of adverse events, the estimation of heritability, guidelines for adherence to treatment, and the characterization of prescribing patterns, to name a few.
Yu et al. [11] converted spontaneous reporting system (SRS) data from the FDA’s Adverse Event Reporting System (FAERS) into the OMOP CDM, creating and assessing a platform “ADEpedia-on-OHDSI,” by combining SRS and EHR data to improve pharmacovigilance. Their follow-up to this study was conducted in 2020, in which the detection of adverse event signals was compared between FAERS and EHR data for a specific adverse drug event (ipilimumab-induced hypopituitarism) [12]. It found that detection of EHR signaling occurred 4 months earlier than with FAERS and concluded that an integrated CDM-based approach to the detection of adverse events would enhance drug safety monitoring. Li et al. [13] likewise advocate integrating SRS data with large-scale, standardized healthcare databases, noting that this can speed up the detection of adverse events.
Chandler [14] reached a similar conclusion in a 2020 study on the association of nintedanib with ischemic colitis, in which safety signaling from SRS and observational health data was assessed. This research suggested that the use of real-world data from the OHDSI complements the spontaneous reporting of adverse drug reactions and has the potential to increase the efficiency of signal management [14].
A 2019 study by Jiang et al. [15] represents one possible extension of the OHDSI CDM to medical device safety, in the form of a unique device identifier (UDI) research database. This pilot study developed a UDI vocabulary for medical devices and proposed a framework for building a UDI research database in the CDM format.
Hripcsak et al. [2] discussed the efficacy of OHDSI in yielding prognostic information useful for comparing treatment options and gaining patient-level assessment of these options. This concept was adopted by Duke et al. [16] in a comparative study of antiepileptic agents spanning 10 databases and 2 countries, in which the risk of experiencing angioedema in new users exposed to levetiracetam was compared to that of phenytoin. Notably, they found the risk associated with levetiracetam to be equal to or lower than that associated with phenytoin (a drug that, unlike levetiracetam, does not carry a warning label for angioedema).
In a 2016 study by Hripcsak et al. [17], 11 data sources from 4 countries (encompassing 250 million patients) were mapped to the OHDSI CDM standard to investigate treatment pathways for type II diabetes mellitus, hypertension, and depression. This study was able to reveal a trend toward more consistent treatment across diseases and locations, although the authors did note some heterogeneity in treatment pathways and prescribing patterns. Zhang et al. [18] conducted a similar study in 2018, converting national data from China to the OMOP CDM to: examine treatment pathways for hypertension, type II diabetes mellitus and depression, compare these to recommended guidelines, uncover medication differences in clinical practice and, ultimately, urge the standardization of such guidelines.
As mentioned in the work of Hripcsak et al. [2], discussed previously, OHDSI has demonstrated usefulness in obtaining prognostic information. Polubriaginof et al. [19] employed next-of-kin information from EHR standardized to OHDSI’s CDM to identify familial relationships, computing heritability estimates for 500 disease phenotypes and yielding knowledge useful for estimating disease risk, identifying risk factors, and tailoring the care and treatment of patients. Wang et al. [20] used OHDSI to develop a logistic regression prediction model able to assess a patient’s risk of hemorrhagic transformation within 30 days of an ischemic stroke, thereby showing OMOP CDM to be a valuable tool for the development and validation of prognostic prediction models.
Choi et al. [21] utilized data converted to the OHDSI CDM to analyze how age at diagnosis of inflammatory bowel disease (IBD) impacted the prognosis in Korea. This study verified the use of CDM in gastrointestinal research and found that early-onset IBD portends a worse prognosis than late-onset IBD does. The authors defined “worsened disease prognosis” as the initiation of a biologic agent (infliximab or adalimumab), reporting odds ratios of experiencing a biologic agent to be 2.3 and 5.4 in patients with early- and late-onset ulcerative colitis and Crohn’s disease, respectively.
OHDSI has proven to be a complement to tools used in clinical trials research. Meystre et al. [22] used the OHDSI’s open-source data analytics platform (ATLAS) and natural language processing (NLP) to find patients with breast cancer who were eligible for specific clinical trials. NLP matched patients with clinical trials with an area under the curve (AUC) of up to 89.8%, illustrating its potential to decrease the demands on human workers who screen for clinical trials.
Bartlett et al. [8] looked at the use of OHDSI as it relates to clinical trials, performing a study aimed at determining whether clinical trials could be replicated by using only observational data obtained from EHR or insurance claims. The study found that while only 15% of US-based clinical trials published in 2017 could be entirely replicated via observational data alone, many additional clinical trial end points, interventions, indications, and inclusion/exclusion criteria could be identified from observational data, ultimately suggesting that the use of real-world data is complementary to clinical trials, particularly for comparing the generalizability of trial populations to real-world populations of interest. In their 2016 study, Sen et al. [7] also found OHDSI to be a useful tool for assessing the generalizability of clinical trials.
The advantages of data standardization and the development of large-scale national and international databases are increasingly being recognized and addressed at diverse sites around the globe. The literature includes articles from a range of locales, including Germany, Korea, Brazil, and Saudi Arabia, describing the experience of standardizing existing data to the OMOP CDM to facilitate the interinstitutional sharing of information [23-27]. Boyce et al. [28] recognized the benefit of using OHDSI’s data characterization tools for longitudinal research on topics important to skilled nursing facilities (SNFs), e.g., patient-level falls prediction, and therefore converted EHR from 5 SNFs to the OMOP CDM.
Conclusion
For clinicians and researchers alike, the OHDSI network presents a means to enhance knowledge and improve patient care. For dermatology, specifically, OHDSI has the potential to assess prescribing patterns and treatment pathways for diseases that presently lack clear best practice guidelines. The rapid expansion of available biologic agents for the management of dermatologic conditions necessitates deliberate efforts to support pharmacovigilance, explore variations in the prescribing of novel and costly pharmaceuticals according to demographic and socioeconomic factors and, ultimately, to ensure health equity and inform clinical practice guidelines. Predictive studies utilizing biologic prescription data, such as that of Choi et al. [21], could be adapted to immune-mediated skin diseases and enable more individualized patient care. While this paper provides only a brief introduction to the OHDSI and its diverse applications, it is clear that, in this increasingly globalized and digitized world, the OHDSI’s practicality and value will likewise continue to grow.
Key Message
The OHDSI network has demonstrable merit; its use in dermatology has the potential to advance observational research.
Appendix
Acknowledgements
The authors wish to acknowledge John Barbieri, David Margolis, and Jonathan Silverberg for their ongoing collaboration and review of this manuscript.
Statement of Ethics
This study complies with internationally accepted standards for research practice. This study did not involve human subjects so no institutional Ethics Committee approval was required.
Conflict of Interest Statement
The authors have no conflicts of interest to declare.
Funding Sources
T.E.S. receives funding from the 58858477 Pfizer Global Medical Grant – Dermatology Fellowship 2020 (principal investigator: R.D.). The authors have no other funding or financial interests to declare.
Author Contributions
T.E.S. contributed to study design, data acquisition and interpretation, manuscript drafting and revising, and design and formulation of tables. T.R. contributed to data acquisition, manuscript drafting, and designing of the figures. L.M.S. contributed to review and revision of the manuscript and OHDSI expertise. M.B. contributed to review of the manuscript and OHDSI expertise. R.D. contributed to study design, manuscript revision and review, and approval of the final version.