Abstract
Introduction: Documentation as well as IT-based management of medical data is of ever-increasing relevance in modern medicine. As radiation oncology is a rather technical, data-driven discipline, standardization, and data exchange are in principle possible. We examined electronic healthcare documents to extract structured information. Planning CT order entry documents were chosen for the analysis, as this covers a common and structured step in radiation oncology, for which standardized documentation may be achieved. The aim was to examine the extent to which relevant information may be exchanged among different institutions. Materials and Methods: We contacted representatives of nine radiation oncology departments. Departments using standardized electronic documentation for planning CT were asked to provide templates of their records, which were analyzed in terms of form and content. Structured information was extracted by identifying definite common data elements, containing explicit information. Relevant common data elements were identified and classified. A quantitative analysis was performed to evaluate the possibility of data exchange. Results: We received data of seven documents that were heterogeneous regarding form and content. 181 definite common data elements considered relevant for the planning CT were identified and assorted into five semantic groups. 139 data elements (76.8%) were present in only one document. The other 42 data elements were present in two to six documents, while none was shared among all seven documents. Conclusion: Structured and interoperable documentation of medical information can be achieved using common data elements. Our analysis showed that a lot of information recorded with healthcare documents can be presented with this approach. Yet, in the analyzed cohort of planning CT order entries, only a few common data elements were shared among the majority of documents. A common vocabulary and consensus upon relevant information is required to promote interoperability and standardization.
Introduction
Data and Electronic Healthcare Records in Radiation Oncology
Data exchange of electronic healthcare records (EHRs) is a topic of rising relevance in modern medicine. As a result, efforts are made to promote semantic interoperability, e.g., initiatives of the Health Level Seven International (HL7) organization, which continues to develop standards like the Fast Healthcare Interoperability Resources (FHIR) [1]. Further approaches include the Integrating Healthcare Enterprise (IHE) initiative and the IHE-RO initiative (specific for radiation oncology [RO]) [2]. Nevertheless, data exchange in healthcare (electronic and analog) remains challenging, including problems like missing standardization [3, 4], commercial interests of IT health developers [5] as well as unclear, irrelevant, or incomplete data [6].
RO is a digitalized medical discipline. In short, the patient pathway in radiation therapy may be described in steps from the first diagnosis and treatment indication through treatment planning, treatment delivery, and treatment outcome evaluation [7]. Due to its technical nature and the need for communication between interdisciplinary professional groups during treatment, certain steps of the treatment course are characterized by a high level of automatization and computer data processing.
One essential step in the patient pathway is the use of treatment planning computer tomography (TPCT), which is crucial for contemporary image-guided RO [8]. The planning scan is acquired while taking into consideration treatment site, treatment modality, and individual patient characteristics (e.g., pain or anatomical considerations). Individualization of the TPCT may, e.g., involve positioning and immobilization devices (like masks or boards), control of patient-dependent factors (like bladder filling or breathing instructions), or technical details (like CT slice thickness) [9]. Specifications for an individual situation are usually communicated by physician order entry (POE) documents. As most RO treatments rely on TPCT as an essential step of treatment planning, which is similarly done among various facilities, a standardized way to communicate instructions could in principle be achieved. Nevertheless, to our knowledge there is currently no standardized TPCT order entry available that could be used on a larger scale among different institutions, promoting standardization and enabling interoperable data exchange.
Benefits of Standardization
Standardization of medical vocabularies and the utilization of structured reports have shown to be beneficial for clinical practice and secondary purposes, such as clinical research or public health studies [10, 11]. In addition, a standardized approach to clinical documentation is helpful for clinicians, secondary purposes, and patient safety [12, 13]. In general, efforts for standardization in medicine regarding documentation, decision-making, and treatment procedures have been shown to improve clinical safety, quality, and efficiency [14‒16]. Integration of standardized information flow results in more accurate and detailed clinical information and more secure communication between care teams [17]. Furthermore, standardization promotes interoperability in healthcare data management as it provides a common language for communication and exchange of information [18].
Extraction of Interoperable Data from Healthcare Documents
Lacking standardization and impaired data exchange of electronic computerized systems in RO is an unresolved issue, which is why the American Society for Radiation Oncology (ASTRO) published a consensus paper on a list of ten minimum data elements to facilitate some basic data exchange in RO [19]. The issue was also addressed at the meeting of the International Society for Radiation Oncology (ISROI) in 2022 [20, 21]. As a result, the ISROI initiated a project attempting to identify common data elements (CDEs) [22] out of prescription documents. The subject of current POEs for TPCTs was chosen as a case scenario, as it covers a common and crucial step in the basic RO treatment, for which standardization of communication may be achieved.
In this work, we evaluate how ordering TPCTs using POEs within Swiss radiation therapy centers can be compared. Furthermore, to enable interoperable data exchange, we introduce the concept of analyzing healthcare documents by identifying “definite CDEs.” We used the following definition of CDEs as defined by the National Institute of Health (NIH): “A common data element (CDE) is a standardized, precisely defined question, paired with a set of allowable responses, used systematically across different sites, studies, or clinical trials to ensure consistent data collection” [11, 22].
The purpose of the study was to investigate whether general structure and common semantic in the documentation of different RO departments can be found. This information would help understand what challenges need to be overcome when a common data framework is to be created. It may present a starting point for future approaches to data standardization and interoperability.
Materials and Methods
Collection of POE Documents
Two authors (F.D. and N.C.) contacted nine representatives of Swiss-based RO departments. The contacted centers were chosen on the authors preference without any formal methodology for the selection, but with the intent to also include some centers, which may also have a specific focus within radiotherapy like brachytherapy, hyperthermia, and proton therapy. The first contact was made by email. We asked if they used standardized TPCT-POE and if they were willing to share more details. For those representatives who answered positively, we asked for the following information:
Does the department use a standardized TPCT-POE?
Describe narratively how the TPCT-POE is implemented within the department.
Participants using a paper-based POE were asked to send a scanned form, while those who used a word processing or pdf POE were asked to deliver the original template. Finally, participants with a custom-developed POE for the TPCT implemented as desktop or web applications were asked to provide screenshots of their graphical user interfaces, with the content of all single data elements (e.g., elements of dropdown list).
As POEs for the TPCTs were heterogeneous in form and content, we developed a normalization methodology for the comparison process. The goal was to enable quantitative and qualitative analysis of the POE for the TPCT structure and semantics. The documents were evaluated for the presence of CDEs.
Definite CDEs in Healthcare Documents
The presence or absence of a CDE within a document was determined based on whether the document can be used to answer the related question of a CDE. It thereby does not matter whether the information is presented textually or graphically.
After a relevant question (and thereby a CDE) has been identified in a document, it was then checked whether the same question was also addressed in the other documents of interest. These questions may only be answered unambiguously to identify “definite CDEs”. This is important to guarantee the acquisition of definite and therefore semantically and structurally interoperable data.
A CDE may have one of the following data types: “Time”, “Number”, “Date”, “Value List”, or “Text” (the NIH also uses the data types “File” and “Externally Defined”, which are not of relevance for our purpose). CDEs belonging to “Time”, “Number”, and “Date” may per se be perceived as “definite”, as the encoded information is unambiguous (assuming a general consensus on these quite basic concepts). Value list CDEs are in general also “definite”, as they have a clearly defined list of permissible entry values. As, e.g., the CDE “Cancer Extent” provided by the NIH has the permissible values “Localized”, “Regional”, and “Metastatic” [23]. However, if CDEs identified in different POEs were to be compared, this can be a problem, since the value lists of two actual identical CDEs may not be the same. As, e.g., POE1 may contain the CDE “Cancer Extent” with the three permissible values provided in the NIH definition, but POE2 may only allow the values “Localized” and “Metastatic.” In order to address this issue, identified CDEs with “Value List” data type were split up into corresponding “Value List” CDEs with binary [YES/NO] values. The “Cancer Extent [Localized/Regional/Metastatic]” CDE would therefore be replaced with “Cancer Extent – Localized [YES/NO]”, “Cancer Extent – Regional [YES/NO]” and “Cancer Extent – Metastatic [YES/NO].” In the mentioned example POE1 would contain all three of these binary or Boolean CDEs, while POE2 would contain only two of them.
“Text” CDEs were only declared as “definite”, if the encoded text was supposed to contain a clear value. Figure 1a shows the concept of identifying a “definite CDE” for the question “Should a 4D-CT be acquired? [YES/NO]”. However, as shown in Figure 1b, the text data element “Previous therapy? [WHICH ONE?]” is not definite since the answer to this question is not unambiguous (an entry by a physician may, e.g., be an acronym like “CRT,” which could stand for “cardiac resynchronization therapy,” “chemoradiotherapy”, or “conformal radiotherapy”).
Methodology of identifying common data elements from healthcare documents. A common data element that includes a piece of certain information (answering a specific question) is identified if that information is addressed in the document. The data element may be defined as “definite” if the information that can be inserted (answer) is clearly defined (a). On the other hand, a CDE, where the inserted information is not defined, is “unclear” (b).
Methodology of identifying common data elements from healthcare documents. A common data element that includes a piece of certain information (answering a specific question) is identified if that information is addressed in the document. The data element may be defined as “definite” if the information that can be inserted (answer) is clearly defined (a). On the other hand, a CDE, where the inserted information is not defined, is “unclear” (b).
Scanning of TPCT-POEs
All the submitted POEs were scanned for “definite CDEs”. Two authors (F.D. and N.C.) independently scanned the POEs for CDEs and a list of all identified CDEs was created. The CDEs were then grouped according to relevance for the TPCT by consensus of the two authors into “Relevant for the TPCT”, “Not Relevant for TPCT”, or “Unclear Meaning.” Out of these, only the CDEs relevant for the TPCT-POE were further analyzed. The relevant CDEs were classified into five semantic data groups describing different facets relevant to the TPCT.
We then performed a quantitative analysis regarding the number of definite CDEs identified in each of the submitted TPCT-POEs. We further checked whether the CDE was present in only one of the POEs (unique) or in at least two or more of the POEs (shared), which could facilitate direct interoperable data exchange.
Results
Collection of TPCT-POEs
Eight of the nine contacted representatives of Swiss-based RO centers answered our initial email. Seven institutions replied positively and delivered the necessary POE data with accompanying information. The RO centers included institutions with different radiotherapy machines (6 departments conventional Linac, 1 department Tomotherapy, 1 department Cyberknife, 1 department MR Linac) and different treatment modalities (7 departments 3D conformal radiotherapy, 4 departments brachytherapy, 2 departments hyperthermia, 1 department protons).
Following radiation therapy departments submitted their institutional POE for the TPCT for evaluation and further processing:
Institute of Radiation-Oncology, Kantonsspital St. Gallen
Institute of Radiation-Oncology, University of Bern, Inselspital
Institute of Radiation-Oncology, Kantonsspital Aarau
Institute of Radiation-Oncology, Kantonsspital Graubünden
Institute of Radiation-Oncology, Hirslandenklink Zurich
Institute of Radiation-Oncology, University Hospital Zurich
Paul Scherrer Institute
Four institutions (57%) used the integration of word documents in the Radiation Oncology Clinical Information System (ROCIS). They implemented the TPCT-POE in a Microsoft Office Word 97–2003 Binary File Format. The ROCIS system calls preinstalled Microsoft Word software and opens TPCT-POE as a document. One institution uses a built-in data entry form builder of the ROCIS to collect data from the end-user. One institution developed an in-house customized TPCT-POE in Microsoft.NET technology, communicating with ROCIS over an API. Finally, one institution uses their primary hospital information system for capturing TPCT-POE data.
Analysis of POEs
The provided POEs were heterogeneous in terms of structure, data documentation, and semantic content. Information was recorded in the POEs using various items, including checkboxes, open text fields, schematic illustrations, and electronic dropdown menus. In addition, several documents included CDEs that were generally relevant for certain parts of the radiotherapy care path but not for the TPCT itself (e.g., CDE about “dose per fraction”). Therefore, these elements were classified as “Not Relevant for TPCT” and not further included in the analysis. Nine data elements were labeled as “unclear meaning,” as the semantic information was not completely clear to the two authors doing the analysis. Since it appeared obvious that none of these elements was of relevance for the TPCT, their semantic meaning was not further investigated, and they were excluded from further analysis.
181 definite CDEs classified as “Relevant for the TPCT” were identified. The CDEs were classified into the following five semantic data groups that constitute different facets relevant to the TPCT: patient data (29 CDEs, median of 3 per POE), positioning data (72 CDEs, median of 21 per POE), CT data (38 CDEs, median of 6 per POE), brachytherapy (24 CDEs, all from one POE), administrative data (18 CDEs, median of 4 per POE). These groups were furthermore divided into subgroups (Table 1).
Identified groups and subgroups of CDEs in the POEs
Name of group/subgroups . | Common data elements . |
---|---|
Patient data | 29 |
Personal data | 3 |
Patient data relevant for contrast medium | 11 |
Risks/problems/peculiarities | 15 |
Positioning data | 72 |
General positioning data | 13 |
Positioning devices | 36 |
Markers | 10 |
Organ position | 11 |
Others | 2 |
CT data | 38 |
CT modality | 10 |
CT slice thickness | 3 |
Contrast medium | 11 |
CT region | 14 |
Brachytherapy | 24 |
General brachytherapy data | 13 |
Preparation | 11 |
Administrative data | 18 |
Appointment | 8 |
Document | 4 |
Physician data | 3 |
Others | 3 |
Name of group/subgroups . | Common data elements . |
---|---|
Patient data | 29 |
Personal data | 3 |
Patient data relevant for contrast medium | 11 |
Risks/problems/peculiarities | 15 |
Positioning data | 72 |
General positioning data | 13 |
Positioning devices | 36 |
Markers | 10 |
Organ position | 11 |
Others | 2 |
CT data | 38 |
CT modality | 10 |
CT slice thickness | 3 |
Contrast medium | 11 |
CT region | 14 |
Brachytherapy | 24 |
General brachytherapy data | 13 |
Preparation | 11 |
Administrative data | 18 |
Appointment | 8 |
Document | 4 |
Physician data | 3 |
Others | 3 |
All the CDEs for the brachytherapy group were from the POE of one institution. For the other four groups, CDEs were provided in the POEs of all seven participating institutes (Fig. 2).
Number of common data elements per group. Quantitative analysis of definite CDEs of the POEs from the seven participating institutes for the five semantic data groups.
Number of common data elements per group. Quantitative analysis of definite CDEs of the POEs from the seven participating institutes for the five semantic data groups.
139 of the 181 definite CDEs (76.8%) were “unique,” as they were present in only one of the POEs. The other 42 “shared” CDEs were present in two (18 = 9.9%), three (8 = 4.4%), four (8 = 4.4%), five (7 = 3.9%), or six (1 = 0.6%) of the seven analyzed POEs (Fig. 3). None of the CDEs was found in all the POEs.
A list of all the CDEs present in the majority (at least four) of the POEs is shown in Table 2. A list of all 181 relevant CDEs is provided in the online supplementary Material (for all online suppl. material, see https://doi.org/10.1159/000534204). The number of unique and shared CDEs identified in the POEs of the seven institutions ranged from 90 (53 unique and 37 shared elements) to 11 (five unique and six shared elements) (Fig. 4).
List of definite CDEs present in the majority of POEs
Information of common data element . | Group . | Presence . |
---|---|---|
Patient name | Patient data | 4 |
Patient’s birth date | Patient data | 4 |
Patient ID | Patient data | 5 |
Prone position [YES/NO] | Positioning data | 5 |
Supine position [YES/NO] | Positioning data | 5 |
Arms up [YES/NO] | Positioning data | 4 |
Arms down [YES/NO] | Positioning data | 4 |
Arms on chest/Arms bent [YES/NO] | Positioning data | 4 |
Radiation Mask [YES/NO] | Positioning data | 5 |
Bite block [YES/NO] | Positioning data | 4 |
Dental splint [YES/NO] | Positioning data | 4 |
Bladder full [YES/NO] | Positioning data | 5 |
Bladder empty [YES/NO] | Positioning data | 5 |
4D-CT [YES/NO] | CT data | 4 |
Intravenous contrast medium [YES/NO] | CT data | 6 |
Name of responsible physician | Administrative data | 5 |
Information of common data element . | Group . | Presence . |
---|---|---|
Patient name | Patient data | 4 |
Patient’s birth date | Patient data | 4 |
Patient ID | Patient data | 5 |
Prone position [YES/NO] | Positioning data | 5 |
Supine position [YES/NO] | Positioning data | 5 |
Arms up [YES/NO] | Positioning data | 4 |
Arms down [YES/NO] | Positioning data | 4 |
Arms on chest/Arms bent [YES/NO] | Positioning data | 4 |
Radiation Mask [YES/NO] | Positioning data | 5 |
Bite block [YES/NO] | Positioning data | 4 |
Dental splint [YES/NO] | Positioning data | 4 |
Bladder full [YES/NO] | Positioning data | 5 |
Bladder empty [YES/NO] | Positioning data | 5 |
4D-CT [YES/NO] | CT data | 4 |
Intravenous contrast medium [YES/NO] | CT data | 6 |
Name of responsible physician | Administrative data | 5 |
Number of shared and unique definite CDEs of the participating institutions.
Discussion
Documentation in Healthcare
In this work, we aimed at comparing healthcare documents in the form of TPCT-POEs, focusing on the semantic and possible interoperable data exchange. In modern medicine, healthcare documents, primarily those managed within EHRs, should ideally be structured, semantically interoperable, and support data exchange over common standards such as the FHIR.
However, healthcare documents that may contain the same or similar information are often arranged in inconsistent ways, even though a high level of structure and standardization could be achieved in principle [13]. Therefore, for our analysis, we selected the example of TPCT-POEs, as it covers an already structured process in a data-driven medical specialty, where nevertheless, up to date, no standardized documentation exists in the Swiss healthcare system (and as to our knowledge also not in any other country).
We detected heterogeneities in structure, implementation, and semantic content among analyzed POEs. For example, several TPCT-POEs included information necessary for other aspects of radiotherapy, such as fractionation schemes or patient history, next to relevant data for the TPCT. Also, how the information was recorded differed among POEs (checkboxes, open text, dropdown menus).
Semantic Analysis and Documentation Using the Concept of Definite CDEs
For the semantic analysis, we introduced the concept of identifying “definite CDEs”. The purpose was to find a common and clearly defined basis of meaning on which data exchange could be possible and misconceptions could be eliminated. Using this rather strict approach, we extracted vital information from the POEs, identifying 181 definite CDEs relevant to the TPCT. This approach furthermore allowed the classification of data elements into semantic data groups and subgroups for structuring information entries. Remarkably, not a single data element was found in all the seven analyzed POEs, and only 16/181 elements (8.9%) appeared in most of the POEs. Even with a clearly defined and structured way of extracting semantic values from healthcare records that cover a structured process, only a minority of the recorded information fulfills the basic requirements for direct data comparison. In order to achieve a high level of standardization and facilitate semantical interoperability, processes and documentation should be structured based on shared semantic specifications [24].
As for the documentation of healthcare data, we propose using (definite) CDEs to define meaning clearly. As we have seen, a large proportion of information can be recorded and classified using this highly structured concept. Definite CDEs can be used to encode the same semantical information among different facilities. If a consensus can be reached on which CDEs are relevant for documentation, it is possible to define a standard based on such elements. Also, CDEs can be used for information unique to a specific document or institution (such as “brachytherapy” in our analysis). Therefore, a modular concept would be possible with, e.g., CDEs termed “relevant for brachytherapy” that may only be used for documentation of these situations. CDEs may be sorted and grouped, which allows flexible application for the needed situation. This is, e.g., used in radiology, for which a comprehensive collection of CDEs already has been created [25, 26]. For different situations and examinations, different groups of CDEs may be appropriate (as, e.g., CDEs of the head and neck in radiology as described in the publication by Rajamohan et al. [27]).
Notably, the approach discussed in our work could also be applied to other medical documents and is not limited to TPCT-POEs. The concepts presented in our study may therefore be used as a basis for future initiatives of standardizing collection and exchange of relevant data in RO.
Controlled Vocabulary
Nevertheless, not all information that might be important can be displayed with CDEs that are “definite.” For example, annotations are written directly, or prescriptions for very complex, individual situations may require a more flexible framework.
In any case, unambiguous communication (via documentation) necessitates the usage of a controlled vocabulary. Previous work showed that existing vocabulary resources are insufficient to cover radiation therapy needs [28]. Consequently, the lack of centralized authority results in situations where no precise nomenclature exists. Therefore, institutions are forced to create local definitions and representations on the user interfaces (labels). Those definitions and labels are frequently based on previous user experience, local standards, care path setup and processes, borrowed titles based on vendor brand names (e.g., VacFix® or Fibreplast®) or historically inherited knowledge. A need for controlled vocabulary in RO arises from the basic features of natural language. It is challenging to compare practice patterns and resulting outcomes without a common nomenclature.
For example, one of the important differences in data quality between clinical studies and clinical routine emerges from the data collection methodology. In clinical studies, the data collection process is defined by instruments called “case reports forms (CRF)”, which are a prerequisite for conducting a study in the first place [29]. Those CRFs are, in fact, a collection of semantic elements organized in a document-like format, while data in clinical practice are not always collected using standardized forms or elements. While CRFs are commonly used in clinical research studies, they are typically not applied in routine clinical practice. Filling out CRFs can be an effortful process following strict protocols. Using such a concept for data collection in daily clinical practice may not always be feasible [30]. However, a lot of structured and semi-structured data is recorded in clinical practice – be it in the form of EHRs or analog documents. We have seen in our study that (definite) CDEs can be used to identify common meaning among different documents. Conversely, CDEs may be defined beforehand when creating EHRs or entry documents to determine which data should be included to promote standardization and data exchange.
Similar Initiatives
Standardization is not a new concept in RO. Noteworthy examples are the following initiatives and concepts developed or in development by other groups. In contrast to our approach, other groups are mainly concentrating their efforts on the standardization of terminology.
A classic example is the ICRU 83 report delivered by the International Commission on Radiation Units and Measurements, which provides the information necessary to standardize techniques and procedures and to harmonize the prescribing, recording, and reporting of intensity-modulated radiation therapy [31]. Furthermore, the ASTRO and the American Association of Physicists in Medicine (AAPM) recognized the need for standardization. As a result, ASTRO and AAPM published several papers addressing problems of dose prescription [32] and general naming convention in RO [33].
From an information technology perspective, our efforts are not competitive but additive. Semantic resources are a critical part of the interoperability pyramid [13]. However, we are evaluating abstraction at one level higher – interinstitutional interoperability. Semantic resources combined with CDEs represent the cornerstone of further development for common sharable data in healthcare.
Limitations of the Study
Our study has some considerable limitations. We contacted nine Swiss-based RO centers to provide their TPCT-POEs. With the selection of RO centers that were contacted, we wanted to depict a wide variety of different centers with different concepts, RT machines, and treatment modalities (including hyperthermia, brachytherapy, and proton therapy). While the participating centers may be representative of RO in Switzerland, the group of participating centers is nevertheless highly selected. Furthermore, we did not have any formal methodology for RO centers selection, and we selected them based on the authors’ personal preferences. Notably, all the participating centers in the study are German speaking. Therefore, our analysis is not a robust evaluation of TPCT documentation that can be extrapolated to other centers. Furthermore, the used methodology of identifying “definite CDEs” is prone to subjective estimations. We tried to use strict rules and a structured approach to solve this issue. While there was a high consensus regarding identifying definite CDEs among the authors who did the analysis, the semantic evaluation of heterogeneous documents cannot be completely objective. Beyond that, the classification into “relevant/not relevant for TPCT” and the grouping of CDEs into semantic groups are also subjective to some extent. Furthermore, the authors doing the analysis use the respective POEs of their facilities during their work as radiation oncologists, which obviously is a considerable bias.
Besides that, only the provided POEs (which were not necessarily developed with interoperability in mind) were individually analyzed. Additional documents in the ROCIS that may also contain CDEs relevant for the TPCT were not included in the analysis. Therefore, relevant information documented on other than the TPCT-POEs we received may have been missed.
One may argue that TPCT-POEs are usually not exchanged among different facilities and therefore interoperability is not that relevant for this situation. While data exchange of other healthcare documents (like physician intent, planning documentation, etc.) may be more relevant, the semantic analysis would be way more difficult and controversial.
As our study represents a starting point for further initiatives of standardized data documentation, we deliberately wanted to avoid additional confusion and heterogeneity. Finding consensus to facilitate standardization remains an effortful task facing considerable multi-institutional challenges (as, e.g., described in the project about standardization of nomenclatures in RO, which was carried out by the AAPM task group 263 [33]).
Documentation of data relevant for TPCT-POEs was chosen for this work, as it covers a clearly defined and structured step in the RO pathway. It is remarkable that even for such a defined situation, the possibility for direct data exchange among the included facilities is rather scarce.
Despite the mentioned limitations, we believe that this work has a particular value. We are proposing a feasible methodology for the analysis of healthcare documents and believe that the results of this work will contribute to efforts for standardization in the RO domain.
Conclusion
The objective of this work was to find out how a semantical comparison of heterogeneous healthcare documents in the form of TPCT-POEs could be made. We therefore introduced the concept of identifying “definite CDEs” as a structured approach to finding common meaning among different documents. We found considerable heterogeneity among analyzed POEs, with only a fraction of CDEs that are commonly shared. While the direct semantic comparison is currently limited in this case, definite CDEs can build the basis for defined data documentation. In addition, using a protected vocabulary and applying recommended standards may facilitate standardized documentation in healthcare. The presented concepts may be used as a framework for future approaches of standardizing documentation and exchange of data in RO.
Acknowledgments
We thank Matthias Hartmann from the Department of Radiation Oncology at Hirslanden Klinik in Zurich for providing a POE document for this study.
Statement of Ethics
An ethics statement was not required for this study type, no human or animal subjects or materials were used.
Conflict of Interest Statement
Nikola Cihoric is a technical lead for the SmartOncology© project and medical advisor for Wemedoo AG, Steinhausen AG, Switzerland. The authors declare no other conflicts of interest.
Funding Sources
No funding was received for this study.
Author Contributions
F.D. developed the extraction methodology of CDEs from the POEs. Preparation of the POEs and clarification of semantic meaning of data elements was done by P.M.P., M.H., E.V.B., B.G.B., D.L., and N.C. Scanning of POEs, extraction of CDEs, and classification was done by F.D. and N.C. All authors discussed the results and contributed to the final manuscript.
Data Availability Statement
The data in this study are based on the Treatment Planning CT POE documents of the participating RO departments. The authors received permission from the participating institutions to analyze the documents for scientific purposes. However, the authors do not own the documents and on legal ground are not permitted to share them in their original form, which is why the data are not publicly available. Inquiries may be directed to the corresponding author.