Introduction: Books and papers are the most relevant source of theoretical knowledge for medical education. New technologies of artificial intelligence can be designed to assist in selected educational tasks, such as reading a corpus made up of multiple documents and extracting relevant information in a quantitative way. Methods: Thirty experts were selected transparently using an online public call on the website of the sponsor organization and on its social media. Six books edited or co-edited by members of this panel containing a general knowledge of breast cancer or specific surgical knowledge have been acquired. This collection was used by a team of computer scientists to train an artificial neural network based on a technique called Word2Vec. Results: The corpus of six books contained about 2.2 billion words for 300d vectors. A few tests were performed. We evaluated cosine similarity between different words. Discussion: This work represents an initial attempt to derive formal information from textual corpus. It can be used to perform an augmented reading of the corpus of knowledge available in books and papers as part of a discipline. This can generate new hypothesis and provide an actual estimate of their association within the expert opinions. Word embedding can also be a good tool when used in accruing narrative information from clinical notes, reports, etc., and produce prediction about outcomes. More work is expected in this promising field to generate “real-world evidence.”

Books and papers are the most relevant source of theoretical knowledge for medical education. New technologies of artificial intelligence can be designed to assist in selected educational tasks, such as reading a corpus made up of multiple documents and extracting relevant information in a quantitative way [1].

Natural language processing is a branch of computer science dealing with human language interpretation so that it can be possible to analyze texts or speeches, extract meaningful information, categorize it, and organize it [2, 3]. This study aimed to demonstrate how an artificial intelligence system trained on a corpus of books edited by senior experts was able to extract meaningful associations between words and recreate semantic contexts.

In March 2021, the G.Re.T.A. (Group for Therapeutic and Reconstructive Advancements) Fondazione ETS gathered a core team made of senior oncoplastic surgeons representative of the most important societies dealing with surgical oncology and oncoplastic surgery of the breast (ETHOS collaborative group). A subgroup of senior oncoplastic surgeons (facilitators) was invited to coordinate this activity.

Thirty experts were selected transparently using an online public call on the website of the sponsor organization and on its social media. Six books edited or co-edited by members of this panel containing a general knowledge of breast cancer or specific surgical knowledge have been acquired (see Table 1 for list of books). This collection was used by a team of computer scientists at the University of Catania, Dipartimento di Scienze del Farmaco, to train an artificial neural network based on a technique called Word2Vec [4‒6].

Table 1.

Cosine similarity values and complications (words selected by experts in a list of 1,000 words produced by ETHOS according to cosine similarity values)

ComplicationCosine similarity
Infection 0.597 
Hematoma 0.502 
Necrosis 0.424 
Extrusion 0.393 
Contracture 0.310 
Hernia 0.257 
Scarring 0.2426 
ComplicationCosine similarity
Infection 0.597 
Hematoma 0.502 
Necrosis 0.424 
Extrusion 0.393 
Contracture 0.310 
Hernia 0.257 
Scarring 0.2426 

This is a natural language processing algorithm able to learn word associations, suggest additional words, detect synonyms within a large corpus of written documents. A mathematical function (the “cosine similarity” between words) indicates the level of semantic association [7‒11]. Using this function, each word is assigned a different coordinate and the text is represented by the vector of the numbers of occurrences of each word in the document. A software named ETHOS is the final product of this process. A graphic interface allows the calculation of a few sub-functions based on cosine similarity (shown in Table 1).

The word2vec artificial neural network was interrogated with the purpose of exploring associations between different clusters of words. More specifically, words representative of surgical techniques (i.e., implant-based-reconstructions, mastectomy, breast conserving surgery) or specific clinical conditions (old-diabetes, etc.) were associated to a potential list of outcomes (extrusion-hematoma, etc.). Cosine similarity can also be calculated between clusters of words. This function was used to estimate how “close” clusters of words were in the vector space. The closer the distance, the higher the semantic association. Values ranged from -1 to 1 with CS = 1 representing two identical words.

In this specific task, this function was used to generate a list of 1,000 words close to the word: “complication-s.” Among these, the panelists manually identified a second list representative of multiple potential clinical scenarios. For each of the explored situations, the function produced an estimate that was subsequently inferred by the panel in order to extract meaningful associations.

The corpus of six books contained about 2.2 billion words for 300d vectors [12‒17]. A few tests were performed by the facilitators group. We report some examples here.

First, words closer (according to cosine similarity) to the word “complication” were assessed. The experts then selected a list of words among a suggested list of 1,000 meaningfully associated to “complications.” These are infection CS = 0.509; hematoma CS = 0.502; dehiscence 0.4842, etc. (shown in Table 1).

After this, cosine similarity was assessed between the following words: “autologous-breast-reconstructions” versus words representing complications as previously identified. Same was done with “and implant-based-reconstructions” (shown in Table 1,-,3).

Table 2.

Cosine similarity values associated to autologous breast reconstructions

TechniqueComplicationCosine similarity
Autologous breast reconstruction Necrosis 0.597 
Autologous breast reconstruction Infection 0.465 
Autologous breast reconstruction Scarring 0.447 
Autologous breast reconstruction Hematoma 0.426 
Autologous breast reconstruction Hernia 0.404 
TechniqueComplicationCosine similarity
Autologous breast reconstruction Necrosis 0.597 
Autologous breast reconstruction Infection 0.465 
Autologous breast reconstruction Scarring 0.447 
Autologous breast reconstruction Hematoma 0.426 
Autologous breast reconstruction Hernia 0.404 
Table 3.

Cosine similarity values associated to implant-based reconstructions

TechniqueComplicationCosine similarity
Implant-based reconstruction Infection 0.603 
Implant-based reconstruction Extrusion 0.595 
Implant-based reconstruction Contracture 0.587 
Implant-based reconstruction Hematoma 0.232 
TechniqueComplicationCosine similarity
Implant-based reconstruction Infection 0.603 
Implant-based reconstruction Extrusion 0.595 
Implant-based reconstruction Contracture 0.587 
Implant-based reconstruction Hematoma 0.232 

A second, more complex clinical scenario was tested. We calculate the cosine similarity between words related to clinical cases with opposite features: words of clinical scenario 1: “young-small-breast-conserving-surgery” versus “complications,” cosine similarity value = 0.274; words of clinical scenario 2: “old-diabetes-breast-reconstruction” versus “complications,” cosine similarity value of 0.3736.

Some other scenarios were explored in the semantic area of oncological outcomes. For instance, the words “poor-outcome” and “triple-negative” retain a higher cosine similarity value (CS = 0.24) in comparison to the words “poor outcome” and “hormone-receptor-positive” 0.1901.

The ETHOS Word2Vec artificial neural network trained on the corpus of knowledge of senior experts in breast cancer surgery tested in this study can be used for two purposes. First, it can be used to perform an enhanced and quantitative reading so that one can explore semantic associations and extract meaningful information. Hypothesis generated by the enhanced reading can inform surveys or be part of the design of clinical trials. For instance, the ETHOS survey [18, 19] used this tool to inform the panel about the semantic relevance of a list of preoperative features and postoperative outcomes. The panel expressed its opinion on approving or rejecting the proposed drivers after reviewing the quantitative information received. In this regard, probably the actual corpus made up of six books can be considered representative of a rather narrow sample of the narrative information available. Some examples of artificial neural networks available online are based on databases (http://epsilon-it.utu.fi/wv_demo/) that can reach up to 4.5B words or even more, and most of the cases are derived from general language texts. The actual corpus can be increased in terms of number of words, using a larger dataset that also includes journal articles or other books. On the other hand, in order to obtain a more refined view in selected subspecialties, it could be possible to handle texts from a specialized subset (for instance surgery), accepting the risk of losing some relevant information.

Alternatively, the system can be developed to reconstruct semantic scenarios (made up of patients’ features and outcomes) that are described in an unstructured narrative way and associate them to structured information (laboratory tests, ICD9 codes, SNOMED, BMI, etc.). Using this strategy, the clinical notes, letters to GP, and some other narrative text (including social media, chats with breast care nurses, etc.) can recreate personal profiles that indicate potential outcomes [20‒22]. In this way, the corpus of knowledge on a clinical condition will not only be derived from clinical trials, observational studies, or even expert opinions but also from so-called “real world data” [23]. To do this, a training population including a large number of patients should be used to train the artificial neural network. Narrative information will be collected prospectively and retrospectively from textual information, together with structured data about clinical condition of the patient. A second population (testing populations) will be used to test the ability of the system to identify the selected outcomes. A final external validation can be required for further improvement of the model.

The word embedding technique still has some limitations: first, the cosine similarity estimates the distance of two words within a corpus, assuming that closer words are likely to retain some kind of semantic association. This is not entirely true. In fact, sometimes some words, even very close (i.e., breast-cancer), can be very highly inter-related but not similar or synonymous. In the ETHOS tool, for instance, they retain a high similarity value (CS = 0.81), as do breast and patient (CS = 0.70), but the meaning behind these two examples is completely different. In fact, the first one is representative of a single entity (a disease); the second indicates two different entities (a human being and one of its organs).

This work represents an initial attempt to derive formal information from textual corpus. It can be used to perform an augmented reading of the corpus of knowledge available in books and papers as part of a discipline. This can generate new hypothesis and provide an actual estimate of their association within the expert opinions. These data can also be accrued prospectively and be insightful on changes across historical times. Word embedding can also be a good tool when used in accruing narrative information from clinical notes, reports, etc., and produce prediction about outcomes. More work is expected in this promising field to generate “real-world evidence.”

An ethics statement was not required for this study type as no human or animal subjects or materials were used.

The authors have no conflicts of interest to declare.

There are no funding sources to declare.

Nicola Rocco, Giuseppe Catanuto, Maurizio Bruno Nava, Yazan Masannat, Konstantina Balafa, Andreas Karakatsanis, Anna Maglia, Peter Barry, Francesco Pappalardo, and Francesco Caruso: all the authors have been involved in the design and preparation of the manuscript.

All data generated or analyzed during this study are included in this article. Further inquiries can be directed to the corresponding author.

1.
Chary
M
,
Parikh
S
,
Manini
AF
,
Boyer
EW
,
Radeos
M
.
A review of natural language processing in medical education
.
West J Emerg Med
.
2019 Jan
20
1
78
86
.
2.
Koskenniemi
K
Two-level morphology: a general computational model of word-form recognition and production
Department of General Linguistics, University of Helsinki
(accessed February 2023).
3.
Guida
G
,
Mauri
G
.
“Evaluation of natural language processing systems: issues and approaches”
.
Proc IEEE
.
1986
;
74
(
7
):
1026
35
.
4.
Magna
AR
,
Allende-Cid
H
,
Taramasco
C
,
Becerra
C
,
Figueroa
RL
.
Application of machine learning and word embeddings in the classification of cancer diagnosis using patient anamnesis
.
IEEE Access
.
2020
;
8
:
106198
213
.
5.
Rahimian
M
,
Warner
JL
,
Jain
SK
,
Davis
RB
,
Zerillo
JA
,
Joyce
RM
.
Significant and distinctive n-grams in oncology notes: a text-mining method to analyze the effect of OpenNotes on clinical documentation
.
JCO Clin Cancer Inform
.
2019
;
3
:
1
9
.
6.
Kim
S
,
Lee
H
,
Kim
K
,
Kang
J
.
Mut2Vec: distributed representation of cancerous mutations
.
BMC Med Genomics
.
2018
11
Suppl 2
33
.
7.
Li
B
,
Drozd
A
,
Guo
Y
,
Liu
T
,
Matsuoka
S
,
Du
X
.
Scaling Word2Vec on big corpus
.
Data Sci Eng
.
2019
;
4
(
2
):
157
75
.
8.
Hill
F
,
Reichart
R
,
Korhonen
A
.
SimLex-999: evaluating semantic models with (genuine) similarity estimation
.
Computational Linguistics
.
2015
;
41
(
4
):
665
95
.
9.
Pakhomov
S
,
McInnes
B
,
Adam
T
,
Liu
Y
,
Pedersen
T
,
Melton
GB
.
Semantic similarity and relatedness between clinical terms: an experimental study
.
AMIA Annu Symp Proc
.
2010
;
2010
:
572
6
.
10.
Pappalardo
F
,
Russo
G
,
Reche
PA
.
Toward computational modelling on immune system function
.
BMC Bioinformatics
.
2020
21
Suppl 17
546
.
11.
Leviant
I
,
Reichart
R
.
Judgment Language matters: multilingual vector space models for Judgment Language aware lexical semantics
2015
. Available from: https://arxiv.org/abs/1508.00106.
12.
Urban
C
,
Rietjens
M
Oncoplastic and reconstructive breast surgery
New York Dordrecht London
Springer Milan Heidelberg
2013
. ISBN 978-88-470-2652-0.
13.
Markopoulos
C
,
Wyld
L
,
Leidenius
M
,
Senkus-Konefka
E
Breast cancer management for surgeons: A European multidisciplinary textbook
Springer International Publishing AG
2018
.
14.
Benson
JR
,
Gui
G
,
Tuttle
TM
Early breast cancer: From screening to multidisciplinary management
3rd ed.
London
CRC Press - Taylor & Francis Group eBooks
2013
.
15.
Rubio
IT
,
Kovacs
T
,
Suzanne Klimberg
V
Oncoplastic breast surgery techniques for the general surgeon
Springer Nature Switzerland AG
2020
.
16.
Fitzal
F
,
Schrenk
P
Oncoplastic breast surgery: A guide to clinical practice
Springer-Verlag Vienna
2015
.
17.
Benson
JR
,
Nava
MB
,
Kronowitz
SJ
Oncoplastic and reconstructive surgery of the breast
Productivity Press
2020
.
18.
Catanuto
G
,
Rocco
N
,
Maglia
A
,
Barry
P
,
Karakatsanis
A
,
Sgroi
G
.
Text mining and word embedding for classification of decision making variables in breast cancer surgery
.
Eur J Surg Oncol
.
2022 Jul
48
7
1503
9
. Epub 2022 Apr 1.
19.
Desai
A
,
Zumbo
A
,
Giordano
M
,
Morandini
P
,
Laino
ME
,
Azzolini
E
.
Word2vec word embedding-based artificial intelligence model in the triage of patients with suspected diagnosis of major ischemic stroke: a feasibility study
.
Int J Environ Res Public Health
.
2022 Nov 19
19
22
15295
.
20.
Zhu
Z
,
Li
J
,
Huang
J
,
Li
Z
,
Zhang
H
,
Chen
S
.
An intelligent prediagnosis system for disease prediction and examination recommendation based on electronic medical record and a Medical-Semantic-aware Convolution Neural Network (MSCNN) for pediatric chronic cough
.
Transl Pediatr
.
2022 Jul
11
7
1216
33
.
21.
Ozyegen
O
,
Kabe
D
,
Cevik
M
.
Word-level text highlighting of medical texts for telehealth services
.
Artif Intell Med
.
2022 May
127
102284
. Epub 2022 Mar 23.
22.
Lu
Z
,
Sim
JA
,
Wang
JX
,
Forrest
CB
,
Krull
KR
,
Srivastava
D
.
Natural Language processing and machine learning methods to characterize unstructured patient-reported outcomes: validation study
.
J Med Internet Res
.
2021 Nov 3
23
11
e26777
.
23.
Available from: https://www.fda.gov/science-research/science-and-research-special-topics/real-world-evidence (accessed February 2023).