Introduction: Books and papers are the most relevant source of theoretical knowledge for medical education. New technologies of artificial intelligence can be designed to assist in selected educational tasks, such as reading a corpus made up of multiple documents and extracting relevant information in a quantitative way. Methods: Thirty experts were selected transparently using an online public call on the website of the sponsor organization and on its social media. Six books edited or co-edited by members of this panel containing a general knowledge of breast cancer or specific surgical knowledge have been acquired. This collection was used by a team of computer scientists to train an artificial neural network based on a technique called Word2Vec. Results: The corpus of six books contained about 2.2 billion words for 300d vectors. A few tests were performed. We evaluated cosine similarity between different words. Discussion: This work represents an initial attempt to derive formal information from textual corpus. It can be used to perform an augmented reading of the corpus of knowledge available in books and papers as part of a discipline. This can generate new hypothesis and provide an actual estimate of their association within the expert opinions. Word embedding can also be a good tool when used in accruing narrative information from clinical notes, reports, etc., and produce prediction about outcomes. More work is expected in this promising field to generate “real-world evidence.”

1.
Chary
M
,
Parikh
S
,
Manini
AF
,
Boyer
EW
,
Radeos
M
.
A review of natural language processing in medical education
.
West J Emerg Med
.
2019 Jan
20
1
78
86
.
2.
Koskenniemi
K
Two-level morphology: a general computational model of word-form recognition and production
Department of General Linguistics, University of Helsinki
(accessed February 2023).
3.
Guida
G
,
Mauri
G
.
“Evaluation of natural language processing systems: issues and approaches”
.
Proc IEEE
.
1986
;
74
(
7
):
1026
35
.
4.
Magna
AR
,
Allende-Cid
H
,
Taramasco
C
,
Becerra
C
,
Figueroa
RL
.
Application of machine learning and word embeddings in the classification of cancer diagnosis using patient anamnesis
.
IEEE Access
.
2020
;
8
:
106198
213
.
5.
Rahimian
M
,
Warner
JL
,
Jain
SK
,
Davis
RB
,
Zerillo
JA
,
Joyce
RM
.
Significant and distinctive n-grams in oncology notes: a text-mining method to analyze the effect of OpenNotes on clinical documentation
.
JCO Clin Cancer Inform
.
2019
;
3
:
1
9
.
6.
Kim
S
,
Lee
H
,
Kim
K
,
Kang
J
.
Mut2Vec: distributed representation of cancerous mutations
.
BMC Med Genomics
.
2018
11
Suppl 2
33
.
7.
Li
B
,
Drozd
A
,
Guo
Y
,
Liu
T
,
Matsuoka
S
,
Du
X
.
Scaling Word2Vec on big corpus
.
Data Sci Eng
.
2019
;
4
(
2
):
157
75
.
8.
Hill
F
,
Reichart
R
,
Korhonen
A
.
SimLex-999: evaluating semantic models with (genuine) similarity estimation
.
Computational Linguistics
.
2015
;
41
(
4
):
665
95
.
9.
Pakhomov
S
,
McInnes
B
,
Adam
T
,
Liu
Y
,
Pedersen
T
,
Melton
GB
.
Semantic similarity and relatedness between clinical terms: an experimental study
.
AMIA Annu Symp Proc
.
2010
;
2010
:
572
6
.
10.
Pappalardo
F
,
Russo
G
,
Reche
PA
.
Toward computational modelling on immune system function
.
BMC Bioinformatics
.
2020
21
Suppl 17
546
.
11.
Leviant
I
,
Reichart
R
.
Judgment Language matters: multilingual vector space models for Judgment Language aware lexical semantics
2015
. Available from: https://arxiv.org/abs/1508.00106.
12.
Urban
C
,
Rietjens
M
Oncoplastic and reconstructive breast surgery
New York Dordrecht London
Springer Milan Heidelberg
2013
. ISBN 978-88-470-2652-0.
13.
Markopoulos
C
,
Wyld
L
,
Leidenius
M
,
Senkus-Konefka
E
Breast cancer management for surgeons: A European multidisciplinary textbook
Springer International Publishing AG
2018
.
14.
Benson
JR
,
Gui
G
,
Tuttle
TM
Early breast cancer: From screening to multidisciplinary management
3rd ed.
London
CRC Press - Taylor & Francis Group eBooks
2013
.
15.
Rubio
IT
,
Kovacs
T
,
Suzanne Klimberg
V
Oncoplastic breast surgery techniques for the general surgeon
Springer Nature Switzerland AG
2020
.
16.
Fitzal
F
,
Schrenk
P
Oncoplastic breast surgery: A guide to clinical practice
Springer-Verlag Vienna
2015
.
17.
Benson
JR
,
Nava
MB
,
Kronowitz
SJ
Oncoplastic and reconstructive surgery of the breast
Productivity Press
2020
.
18.
Catanuto
G
,
Rocco
N
,
Maglia
A
,
Barry
P
,
Karakatsanis
A
,
Sgroi
G
.
Text mining and word embedding for classification of decision making variables in breast cancer surgery
.
Eur J Surg Oncol
.
2022 Jul
48
7
1503
9
. Epub 2022 Apr 1.
19.
Desai
A
,
Zumbo
A
,
Giordano
M
,
Morandini
P
,
Laino
ME
,
Azzolini
E
.
Word2vec word embedding-based artificial intelligence model in the triage of patients with suspected diagnosis of major ischemic stroke: a feasibility study
.
Int J Environ Res Public Health
.
2022 Nov 19
19
22
15295
.
20.
Zhu
Z
,
Li
J
,
Huang
J
,
Li
Z
,
Zhang
H
,
Chen
S
.
An intelligent prediagnosis system for disease prediction and examination recommendation based on electronic medical record and a Medical-Semantic-aware Convolution Neural Network (MSCNN) for pediatric chronic cough
.
Transl Pediatr
.
2022 Jul
11
7
1216
33
.
21.
Ozyegen
O
,
Kabe
D
,
Cevik
M
.
Word-level text highlighting of medical texts for telehealth services
.
Artif Intell Med
.
2022 May
127
102284
. Epub 2022 Mar 23.
22.
Lu
Z
,
Sim
JA
,
Wang
JX
,
Forrest
CB
,
Krull
KR
,
Srivastava
D
.
Natural Language processing and machine learning methods to characterize unstructured patient-reported outcomes: validation study
.
J Med Internet Res
.
2021 Nov 3
23
11
e26777
.
23.
Available from: https://www.fda.gov/science-research/science-and-research-special-topics/real-world-evidence (accessed February 2023).
You do not currently have access to this content.