Introduction: Most tools trying to automatically extract information from medical publications are domain agnostic and process publications from any field. However, only retrieving trials from dedicated fields could have advantages for further processing of the data. Methods: We trained a small transformer model to classify trials into randomized controlled trials (RCTs) versus non-RCTs and oncology publications versus non-oncology publications. In addition, we used two large language models (GPT-4o and GPT-4o mini) for the same task. We assessed the performance of the three models and then developed a simple set of rules to extract the tumor entity from the retrieved oncology RCTs. Results: On the unseen test set consisting of 100 publications, the small transformer achieved an F1 score of 0.96 (95% CI: 0.92–1.00) with a precision of 1.00 and a recall of 0.92 for predicting whether a publication was an RCT. For predicting whether a publication covered an oncology topic, the F1 score was 0.84 (0.77–0.91) with a precision of 0.75 and a recall of 0.95. GPT-4o achieved an F1 score of 0.94 (95% CI: 0.90–0.99) with a precision of 0.89 and a recall of 1.00 for predicting whether a publication was an RCT. For predicting whether a publication covered an oncology topic, the F1 score was 0.91 (0.85–0.97) with a precision of 0.91 and a recall of 0.91. The rule-based system was able to correctly assign every oncology RCT in the test set to a tumor entity. Conclusion: Classifying publications depending on whether they were randomized controlled oncology trials or not was feasible and enabled further processing using more specialized tools such as rule-based systems and potentially dedicated machine learning models.

1.
Kilicoglu
H
,
Rosemblat
G
,
Hoang
L
,
Wadhwa
S
,
Peng
Z
,
Malički
M
, et al
.
Toward assessing clinical trial publications for reporting transparency
.
J Biomed Inform
.
2021
;
116
:
103717
.
2.
Schmidt
L
,
Sinyor
M
,
Webb
RT
,
Marshall
C
,
Knipe
D
,
Eyles
EC
, et al
.
A narrative review of recent tools and innovations toward automating living systematic reviews and evidence syntheses
.
Z Evid Fortbild Qual Gesundhwes
.
2023
;
181
:
65
75
.
3.
Marshall
IJ
,
Nye
B
,
Kuiper
J
,
Noel-Storr
A
,
Marshall
R
,
Maclean
R
, et al
.
Trialstreamer: a living, automatically updated database of clinical trial reports
.
J Am Med Inform Assoc
.
2020
;
27
(
12
):
1903
12
.
4.
U.S. National Library of Medicine
.
Medical subject headings: home page
.
2020
. [cited 2024 Jul 1]. Available from: https://www.nlm.nih.gov/mesh/meshhome.html
5.
Santos
T
,
Tariq
A
,
Gichoya
JW
,
Trivedi
H
,
Banerjee
I
.
Automatic classification of cancer pathology reports: a systematic review
.
J Pathol Inform
.
2022
;
13
:
100003
.
6.
Osterman
TJ
,
Terry
M
,
Miller
RS
.
Improving cancer data interoperability: the promise of the minimal common oncology data elements (mCODE) initiative
.
JCO Clin Cancer Inform
.
2020
;
4
:
993
1001
.
7.
Liu
Y
,
Ott
M
,
Goyal
N
,
Du
J
,
Joshi
M
,
Chen
D
, et al
.
RoBERTa: a robustly optimized BERT pretraining approach
.
arXiv [csCL]
.
2019
. Available from: http://arxiv.org/abs/1907.11692
8.
Kingma
DP
,
Ba
J
.
Adam: a method for stochastic optimization
.
arXiv [csLG]
.
2014
. Available from: http://arxiv.org/abs/1412.6980
9.
Van Rossum
G
,
Drake
FL
Jr
.
Python reference manual
.
Centrum voor Wiskunde en Informatica Amsterdam
.
1995
.
10.
McKinney
W
;
Others
.
Data structures for statistical computing in Python
. Proceedings of the 9th Python in Science Conference.
Austin, TX
;
2010
; p.
51
6
.
11.
Harris
CR
,
Millman
KJ
,
van der Walt
SJ
,
Gommers
R
,
Virtanen
P
,
Cournapeau
D
, et al
.
Array programming with NumPy
.
Nature
.
2020
;
585
(
7825
):
357
62
.
12.
Marshall
IJ
,
Noel-Storr
A
,
Kuiper
J
,
Thomas
J
,
Wallace
BC
.
Machine learning for identifying randomized controlled trials: an evaluation and practitioner’s guide
.
Res Synth Methods
.
2018
;
9
(
4
):
602
14
.
13.
Kim
J
,
Kim
J
,
Lee
A
,
Kim
J
.
Bat4RCT: a suite of benchmark data and baseline methods for text classification of randomized controlled trials
.
PLoS One
.
2023
;
18
(
3
):
e0283342
.
14.
Mailankody
S
,
Devlin
SM
,
Landa
J
,
Nath
K
,
Diamonte
C
,
Carstens
EJ
, et al
.
GPRC5D-targeted CAR T cells for myeloma
.
N Engl J Med
.
2022
;
387
(
13
):
1196
206
.
15.
Blauvelt
A
,
Kempers
S
,
Lain
E
,
Schlesinger
T
,
Tyring
S
,
Forman
S
, et al
.
Phase 3 trials of tirbanibulin ointment for actinic keratosis
.
N Engl J Med
.
2021
;
384
(
6
):
512
20
.
16.
Peled
JU
,
Gomes
ALC
,
Devlin
SM
,
Littmann
ER
,
Taur
Y
,
Sung
AD
, et al
.
Microbiota as predictor of mortality in allogeneic hematopoietic-cell transplantation
.
N Engl J Med
.
2020
;
382
(
9
):
822
34
.
17.
Xu
X
,
Li
M
,
Tao
C
,
Shen
T
,
Cheng
R
,
Li
J
, et al
.
A survey on knowledge distillation of large language models
.
arXiv [csCL]
.
2024
. [cited 2025 May 9]. Available from: http://arxiv.org/abs/2402.13116
18.
Begg
C
,
Cho
M
,
Eastwood
S
,
Horton
R
,
Moher
D
,
Olkin
I
, et al
.
Improving the quality of reporting of randomized controlled trials. The CONSORT statement
.
JAMA
.
1996
;
276
(
8
):
637
9
.
19.
Hopewell
S
,
Clarke
M
,
Moher
D
,
Wager
E
,
Middleton
P
,
Altman
DG
, et al
.
CONSORT for reporting randomised trials in journal and conference abstracts
.
Lancet
.
2008
;
371
(
9609
):
281
3
.
20.
Moreno-Garcia
CF
,
Jayne
C
,
Elyan
E
,
Aceves-Martins
M
.
A novel application of machine learning and zero-shot classification methods for automated abstract screening in systematic reviews
.
Decis Analytics J
.
2023
;
6
:
100162
.
21.
Bao
Y
,
Deng
Z
,
Wang
Y
,
Kim
H
,
Armengol
VD
,
Acevedo
F
, et al
.
Using machine learning and natural language processing to review and classify the medical literature on cancer susceptibility genes
.
JCO Clin Cancer Inform
.
2019
;
3
:
1
9
.
You do not currently have access to this content.