Abstract
Introduction: Essential tremor is a common movement disorder. Numerous validated clinical rating scales exist to quantify essential tremor severity by employing rater-dependent visual observation but have limitations, including the need for trained human raters and the lack of precision and sensitivity compared to technology-based objective measures. Other continuous objective methods to quantify tremor amplitude have been developed, but frequently provide unitless measures (e.g., tremor power), limiting real-world interpretability. We propose a novel algorithm to measure kinetic tremor amplitude using digital spiral drawings, applying the V3 framework (sensor verification, analytical validation, and clinical validation) to establish reliability and clinical utility. Methods: Archimedes spiral drawings were recorded on a digitizing tablet from participants (n = 7) enrolled in a randomized placebo control double-blinded crossover pilot trial evaluating the efficacy of oral cannabinoids in reducing essential tremor. We developed an algorithm to calculate the mean and maximum tremor amplitude derived from the spiral tracings. We compared the digitally measured tremor amplitudes to manual measurement to evaluate sensor reliability, determined the test-retest reliability of the digital output across two short-interval repeated measures, and compared the digital measure to kinetic tremor severity graded using The Essential Tremor Rating Assessment Scale (TETRAS) score for spiral drawings. Results: This algorithm for automated assessment of kinetic tremor amplitude from digital spiral tracings demonstrated a high correlation with manual spot measures of tremor amplitude, excellent test-retest reliability, and a high correlation with human ratings of the TETRAS score for spiral drawing severity when the tremor severity was rated “slight tremor” or worse. Discussion: This digital measure provides a simple and clinically relevant evaluation of kinetic tremor amplitude that shows promise as a potential future endpoint for use in clinical trials of essential tremor.
Introduction
Essential tremor (ET) is one of the most common neurological disorders. The predominant clinical finding in patients with ET is action tremor (involuntary oscillatory movements) involving the arms and may also affect the head, voice, and legs; this occurs during many voluntary movements and inevitably impacts activities of daily living [1]. Clinically, action tremors can be evaluated during isometric tonic posturing of the arms against gravity (i.e., postural tremor), while the patient is reaching for a target (i.e., intention tremor), or while drawing an Archimedes spiral (i.e., kinetic tremor). These tasks have proven reliable in assessing tremor severity and related disability in rater-administered visual rating scales [2‒4]. Although these clinical rating scales have been validated in ET, they have several limitations: they rely on a human rater, are ordinal in nature, and lack the sensitivity and precision of a continuous objective measure.
Objective measures have numerous advantages over traditional clinical rating scales since they provide continuous quantitative measures, are rater-independent and therefore not prone to variable bias, are less likely to suffer floor or ceiling effects, enable remote administration, and have a potentially lower cost for administration. Various objective methods have been developed to measure tremor severity in ET [2‒9]. To derive a measure of tremor severity (i.e., amplitude), most prior methods have utilized some variant of a fast Fourier transformation. Fast Fourier transformation is a mathematical algorithm that converts space and time data to frequency and power, with power reflecting the relative proportion of a signal at a particular frequency [10]. Tremors typically demonstrate a tight peak at a particular frequency, and the height of that peak (e.g., power spectral density) correlates with the tremor severity [11]. Other methods have used neural network algorithms [12], wavelet transforms [13], or time domain characteristics, such as spiral width variability [14], to characterize the degree of tremor within a spiral. While all these methods provide a valid continuous measure of tremor severity, they lack a real-world measure that represents actual tremor magnitude. To address this knowledge gap, we propose a novel quantitative algorithm for measuring the actual tremor magnitude during digital spiral drawings. We apply the Digital Medicine Society (DiMe) V3 evaluation framework to this digital tremor measurement [15]. V3 is a tripartite structure that was created by the DiMe as a universal best practice for biometric monitoring technologies in clinical research, which involves: (1) sensor verification, (2) analytical validation, and (3) clinical validation. Sensor verification (V1) demonstrates that a sensor technology measures the intended information appropriately and produces the appropriate output. Analytical validation (V2) uses processed data to evaluate the ability of the algorithm to generate reliable physiological or behavioral measures. Clinical validation (V3) evaluates how the technology detects, measures, or predicts a meaningful outcome in the context of its use; for example, does the digital measure correspond to an established clinical assessment and is the measure relevant and clinically meaningful to the target population. Taken together, the V3 framework provides some of the supportive evidence necessary for endorsement or qualification by regulatory agencies, such as the Food and Drug Administration (FDA)’s Drug Development Tool Program [16], to enable the use of digital outcome assessments as approved endpoints in clinical trials.
Methods
Digital spiral drawing data were collected from subjects 21 years or older diagnosed by a movement disorder specialist with definite or probable ET, using TRIG criteria [17], who provided written informed consent and participated in an experimental trial of cannabinoids for symptomatic treatment of ET. The overall goal of this double-blind, placebo-control cross-over pilot trial (NCT03805750) was to determine whether TILRAY TN-CT120LM (FDA IND#137400), an oral capsule containing tetrahydrocannabinol (THC) 5 mg/cannabidiol (CBD) 100 mg, is a viable therapeutic option for ET. Exclusion criteria included the presence of resting tremor or other abnormalities on neurological exam suggestive of an alternative diagnosis (e.g., Parkinson’s disease, etc.).
An in-depth review of the study design and results is beyond the scope of this report. Briefly, thirteen subjects were screened, and seven met the eligibility criteria. Participants included were on stable treatment for tremor and remained on the same regimen throughout the study. After completing the screening visit, subjects were assigned to one of two treatment arms, either oral THC/CBD or a matched placebo, based on a predetermined randomization matrix (1:1). Both investigators and participants were blinded to the agent. Participants attended a total of nine visits (one screening visit and eight study visits: four visits on the study drug and four on placebo). Visits were completed at 0, 4, 7, and 21 days in each treatment arm, with a 28-day washout period between arms. During study visits 1–4, participants underwent a titration period where the dose was increased as tolerated up to a maximum dose of three capsules daily, taken in one or two doses daily depending on tolerability. Participants then underwent a 4-week wash-out period and received the other agent in a crossover design, and the same procedures were repeated during study visits 5–8.
During each visit, participants underwent clinical assessments that included The Essential Tremor Rating Assessment Scale (TETRAS) score, and digital spiral drawings [4] on a tablet as detailed below (Fig. 1). The neurological exam was video recorded, and the TETRAS score was administered by a movement disorders specialist (KL or FBN). TETRAS is an examiner-administered scale that evaluates tremor of various body parts during posture, kinesis, and tasks. Items are scored from 0 to 4 in 0.5-point intervals, with 4 representing the highest degree of severity, and the maximum score is 64 [4]. This scale was adapted from the long-used Fahn-Tolosa-Marin scale [18].
Photograph of the digital tablet used for collecting spiral drawings.
Digitized Spiral Collection
Digitized spiral drawings were collected using a Lenovo ThinkPad X60 tablet PC and a magnetic pen. The tablet resolution was 1,024 × 768 pixel (XGA) with a sampling frequency of 120 Hz, which far exceeded our anticipated tremor frequency range of 2–7 Hz [19]. The non-ink magnetic digitizing pen also allowed the collection of “air points” within 10 mm of the tablet surface. The entire stream of (x, y) coordinate pairs during spiral execution was stored in binary format for offline analysis. Subjects were instructed to hold the digitizing pen within 2 cm of the pen tip and trace the spiral template at a comfortable pace while maintaining their arms parallel to the surface of the digitizing tablet. Right-handed spirals were drawn starting at the center and working outward in a clockwise direction, while left-handed spirals were in a counterclockwise direction. Archimedes spiral template had three loops, was centered on the digitizing tablet, had a maximum radius of 8.4 cm, and an inter-loop distance of 3 cm. This task takes less than a minute to complete. Each recording was triggered by the subject’s pen coming into contact with the center of the template (±2.3 cm) and was stopped upon reaching the end of the template (±2.3 cm). At each timepoint, participants drew two trials per hand. Assessment frequency included baseline data collected at the screening visit and prior to drug intake on visits 1 and 5, and then repeated at 15, 25, 60, 120, 200, and 230 min after taking the study drug/placebo (7 time points total per visit). Thus, there were approximately 228 data points collected per subject. The study drug was a novel formulation that had not been studied previously; the measurement intervals were based on existing literature showing that the half-life of a single oral administration of oral cannabidiol ranges from approximately 1–3 h [20‒23].
Method for Measuring Tremor Amplitude
The goal of our proposed method was to measure the distance between the subjects’ tremulous spiral drawing and an idealized spiral calculated from the subject’s own data, whereby the “ideal spiral” represents the subject’s voluntary intended movement when tremor is removed. This algorithm would therefore provide the ability to calculate parameters such as the maximum, mean, or standard deviation of the tremor in real-world units (cm). The recorded data of x-y coordinate pairs were processed using custom-designed algorithms within MATLAB (version 2013a; Mathworks, Natick, MA, USA). Each spiral trace was analyzed based on the following sequence of steps.
Spiral Feature Extraction and Tremor Amplitude Metrics
Tremor is defined as a roughly sinusoidal, or oscillatory, involuntary motion, which is typically 2 Hz or greater [24]. In contrast, the voluntary movement or slow drift energy associated with drawing a spiral is mostly limited to frequencies below 1 Hz [25]. The task of tracing an Archimedes spiral inevitably leads to an imperfect rendering regardless of the user’s tremor status. In someone without tremor, their own rendering can be considered their ideal spiral. This becomes more challenging in a participant with tremor, as their tremor oscillations are then superimposed upon their intended rendering of the Archimedes spiral displayed on the screen. The goals of calculating an ideal spiral are (1) recognizing that it is unlikely any participant will exactly trace the template spiral, to calculate the actual spiral drawn in the theoretical absence of tremor rather than the spiral traced and (2) to preserve the low-frequency (0–2 Hz) components (voluntary motion and slow drift), while excluding tremor-related motion (>2 Hz). In the end, the tremor amplitude metrics were calculated based on the deviation between the subjects’ spiral drawing points and the ideal spiral. The procedures of spiral analysis are described below and illustrated in Figure 2.
Diagram of spiral analysis. a The red line is the spiral template displayed on the tablet. The black line is a hand-drawn spiral. b The hand-drawn spiral (black line in A) was unraveled/converted to polar (radius-angular) coordinates. c The ideal spiral (green line) is the ideal (best-fit) spiral of the unraveled hand-drawn spiral (black line in B). d The participant’s hand-drawn spiral (black) is now superimposed on the ideal spiral, which has converted back to (x, y) Cartesian coordinates for display purposes (green) and the template spiral (red). e The area selected in the box from (d) is magnified to show the subject’s data points (arrowed lines) and the ideal spiral line (solid green line). The radial errors are then calculated to generate the tremor amplitudes: rsubject,i – rideal,i.
Diagram of spiral analysis. a The red line is the spiral template displayed on the tablet. The black line is a hand-drawn spiral. b The hand-drawn spiral (black line in A) was unraveled/converted to polar (radius-angular) coordinates. c The ideal spiral (green line) is the ideal (best-fit) spiral of the unraveled hand-drawn spiral (black line in B). d The participant’s hand-drawn spiral (black) is now superimposed on the ideal spiral, which has converted back to (x, y) Cartesian coordinates for display purposes (green) and the template spiral (red). e The area selected in the box from (d) is magnified to show the subject’s data points (arrowed lines) and the ideal spiral line (solid green line). The radial errors are then calculated to generate the tremor amplitudes: rsubject,i – rideal,i.
Sample spirograms from a normal subject (a) and ET subjects with mild to increasingly severe tremors (b–e). The template (red line), ideal spiral (green line), and subject’s spiral (black line) data are shown using the proposed method. The tremor magnitude or perpendicular displacement [2, 30] thus represents the difference between the subject’s actual spirogram and the ideal spirogram.
Sample spirograms from a normal subject (a) and ET subjects with mild to increasingly severe tremors (b–e). The template (red line), ideal spiral (green line), and subject’s spiral (black line) data are shown using the proposed method. The tremor magnitude or perpendicular displacement [2, 30] thus represents the difference between the subject’s actual spirogram and the ideal spirogram.
Statistical Methods
Pearson’s correlation coefficients were calculated to assess the correlation of manual and digital measures of maximum amplitude, and test-retest reliability in digital measure of amplitude. Relationship of automated measures of amplitude and TETRAS rating scores was evaluated using a linear spline model with one knot (breakpoint) at the TETRAS score of 1.33. All statistical analyses were performed using R (version 4.2.1, R Core Team, 2022). The significance level α was set to 5%.
Sensor Verification of Tremor Amplitude Algorithm
A key first step in verifying this method was to ensure that tremor amplitudes calculated by our tablet PC and tremor algorithm matched the manual measurement of tremor displacement data points. To verify that the raw data collection and algorithm outputs were valid, we randomly selected 50 spiral drawings (10% of all spirals collected) and compared the algorithm output of a single tremor displacement to the tremor amplitude printed to scale on paper and measured with the aid of digital calipers capable of ± 0.01 cm precision by two independent raters (F.B.N. and K.L.). The maximum amplitude ranged from 2.55 cm to 21.10 cm (mean = 8.35 cm). We then calculated the intertrial Pearson correlation coefficient between the manual measure of tremor amplitude compared to the automated digital measure. The manual versus automated tremor amplitudes showed a high correlation (r [95% CI] = 0.91 [0.85, 0.95], p < 0.001) (Fig. 4).
Correlation between automated digital measure of maximum tremor amplitude and human manual measures. r, Pearson’s correlation coefficient.
Correlation between automated digital measure of maximum tremor amplitude and human manual measures. r, Pearson’s correlation coefficient.
Analytical Validation
Tremor symptoms can vary widely for a variety of reasons, including diurnal factors, stress or anxiety, and treatment effects [32, 33]. Understanding the amount of variability during a short-interval test-retest paradigm provides essential information about the reliability of the measurement paradigm and the degree of background tremor biological variability necessary to estimate sample sizes for therapeutic trials. To show test-retest reliability of this method, we compared the two consecutive digital spiral tracings (collected ∼1 min apart) that participants drew with each hand on the electronic tablet. In some cases, the tablet did not record a tracing properly on one attempt due to technical errors; in these cases, we repeated the spiral drawing task (e.g., comparing the second and third tracings). We evaluated the reliability between the two spiral drawing trials (n = 480 pairs of spiral drawings) using Pearson’s correlation coefficient. The first and second trials were found to be strongly correlated for maximum amplitude (r [95% CI] = 0.80 [0.76, 0.83], p < 0.001) and mean amplitude (0 0.91 [0.90, 0.93], p < 0.001), as shown in Figure 5.
Correlation plots of maximum (a) and mean (b) amplitude between Trial 1 and Trial 2. r, Pearson’s correlation coefficient; measures of amplitude were log10 transformed prior to correlation analysis.
Correlation plots of maximum (a) and mean (b) amplitude between Trial 1 and Trial 2. r, Pearson’s correlation coefficient; measures of amplitude were log10 transformed prior to correlation analysis.
Clinical Validation
The final aspect of the V3 framework is to determine how the digital measure correlates with an established “gold standard,” which is a frequently and widely used clinical rating scale. Here we compared the maximum and mean amplitudes obtained from the algorithm for the same fifty spirals described above with the TETRAS ratings for spiral drawing (item number 6 on the TETRAS). The scores were defined on a 0–4 scale, with higher numbers indicating more severe tremor [34]. The scoring instructions for the spiral drawing task are (0) no tremor; (1) slight: tremor barely visible; (2) mild: obvious tremor; (3) moderate: portions of the figure not recognizable (3); and (4) severe: figure not recognizable. We used 0.5 increments for instances when there was ambiguity between two integer ratings, e.g., the spiral was rated as 2.5 if between mild (2) and moderate (3) [34]. Three independent raters (F.B.N., B.W., K.L.) determined the severity of the spiral drawings (range: 0.5–3.5, mean = 1.71), and the scores from these raters were averaged to produce a single score. Since automated measures of amplitude and TETRAS rating scores do not have a linear relationship, a non-linear model (i.e., linear spline model) was used to determine the relationship. We found TETRAS ratings to be significantly associated with maximum amplitudes from the algorithm (slope = 0.42 + 0.08, p < 0.006) when the TETRAS score was greater than 1.33. We saw the same findings for measures of mean amplitude. When TETRAS ratings for spiral drawings were less than 1.33, there was no association with the automated measures (slope = 0.08, p = 0.50) (Fig. 6).
Correlation plot between automated measures of maximum (a) and mean (b) amplitude and mean TETRAS score on item number 6 (spiral drawing severity). Automated ratings were significantly correlated with TETRAS scores above 1.33.
Correlation plot between automated measures of maximum (a) and mean (b) amplitude and mean TETRAS score on item number 6 (spiral drawing severity). Automated ratings were significantly correlated with TETRAS scores above 1.33.
Discussion
Here we present validation findings for a novel algorithm to determine tremor amplitude derived from digital spiral drawings performed on an electronic tablet. This digital measure was applied in a randomized double-blind, placebo-controlled clinical trial of oral cannabinoids in seven subjects with ET. We found that the automated digital measure for tremor amplitude correlated highly with the manual measurement. The digital spiral drawings had excellent test-retest reliability for both maximum and mean amplitudes, though the mean amplitudes expectedly had higher reliability. We demonstrated similar performance between this new digital tool and an established validated clinical measure of kinetic tremor amplitude. In the case of ET, the TETRAS is the established “gold standard,” which is often used as an endpoint in clinical trials. Our results demonstrate a high correlation between this novel objective method and the more subjective rater-dependent TETRAS when the tremor severity scores are greater than 1.33 (i.e., severity greater than a TETRAS score of 1.5 representing “slight tremor”). At TETRAS scores below 1.33, we found no correlation between the clinical rating scale and the digital measure. Considering the high test-retest reliability, this latter finding suggests that the automated measure was more sensitive and precise than the human raters for detecting low amplitude tremor from spiral drawings. We interpret this finding as evidence of a floor effect of clinical rating scales [35, 36] and corresponds to a common practice employed in ET clinical trials to enroll subjects with a minimum severity score of two or greater for an individual item since participants with lower levels of tremor are unlikely to have detectable improvement.
In addition to improved sensitivity and reliability in patients with milder tremors, this method provides numerous additional advantages, including (1) the ability to measure tremor severity with real-world continuous metric units rather than an ordinal measure (rating scales) or a continuous unitless measure (tremor power), (2) the simplicity and short administration time, (3) the ability to support independent remote administration, and (4) the lack of reliance on raters to score tremor with traditional rating scales that add variability and cost to clinical trials. In addition to these meaningful enhancements, we are currently developing additional enhancements that will enable tremor measurement without the need for a digital tablet, thereby further reducing the technology burden and cost of administration.
Although we did not evaluate the link between kinetic tremor severity and patient-reported assessments of the impact on activities of daily living, these measures may predict impairments in activities such as eating, drinking, and hygiene tasks. Furthermore, our study captured many data points across subjects longitudinally. We acknowledge several limitations of this method. It currently requires proper equipment, software, and time beyond the spiral tracing acquisition to organize and process the data until this process is fully automated. This method does not calculate tremor frequency, though that information can be derived using existing methods previously discussed. Finally, this digital measure was only applied in persons with ET, and further work is needed to generalize these findings to other tremor disorders.
This work provides the basis for the use of an objective measure of tremor as a concept of interest with the context of use as an endpoint for therapeutic clinical trials of ET. Additional work is needed before such a digital measure can gain regulatory endorsement or qualification as a drug development tool. In conclusion, this digital measure provides a valid, clinically relevant, easy-to-administer outcome measure for maximum and mean tremor amplitude that can be useful for both clinical care and future clinical trials in ET.
Acknowledgments
We thank the subjects who participated in this research and the staff at the University of California San Diego Altman Clinical and Translational Research Center.
Statement of Ethics
The University of California San Diego Institutional Review Board reviewed and approved this study, Project # 180414. All participants provided written informed consent.
Conflict of Interest Statement
K.L. works for the University of California San Diego Health. Her research has been funded in part by the National Institutes of Health Grant UL1TR001442 and KL2TR001444 and by 1R21NS114764-01A1. She has worked as a consultant for Boston Scientific (relationship ended). Other authors have no conflicts to declare.
Funding Sources
This project was supported by a grant from the International Essential Tremor Foundation and The University of California Center for Medicinal Cannabis Research. The project described was partially supported by the National Institutes of Health, Grant UL1TR001442. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.
Author Contributions
Katherine Longardner: data collection, clinical rating of tremor, and drafting and editing of the manuscript. Qian Shen: contributed to the tremor algorithm and manuscript review. Bing Tan: contributed to statistical analysis. Brenton A. Wright: contributed to the clinical rating of tremor and manuscript review. Prantik Kundu: contributed to data collection procedures and manuscript review. Fatta B Nahab: designed and conceptualized the study, data collection, clinical rating of tremor, and draft and editing of the manuscript.
Data Availability Statement
The data that support the findings of this study are not publicly available due to privacy reasons but are available from the corresponding author upon reasonable request.