Abstract
Introduction: Auditory-perceptual assessments of cough are commonly used by speech-language pathologists working with people with swallowing disorders with emerging evidence beginning to demonstrate their validity; however, their reliability among novice clinicians is unknown. Therefore, the primary aim of this study was to characterize the reliability of auditory-perceptual assessments of cough among a group of novice clinicians. As a secondary aim, we assessed the effects of a standardized training protocol on the reliability of auditory-perceptual assessments of cough. Methods: Twelve novice clinicians blindly rated ten auditory-perceptual cough descriptors for 120 cough audio clips. Standardized training was then completed by the group of clinicians. The same cough audio clips were then re-randomized and blindly rated. Reliability was analyzed pre- and post-training within each clinician (intra-rater), between each unique pair of raters (dyad-level inter-rater), and for the entire group of raters (group-level inter-rater) using intraclass correlation coefficients and Cohen’s Kappa. Results: Pre-training reliability was greatest for measures of strength, effectiveness, and normality and lowest when judging the type of expiratory maneuver (cough, throat clear, huff, other). The measures that improved the most with training were ratings of perceived crispness, amount of voicing, and type of expiratory maneuver. Intra-rater reliability coefficients ranged from 0.580 to 0.903 pre-training and 0.756–0.904 post-training. Dyad-level inter-rater reliability coefficients ranged from 0.295 to 0.745 pre-training and 0.450–0.804 post-training. Group-level inter-rater reliability coefficients ranged from 0.454 to 0.919 pre-training and 0.558–0.948 post-training. Conclusion: Reliability of auditory-perceptual assessments varied across perceptual cough descriptors, but all appeared within the range of what has been historically reported for auditory-perceptual assessments of voice and visual-perceptual assessments of swallowing and cough airflow. Reliability improved for most cough descriptors following 30–60 min of standardized training. Future research is needed to examine the validity of auditory-perceptual assessments of cough by assessing the relationship between perceptual cough descriptors and instrumental measures of cough effectiveness to better understand the role of perceptual assessments in clinical practice.
Introduction
Cough and Dysphagia
Cough is defined as a “forced expulsive maneuver, usually against a closed glottis, associated with a characteristic sound” [1, 2]. It is an important airway protective behavior intended to clear penetrant and aspirate material out of the airway in order to maintain a healthy and homeostatic pulmonary environment [3]. Disordered coughing (dystussia) is characterized by reductions in cough airflow, expiratory pressures, and blunted sensorimotor responses [4‒9] and, when present, is associated with an increased risk of pneumonia [10‒15].
The relationship between dysphagia, dystussia, and pneumonia highlights the importance for assessing cough within the context of routine clinical swallow evaluations [16]. Emerging evidence demonstrates that assessing cough during a standardized swallowing evaluation protocol and then triaging accordingly significantly reduces rates of aspiration pneumonia in dysphagic patients [14]. Assessing cough may also aid clinicians in determining if exercise-based treatments for cough should be included as part of a comprehensive airway protective treatment plan when working with people with swallowing disorders.
Aerodynamic and Acoustic Assessments
Acoustic and aerodynamic (airflow) assessments are two well-established instrumental methods used to objectively evaluate cough function [17]; however, their use in clinical practice is limited. A recent survey found that of the 85% of speech-language pathology (SLP) respondents who assessed cough as part of their standard clinical practice for dysphagia management, only 6.8% did so using aerodynamic methods and none did so using acoustic methods [18]. The limited use of acoustic and aerodynamic assessments of cough in clinical practice is due to multiple factors including limited education and training related to cough assessment techniques [18] and likely also pragmatic issues related to equipment costs and handling. Therefore, there is a need to improve clinical education and training related to acoustic and aerodynamic cough assessments and to improve the clinical feasibility of cough assessments by developing and refining other assessment techniques.
Auditory-Perceptual Assessments
Auditory-perceptual assessments are free and non-invasive, easy and expedient to complete, and do not require extensive technical training or knowledge. These qualities make them feasible for use in clinical practice. It is therefore perhaps unsurprising that 87–98% of speech pathologists who perform clinical swallow evaluations already include auditory-perceptual assessments of cough as part of their standardized practice pattern [18‒22].
In the area of voice, auditory-perceptual assessments are considered to be an integral component of the gold standard for voice evaluation [23]. Auditory-perceptual assessments typically involve judging multiple voice descriptors, including but not limited to overall severity, roughness, breathiness, asthenia, strain, pitch, and loudness [24, 25]. Importantly, auditory-perceptual assessments of voice appear to exhibit stronger relationships to stroboscopic evaluations of voice than acoustic and aerodynamic assessments of voice [26‒29] and have been found to have stronger relationships to patient-reported voice outcome measures than acoustic, aerodynamic, or laryngeal stroboscopic measures [29, 30].
Despite the benefits of auditory-perceptual assessments, these evaluation techniques are subjective and perceptual in nature, making their reliability within and between raters a justifiable concern. Current research demonstrates that reliability coefficients for auditory-perceptual assessments of voice typically range from 0.45 to 0.75 [24, 31‒34]. While these findings demonstrate that reliability of auditory-perceptual assessments of voice is not perfect, they continue to be used in clinical practice and can be further improved with relatively simple trainings [35‒41].
Because coughs have a characteristic sound [1, 2] and known acoustic properties [3, 42], it stands to reason that auditory-perceptual assessments may be a meaningful approach to subjectively evaluate cough in clinical practice. A 2016 study by Laciuga et al. [43] explored the validity of auditory-perceptual assessments of cough by comparing auditory-perceptual ratings to measures of cough airflow. In that study, ten exemplar coughs produced from six healthy adults were perceptually rated by 30 experienced clinicians and compared to objective measures of cough airflow. The perceptual cough descriptors used in that study included ratings of perceived strength, effectiveness, duration, strain, normality, type of expiratory maneuver (cough vs. throat, clear vs. huff), and number of expiratory maneuvers (single vs. multiple). While inferential statistics were not completed as part of that study, descriptive data revealed several interesting findings. First, clinicians perceived differences between coughs, throat clears, and huffs and that each of these expiratory maneuvers contained distinct airflow parameters. Second, listeners accurately perceived differences in the number of expiratory maneuvers across audio samples. This is important since the number of coughs within a trial influences cough airflow measures [44]. Third, perceptual ratings of strength, effectiveness, and normality appeared associated with cough peak expiratory flow rate, compression phase duration, and cough expired volume. Fourth, perceptual ratings of duration appeared associated with compression phase duration and total cough duration. Lastly, perceptual ratings of strain appeared associated with compression phase duration and peak cough flow rise time.
The above findings provide preliminary evidence that auditory-perceptual assessments of cough may be a valid way to grossly estimate cough function. However, the reliability of auditory-perceptual assessments of cough is currently inconclusive [43, 45‒49]. It is important to gain an immediate understanding of the reliability of auditory-perceptual assessments of cough for two reasons. First, given the widespread use of auditory-perceptual assessments of cough in clinical practice [18‒22], it is of utmost importance to have a clear understanding of how reliable clinicians are with themselves (intra-rater reliability) and with each other (inter-rater reliability). Second, by first identifying which cough descriptors can be judged with an acceptable level of reliability among novice listeners, future research can begin to explore the validity of said descriptors. Characterizing the validity of cough descriptors before thoroughly understanding their reliability could inadvertently result in clinicians using cough descriptors in clinical practice that appear meaningful (valid) but may unknowingly have poor reliability and low diagnostic accuracy.
Aims
The primary aim of this exploratory study was to characterize the reliability of auditory-perceptual assessments of cough among a group of novice clinicians. The secondary aim of this study was to explore if standardized training could be used to improve rater reliability of auditory-perceptual assessments of cough.
Materials and Methods
Design Overview
This study protocol was reviewed and approved by the Institutional Review Board at Teachers College, Columbia University, with approval number IRB #21-392. Twelve SLP graduate students who were completing internship training at a university-based clinical research dysphagia laboratory were used as the novice clinicians in this study. This study was a secondary analysis of data collected from the clinical trainings of the novice clinicians. The novice clinicians were blinded to the aims of this study and provided ratings of auditory-perceptual assessments of cough for ten cough descriptors for 50 exemplar cough audio clips and 50 dystussic cough audio clips.
Cough audio clips were presented in a randomized order. Twenty percent of the cough audio clips were randomly selected and re-rated to examine intra-rater reliability. The 120 cough audio clips were rated within a single day. One week later, the novice clinicians completed a standardized, auditory-perceptual cough assessment training protocol. Three days after training completion, the clinicians re-rated the same 120 cough audio clips, presented in a re-randomized order, within a single day. The novice clinicians were not informed that some cough audio clips were repeated to assess intra-rater reliability or that the same audio clips were used pre- and post-training. A post-training questionnaire was completed by the clinicians to obtain information related to demographics, self-perceptions on confidence and reliability of perceptual cough assessments, and comments about the training.
Auditory-Perceptual Cough Descriptors
Ten auditory-perceptual cough descriptors were analyzed in this study, including: strength, crispness, amount of voicing, strain, duration, type of expiratory maneuver, number of expiratory maneuvers, coordination, effectiveness, and normality. All cough descriptors, except for “type” and “number” of expiratory maneuvers, were rated using a 100-point numerical rating scale. The type of expiratory maneuver was categorically rated by selecting one of four categories: “cough,” “throat clear,” “huff,” or “other.” The number of expiratory maneuvers was rated using one of four ordinal categories: 1-, 2-, 3-, or ≥3-expiratory maneuvers. The term “expiratory maneuver” was used instead of the type/number of “cough” since non-cough behaviors such as throat clears and huffs were included for rating.
Cough Audio Clips
Two sets of cough audio clips were pulled from a clinical research database and used to assess reliability: 50 cued voluntary coughs and 50 dystussic voluntary coughs. The 50 cued coughs were completed by three healthy adult volunteers, originally compiled for clinical training purposes. The healthy adults were instructed to complete 20 unique experimentally induced voluntary coughs (Fig. 1) intended to differ by auditory-perceptual cough descriptors. Definitions of the cough descriptors (Table 1) were provided to the healthy adults to guide performance of the cued coughs. The definitions provided to the healthy adults were the same definitions used in the standardized training for the graduate interns. Audio recordings of the cued coughs were recorded directly into Praat acoustical analysis software (Boersma and Weenink; Institute of Phonetic Sciences, University of Amsterdam, The Netherlands) using MacBook Pro laptop computers (Apple, Cupertino, CA, USA). Recordings for the cued coughs were made in a quiet environment with a mouth-to-microphone distance of approximately 30 cm. Audio samples were digitized with a sampling frequency of 44,100 Hz and saved as a .wav file.
Description of the 20 coughs completed by (1) the healthy adults for analysis of the “exemplary coughs”; (2) the audio clips of the two clinicians in the part 3 listening practice; and (3) the 12 novice clinicians in the part 4 imitation practice.
Description of the 20 coughs completed by (1) the healthy adults for analysis of the “exemplary coughs”; (2) the audio clips of the two clinicians in the part 3 listening practice; and (3) the 12 novice clinicians in the part 4 imitation practice.
Definitions of cough descriptors
Descriptors . | Definitions . |
---|---|
Quality | |
Strength | Perceived force and loudness of expired airflow, taking into consideration distance from the sound source |
Crispness | Perception of an abrupt (as opposed to gradual) and distinct pop of expired airflow at the onset of the expulsive cough phase |
Voicing | Perception of vocal fold vibration during the expulsive cough phase |
Strain | Perception of excessive vocal effort (hyperfunction), when voicing is present |
Duration | Length of time of the expiratory phase of cough |
Effectiveness | Perceived effectiveness at clearing material from the airway |
Normality | How normal (as opposed to abnormal) the expiratory maneuver sounds |
Coordination | How coordinated (as opposed to discoordinated) the expiratory maneuver sounds |
Type | |
Cough | A force expiratory maneuver, usually against a closed glottis, associated with a characteristic sound |
Throat clear | Similar to but distinctly different from a cough – occurring with less thoracic pressure |
Huff | A sharp, forced expiration without glottic closure |
Number | The number of expiratory maneuvers present, cooccurring in whole number integers (e.g., 0, 1, 2, 3, etc.) |
Descriptors . | Definitions . |
---|---|
Quality | |
Strength | Perceived force and loudness of expired airflow, taking into consideration distance from the sound source |
Crispness | Perception of an abrupt (as opposed to gradual) and distinct pop of expired airflow at the onset of the expulsive cough phase |
Voicing | Perception of vocal fold vibration during the expulsive cough phase |
Strain | Perception of excessive vocal effort (hyperfunction), when voicing is present |
Duration | Length of time of the expiratory phase of cough |
Effectiveness | Perceived effectiveness at clearing material from the airway |
Normality | How normal (as opposed to abnormal) the expiratory maneuver sounds |
Coordination | How coordinated (as opposed to discoordinated) the expiratory maneuver sounds |
Type | |
Cough | A force expiratory maneuver, usually against a closed glottis, associated with a characteristic sound |
Throat clear | Similar to but distinctly different from a cough – occurring with less thoracic pressure |
Huff | A sharp, forced expiration without glottic closure |
Number | The number of expiratory maneuvers present, cooccurring in whole number integers (e.g., 0, 1, 2, 3, etc.) |
The 50 dystussic coughs were recorded during a flexible endoscopic evaluation of swallowing (FEES) previously completed in a standardized method for prospective research purposes. The FEES equipment included a 3.0 mm-diameter flexible distal chip laryngoscope (ENT-5000; Cogentix Medical, New York, NY, USA) and a video recording system (Cogentix Medical, DPU-7000A). Audio during the FEES was recorded using a lapel microphone secured to the patient’s right shoulder with a mouth-to-microphone distance of approximately 30 cm. Audio-only samples were digitized with a sampling frequency of 44,100 Hz and saved as a .wav file for blinded (no video) analysis. The dystussic cough audio clips were from people with Parkinson’s disease (n = 22), progressive supranuclear palsy (n = 7), and multiple systems atrophy (n = 1) who demonstrated aspiration during the FEES and were cued by a clinician to cough in response to the aspiration.
Standardized Training Protocol
The training protocol used in this study was developed by the first author and included four parts: (1) background information on cough physiology and instrumental assessments; (2) standardized definitions of the ten cough descriptors; (3) listening practice; and (4) imitation practice. A pre-recorded training video was viewed by the clinicians to facilitate standardized completion of parts 1–3. Part 3 listening practice included cough audio clips originally recorded by two of the authors, each performing the 20 exemplar coughs (Fig. 1). Part 4 was completed immediately after viewing the training video and involved each clinician practicing the 20 exemplar coughs four times each. The fourth trial of each exemplar cough was recorded directly into Praat by each clinician using their personal computer’s built-in microphone at a sampling frequency of 44,100 Hz. The clinicians submitted .wav files of their imitation practice audio recording to ensure completion of the imitation practice.
Statistical Analysis
All analyses were performed in R version 4.1.1 [50]. All data, R code, and a copy of the training video and training PDF document that support the findings from this study are openly available for data transparency and re-use by clinicians and researchers in the Open Science Framework repository at https://osf.io/kwjcy/.
Intra-rater reliability was calculated for each clinician for each cough descriptor (n of 12 per cough descriptor). Inter-rater reliability was calculated for the entire group (“group-level”; n of 1 per cough descriptor) and for each unique pair of raters (“dyad-level”; n of 66 per cough descriptor). Group-level reliability with all 12 clinicians was calculated to adhere to statistical guidelines which recommend using >2 raters when determining reliability of an assessment technique [51]. Dyad-level reliability, which typically yields lower reliability coefficients than when relying on >2 raters, was calculated to provide a point of reference for future research, whereby studies may rely only on two raters when assessing inter-rater reliability. Intra- and inter-rater reliability were calculated at baseline (pre-training) and 3 days after training (post-training).
Intraclass correlation coefficients (ICC) were used to measure reliability for all 100-point, continuous variables (i.e., strength, effectiveness, normality, duration, strain, coordination, voicing, severity). ICCs were calculated using two-way random effects, with absolute agreement, and an average of k raters. The average of k raters was used because the true value of a perceptual measure was unknown, and therefore the average of two or more ratings was used as the unit of analysis. ICC analyses using a “single rater” approach were also completed (online suppl. Tables 1–3; for all online suppl. material, see https://doi.org/10.1159/000533372) to serve as a point of reference for future research.
Kappa (κ) was used to measure reliability for categorical variables (type and number of expiratory maneuvers). Fleiss Kappa was used for group-level inter-rater reliability to account for >2 raters, and Cohen’s Kappa was used for intra-rater reliability and dyad-level inter-rater reliability. Cohen’s Kappa was weighted for the “number of expiratory maneuvers” cough descriptor since this descriptor is ordinal in nature and was unweighted for “type of expiratory maneuvers” cough descriptor since this descriptor is nominal in nature.
Group-level inter-rater reliability was considered “improved” for a cough descriptor if the mean reliability score post-training was greater than the upper limit of the mean’s 95% confidence interval pre-training. Intra-rater reliability and dyad-level inter-rater reliability were considered improved for a cough descriptor if the median reliability score post-training was greater than the 75th percentile of the reliability scores pre-training. Median scores were used for intra-rater reliability and dyad-level inter-rater reliability due to skewness in the distribution of the reliability scores.
Percent change (%Δ) was used to characterize the magnitude of change in reliability from baseline to post-training. Percent change was calculated using the median scores for intra-rater reliability and dyad-level inter-rater reliability and the mean score for group-level inter-rater reliability. Percent change in reliability was calculated using the following formula:
%Δ = (post-training reliability – pre-training reliability) ÷ (pre-training reliability)
Results
A total of 2,880 cough audio clips were rated by the 12 novice clinicians (cis women = 11, gender non-binary = 1). Median age of the raters was 23 years (range: 22–40 years). None of the clinicians reported any hearing loss. Eight of the 12 clinicians reported experience completing auditory-perceptual assessments of voice – all within the context of a 3-month graduate-level voice disorders class. Reliability results are outlined in Tables 2-5; Figures 2, 3.
Percent change from baseline to post-training
Descriptor . | Change in intra-rater reliability (%) . | Change in dyad-level inter-rater reliability (%) . | Change in group-level inter-rater reliability (%) . |
---|---|---|---|
Strength (ICC) | 0.1 | 5.3 | 2.4 |
Effectiveness (ICC) | −1.4 | 11.1 | 2.6 |
Amount of voicing (ICC) | −0.1 | 71.5 | 19.4 |
Normality (ICC) | −2.3 | 16.8 | 3.1 |
Duration (ICC) | 3.2 | 30.9 | 5.5 |
Strain (ICC) | 38.4 | 47.0 | 15.5 |
Crispness (ICC) | 15.0 | 27.6 | 8.4 |
Coordination (ICC) | 10.0 | 35.3 | 13.3 |
Type (Cohen’s Kappa) | 9.3 | 19.3 | 22.9 |
Number (weighted Kappa) | 18.3 | 7.9 | 8.2 |
Descriptor . | Change in intra-rater reliability (%) . | Change in dyad-level inter-rater reliability (%) . | Change in group-level inter-rater reliability (%) . |
---|---|---|---|
Strength (ICC) | 0.1 | 5.3 | 2.4 |
Effectiveness (ICC) | −1.4 | 11.1 | 2.6 |
Amount of voicing (ICC) | −0.1 | 71.5 | 19.4 |
Normality (ICC) | −2.3 | 16.8 | 3.1 |
Duration (ICC) | 3.2 | 30.9 | 5.5 |
Strain (ICC) | 38.4 | 47.0 | 15.5 |
Crispness (ICC) | 15.0 | 27.6 | 8.4 |
Coordination (ICC) | 10.0 | 35.3 | 13.3 |
Type (Cohen’s Kappa) | 9.3 | 19.3 | 22.9 |
Number (weighted Kappa) | 18.3 | 7.9 | 8.2 |
ICC, intraclass correlation coefficient; type, type of expiratory maneuver; number, number of expiratory maneuvers.
Intra-rater reliability (average)
Descriptor . | Mean . | 95% C.I. . | SD . | Median . | 25th . | 75th . | Min . | Max . | |
---|---|---|---|---|---|---|---|---|---|
mean L.L. . | mean U.L. . | ||||||||
Pre-training | |||||||||
Strength (ICC) | 0.849 | 0.630 | 0.939 | 0.141 | 0.903 | 0.849 | 0.916 | 0.486 | 0.973 |
Effectiveness (ICC) | 0.878 | 0.661 | 0.953 | 0.083 | 0.895 | 0.835 | 0.933 | 0.719 | 0.982 |
Amount of voicing (ICC) | 0.609 | 0.029 | 0.843 | 0.337 | 0.757 | 0.443 | 0.852 | 0.037 | 0.945 |
Normality (ICC) | 0.873 | 0.674 | 0.950 | 0.077 | 0.898 | 0.834 | 0.918 | 0.715 | 0.976 |
Duration (ICC) | 0.718 | 0.270 | 0.889 | 0.157 | 0.765 | 0.692 | 0.806 | 0.355 | 0.890 |
Strain (ICC) | 0.555 | −0.077 | 0.820 | 0.317 | 0.580 | 0.462 | 0.835 | 0.079 | 0.886 |
Crispness (ICC) | 0.740 | 0.314 | 0.899 | 0.149 | 0.785 | 0.649 | 0.845 | 0.441 | 0.908 |
Coordination (ICC) | 0.632 | 0.023 | 0.857 | 0.327 | 0.767 | 0.605 | 0.827 | 0.163 | 0.905 |
Type (Cohen’s Kappa) | 0.767 | 0.531 | 1.000 | 0.201 | 0.824 | 0.774 | 0.897 | 0.253 | 0.914 |
Number (weighted Kappa) | 0.710 | 0.512 | 0.887 | 0.142 | 0.696 | 0.640 | 0.867 | 0.457 | 0.883 |
Post-training | |||||||||
Strength (ICC) | 0.865 | 0.653 | 0.948 | 0.095 | 0.904 | 0.837 | 0.929 | 0.613 | 0.956 |
Effectiveness (ICC) | 0.864 | 0.642 | 0.947 | 0.068 | 0.882 | 0.834 | 0.909 | 0.732 | 0.949 |
Amount of voicing (ICC) | 0.690 | 0.207 | 0.878 | 0.244 | 0.756 | 0.643 | 0.826 | 0.033 | 0.934 |
Normality (ICC) | 0.883 | 0.691 | 0.954 | 0.060 | 0.877 | 0.836 | 0.928 | 0.795 | 0.983 |
Duration (ICC) | 0.740 | 0.344 | 0.898 | 0.188 | 0.790 | 0.632 | 0.872 | 0.275 | 0.963 |
Strain (ICC) | 0.744 | 0.416 | 0.892 | 0.221 | 0.803 | 0.704 | 0.855 | 0.087 | 0.931 |
Crispness (ICC) | 0.886 | 0.718 | 0.956 | 0.059 | 0.903 | 0.853 | 0.932 | 0.776 | 0.950 |
Coordination (ICC) | 0.793 | 0.467 | 0.919 | 0.148 | 0.844 | 0.762 | 0.896 | 0.476 | 0.924 |
Type (Cohen’s Kappa) | 0.847 | 0.674 | 1.000 | 0.136 | 0.901 | 0.755 | 0.930 | 0.619 | 1.000 |
Number (weighted Kappa) | 0.836 | 0.618 | 1.000 | 0.063 | 0.824 | 0.784 | 0.884 | 0.764 | 0.958 |
Descriptor . | Mean . | 95% C.I. . | SD . | Median . | 25th . | 75th . | Min . | Max . | |
---|---|---|---|---|---|---|---|---|---|
mean L.L. . | mean U.L. . | ||||||||
Pre-training | |||||||||
Strength (ICC) | 0.849 | 0.630 | 0.939 | 0.141 | 0.903 | 0.849 | 0.916 | 0.486 | 0.973 |
Effectiveness (ICC) | 0.878 | 0.661 | 0.953 | 0.083 | 0.895 | 0.835 | 0.933 | 0.719 | 0.982 |
Amount of voicing (ICC) | 0.609 | 0.029 | 0.843 | 0.337 | 0.757 | 0.443 | 0.852 | 0.037 | 0.945 |
Normality (ICC) | 0.873 | 0.674 | 0.950 | 0.077 | 0.898 | 0.834 | 0.918 | 0.715 | 0.976 |
Duration (ICC) | 0.718 | 0.270 | 0.889 | 0.157 | 0.765 | 0.692 | 0.806 | 0.355 | 0.890 |
Strain (ICC) | 0.555 | −0.077 | 0.820 | 0.317 | 0.580 | 0.462 | 0.835 | 0.079 | 0.886 |
Crispness (ICC) | 0.740 | 0.314 | 0.899 | 0.149 | 0.785 | 0.649 | 0.845 | 0.441 | 0.908 |
Coordination (ICC) | 0.632 | 0.023 | 0.857 | 0.327 | 0.767 | 0.605 | 0.827 | 0.163 | 0.905 |
Type (Cohen’s Kappa) | 0.767 | 0.531 | 1.000 | 0.201 | 0.824 | 0.774 | 0.897 | 0.253 | 0.914 |
Number (weighted Kappa) | 0.710 | 0.512 | 0.887 | 0.142 | 0.696 | 0.640 | 0.867 | 0.457 | 0.883 |
Post-training | |||||||||
Strength (ICC) | 0.865 | 0.653 | 0.948 | 0.095 | 0.904 | 0.837 | 0.929 | 0.613 | 0.956 |
Effectiveness (ICC) | 0.864 | 0.642 | 0.947 | 0.068 | 0.882 | 0.834 | 0.909 | 0.732 | 0.949 |
Amount of voicing (ICC) | 0.690 | 0.207 | 0.878 | 0.244 | 0.756 | 0.643 | 0.826 | 0.033 | 0.934 |
Normality (ICC) | 0.883 | 0.691 | 0.954 | 0.060 | 0.877 | 0.836 | 0.928 | 0.795 | 0.983 |
Duration (ICC) | 0.740 | 0.344 | 0.898 | 0.188 | 0.790 | 0.632 | 0.872 | 0.275 | 0.963 |
Strain (ICC) | 0.744 | 0.416 | 0.892 | 0.221 | 0.803 | 0.704 | 0.855 | 0.087 | 0.931 |
Crispness (ICC) | 0.886 | 0.718 | 0.956 | 0.059 | 0.903 | 0.853 | 0.932 | 0.776 | 0.950 |
Coordination (ICC) | 0.793 | 0.467 | 0.919 | 0.148 | 0.844 | 0.762 | 0.896 | 0.476 | 0.924 |
Type (Cohen’s Kappa) | 0.847 | 0.674 | 1.000 | 0.136 | 0.901 | 0.755 | 0.930 | 0.619 | 1.000 |
Number (weighted Kappa) | 0.836 | 0.618 | 1.000 | 0.063 | 0.824 | 0.784 | 0.884 | 0.764 | 0.958 |
C.I., confidence interval; L.L., lower limit; U.L., upper limit; SD, standard deviation; 25th, 25th percentile; 75th, 75th percentile; Min, minimum; Max, maximum; type, type of expiratory maneuver; number, number of expiratory maneuvers.
Dyad-level inter-rater reliability
Descriptor . | Mean . | 95% C.I. . | SD . | Median . | 25th . | 75th . | Min . | Max . | |
---|---|---|---|---|---|---|---|---|---|
mean L.L. . | mean U.L. . | ||||||||
Pre-training | |||||||||
Strength (ICC) | 0.621 | 0.331 | 0.768 | 0.098 | 0.690 | 0.555 | 0.678 | 0.411 | 0.861 |
Effectiveness (ICC) | 0.626 | 0.306 | 0.778 | 0.118 | 0.643 | 0.555 | 0.697 | 0.340 | 0.860 |
Amount of voicing (ICC) | 0.291 | −0.063 | 0.523 | 0.204 | 0.295 | 0.154 | 0.454 | −0.185 | 0.619 |
Normality (ICC) | 0.640 | 0.094 | 0.779 | 0.105 | 0.642 | 0.574 | 0.701 | 0.385 | 0.868 |
Duration (ICC) | 0.379 | 0.046 | 0.592 | 0.236 | 0.397 | 0.231 | 0.569 | −0.388 | 0.815 |
Strain (ICC) | 0.314 | −0.066 | 0.561 | 0.148 | 0.306 | 0.215 | 0.421 | −0.0731 | 0.612 |
Crispness (ICC) | 0.482 | 0.117 | 0.681 | 0.131 | 0.510 | 0.375 | 0.580 | 0.246 | 0.766 |
Coordination (ICC) | 0.372 | 0.010 | 0.586 | 0.220 | 0.416 | 0.220 | 0.535 | −0.009 | 0.764 |
Type (Cohen’s Kappa) | 0.475 | 0.288 | 0.646 | 0.170 | 0.512 | 0.376 | 0.592 | 0.064 | 0.749 |
Number (weighted Kappa) | 0.726 | 0.594 | 0.858 | 0.085 | 0.745 | 0.662 | 0.786 | 0.508 | 0.870 |
Post-training | |||||||||
Strength (ICC) | 0.708 | 0.511 | 0.816 | 0.081 | 0.727 | 0.644 | 0.768 | 0.487 | 0.849 |
Effectiveness (ICC) | 0.710 | 0.529 | 0.814 | 0.080 | 0.715 | 0.641 | 0.764 | 0.515 | 0.854 |
Amount of voicing (ICC) | 0.485 | 0.094 | 0.699 | 0.176 | 0.506 | 0.334 | 0.630 | 0.160 | 0.810 |
Normality (ICC) | 0.739 | 0.585 | 0.830 | 0.088 | 0.750 | 0.678 | 0.802 | 0.457 | 0.914 |
Duration (ICC) | 0.506 | 0.180 | 0.687 | 0.168 | 0.520 | 0.397 | 0.639 | 0.028 | 0.740 |
Strain (ICC) | 0.460 | 0.066 | 0.677 | 0.174 | 0.450 | 0.360 | 0.605 | −0.077 | 0.798 |
Crispness (ICC) | 0.656 | 0.473 | 0.774 | 0.094 | 0.651 | 0.594 | 0.728 | 0.453 | 0.863 |
Coordination (ICC) | 0.547 | 0.236 | 0.717 | 0.145 | 0.563 | 0.473 | 0.657 | 0.061 | 0.825 |
Type (Cohen’s Kappa) | 0.572 | 0.386 | 0.758 | 0.163 | 0.611 | 0.513 | 0.693 | 0.158 | 0.837 |
Number (weighted Kappa) | 0.788 | 0.665 | 0.911 | 0.092 | 0.804 | 0.745 | 0.848 | 0.508 | 0.970 |
Descriptor . | Mean . | 95% C.I. . | SD . | Median . | 25th . | 75th . | Min . | Max . | |
---|---|---|---|---|---|---|---|---|---|
mean L.L. . | mean U.L. . | ||||||||
Pre-training | |||||||||
Strength (ICC) | 0.621 | 0.331 | 0.768 | 0.098 | 0.690 | 0.555 | 0.678 | 0.411 | 0.861 |
Effectiveness (ICC) | 0.626 | 0.306 | 0.778 | 0.118 | 0.643 | 0.555 | 0.697 | 0.340 | 0.860 |
Amount of voicing (ICC) | 0.291 | −0.063 | 0.523 | 0.204 | 0.295 | 0.154 | 0.454 | −0.185 | 0.619 |
Normality (ICC) | 0.640 | 0.094 | 0.779 | 0.105 | 0.642 | 0.574 | 0.701 | 0.385 | 0.868 |
Duration (ICC) | 0.379 | 0.046 | 0.592 | 0.236 | 0.397 | 0.231 | 0.569 | −0.388 | 0.815 |
Strain (ICC) | 0.314 | −0.066 | 0.561 | 0.148 | 0.306 | 0.215 | 0.421 | −0.0731 | 0.612 |
Crispness (ICC) | 0.482 | 0.117 | 0.681 | 0.131 | 0.510 | 0.375 | 0.580 | 0.246 | 0.766 |
Coordination (ICC) | 0.372 | 0.010 | 0.586 | 0.220 | 0.416 | 0.220 | 0.535 | −0.009 | 0.764 |
Type (Cohen’s Kappa) | 0.475 | 0.288 | 0.646 | 0.170 | 0.512 | 0.376 | 0.592 | 0.064 | 0.749 |
Number (weighted Kappa) | 0.726 | 0.594 | 0.858 | 0.085 | 0.745 | 0.662 | 0.786 | 0.508 | 0.870 |
Post-training | |||||||||
Strength (ICC) | 0.708 | 0.511 | 0.816 | 0.081 | 0.727 | 0.644 | 0.768 | 0.487 | 0.849 |
Effectiveness (ICC) | 0.710 | 0.529 | 0.814 | 0.080 | 0.715 | 0.641 | 0.764 | 0.515 | 0.854 |
Amount of voicing (ICC) | 0.485 | 0.094 | 0.699 | 0.176 | 0.506 | 0.334 | 0.630 | 0.160 | 0.810 |
Normality (ICC) | 0.739 | 0.585 | 0.830 | 0.088 | 0.750 | 0.678 | 0.802 | 0.457 | 0.914 |
Duration (ICC) | 0.506 | 0.180 | 0.687 | 0.168 | 0.520 | 0.397 | 0.639 | 0.028 | 0.740 |
Strain (ICC) | 0.460 | 0.066 | 0.677 | 0.174 | 0.450 | 0.360 | 0.605 | −0.077 | 0.798 |
Crispness (ICC) | 0.656 | 0.473 | 0.774 | 0.094 | 0.651 | 0.594 | 0.728 | 0.453 | 0.863 |
Coordination (ICC) | 0.547 | 0.236 | 0.717 | 0.145 | 0.563 | 0.473 | 0.657 | 0.061 | 0.825 |
Type (Cohen’s Kappa) | 0.572 | 0.386 | 0.758 | 0.163 | 0.611 | 0.513 | 0.693 | 0.158 | 0.837 |
Number (weighted Kappa) | 0.788 | 0.665 | 0.911 | 0.092 | 0.804 | 0.745 | 0.848 | 0.508 | 0.970 |
C.I., confidence interval; L.L., lower limit; U.L., upper limit; SD, standard deviation; 25th, 25th percentile; 75th, 75th percentile; Min, minimum; Max, maximum; type, type of expiratory maneuver; number, number of expiratory maneuvers.
Group-level inter-rater reliability
Descriptor . | Test statistic . | p value . | Reliability . | 95% C.I. . | |
---|---|---|---|---|---|
lower limit . | upper limit . | ||||
Pre-training | |||||
Strength (ICC) | 13.9 | <0.0001 | 0.913 | 0.880 | 0.938 |
Effectiveness (ICC) | 14.6 | <0.0001 | 0.912 | 0.877 | 0.939 |
Amount of voicing (ICC) | 4.6 | <0.0001 | 0.706 | 0.577 | 0.799 |
Normality (ICC) | 14.5 | <0.0001 | 0.919 | 0.890 | 0.942 |
Duration (ICC) | 5.9 | <0.0001 | 0.811 | 0.746 | 0.863 |
Strain (ICC) | 5.2 | <0.0001 | 0.720 | 0.588 | 0.811 |
Crispness (ICC) | 8.6 | <0.0001 | 0.852 | 0.793 | 0.897 |
Coordination (ICC) | 7.1 | <0.0001 | 0.781 | 0.669 | 0.855 |
Type (Fleiss Kappa) | 49.2 | <0.0001 | 0.454 | 0.437 | 0.471 |
Number (Fleiss Kappa) | 80.6 | <0.0001 | 0.773 | 0.759 | 0.787 |
Post-training | |||||
Strength (ICC) | 16.9 | <0.0001 | 0.935 | 0.913 | 0.953 |
Effectiveness (ICC) | 16.9 | <0.0001 | 0.936 | 0.915 | 0.954 |
Amount of voicing (ICC) | 9.8 | <0.0001 | 0.843 | 0.762 | 0.897 |
Normality (ICC) | 20.4 | <0.0001 | 0.948 | 0.931 | 0.963 |
Duration (ICC) | 8.4 | <0.0001 | 0.856 | 0.801 | 0.898 |
Strain (ICC) | 8.4 | <0.0001 | 0.832 | 0.753 | 0.886 |
Crispness (ICC) | 13.8 | <0.0001 | 0.924 | 0.899 | 0.945 |
Coordination (ICC) | 10.0 | <0.0001 | 0.885 | 0.844 | 0.918 |
Type (Fleiss Kappa) | 62.9 | <0.0001 | 0.558 | 0.541 | 0.575 |
Number (Fleiss Kappa) | 90.7 | <0.0001 | 0.837 | 0.823 | 0.851 |
Descriptor . | Test statistic . | p value . | Reliability . | 95% C.I. . | |
---|---|---|---|---|---|
lower limit . | upper limit . | ||||
Pre-training | |||||
Strength (ICC) | 13.9 | <0.0001 | 0.913 | 0.880 | 0.938 |
Effectiveness (ICC) | 14.6 | <0.0001 | 0.912 | 0.877 | 0.939 |
Amount of voicing (ICC) | 4.6 | <0.0001 | 0.706 | 0.577 | 0.799 |
Normality (ICC) | 14.5 | <0.0001 | 0.919 | 0.890 | 0.942 |
Duration (ICC) | 5.9 | <0.0001 | 0.811 | 0.746 | 0.863 |
Strain (ICC) | 5.2 | <0.0001 | 0.720 | 0.588 | 0.811 |
Crispness (ICC) | 8.6 | <0.0001 | 0.852 | 0.793 | 0.897 |
Coordination (ICC) | 7.1 | <0.0001 | 0.781 | 0.669 | 0.855 |
Type (Fleiss Kappa) | 49.2 | <0.0001 | 0.454 | 0.437 | 0.471 |
Number (Fleiss Kappa) | 80.6 | <0.0001 | 0.773 | 0.759 | 0.787 |
Post-training | |||||
Strength (ICC) | 16.9 | <0.0001 | 0.935 | 0.913 | 0.953 |
Effectiveness (ICC) | 16.9 | <0.0001 | 0.936 | 0.915 | 0.954 |
Amount of voicing (ICC) | 9.8 | <0.0001 | 0.843 | 0.762 | 0.897 |
Normality (ICC) | 20.4 | <0.0001 | 0.948 | 0.931 | 0.963 |
Duration (ICC) | 8.4 | <0.0001 | 0.856 | 0.801 | 0.898 |
Strain (ICC) | 8.4 | <0.0001 | 0.832 | 0.753 | 0.886 |
Crispness (ICC) | 13.8 | <0.0001 | 0.924 | 0.899 | 0.945 |
Coordination (ICC) | 10.0 | <0.0001 | 0.885 | 0.844 | 0.918 |
Type (Fleiss Kappa) | 62.9 | <0.0001 | 0.558 | 0.541 | 0.575 |
Number (Fleiss Kappa) | 90.7 | <0.0001 | 0.837 | 0.823 | 0.851 |
The test statistics and p values for each reliability calculation are outlined in the left two columns. The reliability estimate (ICC or Kappa) for each descriptor is outlined in the middle column with the 95% confidence interval for each reliability estimate in the right two columns.
C.I., confidence interval; type, type of expiratory maneuver; number, number of expiratory maneuvers.
Distributions of the intra-rater reliability coefficients (n = 12) for each auditory-perceptual cough descriptor at baseline (purple) and at post-training (yellow). ICC, intraclass correlation coefficients.
Distributions of the intra-rater reliability coefficients (n = 12) for each auditory-perceptual cough descriptor at baseline (purple) and at post-training (yellow). ICC, intraclass correlation coefficients.
Distributions of the dyad-level inter-rater reliability coefficients (n = 66) for each auditory-perceptual cough descriptor at baseline (purple) and at post-training (yellow). ICC, intraclass correlation coefficients.
Distributions of the dyad-level inter-rater reliability coefficients (n = 66) for each auditory-perceptual cough descriptor at baseline (purple) and at post-training (yellow). ICC, intraclass correlation coefficients.
Intra-Rater Reliability
Median ICCs ranged from 0.580 to 0.903 pre-training and 0.756–0.904 post-training. Median κ ranged from 0.696 to 0.824 pre-training to 0.824–0.901 post-training. Reliability improved post-training for “crispness,” “coordination,” and “type of expiratory maneuver,” with reliability coefficients increasing by an average of 11.4%. Intra-rater reliability did not worsen for any of the cough descriptors after training.
Inter-Rater Reliability
Dyad-Level
ICCs ranged from 0.295 to 0.690 pre-training to 0.450–0.727 post-training. κ ranged from 0.512 to 0.745 pre-training to 0.611–0.804 post-training. Reliability improved post-training for all cough descriptors except “duration,” with reliability coefficients increasing by an average of 26.9%. Dyad-level inter-rater reliability did not worsen for any of the cough descriptors after training.
Group-Level
ICCs ranged from 0.706 to 0.919 pre-training to 0.832–0.948 post-training. κ ranged from.454–0.773 pre-training to 0.558–0.837 post-training. Reliability improved post-training for “amount of voicing,” “normality,” “strain,” “crispness,” “coordination,” “type,” and “number,” with reliability coefficients increasing by an average of 13.0%. Group-level inter-rater reliability did not worsen for any of the cough descriptors after training.
Post-Training Interview
Clinicians’ Self-Perceptions of Cough Rating Reliability and Confidence
When asked what negatively impacted perceived confidence in performing auditory-perceptual cough assessments pre-training, the majority of clinicians reported lack of definitions related to perceptual cough descriptors (n = 12; 100%), lack of experience listening to normal (n = 11; 91%), lack of experience listening to disordered coughs (n = 9; 75%), lack of experience listening to exemplar coughs (i.e., “anchor” cough audio clips; n = 9; 75%), and lack of experience imitating/modeling different types of normal and disordered coughs (n = 9; 75%).
When asked to rate how reliable the clinicians thought they were with themselves (intra-rater reliability) with auditory-perceptual cough ratings using a scale from zero (not at all reliable) to 10 (completely reliable), clinicians had median score of 4.5 pre-training and 7 post-training. Similarly, when asked to rate how reliable the clinicians thought they were with their colleagues, clinicians reported a median score of 4.0 pre-training and 6.0 post-training.
Clinicians’ Comments about the Training
When asked what the most helpful part(s) of the training was for improving their confidence in auditory-perceptual cough assessment ratings, the most common response was listening practice (n = 10; 83%), followed by imitation practice (n = 5; 41%), and the provision of standardized definitions of cough descriptors (n = 5; 41%). When asked what the least helpful part or parts of the training were for improving their confidence in auditory-perceptual cough assessment rating, the most common response was education (n = 7; 58%). When asked how easy the training was on a scale of 1 (very difficult) to 5 (very easy), clinicians reported a median score of 2.5, with scores ranging from 2 to 4.
Discussion
The reliability of ten auditory-perceptual cough descriptors was analyzed in this study because of their potential relevance to clinical practice. Perceived strength, duration, strain, coordination, effectiveness, normality, type of expiratory maneuver, and number of expiratory maneuvers were included because they are either currently used in clinical practice [18‒22] and have preliminary evidence suggesting important relationships with objective measures of cough airflow [43, 49]. Furthermore, research has found that perceptual ratings of cough strength and effectiveness can predict “safe” versus “unsafe” swallowers [49] – findings which have been repeatedly seen when evaluating people with dysphagia using objective methods to assess cough airflow [9]. “Amount of voicing” was included since voicing is an established component of the acoustic cough phase and has physiologic relevance [17]. Perceived “crispness” was included as a cough descriptor since it is hypothesized to be associated with the “characteristic cough sound” and since it relates to pitch, loudness, and sound energy [52, 53] – important acoustic properties associated with cough airflow [54].
Results from this study demonstrated that reliability varied across the ten perceptual cough descriptors, with perceived strength, effectiveness, and normality tending to exhibit the highest degree of inter-rater reliability and with type of expiratory maneuver tending to exhibit the lowest degree of inter-rater reliability. Differences in reliability among the ten cough descriptors were especially pronounced pre-training, suggesting that some cough descriptors were more intuitive and/or easier to rate than others prior to being provided with standardized definitions, listening examples, and imitation practice.
Using common benchmarks to judge the level of observed reliability [51], pre-training inter-reliability was found to be “poor” for type of expiratory maneuver, “moderate” for perceived strain and amount of voicing, “good” for perceived coordination, crispness, duration, number of expiratory maneuvers, and “excellent” for perceived strength, effectiveness, and normality. However, while categorizing reliability into groups such as “poor,” “moderate,” “good,” and “excellent” is common in research, such classifications are arbitrary and need to be considered within clinical and research context [55‒57]. Therefore, to guide interpretation of the reliability findings from the present study, results can be compared to three related areas of reliability research: auditory-perceptual assessments of voice, visual-perceptual assessments of swallowing, and instrumental assessments of cough.
From a voice standpoint, the reliability coefficients from the present study were within the range of those reported for auditory-perceptual assessments of voice, as previously outlined. Similarly, from a swallowing standpoint, the reliability coefficients in the present study are consistent with those frequently reported for visual-perceptual assessments of swallowing, including measures related to pharyngeal residue [58‒61], penetration/aspiration [60‒64], and swallowing physiology [65‒67]. Lastly, while the reliability of acoustic and aerodynamic instrumental cough assessment measures is not routinely reported in research, current data demonstrate reliability coefficients range from 0.61 to 0.89 for non-automated measures (e.g., number of coughs, compression phase duration, peak flow rise time, inspiratory/expiratory volume) and 0.90–0.99 for automated measures (e.g., peak flow rate) [68‒77]. Together, these data demonstrate that the reliability of auditory-perceptual assessments of cough is within the range of reliability commonly reported for perceptual and non-automated measures of voice, swallowing, and cough.
Auditory-Perceptual Cough Assessment Training
This is the first study to compare reliability ratings before and after a standardized training program for auditory-perceptual assessments of cough. Training resulted in significant improvements in inter- and intra-rater reliability, as well as clinicians’ reported confidence in auditory-perceptual cough ratings. Nine of the 10 cough descriptors improved for dyad-level inter-rater reliability, five of the 10 cough descriptors improved for group-level inter-rater reliability, and three of the 10 cough descriptors improved for intra-rater reliability.
Training of these ten cough descriptors appeared feasible, with most clinicians completing the training in less than 1 h and reporting that the training was neither too easy nor too difficult. Furthermore, clinicians reported feeling ∼50% more confident in interpreting auditory-perceptual cough assessments after training. Given that clinicians reported listening practice to be the most helpful part of the training to improve their confidence in ratings, future studies should examine if providing more listening examples and auditory anchors can further improve reliability.
Together, these data demonstrate that cough training can be relatively easy and expedient to complete and can result in immediate improvements in both intra- and inter-rater reliability. The training used in this study can be found available for free (https://osf.io/kwjcy/) as an example for future trainings.
Limitations and Future Directions
This study is not without limitations. One limitation to the present study is that no standard values exist delineating what is versus is not a clinically acceptable level of reliability. This is in part because reliability coefficients are influenced by multiple factors aside from rater agreement, including the number of raters, sample size, and the variability of scores across the sampled data [51]. Low reliability coefficients could occur for reasons unrelated to rater reliability, including a small number of raters, a small sample size, or low variance of the sampled data. Dyad-level inter-rater reliability was analyzed in the present study to serve as a point of reference for clinicians and researchers who may need to rely on examining reliability between just two raters. However, statistical guidelines recommend analyzing reliability between three or more raters [51]. To address this, we also included group-level inter-rater reliability in the present study, which involved analyzing reliability from 12 raters. To enhance accuracy of reliability analyses, it is also recommended to obtain a sample size of at least 30 data points that are heterogenous in nature [51]. We closely adhered to these guidelines by including a sample size of 100, with samples composed of experimentally altered coughs from healthy adults and naturally occurring dystussic coughs from people with movement disorders.
A second limitation relates to generalizability of findings. Rater reliability was assessed among SLP graduate students who served as novice listeners. Therefore, findings may not generalize to SLPs with greater clinical experience, non-SLP healthcare providers, or clinicians without any experience interpreting auditory-perceptual assessments of voice. Therefore, future research should expand on the present study by examining reliability across raters with different listening experience levels.
Lastly, this was a preliminary study intended to explore the reliability, but not the validity, of a variety of descriptors which could be potentially used for standardized auditory-perceptual assessments of cough. While preliminary data suggest auditory-perceptual assessments of cough may have valid relationships with measures of cough airflow [43] and swallowing safety [49], more research is needed to better understand the association between varying perceptual cough descriptors and cough acoustics and airflow. By identifying which cough descriptors can be reliably judged by clinicians while also obtaining meaningful diagnostic information, future research can begin to establish a standardized, consensus-based approach to auditory-perceptual assessments of cough.
Conclusions
The reliability of auditory-perceptual assessments of cough appears similar to the reliability that has been frequently observed for auditory-perceptual assessments of voice and visual-perceptual assessments of swallowing. Furthermore, relatively short and simple standardized trainings can elicit robust improvements in intra- and inter-rater reliability. Future research is needed to more conclusively determine which cough descriptors should be routinely included in an auditory-perceptual assessment by further characterizing the validity of perceptual cough descriptors and ultimately by developing a standardized, consensus-based approach to auditory-perceptual assessments of cough.
Acknowledgments
The authors would like to acknowledge and thank the clinicians who participated in this study as part of the clinical training.
Statement of Ethics
All procedures were performed in accordance with the ethical standards of the Institutional Research Committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. Informed consent was not obtained from the participants of this study as it was a secondary analysis of data collected for clinical training purposes. This study was reviewed and approved by the Institutional Review Board at Teachers College, Columbia University, with approval number IRB #21-392.
Conflict of Interest Statement
The authors have no relevant financial disclosures or conflicts of interest.
Funding Sources
This work was supported by a Clinical Research Training Scholarship in Parkinson’s Disease awarded to Dr. James Curtis from the American Brain Foundation and the Parkinson’s Foundation in collaboration with the American Academy of Neurology (Grant No. #2360).
Author Contributions
All authors met the minimum criteria for authorship status, as proposed by the International Committee of Medical Journal Editors (ICMJE). Specific authorship contributions are outlined below using CRediT (https://credit.niso.org/).
Roles: 1: conceptualization, 2: data curation, 3: formal analysis, 4: funding acquisition, 5: investigation, 6: methodology, 7: project administration, 8: resources, 9: software, 10: supervision, 11: validation, 12: visualization, 13: writing – original draft, 14: writing – reviewing and editing.
Authors:
James A. Curtis: 1, 2, 3, 5, 6, 7, 9, 10, 11, 12, 13, 14.
James C. Borders: 3, 9, 11, 12, 14.
Avery E. Dakin: 3, 11, 14.
Michelle S. Troche: 6, 8, 10, 14.
Data Availability Statement
All data and R code from this study are openly available in the Open Science Framework repository at https://osf.io/kwjcy/. Further inquiries can be directed to the corresponding author.