Background: Genomic testing is increasingly employed in clinical, research, educational, and commercial contexts. Genomic literacy is a prerequisite for the effective application of genomic testing, creating a corresponding need for validated tools to assess genomics knowledge. We sought to develop a reliable measure of genomics knowledge that incorporates modern genomic technologies and is informative for individuals with diverse backgrounds, including those with clinical/life sciences training. Methods: We developed the GKnowM Genomics Knowledge Scale to assess the knowledge needed to make an informed decision for genomic testing, appropriately apply genomic technologies and participate in civic decision-making. We administered the 30-item draft measure to a calibration cohort (n = 1,234) and subsequent participants to create a combined validation cohort (n = 2,405). We performed a multistage psychometric calibration and validation using classical test theory and item response theory (IRT) and conducted a post-hoc simulation study to evaluate the suitability of a computerized adaptive testing (CAT) implementation. Results: Based on exploratory factor analysis, we removed 4 of the 30 draft items. The resulting 26-item GKnowM measure has a single dominant factor. The scale internal consistency is α = 0.85, and the IRT 3-PL model demonstrated good overall and item fit. Validity is demonstrated with significant correlation (r = 0.61) with an existing genomics knowledge measure and significantly higher scores for individuals with adequate health literacy and healthcare providers (HCPs), including HCPs who work with genomic testing. The item bank is well suited to CAT, achieving high accuracy (r = 0.97 with the full measure) while administering a mean of 13.5 items. Conclusion: GKnowM is an updated, broadly relevant, rigorously validated 26-item measure for assessing genomics knowledge that we anticipate will be useful for assessing population genomic literacy and evaluating the effectiveness of genomics educational interventions.
Millions of individuals have had or will obtain personal genomic testing in educational, clinical, research, or commercial settings . Achieving some degree of population genomic literacy (including provider genomic literacy) is a prerequisite to the effective application of that genomic testing [2-8]. The critical role of genomic literacy in genomic testing creates a corresponding need for validated tools to assess genomics knowledge .
We define genomic literacy as the knowledge needed to (1) make an informed decision for genomic testing, (2) appropriately apply genomic technologies and accurately interpret genomic data, and (3) participate in decisions about genetics and genomics policy questions as a member of society. This definition extends those described previously  to include the application of modern genomic technologies, the interpretation of genomic data, and the nonclinical personal, familial, and societal implications of genomic testing. The expanded definition reflects the increasing opportunities for recipients of genomic testing to obtain and analyze their own genomic data [10, 11] and the concerns of all parties about the ethical, legal, and social implications (ELSI) of genomic testing .
A survey of existing genetic or genomics knowledge measures is summarized in online suppl. Table 1 (see www.karger.com/doi/10.1159/000515006 for all online suppl. material). As genomic technology evolves, new measures are needed to address the new capabilities, limitations, and implications of those technologies. For example, the Kaphingst et al.  measure and the recently developed UNC Genomics Knowledge Scale (UNC-GKS)  and Knowledge of Genome Sequencing (KOGS)  measures were designed to assess genomics (vs. genetics) knowledge and to incorporate modern genomic technologies such as whole exome sequencing or whole genome sequencing (WES/WGS). Existing measures generally do not cover ELSI topics, such as the legal protections for genomic information, which are important for informed decision-making but are poorly understood  and/or exhibit ceiling effects in genomic study cohorts . Personal genomic studies, for example, are often enriched for participants with high educational attainment and life science/health professionals in particular [18, 19]. Evaluating the effectiveness of genomics educational interventions, especially those targeting current or future healthcare providers or life scientists, requires a measure that is informative for more knowledgeable examinees.
Here we present the GKnowM Genomics Knowledge Scale, an updated and broadly relevant measure for assessing population genomic literacy and evaluating the effectiveness of genomics educational interventions. We developed GKnowM to address the need for a genomics knowledge measure that incorporates modern genomic technologies such as WES and WGS that are now in widespread use, addresses knowledge relevant to ELSI concerns, and is informative for a broad range of examinees. We performed a rigorous psychometric validation of the proposed measure with members of the general public, students/trainees, and genetics/genomics professionals, including simulations to demonstrate the suitability of the item bank for efficient computerized adaptive testing (CAT) . Our goal was to create a robust and useful tool for assessing an individual’s genomics knowledge across a wide range of experience, education, and expertise.
Definition of Content Domain
We developed the initial content domain based on a detailed literature review of competencies published by relevant professional societies (including the Accreditation Council for Genetic Counseling, American Board of Medical Genetics and Genomics, American College of Medical Genetics, American Nurses Association, and Association of Professors of Human and Medical Genetics), textbooks, content domains developed for existing genetics knowledge, and the investigators’ professional expertise. Nine published competencies or content domains [21-28] were independently reviewed and coded for genomics concepts by 2 investigators (M.D.L. and S.A.S.); any differences were then resolved through consultation to produce the draft content domain. We revised the draft content domain based on the feedback from genomics professionals, including clinicians, researchers, and educators.
This process produced a final content domain with 3 top-level concepts: (1) “General Genetics and Genomics Knowledge,” (2) “Applications of Genomics Technology,”and (3) “Genomics and Society,” each with multiple sub-concepts. Table 1 summarizes the content domain. For conciseness, in the remainder of the study, we will use the term “genomics” to encompass genetics and genomics.
We undertook a multistage item development process incorporating item identification, classification, adaptation and/or creation, and revision . We began the item development process by reviewing 235 items from existing genetics and genomics knowledge measures, classroom exercises, and other sources (published measures are noted in online suppl. Table 1). Two investigators (M.D.L. and S.A.S.) coded the existing items with the content domain. Approximately 80% of the existing items were coded as “General Genetics and Genomics Knowledge,” with fewer items coded “Applications of Genomics Technologies,” and “Genomics and Society.” As a result, we focused our novel item development effort in the latter 2 areas. In prior work, we observed ceiling effects with existing genomics knowledge measures . To ensure that items would be informative for a broad range of examinees, including those with prior genomics education, we also focused on drafting more difficult items that would complement existing measures. The proposed measure is intended to be used in a wide variety of settings, not just in the context of a specific genomic test, such as diagnostic WES. Thus, we aimed to develop an item bank that would be relevant to multiple genomic testing technologies and contexts.
We developed an initial item bank of 74 multiple-choice items (with 3 or 4 response options) consisting of novel items and items adapted from published measures [9, 31-33]. We iteratively refined the item bank with a focus on ensuring item relevance, removing ambiguity, and clarifying the item language through feedback from genomics professionals and others. We administered the draft items to the general public recruited via Amazon Mechanical Turk and flyers, undergraduate students, healthcare professionals, and genomics professionals. Based on preliminary testing, we selected 30 high-performing items with a range of difficulties and broad coverage of the content domain. The complete text of the 30 items and the key are included in the online suppl. material.
We developed an online survey comprising demographic questions, the 30 test items, the UNC Genomics Knowledge Scale (UNC-GKS) , a 1-item health literacy measure , and a single-item self-assessment of genomics knowledge (“How would you rate your knowledge of genomics” with 5 answer options: not knowledgeable – extremely knowledgeable). We administered the survey via the Qualtrics platform to cohorts recruited via commercial panel providers (targeting a US census panel), flyers on an undergraduate campus and at an academic medical center, an undergraduate psychology department student pool, advertisements in the National Society of Genetic Counselors e-mail bulletin, and e-mail advertisements. We recruited participants in 2 phases, an initial calibration cohort and a subsequent validation cohort. The 30 test items were also administered as part of a different study to 20 students in an intensive undergraduate human genome analysis course before and after the course (the survey was an optional element of the course). This project was reviewed and approved by the Institutional Review Boards at the Icahn School of Medicine and Middlebury College.
Respondents who completed the survey faster than a minimum time threshold (“speeders”), spent longer than 1 h, “straightlined” gridded answers, answered 85% or more of the multiple-choice items identically, or answered educational attainment demographic questions inconsistently (e.g., reported being a practicing physician without the corresponding educational attainment), were excluded from further analysis. We separately determined the minimum completion times for the calibration phase (110 s to complete the GKnowM items) and the validation phase (330 s to complete the entire survey) based on the fastest entirely correct response to either the GKnowM or the UNC GKS measures. We treated missing responses to GKnowM items as incorrect.
Psychometric analysis was performed with a combination of R 3.5 and Xcalibre 4.2 . Dimensionality and item response theory (IRT) analyses were performed on the calibration cohort, modeled on the approaches employed by Langer et al.  and Sanderson et al. .
We performed the Kaiser-Meyer-Olkin (KMO) analysis to assess suitability for factor analysis. We considered items with measure of sampling adequacy (MSA) values less than the “meritorious” threshold of 0.8  for removal prior to factor analysis. We performed exploratory factor analysis (EFA) to assess the dimensionality of the items using tetrachoric correlations, a minimum residual approach, and oblimin rotation. We employed parallel analysis (PA) to identify the number of factors to be retained, generating multiple replicates (n = 100) of randomly simulated and randomly resampled data matrices of the same size as the respondent data. Factors from the real data whose actual eigenvalues exceeded the average simulated and resampled eigenvalues were retained . KMO, EFA, and PA were performed with the R psych package . We performed confirmatory factor analysis (CFA) with the R lavaan package  using the diagonally weighted least squares estimator, robust standard errors, and mean- and variance-adjusted test statistics. We evaluated CFA model fit with root mean square error (RMSEA) (acceptable <0.05), the standardized root mean square residual (SRMR) (acceptable <0.05), the Tucker-Lewis index (TLI) (acceptable >0.95), and the comparative fit index (CFI) (acceptable >0.95)  and modification indices.
We performed classical test theory analyses and IRT calibration with Xcalibre. We calculated difficulty (p) and point-biserial correlation (rpb) for each item and evaluated internal consistency with Cronbach’s α . We fit a 3-parameter (3-PL) IRT model with discrimination (a), difficulty (b), and guessing (c) parameters using a scaling constant D = 1.7. IRT scores (θ) were estimated with a weighted maximum likelihood approach  to handle all correct or all incorrect responses without assuming a prior distribution; individuals with extreme IRT scores (<−4 or >4) were excluded. We evaluated overall model fit, item fit, local dependence, and reliability with the R mirt package . Overall model fit was evaluated with the M2 statistic and RMSEA, SRMR, TLI, and CFI as described above in the context of CFA. We evaluated item fit with the S-Χ2 goodness of fit indices . To control the false discovery rate in the presence of multiple comparisons, we performed Benjamini-Hochberg (BH) correction [45, 46] when determining significance. We evaluated local dependence with the Q3 statistic of residual correlations .
We evaluated differential item functioning (DIF) across sex, ethnicity (white vs. non-white), and age (<45 vs. 45+ years) to identify items that performed differently across these subgroups while holding genomics knowledge constant. For each item, we performed logistic regression with 3 models: (1) IRT score as the sole predictor, (2) IRT score and the DIF variable as predictors, and (3) IRT score, the DIF variable, and an interaction term as predictors . To evaluate uniform DIF (effect is similar across the entire construct range), we performed a likelihood ratio test between models 1 and 2; to evaluate nonuniform DIF (effect differs across the construct range), we performed a likelihood ratio test between models 2 and 3. Similar to the item fit analyses, we employed the BH correction procedure when evaluating significance.
Statistical analyses of the GKnowM measure were performed with R 3.5 on the combined validation cohort. We evaluated concurrent validity with UNC-GKS T score , a previously validated genomics knowledge measure, and convergent validity with the Chew et al.  single question measure of health literacy. Respondents missing >20% of the UNC-GKS items were excluded; for the remaining respondents, missing items were scored as incorrect. We computed the Pearson correlation between the GKnowM IRT score and UNC-GKS T score and performed an independent samples unequal variances t-test to compare GKnowM IRT scores between individuals with adequate and inadequate health literacy. We predicted that the GKnowM IRT score should be positively correlated with the UNC-GKS and that individuals with adequate health literacy will have increased genomics knowledge due to their increased ability to understand health information .
We similarly performed an independent samples unequal variances t-test to compare GKnowM IRT scores between individuals who did and did not self-report working as or studying to be a healthcare practitioner, clinical researcher, or life scientist (abbreviated HCP) and between HCPs who did and did not report working with genomic testing/data. Since genomics is directly relevant to a HCP’s professional activities, we predicted that those individuals should have increased genomics knowledge. We performed a paired t-test to evaluate the GKnowM IRT scores of students before and after they completed an intensive 1-month undergraduate course in human genome analysis (based on a similar graduate course ). The course material is aligned with the GKnowM construct and so we predicted that students would have significantly higher IRT scores after completing the course. We evaluated the correlation between the GKnowM IRT score and self-assessed knowledge with Kendall’s Tau.
To evaluate the utility and specification of a CAT implementation of the proposed GKnowM measure, we performed a post-hoc simulation study of the item bank  with the R catR package . Using the existing responses of calibration cohort examinees to the fixed-form test, we “readministered” the items in simulation to identify the minimum subset of items for each examinee that resulted in IRT score estimates that were highly correlated with the IRT score from the fixed-form test.
Since the GKnowM measure is intended for measuring population genomic literacy (and not high stakes testing or other applications), we employed a static initial θ and selected items by maximum information. The goal was to measure genomics knowledge with the minimum number of items, so we used a target standard error of measurement (SEM) as the termination criterion with a maximum number of items even if the target SEM was not reached. We simulated target SEM values of 0.30, 0.35, and 0.40, with and without a 20-item maximum limit for the calibration cohort. Given the difficulty of the items, this maximum is most likely to apply to examinees with lower knowledge levels. For this group, it is sufficient to estimate that the IRT score is much <0. We evaluated the CAT simulation with the correlation between full item bank score and CAT score estimate, the number of items, and the percentage of respondents reaching the maximum number of items.
Of the 1,499 respondents who completed the survey during the calibration phase, 265 respondents were excluded by the quality filters. A combined total of 3,287 respondents completed the survey in the calibration and validation phases, of which 882 were excluded by the quality filters. An unknown number of respondents were excluded by the commercial cohort providers for failing to complete the survey or failing the vendor’s quality filters. Table 2 lists the participant characteristics. The median time to complete the 30 draft items was 10.8 min in the calibration cohort and 10.4 min in the validation cohort after applying quality filters.
Psychometric analysis and calibration were performed on the n = 1,234 respondents in the calibration cohort. The overall KMO MSA was 0.92. Two items, Q22 and Q28, had KMO MSA values below the threshold of 0.80 and were removed; Q29 had a borderline MSA value (0.756) and so was retained for further analysis. The KMO results for the remaining items indicate that the data were suitable for factor analysis. In preliminary EFA with a single factor, Q26 had the smallest loading of any of the items retained after KMO analysis. Performing EFA on the subset of respondents who self-identified as HCPs, Q26 had a loading of 0.28, while all other items had loadings >0.40. Based on the factor analysis and feedback from respondents that item Q26 was confusing, we removed that item from the final measure.
PA of the remaining items indicated that 11 factors had eigenvalues greater than the eigenvalues of the random data. Online suppl. Figure 1 shows the scree plot. However, the eigenvalue of the first factor is 7.8-fold greater than the second factor indicating the presence of a single dominant factor. The actual eigenvalues for factors 3–11 are very similar to the eigenvalues for the random data; thus, we focused on a two-factor analysis. A single factor solution accounted for 30% of the variance, while the two-factor solution accounted for 34%. The second factor appeared to be associated with a single item, Q29, which had a much larger loading (>0.7) than all other items. Based on the borderline KMO MSA, factoring, and fit issues in preliminary IRT analysis, we removed that item and proceeded with a single factor approach.
A CFA model with a single factor showed good fit (χ2 = 523.0 , p < 0.001; RMSEA = 0.025, SRMR = 0.048, CFI = 0.976, TLI = 0.974). The RMSEA and SRMR met the acceptable thresholds for good fit (<0.05) as did CFI and TLI (>0.95) . Three pairs of items had modification indices >10 suggesting local dependence (between items Q10 and Q13, Q2 and Q13, and Q17 and Q27), but all 3 were close to the threshold (<13.3). Fitting a CFA model in which the errors for these item pairs could correlate produced a good fit (χ2 = 480.3 , p < 0.001; RMSEA = 0.022, SRMR = 0.046, CFI = 0.980, TLI = 0.978) that was statistically significantly better than the initial model (χ2 = 45.7 , p < 0.001). However, the residual correlations were <0.16, and the items had different content areas, so we retained all items.
Table 3 lists the classical test theory difficulty (p) and discrimination (rpb) results, the IRT 3-PL model parameters, and S-Χ2 statistic for the 26-item GKnowM measure. The overall IRT 3-PL model fit met the acceptable thresholds for good fit (M2 = 414.4 , p = 0.011; RMSEA = 0.012, SRMR = 0.029, TLI = 0.996, CFI = 0.996). We observed significant item misfit at the α = 0.05 level with BH correction for multiple items. Visual inspection of those items (Q1, Q14, Q16–Q18, Q21, Q25) indicated adequate fit, suggesting that the statistical misfit was due to sample size. Based on content relevance we retained all items. All residual correlations were below 0.2, indicating local independence.
Online suppl. Tables 2–4 list the regression coefficients and p values for DIF analysis of all items. We did not observe uniform DIF for gender and ethnicity that was significant at the α = 0.05 level with BH correction but did observe significant nonuniform DIF for multiple items for gender and ethnicity. These results indicate that more knowledgeable white or female respondents generally performed better on these items than other individuals at the same knowledge level. However, the nonuniform DIF results may reflect demographic biases among individuals with health care, life sciences, or social sciences training. As shown in online suppl. Figure 2, many of the highest scoring respondents are, or are studying to be, genetic counselors. Genetic counselors are overwhelmingly white and female , and the gender and ethnicity of the high-scoring respondents in this sample reflect those demographics (with fewer nonwhite or male individuals with higher IRT scores). When we excluded genetic counselors, we did not observe significant nonuniform DIF for gender or ethnicity. We observed uniform DIF in both directions for age for multiple items that were significant at the α = 0.05 level with BH correction. However, this analysis may similarly reflect the demographics of genetic counselors who are more likely to be younger than 45 years, younger undergraduate students who may be more likely to have learned about newer genomic technologies in school, and/or older individuals who may be more likely to have encountered genomic testing professionally or personally.
We performed the same DIF analysis with the combined validation cohort. We observed 1 item, Q1, to have significant uniform DIF for gender with BH correction in the larger cohort (α < 0.001). A review of the item content did not indicate a potential explanation for gender bias and so the item was retained.
Figure 1a shows the test information function (TIF), and Figure 1b shows the conditional standard error of measurement function. The maximum TIF was 14.1 and the minimum conditional standard error of measurement was 0.267 at an IRT score of 0.78. Reliability  was >0.70 for IRT scores of −1.1 to 2.3. Figure 1c shows the full test IRT score distributions for participants in the calibration and validation cohorts. The peak at higher knowledge levels reflects the HCPs, and in particular, the genomic professionals in the sample. The internal consistency of the 26-item measure was α = 0.850. To facilitate applications with summed scoring, Table 4 provides a conversion between the summed score and the IRT scaled T-score .
The GKnowM IRT score and the previously validated UNC-GKS T score, both of which measure genomics knowledge, were significantly correlated r = 0.61 (0.58–0.63), p < 0.001. Online suppl. Figure 3 shows respondents UNC-GKS T scores versus their GKnowM IRT scores. We observed a ceiling effect in which a range of higher-scoring participants answered all UNC-GKS items correctly.
Consistent with our predictions, respondents reporting adequate health literacy had higher GKnowM IRT scores (M = 0.10) than did those reporting inadequate healthy literacy (M = −0.19), t (1,439.7) = 8.02 and p < 0.001. Similarly self-reported HCPs had higher GKnowM IRT scores (M = 0.53) than non-HCPs (M = −0.08), t (392.6) = 8.28 and p < 0.001. Among self-reported HCPs, those individuals who reported ordering, interpreting or implementing genetic/genomic tests or using genomic technologies in their research or employment had higher GKnowM IRT scores (M = 0.87) than those who did not (M = 0.05), t (331.7) = 6.38 and p < 0.001.
Of the 20 students enrolled in an intensive undergraduate human genome analysis course, 14 completed the draft items before and after the course. There was a significant increase in students’ GKnowM IRT scores from before (M = 0.93) to after (M = 1.27) the course, t (13) = 5.28 and p < 0.001. To facilitate cohort comparisons when using summed scores, online suppl. Table 5 provides mean scores for the above groups using T-scores computed with the mapping in Table 4.
Figure 2 shows the percentage of individuals at each knowledge level endorsing different answers to “How would you rate your knowledge of genomics.” Individuals with higher IRT scores accurately rated themselves as “very” or “extremely” knowledgeable, while the self-assessments of individuals with lower IRT scores were less or even negatively correlated with their objectively assessed knowledge. We observed a correlation of τ = 0.04 and p = 0.016 between IRT score and self-assessed knowledge for all individuals, and a significant positive correlation τ = 0.29 and p < 0.001 for individuals with an IRT score ≥0 and significant negative correlation τ = −0.16 and p < 0.001 for individuals with an IRT score <0. The sharp increase in self-assessed knowledge at IRT scores >1.5 is consistent with the knowledge level above which the majority of participants are genomics professionals.
Table 5 lists the CAT post-hoc simulation results for the calibration cohort for different target SEM values, with and without a maximum number of items. A CAT implementation with a target SEM = 0.40 and a 20-item maximum is accurate (r = 0.965 compared to the full item bank) while administering approximately half the number of items on average. Incorporating a maximum number of items did not impact accuracy but reduced the test length by >2 items. Given the overall difficulty of the item bank, the maximum number of items is most likely to impact examinees at lower knowledge levels. For such a cohort, there is minimal value for administering additional items as doing so is not going to improve score accuracy.
The NHGRI’s 2011 vision for the future of genomic medicine specifically cites the need for both providers and patients/consumers to achieve genomic literacy . Doing so will require effective tools for assessing population genomic literacy and rigorously evaluating the effectiveness of genomics educational interventions. Existing genomics knowledge measures can be too narrowly targeted in both content and intended examinee knowledge level. Here we present the GKnowM Genomics Knowledge Scale, a rigorously validated 26-item measure of genomic literacy. GKnowM is designed to meet the need for a genomics knowledge measure that incorporates modern genomic technologies (such as WES/WGS), is informative across a wide range of examinees (including current or future genomics professionals), and addresses the breadth of knowledge relevant to informed decision-making as both a patient and a member of society.
We anticipate the GKnowM scale will be useful in a variety of research, clinical, and educational contexts. Translational genomic studies use genomic literacy measures for multiple purposes, including to evaluate patient-focused educational interventions, such as those used in the genetic counseling protocol and as a cross-sectional measure to be tested against participant outcomes. These applications are critical and timely. The rapid advance of genomic technologies has created a corresponding need to develop new guidelines and best practices for patients/participants and providers alike. Those best practices will be informed by the data collected during the coming years on the impact of genomic literacy on the application of genomic testing and how best to promote the development of genomic literacy . Similarly, genomics knowledge measures are needed to rigorously evaluate the educational interventions being developed for all stages of the educational pipeline to close the gap between the demand for and supply of genomics professionals [54-58].
These are overlapping needs. There is not always a clear distinction between the knowledge required of patients/participants versus that required of providers. For example, many consumer-facing genomic testing customers and research study participants can obtain their genomic data for further self-directed analysis and interpretation [10, 11]. Thus, in GKnowM, we sought to define a content domain that reflects the different and overlapping roles for an individual, that is, patient/participant, provider, and citizen, and an item bank that could effectively measure genomic literacy across a broad range of knowledge levels. Deploying a common measure that can be used across different settings and participant/student populations will reduce duplicated effort and facilitate comparison and meta-analysis of population genomic literacy and the educational efforts to enhance that literacy.
We performed a rigorous psychometric evaluation of GKnowM with a large, educationally, and ethnically diverse cohort drawn from the general public, students, and genomics professionals. Model fit, item fit, and dimensionality analyses indicate that GKnowM successfully measures an essentially unidimensional construct. GKnowM items performed similarly across male/female and white/nonwhite examinees (uniform DIF), although we did observe significant nonuniform DIF for several items. The latter reflects the limited diversity in this sample at higher knowledge levels. Many of the higher-scoring participants are genetic counselors, a profession that is overwhelmingly white and female . Excluding genetic counselors eliminated significant nonuniform DIF between male/female and white/nonwhite examinees. Validity analyses showed that GKnowM is significantly positively correlated with a previously validated genomics knowledge measure and a related measure of health literacy; and that groups, such as healthcare providers, whom we could expect to have increased levels of genomics knowledge, do indeed have higher scores. Thus, the GKnowM score successfully measures an individual’s genomics knowledge.
GKnowM is informative for a wide range of examinees (from approximately 1 standard deviation below the mean to 2 standard deviations above the mean) and is most informative for examinees with above-average genomics knowledge. As such, it may be most useful with cohorts where participants have a range of educational backgrounds, including individuals with higher educational attainment or health or life science training [59, 60], and in evaluating educational interventions where individuals are actively learning about genomics and genome analysis. For example, we showed an application of GKnowM evaluating the knowledge gained in an undergraduate human genome analysis course (in which baseline knowledge scores were already above average). GKnowM complements existing measures, such as UNC-GKS and KOGS, which are most informative for less knowledgeable individuals (i.e., peak TIF is at IRT scores below zero). GKnowM is less informative for differentiating individuals with none versus minimal knowledge but can differentiate individuals that might otherwise “max out” other measures.
A potential limitation of the GKnowM scale is its length (26 items); the median time spent completing the original 30 online items was 10.4 min (after removing the fastest respondents). We performed simulations that showed that a CAT implementation of the measure could achieve high accuracy while administering only 13.5 items on average. A CAT approach would enjoy the benefits of the larger item bank while reducing examinee burden levels to be similar to other measures (KOGS has 9 true-false items, UNC-GKS has 19 or 25 true-false items, and Kaphingst et al.  have 11 items). A CAT implementation has been deployed as part of a Qualtrics survey using its survey customization tools  (contact the authors for the customization code) and could be deployed on other online testing platforms with CAT support. Developing more time-efficient versions of GKnowM is a focus of future work. Another potential limitation is US-centric items. For example, a GKnowM item references the Genetic Information Nondiscrimination Act, a U.S. Federal Law. Users outside the USA could drop those items (noted in the online suppl. materials). As genomics technology and policy evolve some items may no longer be valid and could be dropped.
The observed relationship between self-assessed and objectively measured knowledge is consistent with previous observations that less knowledgeable individuals are less aware of that lack of knowledge  (what Dunning et al.  describe as the “double burden” of individuals’ incomplete/misguided knowledge of both the domain itself and when they are mistaken). This item was presented after the knowledge items in the survey (but before respondents could learn their score) and so would reflect their perceived performance on those items. These results may be confounded, however, by low-quality responses.
Further study is needed to evaluate the dimensionality of item subsets, multidimensional models of genomics knowledge, the stability of GKnowM scores over time, group differences at higher knowledge levels, finer grain group differences (e.g., among different racial/ethnic groups), and whether GKnowM scores are associated with differences in attitudes toward genomic testing, decision-making for genomic testing, and/or post-test psychosocial outcomes. Compared to many existing measures, the GKnowM content domain incorporates more topics focused on ELSI concerns for genomic testing. Little is known about individuals’ understanding of the scientific, legal, and other principles that underlie those concerns. Future studies of the association between genomics knowledge and ELSI concerns could inform future educational materials and counseling best practices.
We present GKnowM, a rigorously validated 26-item measure for assessing genomics knowledge that is amenable to an efficient CAT implementation. The GKnowM content domain incorporates the application of modern genomic technologies, the interpretation of genomic data, and the nonclinical implications of genomic testing. This expanded definition of genomic literacy captures the knowledge needed to (1) make an informed decision for genomic testing, (2) appropriately apply genomic technologies and accurately interpret genomic data, and (3) participate in decisions about genetics and genomics policy questions as a member of society. GKnowM is an up-to-date and broadly relevant measure for assessing population genomic literacy and evaluating the pedagogical effectiveness of genomics educational interventions.
Statement of Ethics
This project was conducted in accordance with the World Medical Association Declaration of Helsinki. Electronic written informed consent was obtained from all participants. The project was determined to be exempt by the Institutional Review Boards at the Icahn School of Medicine at Mount Sinai (#15-0355-00002-01) and Middlebury College (#16086).
Conflict of Interest Statement
N.T. is an employee of and N.T. and D.J.W. hold an ownership interest in Assessment Systems Corporation. R.C.G. has received compensation for advising the following companies: A.I.A., Genomic Life Grail, Humanity, Kneed Media, OptumLabs, Verily, and VibrentHealth; and R.C.G. is a co-founder of Genome Medical, Inc, a technology and services company providing genetics expertise to patients, providers, employers, and care systems.
Research reported in this publication was supported by NHGRI of the National Institutes of Health under award No. R03HG008809, as well as NIH grants HD077671, HL143295, HG009922, TR003201, and HG008685, and the Franca Sozzani Fund for Preventive Genomics. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
M.D.L. conceptualized the project and wrote the manuscript. All authors contributed to developing and refining the item bank. M.D.L. and S.A.S. performed interviews. M.D.L., S.A.S., and J.S.R. recruited survey participants. M.D.L., N.T., and D.J.W. performed the psychometric analysis. All authors have read and approved the manuscript.