Objective: To evaluate a new approach for creating a composite measure of cognitive function, we calibrated a measure of general cognitive performance from existing neuropsychological batteries. Methods: We applied our approach in an epidemiological study and scaled the composite to a nationally representative sample of older adults. Criterion validity was evaluated against standard clinical diagnoses. Convergent validity was evaluated against the Mini-Mental State Examination (MMSE). Results: The general cognitive performance factor was scaled to have a mean of 50 and standard deviation of 10 in a nationally representative sample of older adults. A cutoff point of approximately 45, corresponding to an MMSE of 23/24, optimally discriminated participants with and without dementia (sensitivity = 0.94, specificity = 0.90, area under the curve = 0.97). The general cognitive performance factor was internally consistent (Cronbach's α = 0.91) and provided reliable measures of functional ability across a wide range of cognitive functioning. It demonstrated minimal floor and ceiling effects, which is an improvement over most individual cognitive tests. Conclusions: The cognitive composite is a highly reliable measure, with minimal floor and ceiling effects. We calibrated it using a nationally representative sample of adults over the age of 70 in the USA and established diagnostically relevant cutoff points. Our methods can be used to harmonize neuropsychological test results across diverse settings and studies.

Despite its central role in the daily lives of older adults, there is no widely used, standardized method of assessing overall or global cognitive function across a wide range of functioning. Over 500 neuropsychological tests exist for clinical and research purposes [1]. This tremendous diversity complicates comparison and synthesis of findings about cognitive functioning across a broad range of performance in multiple samples in which different neuropsychological batteries were administered. Although test batteries are used to examine domain-specific cognitive function, summary measures provide global indices of function [2]. Brief global cognitive tests such as the Mini-Mental State Examination (MMSE) and others are limited by prominent ceiling effects and skewed score distributions [3,4,5,6], thus evaluating a limited range of cognitive function. These measurement properties substantially hamper the capacity to measure longitudinal change, since score ranges are limited and a point change has different meanings across the range of values [3,7]. Summary measures of general cognition, if properly calibrated, may be more sensitive to impairments across a broader range of cognitive function and be more sensitive to changes over time [8].

Approaches to creating summary cognitive measures have been limited and controversial. One approach involves standardizing scores of component cognitive tests, which are then averaged into a composite [9,10]. Although widely used, this approach is limited because standardizing variables does not address skewed response distributions, does not allow differential weighting of tests, and ultimately does not facilitate comparisons of findings across studies. An alternative approach uses latent variable methods to summarize tests. The tests are weighted and combined in a composite measure that has more favorable measurement properties, including minimal floor and ceiling effects, measurement precision over a wide range of function and interval-level properties that make the composite optimal for studying longitudinal change [8,11].

In a previous study, Jones et al. [12] used confirmatory factor analysis to develop a general cognitive performance factor from a neuropsychological battery [13]. This measure was shown to be unidimensional and internally consistent (Cronbach's α = 0.82). The factor was defined by high loadings on 6 of the 10 component tests. The cognitive factor was designed to be sensitive to a wide range of cognitive function from high functioning to scores indicative of dementia. Scores were normally distributed and reliable (reliability index >0.90). With these properties, the general cognitive performance factor provides a robust approach to assess cognitive function over time.

Despite these clear advantages, an important limitation of the general cognitive performance factor is that its scores are not yet clinically interpretable or generalizable across studies. To address this limitation, the aims of the present study were as follows: (1) to calibrate a general cognitive performance factor to a nationally representative sample of adults over the age of 70 in the USA, (2) to validate the general cognitive performance factor against reference standard clinical diagnoses, (3) to examine convergent validity of the cognitive performance factor score and (4) to identify clinically meaningful cutoff points for the cognitive factor score. Our overall goal was to create a clinically meaningful measurement tool and, importantly, to demonstrate an approach that is generalizable to other different neuropsychological test batteries on a broader scale.

Study Samples

Participants were drawn from the Successful Aging after Elective Surgery (SAGES) Study and the Aging, Demographics and Memory Study (ADAMS), a substudy of the Health and Retirement Study. SAGES is a prospective cohort study of long-term cognitive and functional outcomes of hospitalization in elective surgery patients. After recruitment, a neuropsychological battery was administered to participants just before surgery and periannually for up to 3 years. Because data collection was ongoing at the time of this study, we used preoperative data for the first 300 patients enrolled. Eligible participants were at least 70 years of age, English speaking, and scheduled for elective surgery at one of two academic teaching hospitals in Boston, Mass., USA. Exclusion criteria included evidence of dementia or delirium at baseline. Study procedures were approved by the Institutional Review Board at the Beth Israel Deaconess Medical Center.

ADAMS is a nationally representative sample of 856 older adults in the USA interviewed in 2002-2004 [14]. Its parent study, the Health and Retirement Study, is a longitudinal survey of over 20,000 community-living retired persons. ADAMS, which began as a population-based study of dementia, initially identified a stratified random sample of 1,770 participants; 687 refused and 227 died before they were interviewed, yielding 856 participants. Participants with probable dementia and minorities were oversampled. We used survey weights to account for the complex survey design and to make estimates representative of adults over the age of 70 in the USA [15]. ADAMS was approved by Institutional Review Boards at the University of Michigan and Duke University.

Measures

Neuropsychological Test Batteries

In SAGES, a neuropsychological test battery was administered during an in-person evaluation that consisted of 11 tests of memory, attention, language and executive function. We used 10 tests from the ADAMS battery, of which 7 were in common with SAGES (table 1). As explained in more detail in the statistical analysis, our modeling approach allows cognitive ability to be estimated based on responses to any subset of cognitive tests [16].

Table 1

Neuropsychological test batteries in the SAGES and ADAMS samples

Neuropsychological test batteries in the SAGES and ADAMS samples
Neuropsychological test batteries in the SAGES and ADAMS samples

Clinical Diagnoses

Clinical diagnoses, grouped as normal cognitive function, cognitive impairment-no dementia (CIND) and all-cause dementia, were assigned in ADAMS by an expert clinical consensus panel [14,17]. Diagnoses were determined after a review of data collected during in-home participant assessments, which included neuropsychological and functional assessments from participants and proxies. A diagnosis of dementia was based on the Diagnostic and Statistical Manual of Mental Disorders III-R and IV [18,19] and for the present study included probable and possible AD, probable and possible vascular dementia, dementia associated with other conditions (Parkinson's disease, normal pressure hydrocephalus, frontal lobe dementia, severe head trauma, alcoholic dementia, Lewy body dementia) and dementia of undetermined etiology. CIND was defined as functional impairment that did not meet criteria for all-cause dementia or as below average performance on any test [17]. CIND included participants with mild cognitive impairment or cognitive impairment due to other causes (e.g. vascular disease, depression, psychiatric disorder, mental retardation, alcohol abuse, stroke, other neurological condition).

MMSE

The MMSE is a brief 30-point cognitive screening instrument used to assess global mental status. It is widely used in clinical and epidemiological settings. Its validity as a screening test for all-cause dementia in clinical populations has been previously established [20,21]. George et al. [22] recommended cutoff points of 23/24 for moderate cognitive impairment and 17/18 for severe cognitive impairment. These cutoff points have been widely applied in clinical and research settings [21,22,23,24,25]. MMSE 9/10 has also been used to indicate severe impairment [26]. Although not a preferred test for identification of mild cognitive impairment [21], cutoff points of 26/27 [23,27,28] and 28/29 [29] have been used for that purpose. Although the MMSE has poor measurement properties and ceiling effects, we used is as a standard for comparison in this study because it remains widely used and its scores and cutoffs are well recognized.

Statistical Analyses

We used descriptive statistics to characterize the SAGES and ADAMS samples. Analyses were subsequently conducted in four steps: (1) scoring the general cognitive performance factor in SAGES and calibrating it to ADAMS using item response theory (IRT) methods, (2) assessing criterion validity of the general cognitive performance factor using reference standard clinical ratings in ADAMS, (3) assessing convergent validity of the general cognitive performance factor with the MMSE and (4) identifying cutoff points on the general cognitive performance factor corresponding to published MMSE cutoff points.

Scoring the General Cognitive Performance Factor in SAGES and Calibrating to ADAMS

We calculated internal consistency of the cognitive tests in SAGES using Cronbach's α [13]. The Cronbach's α statistic has a theoretical range between 0 and 1, with 1 indicating high internal consistency. A generally accepted reliability for analysis of between-person group differences is ≥0.80 and for within-person change is ≥0.90 [13]. Next, we calculated the general cognitive performance factor in SAGES from a categorical variable factor analysis of cognitive tests. The general cognitive performance factor score was scaled to have a mean of 50 and standard deviation (SD) of 10 in the US population over the age of 70. The factor analysis is consistent with the IRT-graded response model, and facilitated precise estimation of reliability across the range of performance [30,31,32,33]. In IRT, reliability is conceptualized as the complement of the squared standard error of measurement (SEM) [34]. The SEM is estimated from the test information function, which varies over the range of ability. We described the precision of the measure over the range of general cognitive performance using the SEM calculated based on the IRT measurement model [35,36].

To scale the general cognitive performance factor in SAGES to the nationally representative ADAMS sample, we took advantage of tests in common between studies to calibrate scores in SAGES to ADAMS [3]. We categorized cognitive tests into up to 10 discrete equal-width categories [37] to avoid model convergence problems caused by outlying values [38] and to place all tests on a similar scale [12] (see online suppl. table 2; for all online suppl. material, see www.karger.com/doi/10.1159/000357647). Tests in common between studies were categorized based on the sample distribution in ADAMS. We assigned model parameters for anchor tests in the SAGES factor analysis based on corresponding estimates from the ADAMS-only model that used population-based survey weighting. This procedure allowed us to scale the general cognitive performance factor to reflect the general population of US adults aged 70 and older. Importantly, because the IRT model handles missing data under the assumption that cognitive performance data are missing at random conditional on variables in the model, general cognitive performance can be calculated for each participant based on responses to any subset of cognitive tests as long as not all test scores are missing. The factor score is the mean of the posterior probability distribution of the expected a priori latent trait estimate. The posterior probability distribution refers to the conditional density of the latent cognitive performance trait given observed cognitive test scores. Because the factor is computed on the basis of all available information in a participant's response patterns, it can be computed regardless of missing tests [39].

We conducted diagnostic procedures using Monte Carlo simulation by generating 100,001 hypothetical observations with the MMSE and all SAGES and ADAMS cognitive measures. Simulated cognitive test distributions matched those of our empirical samples. This simulation allowed us to rigorously compare SAGES and ADAMS scores to the overall general cognitive performance factor using correlations and Bland-Altman plots to examine systematic differences between the measures [40].

Criterion Validity of the General Cognitive Performance Factor

To evaluate criterion validity for distinguishing dementia and CIND, we used logistic regression. We report overall areas under the curve (AUC) for the general cognitive performance factor and diagnostic characteristics for the score that maximized sensitivity and specificity [41].

Convergent Validity of the General Cognitive Performance Factor

We correlated the general cognitive performance factor with MMSE using Pearson's correlation coefficients.

Linking the General Cognitive Performance Factor and MMSE

The MMSE is a widely used screening test for global cognitive status. Because of its widespread use, many clinicians and researchers are familiar with its scores. Thus, the MMSE provides a set of readily recognizable cutoff points which we utilized as guide posts for comparison with the general cognitive performance measure. To produce a ‘crosswalk' between the general cognitive performance factor and MMSE, we used equipercentile linking methods to correlate scores [42]. This step allowed the direct comparison of general cognitive performance factor scores that correspond to MMSE scores. Equipercentile linking identifies scores on 2 measures (MMSE and general cognitive performance factor) with the same percentile rank, and assigns general cognitive performance a value from the reference test, MMSE, at that percentile. This approach is appropriate when 2 tests are on different scales, and is most useful when the distribution of the reference test is not normally distributed [43].

In SAGES (n = 300), most participants were female (56%), white (95%), on average 77 years old (range 70-92), and had at least a college education (70%; table 1); few had dementia (n = 5, 1.7%) or CIND (n = 19, 6.3%). By comparison, in ADAMS (n = 856), which is representative of persons over the age of 70, participants were mostly female (61%), white (89%), on average 79 years of age (range 70-110), and 37% had at least a college education. Relative to SAGES, the ADAMS sample was older (p < 0.001), more ethnically diverse (p = 0.001), less highly educated (p < 0.001), and had higher levels of cognitive impairment (n = 308, 13.7% had dementia and n = 241, 22.0% had CIND).

Derivation of the General Cognitive Performance Factor in SAGES and Calibration to ADAMS

By design, the general cognitive performance factor in ADAMS had a mean of 50 and SD of 10. The general cognitive performance factor in SAGES was 0.9 SD above the national average, reflecting their higher education and younger average age. The cognitive tests in ADAMS were internally consistent (Cronbach's α = 0.91). The reliability of the general cognitive performance factor, derived based on the SEM, was above 90% for scores between 40 and 70, which included 84% of the ADAMS sample. Correlations between the general cognitive performance factor and items from SAGES were above 0.80, with the exception of HVLT delayed recall (r = 0.65). The tests represent multiple domains including memory, executive function, language and attention, suggesting the factor represents general cognitive performance and is not dominated by a particular cognitive domain. Figure 1 demonstrates that general cognitive performance factor scores in SAGES and ADAMS were normally distributed; on average, SAGES participants had higher levels of cognitive function.

Fig. 1

Distribution of general cognitive performance score in SAGES and ADAMS.

Fig. 1

Distribution of general cognitive performance score in SAGES and ADAMS.

Close modal

Using simulated data, the correlation between the study-specific general cognitive performance factor for SAGES and ADAMS was above 0.97. Bland-Altman plots further revealed no systematic bias across the range of general cognitive performance scores, suggesting the general cognitive performance factor was not measured differently across the two studies (see online suppl. fig. 1).

Criterion Validity of the General Cognitive Performance Factor

Figure 2 shows receiver operating curves for distinguishing dementia and CIND in ADAMS. The general cognitive performance factor score that best discriminated dementia participants from cognitively normal participants was less than 44.8 (sensitivity = 0.94, specificity = 0.90; fig. 2, right panel). This cutoff point correctly classified 94% of the sample. The AUC was 0.97. The general cognitive performance factor score that best discriminated CIND participants from cognitively normal participants was less than 49.5 (sensitivity = 0.80, specificity = 0.76; fig. 2, left panel). This cutoff point correctly classified 79% of the sample (AUC = 0.84).

Fig. 2

Receiver operator characteristic curves for general cognitive performance predicting CIND and dementia: results from ADAMS (n = 856). In right panel, the general cognitive performance score that best discriminated dementia participants from other participants was <44.8 (sensitivity = 0.94, specificity = 0.90). This cutoff point correctly classified 93.8% of the sample. The AUC was 0.97. In the left panel, the general cognitive performance score that best discriminated CIND participants from cognitively normal participants was <49.5 (sensitivity = 0.80, specificity = 0.76; fig. 1, left panel). This cutoff point correctly classified 78.5% of the sample. The AUC was 0.84.

Fig. 2

Receiver operator characteristic curves for general cognitive performance predicting CIND and dementia: results from ADAMS (n = 856). In right panel, the general cognitive performance score that best discriminated dementia participants from other participants was <44.8 (sensitivity = 0.94, specificity = 0.90). This cutoff point correctly classified 93.8% of the sample. The AUC was 0.97. In the left panel, the general cognitive performance score that best discriminated CIND participants from cognitively normal participants was <49.5 (sensitivity = 0.80, specificity = 0.76; fig. 1, left panel). This cutoff point correctly classified 78.5% of the sample. The AUC was 0.84.

Close modal

The AUC for the general cognitive performance factor was significantly greater than the AUC for each constituent test (online suppl. table 1). The only exception was for immediate word recall, which was superior for predicting dementia.

Convergent Validity of the General Cognitive Performance Factor

The correlation between the general cognitive performance factor and the MMSE was 0.91 in ADAMS (p < 0.001), indicating strong evidence of convergent validity.

Crosswalk between General Cognitive Performance Factor and MMSE

The equipercentile linking procedure is illustrated in figure 3. Scores for the general cognitive performance factor and MMSE were matched based on percentile ranks. For example, a score of 24 on the MMSE has the same percentile rank as a score of 45 on the general cognitive performance factor.

Fig. 3

Corresponding scores and percentile ranks for the general cognitive performance factor and MMSE. General cognitive performance scores are shown on the top axis. MMSE scores with the proportion of participants in ADAMS falling below each score are shown on the bottom x-axis. Dotted lines show that approximately 25% of participants have <45 on the general cognitive performance factor and <24 on the MMSE.

Fig. 3

Corresponding scores and percentile ranks for the general cognitive performance factor and MMSE. General cognitive performance scores are shown on the top axis. MMSE scores with the proportion of participants in ADAMS falling below each score are shown on the bottom x-axis. Dotted lines show that approximately 25% of participants have <45 on the general cognitive performance factor and <24 on the MMSE.

Close modal

After equipercentile linking the general cognitive performance factor with the MMSE, the score corresponding to an MMSE cutoff point of 23/24, was <45.2, or 0.48 SD below the national average (table 2). This cutoff point was nearly identical to the score that best discriminated participants with and without dementia (fig. 2). General cognitive performance factor scores of <50.9 and <40.7 corresponded to MMSE scores of 26/27 and 17/18, respectively. Table 2 provides sensitivity, specificity, positive and negative predictive values, and likelihood ratio positive and negative statistics for predicting dementia and CIND for general cognitive performance factor scores corresponding to MMSE scores of 29, 27, 24, 18 and 10. The general cognitive performance factor cutoff point of 45.2 correctly classified 90% of persons with dementia (sensitivity) and 93% of persons without dementia (specificity). This cutoff point is moderately strong for confirming dementia (positive predictive value = 73%, likelihood ratio positive = 12.8) and has an excellent negative predictive value (98%).

Table 2

Clinically important cutoff points on the MMSE and corresponding general cognitive performance scores

Clinically important cutoff points on the MMSE and corresponding general cognitive performance scores
Clinically important cutoff points on the MMSE and corresponding general cognitive performance scores

We constructed a crosswalk to show scores on the general cognitive performance factor and corresponding scores on the MMSE (fig. 4). Irregular levels between each score on the MMSE and the limited observable range of MMSE evident in this figure underscores the broader range and better interval scaling properties of the general cognitive performance factor.

Fig. 4

Crosswalk between MMSE and general cognitive performance factor: results from ADAMS (n = 856). The left side of the crosswalk shows MMSE scores (range 0-30) and the right side shows corresponding general cognitive performance scores. To facilitate comparison of the distribution of general cognitive performance scores against MMSE scores, y-axes are plotted on an inverse normalized percentile scale. Original units are labeled. We used data from the Monte Carlo simulation because the distribution of the observed data did not permit a finely graded linkage between the MMSE and general cognitive performance factor.

Fig. 4

Crosswalk between MMSE and general cognitive performance factor: results from ADAMS (n = 856). The left side of the crosswalk shows MMSE scores (range 0-30) and the right side shows corresponding general cognitive performance scores. To facilitate comparison of the distribution of general cognitive performance scores against MMSE scores, y-axes are plotted on an inverse normalized percentile scale. Original units are labeled. We used data from the Monte Carlo simulation because the distribution of the observed data did not permit a finely graded linkage between the MMSE and general cognitive performance factor.

Close modal

We developed a unidimensional factor of cognitive performance in the SAGES study, scaled it to national norms for adults over 70 years of age living in the USA, and evaluated its criterion and convergent validity. When validated against reference standard diagnoses for dementia, the score of approximately 45 has a sensitivity of 94% and specificity of 90% (AUC = 0.97), indicating outstanding performance. Convergent validity with the MMSE was excellent (correlation = 0.91, p < 0.001). Cognitive tests comprising the general cognitive performance factor are internally consistent (Cronbach's α = 0.91). The factor is highly precise across most of the score range (reliability >0.90). To enhance the clinical relevance of the scores, we provided correlations with widely used scores for the MMSE. Notably, the score of 45 was the optimal cutoff point for dementia and also corresponded to an MMSE of 23/24.

General cognitive performance factors have previously been shown to have minimal floor and ceiling effects, measurement precision over a wide range of cognitive function, and interval scaling properties that make it an ideally suited measure of change [12,31]. We replicated these findings, and identified meaningful cutoff points to further enhance the potential utility of the measure. Strengths of this study include calibration of the cognitive composite to a large, nationally representative sample of older adults in which rigorous reference standard clinical diagnoses were available. With this sample, we were able to evaluate criterion and convergent validity of the general cognitive performance factor for detecting cognitive impairment and demonstrate its favorable test performance characteristics.

Several caveats merit attention. First, the general cognitive performance factor is not intended to diagnose dementia or mild cognitive impairment. It simply provides a refined cognitive measure which, like any cognitive measure, represents only one piece of the necessary armamentarium for establishing a clinical diagnosis. Second, 7 tests were in common to calibrate cognitive composites in ADAMS and SAGES, and Bland-Altman plots confirmed they were similar between studies. Although we are convinced that the general cognitive performance factors developed in the two samples were equivalent, further research is needed to determine minimum sample sizes, number of cognitive tests available in common between studies, and the degree of correlation among tests needed to estimate a reliable composite in a new sample. Existing research suggests that at least 5 anchor items in an IRT analysis such as ours is enough to produce reasonably accurate parameter estimates [44]; one previous study used 1 item in common to calibrate different scales [45]. Third, some criterion contamination is potentially present in our evaluation of criterion validity because the general cognitive performance factor in ADAMS is a unique combination of shared variability among test items that were available to clinicians when assigning clinical diagnoses. However, comparison of AUCs (online suppl. table 1) for the general factor and individual cognitive tests revealed the former performed better than most of its constituent parts. Fourth, positive and negative predictive values in our study are dependent on base rates, and may differ in other samples. Fifth, while reliability of the general cognitive performance factor was excellent across a range of performance that included 84% of the ADAMS population, results suggest it is less reliable at more impaired levels. Future calibration efforts should consider cognitive tests that are sensitive to impairment in populations with more severe degrees of dementia. A final caveat is that our approach is not intended to replace examination and interpretation of individual neuropsychological tests. Such examination remains an important approach to examine domain-specific cognitive changes to assist with clinical diagnosis and to understand pathophysiological correlates of various cognitive disorders.

An important implication of the present work is the potential of deriving the general cognitive performance factor in other samples with neuropsychological batteries that have overlap with the battery used in ADAMS. Extrapolation of these methods holds the potential for harmonizing batteries to enhance comparability and even synthesize results across studies through integrative data analysis. This method would thus address a substantial limitation for combining existing studies using disparate neuropsychological batteries [46]. Harmonizing samples together with different research designs and demographic characteristics provides opportunities to make findings more generalizable. Without this common metric, or at least 1 test in common across studies, to conduct integrative analysis, one must resort to comparing multiple data points from normative tests that potentially measure diverse cognitive domains.

The need for uniform measures of cognitive function, derived using rigorous psychometric methods, has been recognized by national groups [47]. Uniform, psychometrically sound measures are a central focus of the NIH PROMIS and Toolbox initiatives. Our study is consistent with these goals. The innovative approach demonstrated here used psychometric methods to generate a unidimensional general cognitive performance composite with excellent performance characteristics that can be used to measure cognitive change over time and across study. We established clinically meaningful, population-based cutoff points. This measure, and the methods used to create it, holds substantial promise for advancing work to evaluate the progression of cognitive functioning over time. Perhaps most importantly, our methods can facilitate future strategies to integrate cognitive test results across epidemiological and clinical studies of cognitive aging.

We created a composite factor for general cognitive performance from widely used tests for neuropsychological functioning using psychometrically sophisticated methods. We used publicly available neuropsychological performance data from the ADAMS to calibrate general cognitive performance to a nationally representative sample of adults aged 70 and older in the USA. This calibration enabled us to describe cognitive functioning in our study on a nationally representative scale. The general cognitive performance factor was internally consistent, provided reliable measures of functional ability across a wide range of cognitive functioning, and demonstrated minimal floor and ceiling effects. It also demonstrated criterion validity: a cutoff point of approximately 45, corresponding to an MMSE of 23/24, optimally discriminated participants with and without dementia (sensitivity = 0.94, specificity = 0.90, AUC = 0.97). Our approach has broad applicability and usefulness to directly compare cognitive performance in new and existing studies when overlapping items with the ADAMS neuropsychological battery are present. These methods can facilitate interpretation and synthesis of findings in existing and future research studies.

This work was supported by a grant from the National Institute on Aging (P01AG031720 to Dr. Inouye). Dr. Gross was supported by a National Institutes of Health Translational Research in Aging Postdoctoral Fellowship (T32AG023480) and by a grant from the NIA (R03AG045494). Dr. Inouye holds the Milton and Shirley F. Levy Family Chair in Alzheimer's Disease. Dr. Fong was supported by a National Institute on Aging Career Development Award (K23AG031320). The contents do not necessarily represent the views of the funding entities. Dr. Gross had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

None of the authors report any conflicts of interest or received compensation for this work.

1.
Lezak MD, Howieson DB, Loring DW: Neuropsychological Assessment. New York, Oxford University Press, 2004.
2.
Chandler MJ, Lacritz LH, Hynan LS, Barnard HD, Allen G, Deschner M, Weiner MF, Cullum CM: A total score for the CERAD neuropsychological battery. Neurology 2005;65:102-106.
3.
Crane PK, Narasimhalu K, Gibbons LE, Mungas DM, Haneuse S, Larson EB, et al: Item response theory facilitated calibrating cognitive tests and reduced bias in estimated rates of decline. J Clin Epidemiol 2008;61:1018-1027.
4.
Folstein MF, Folstein SE, McHugh PR: ‘Mini-Mental State': a practical guide for grading the cognitive state of patients for the clinician. J Psychiatr Res 1975;12:189-198.
5.
Schultz-Larsen K, Kreiner S, Lomholt RK: Mini-Mental State Examination: mixed Rasch model item analysis derived two different cognitive dimensions of the MMSE. J Clin Epidemiol 2007;60:268-279.
6.
Simard M: The Mini-Mental State Examination: strengths and weaknesses of a clinical instrument. Can Alzheimer Dis Rev 1998;12:10-12.
7.
Wouters H, van Gool WA, Schmand B, Zwinderman AH, Lindeboom R: Three sides of the same coin: measuring global cognitive impairment with the MMSE, ADAS-cog and CAMCOG. Int J Geriatr Psychiatry 2010;25:770-779.
8.
McArdle JJ, Grimm KJ, Hamagami F, Bowles RP, Meredith W: Modeling life-span growth curves of cognition using longitudinal data with multiple samples and changing scales of measurement. Psychol Methods 2009;14:126-149.
9.
Tuokko H, Woodward TS: Development and validation of a demographic correction system for neuropsychological measures used in the Canadian Study of Health and Aging. J Clin Exp Neuropsychol 1996;18:479-616.
10.
Wilson RS, Mendes de Leon CF, Barnes LL, Schneider JA, Bienias JL, Evans DA, et al: Participation in cognitively stimulating activities and risk of incident Alzheimer's disease. JAMA 2002;287:742-748.
11.
Strauss ME, Fritsch T: Factor structure of the CERAD neuropsychological battery. J Int Neuropsychol Soc 2004;10:559-565.
12.
Jones RN, Rudolph JL, Inouye SK, Yang FM, Fong TG, Milberg WP, et al: Development of a unidimensional composite measure of neuropsychological functioning in older cardiac surgery patients with good measurement precision. J Clin Exp Neuropsychol 2010;32:1041-1049.
13.
Nunnally JC, Bernstein IH: Psychometric Theory. New York, McGraw-Hill, 1994.
14.
Langa KM, Plassman BL, Wallace RB, Herzog AR, Heeringa SG, Ofstedal MB, et al: The Aging, Demographics and Memory Study: study design and methods. Neuroepidemiology 2005;25:181-191.
15.
Juster FT, Suzman R: An overview of the Health and Retirement Study. J Hum Res 1995;30(suppl):7-56.
16.
Bjorner JB, Kosinski M, Ware JE Jr: Computerized adaptive testing and item banking; in Fayers PM, Hays RD (eds): Assessing Quality of Life. Oxford, Oxford University Press, 2004.
17.
Plassman BL, Langa KM, Fisher GG, Heeringa SG, Weir DR, Ofstedal MB, et al: Prevalence of dementia in the United States: the Aging, Demographics and Memory Study. Neuroepidemiology 2007;29:125-132.
18.
American Psychiatric Association: Diagnostic and Statistical Manual of Mental Disorders, ed 3 (DSM-III-R). Washington, APA, 1987.
19.
American Psychiatric Association: Diagnostic and Statistical Manual of Mental Disorders, ed 4 (DSM-IV). Washington, APA, 1994.
20.
Feher EP, Mahurin RK, Doody RS, Cooke N, Sims J, Pirozzolo FJ: Establishing the limits of the Mini-Mental State: examination of ‘subtests'. Arch Neurol 1992;49:87-92.
21.
Mitchell AJ: A meta-analysis of the accuracy of the Mini-Mental State Examination in the detection of dementia and mild cognitive impairment. J Psychiatr Res 2009;43:411-431.
22.
George L, Landerman R, Blazer D, Anthony J: Cognitive impairment; in Robins L, Regier D (eds): Psychiatric Disorders in America. New York, Free Press, 1991, pp 291-327.
23.
Crum RM, Anthony JC, Bassett SS, Folstein MF: Population-based norms for the Mini-Mental State Examination by age and educational level. JAMA 1993;269:2386-2391.
24.
Huppert FA, Cabelli ST, Matthews FE: Brief cognitive assessment in a UK population sample - distributional properties and the relationship between the MMSE and an extended mental state examination. MRC Cognitive Function and Ageing Study. BMC Geriatr 2005;5:1-14.
25.
Moylan T, Das K, Gibb A, Hill A, Kane A, Lee C: Assessment of cognitive function in older hospital inpatients: is the Telephone Interview for Cognitive Status (TICS-M) a useful alternative to the Mini Mental State Examination? Int J Geriatr Psychiatry 2004;19:1008-1009.
26.
Mungas D: In-office mental status testing: a practical guide. Geriatrics 1991;46:54-58.
27.
Kilada S, Gamaldo A, Grant EA, Moghekar A, Morris JC, O'Brien RJ: Brief screening tests for the diagnosis of dementia: comparison with the Mini-Mental State Exam. Alzheimer's Dis Assoc Disord 2005;19:8-16.
28.
Xu G, Meyer JS, Thornby J, Chowdhury M, Quach M: Screening for mild cognitive impairment (MCI) utilizing combined mini-mental-cognitive capacity examinations for identifying dementia prodrome. Int J Geriatr Psychiatry 2002;17:1027-1033.
29.
Tang-Wai DF, Knopman DS, Geda YE: Comparison of the short test of mental status and the Mini-Mental State Examination in mild cognitive impairment. Arch Neurol 2003;60:1777-1781.
30.
Jöreskog KG, Moustaki I: Factor analysis of ordinal variables: a comparison of three approaches. Multivariate Behav Res 2001;36:347-387.
31.
McHorney CA: Ten recommendations for advancing patient-centered outcomes measurement for older persons. Ann Intern Med 2003;139:403-409.
32.
Mislevy RJ: Recent developments in the factor analysis of categorical variables. J Educ Stat 1986;11:3-31.
33.
Samejima F: Estimation of latent ability using a response pattern of graded scores. Psychometrika 1969;34:1-17.
34.
Green BF, Bock RD, Humphreys LG, Linn RL, Reckase MD: Technical guidelines for assessing computerized adaptive tests. J Educ Meas 1984;21:347-360.
35.
Embretson SE, Reise SP: Item Response Theory for Psychologists. Mahwah, Erlbaum, 2000.
36.
Hambleton RK, Swaminathan H, Rogers HJ: Fundamentals of Item Response Theory. Newbury Park, Sage, 1991.
37.
Kotsiantis S, Kanellopoulos D: Discretization techniques: a recent survey, GESTS. Int Trans Comput Sci Eng 2006;32:47-58.
38.
Heywood HB: On finite sequences of real numbers. Proc R Soc Lond A Math Phys Sci 1931;134:486-501.
39.
Muraki E, Engelhard G: Full-information item factor analysis: applications of EAP scores. Appl Psychol Meas 1985;9:417.
40.
Bland JM, Altman DG: Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986;327:307-310.
41.
Coffin M, Sukhatme S: Receiver operating characteristic studies and measurement errors. Biometrics 1997;53:823-837.
42.
Kolen M, Brennan R: Test Equating: Methods and Practices. New York, Springer, 1995.
43.
Gross AL, Inouye SK, Rebok GW, Brandt J, Crane PK, Parisi JM, et al: Parallel but not equivalent: challenges and solutions for repeated assessment of cognition over time. J Clin Exp Neuropsychol 2012;34:758-772.
44.
Wang WC: Effects of anchor item methods on the detection of differential item functioning within the family of Rasch models. J Exp Educ 2004;72:221-261.
45.
Jones RN, Fonda SJ: Use of an IRT-based latent variable model to link different forms of the CES-D from the Health and Retirement Study. Soc Psychiatry Psychiatr Epidemiol 2004;39:828-835.
46.
Curran PJ, Hussong AM: Integrative data analysis: the simultaneous analysis of multiple data sets. Psychol Methods 2009;14:81-100.
47.
Hendrie HC, Albert MS, Butters MA, Gao S, Knopman DS, Launer LJ, Wagster MV: The NIH Cognitive and Emotional Health Project: report of the Critical Evaluation Study Committee. Alzheimers Dement 2006;2:12-32.
48.
Reitan R: Validity of the trail making test as an indicator of organic brain damage. Percept Mot Skills 1958;8:271-276.
49.
Wechsler D: Wechsler Memory Scale, revised. San Antonio, Psychological Corporation, 1987.
50.
Benton A, Hamsher K: Multilingual Aphasia Examination. Iowa City, University of Iowa, 1976.
51.
Williams BW, Mack W, Henderson VW: Boston Naming Test in Alzheimer's disease. Neuropsychologia 1989;27:1073-1079.
52.
Smith A: Symbol Digits Modalities Test. Los Angeles, Western Psychological Services, 1973.
53.
Morris JC, Heyman A, Mohs RC, Hughes JP, van Belle G, Fillenbaum G, Mellits ED, Clark C: The Consortium to Establish a Registry for Alzheimer's Disease (CERAD). I. Clinical and neuropsychological assessment of Alzheimer's disease. Neurology 1989;39:1159-1165.
54.
Brandt J, Benedict RHB: Hopkins Verbal Learning Test, revised: professional manual. Odessa, Psychological Assessment Resources, 2001.
55.
Trenerry MR, Crosson B, DeBoe J, Leber WR: Visual Search and Attention Test (VSAT). Odessa, Psychological Assessment Resources, 1990.
Copyright / Drug Dosage / Disclaimer
Copyright: All rights reserved. No part of this publication may be translated into other languages, reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, microcopying, or by any information storage and retrieval system, without permission in writing from the publisher.
Drug Dosage: The authors and the publisher have exerted every effort to ensure that drug selection and dosage set forth in this text are in accord with current recommendations and practice at the time of publication. However, in view of ongoing research, changes in government regulations, and the constant flow of information relating to drug therapy and drug reactions, the reader is urged to check the package insert for each drug for any changes in indications and dosage and for added warnings and precautions. This is particularly important when the recommended agent is a new and/or infrequently employed drug.
Disclaimer: The statements, opinions and data contained in this publication are solely those of the individual authors and contributors and not of the publishers and the editor(s). The appearance of advertisements or/and product references in the publication is not a warranty, endorsement, or approval of the products or services advertised or of their effectiveness, quality or safety. The publisher and the editor(s) disclaim responsibility for any injury to persons or property resulting from any ideas, methods, instructions or products referred to in the content or advertisements.