Background: Despite positive psychology being increasingly recognised as an important agent in well-being, there is a lack of standardised outcome measures for psychosocial dementia research. This review assessed positive psychology outcome measures using standardised criterion in populations that were identified as having shared characteristics. It aimed to identify robust measures that were suitable for potential adaption or use within a dementia population. Summary: The review identified 16 positive psychology outcome measures (and 8 further psychometric assessments of these) within the constructs of resilience, self-efficacy, religiousness/spirituality, life valuation, sense of coherence, autonomy, resourcefulness and a combined measure (CASP-19). Scale development studies were subject to a quality assessment, and most were found to be lacking information on reproducibility and responsiveness. Key Messages: A wide range of measures within the constructs of positive psychology was identified as having potential utility for psychosocial research within a dementia population. Examples included the CD-RISC, GSWB, SWLS, MPAQ, RSOA and CASP-19. It is recommended that such scales are further adapted or validated for people with dementia. Underreporting of appropriate psychometric analyses hampered this review, and it is recommended that future authors endeavour to report such analyses.
The psychology of dementia has generally been constructed in terms of progressive deficits, negative aspects of behaviour or mood and progressive dependency. Whilst a number of recent psychosocial intervention studies have aimed to promote quality of life, the outcomes utilised within these studies assess neuropsychiatric symptoms such as agitation and depression, thereby inferring well-being or quality of life by the reduction or absence of these behaviours . Furthermore, the measurement of quality of life within dementia research is not without its challenges. Quality of life is a highly subjective concept and, therefore, there is debate as to whether proxy or observational measures of this concept truly reflect an individual's appraisal of their own quality of life .
Positive psychology (PP) is recognised as an important contributing factor in terms of well-being. However, there is little research in relation to it within the dementia field, possibly due to the lack of suitable PP outcome scales. Whilst a consensus on more general measures to be used in dementia research has been reached, it has also highlighted the absence of PP outcome measures in the field . In fact, there are no specifically tested or designed PP outcome measures for people with dementia.
For the purpose of this review, PP was defined as the study of positive emotions that enable individuals, communities and organisation to thrive . Whilst other models and definitions were considered , Seligman's theory was chosen for the basis of this review due to its inclusive and influential nature.
Since the reintroduction of PP as a meaningful branch of psychology , a number of studies have explored positive attributes for dementia caregivers including resilience and self-efficacy [7,8], in both qualitative and quantitative settings.
PP research focusing predominantly on the person with dementia has largely been of a qualitative nature, for example the use of spirituality in coping with a diagnosis of Alzheimer's disease . More recently, a meta-synthesis of living positively with dementia was undertaken, which highlighted retained capacities to utilise character strengths and actively seek enjoyment and pleasure and provides a strong rationale for the development of PP outcome measures within dementia research .
In order to consider the outcomes which may have potential utility for people with dementia, this review identified populations which may have shared characteristics. Chronic illness populations were selected due to the persistent, incurable nature, traumatic brain injury (TBI) populations for their shared symptoms including impairment of cognitive, physical and psychosocial functions, and older adults as they share similar issues in old age and this population has the highest prevalence of dementia .
The aims of this review were to: (1) assess the psychometric properties of PP outcome measures in use for chronic illness, TBI and older adults, and (2) appraise the potential applicability of measures of positive outcomes for people with dementia.
A systematic search and psychometric property appraisal of published PP outcome measures for people with chronic illness, TBI and for older adults was undertaken. Systematic principles were followed for searching, screening and appraising results . Constructs denoting PP were sourced from current literature [13,14,15] to identify salient and pertinent constructs. Such constructs included resilience, hope, optimism, autonomy and spirituality.
The following electronic databases were searched: PsycINFO, MEDLINE and PubMed. The search terms were: ‘measure', ‘instrument', ‘questionnaire', ‘quiz', ‘test' and ‘scale' combined with ‘goal', ‘life satisfaction', ‘self-efficacy', ‘hope', ‘resilience', ‘cope', ‘wisdom', ‘growth', ‘coherence', ‘control', ‘autonomy', ‘pleasure', ‘self-realisation', ‘sense of agency', ‘gratitude', ‘happiness', ‘optimism', ‘transcendence', ‘positive', ‘dignity', ‘social participation', ‘social inclusion', ‘self-concept', ‘humour', ‘creativity', ‘flow', ‘spirituality', ‘love', ‘compassion', ‘benefit finding', ‘community integration', ‘opportunity', ‘social adjustment', ‘mindfulness', ‘acceptance' and ‘successful aging'. These search terms were then, again, combined with: ‘chronic illness', ‘traumatic brain injury' and ‘older adult'. Truncations of search terms were used where necessary. Search terms such as ‘quality of life' and ‘well-being' were not included as the review focused on concepts that contribute to these dimensions.
In the first instance, studies were screened for the development of a PP outcome measure. Studies were also screened for independent assessments of psychometric properties of a PP outcome measure in either a chronic illness, TBI or an older adult population (hereafter referred to as ‘validation studies'). If a validation study was identified, a search for the original psychometric development study was performed, even if this date preceded 1998. Finally, a manual check of text and reference lists was conducted to identify additional measures.
The inclusion criteria were: (1) outcome measures published in a peer-reviewed journal; (2) an outcome measure purporting to measure a specific construct, as identified in the search terms, within PP and developed or validated in chronic illness, TBI or older adult populations, and (3) published between 1998 and 2015 (1998 was the date when the term PP was re-introduced by Seligman).
The exclusion criteria were: (1) papers published in a language other than English if a translation was not available, and (2) measures that focused on external or situational factors rather than an internal trait within PP.
Identified abstracts were exported to EndNote, where they were screened against eligibility criteria. Full text articles were then sought for the studies included. In uncertain cases, of which there were 6, scales were given to A.S. to screen against the eligibility criteria and were discussed until a decision on its inclusion/exclusion was reached. The final list of measures included was also reviewed by A.S.
Appraisal of Psychometric Properties
Included measures were grouped within the construct they intended to measure, and a quality assessment was undertaken, utilising a published criterion that appraises the development process of outcome measures . This criterion has been applied in other reviews  and provides a scoring system based on reported aspects of reliability and validity during measure development (table 1). This analysis was undertaken for measure development papers only by the primary author and corroborated by A.S. For each item within the criterion, positive scores were awarded when the study was adequately designed and appropriate statistics are reported. An intermediate score (?) was given if there were either methodological shortfalls including inadequate description of the design or analysis and sampling issues. A negative rating was awarded if, despite adequate study design and methods, the study produced results indicating poor psychometric properties. A zero was awarded if the authors failed to report the appropriate information. A positive score was awarded 2 points, an intermediate score 1 point and both negative ratings and zero ratings were awarded a score of zero. These scores were then added together to produce an overall quality score for the development process of the scale with a possible score range of 0-18.
The appraisal of validation papers was undertaken to assess the degree of translatability to other populations and therefore guide selection of a measure that could be used for people with dementia. An analysis of reported psychometric properties including internal consistency (employing magnitude guidelines)  and convergent validity was undertaken. A measure was identified as potentially applicable to people with dementia if reported psychometric properties within a validation study were robust and consistent with the original scale. This would indicate the measure's stability and utilisation across populations. Study characteristics such as including sample size and psychometric properties were synthesised to compare properties of a measure when used in a different population (table 2).
A total of 6,709 results were identified from the databases PsycINFO, MEDLINE and PubMed, of which 109 potential scale and 32 validation papers were identified. Figure 1 summarises the steps taken during the review when including or excluding potential scales. Of the 109 potential scales, only 16 met the criteria for inclusion within this review.
The main reason for the exclusion was that scales did not measure a trait or characteristic indicative of PP (64 out of 109 were excluded on this basis). A breakdown of the inclusion and exclusion process is shown in figure 1.
Of the 32 validation papers, 8 met the inclusion criteria. The majority of validation papers (19 out of 32) were excluded on the basis that the original scale did not meet the inclusion criteria (fig. 1).
The appraisal of the scale development process is contained within table 3. Scores were relatively low, ranging from 2 to 9 out of a possible 18. Overall, the Control, Autonomy, Self-Realisation and Pleasure (CASP-19)  was awarded the highest score, demonstrating its comprehensive reliability and validity for older adults.
Table 2 provides a description of the 15 included measures and validations in populations of interest. In one instance, a short form version of a scale was utilised, and both versions were included in the quality assessment.
Four scales measuring resilience were identified: the Connor-Davidson Resilience Scale (CD-RISC) , the Brief Resilient Coping Scale (BRCS) , the Resilience Scale (RS)  and the Brief Resilience Scale (BRS) . The CD-RISC, RS and BRS were awarded the highest scores for resilience measurement (7/18). In particular, the RS was rigorously developed with items being generated following an extensive literature review and in-depth interviews, with the target population contributing to its high score on the content validity criterion.
Internal consistency using Cronbach's α was reported in all 4 development studies. Scores ranged from acceptable to good, of which the BRS had the highest score. Test-retest reliability was reported for 3 of the 4 (CD-RISC, BRCS and BRS), and scores again ranged from acceptable to good, of which the CD-RISC had the highest score.
Convergent validity was reported for all 4 scales with expected and significant results, of which the CD-RISC was the most thorough. Overall correlations between the BRCS and scales attempting to establish convergent validity were not established. However, expected and significant correlations were reported for subscales. The BRS was found to be positively correlated with a number of scales and subscales including the CD-RISC. The RS was found to be positively correlated with life satisfaction and morale and negatively correlated with depression.
Sensitivity to change was established for 2 scales (CD-RISC and BRCS), with the CD-RISC showing a significant effect of time and an interaction effect between the time and response categories, indicating an increase in score associated with overall clinical improvement. The BRCS demonstrated a significant linear effect across four assessment periods and an increase in the mean average score before and after intervention (table 2).
Predictive validity was reported for 2 scales (BRCS and BRS). For the BRCS, the authors created an Outcomes Index, consisting of 6 standardised variables reflecting post-intervention scores. This outcomes index had an adequate α reliability score (α = 0.86) and was found to be a moderately significant predictor of post-intervention outcomes (p < 0.03). The BRS predicted outcomes for perceived stress, anxiety, depression, positive affect and physical symptoms.
Two validation papers were identified within this review. The CD-RISC was validated in a Native American, older adults sample and was found to have excellent internal consistency (α = 0.93) . Its convergent validity was also established by significant positive correlations with depression and scales of self-efficacy and mastery and negatively correlated with handgrip strength. The RS was translated and validated in a Spanish chronic musculoskeletal pain sample and was found to have adequate psychometric properties . The authors also analysed the scales stability and found no significant differences across two time points. The RS was also found to be positively correlated with pain scales and negatively correlated pain catastrophising.
Overall, the CD-RISC appears to be the most psychometrically robust measure, reflected in the quality assessment and validation stages. Although the RS scored equally as well as the CD-RISC, the latter was subject to an increased level of validity checks including sensitivity to change analyses and stringent validity checks and, therefore, the CD-RISC seems the most appropriate scale for further validation in a dementia population.
The Care-Receiver Efficacy Scale (CRES)  was the only measure of self-efficacy to meet the inclusion criteria for this review. It was given a moderate 7/18 for the scale development, notably lacking information regarding reproducibility, responsiveness and interpretability. The CRES had an adequate reported internal consistency for most of its subscales; however, one of these came close to the minimum required score of α = 0.70. The authors conceded that this subscale was of questionable practical use but was retained for potential future analysis and modification.
The authors reported expected negative correlations between depression and subscale 4 (performance-related quality of life) but, overall, subscale correlations with validation measures were only moderate, ranging from r = 0.3 to 0.4. No validation studies for the CRES were identified within this review. The CRES, although scoring moderately on the quality assessment, appears to be of questionable practical value and would benefit from further development and analysis.
The Daily Spiritual Experience Scale (DSES) , the Spirituality Index of Well-Being (SIWB) , the Geriatric Spiritual Well-Being Scale (GSWS)  and the Functional Assessment of Chronic Illness Therapy-Spiritual Well-Being Scale (FACIT-Sp)  were identified for inclusion within this review, of which the final 2 were given the highest rating during the quality assessment stage (7/18).
Internal consistency was reported for all 4 scales and ranged from good to excellent, of which the DSES scored highest. However, 2 items were found to be collinear (α = 0.96) for this scale, as participants seemed unable to distinguish between finding comfort and finding strength at an item level. The authors conceded that if similar patterns were to be found in other populations, 1 of the items should be omitted. These items were nevertheless included within the final scale. Test-retest reliability was only reported for the GSWS with a significant relationship being found (p < 0.001).
Convergent validity was reported for all scales with acceptable, expected correlations for each (table 2). Of particular note was the DSES, for which the authors reported positive correlations for a range of factors including optimism, social support and quality of life, and negative correlations with anxiety and alcohol consumption. Furthermore, the DSES was reported to have good construct validity.
Validations in appropriate populations were identified for the DSES and the SIWB. The DSES has been translated and validated for French older adults and was found to have excellent internal consistency, test-retest reliability and convergent validity, highlighting its possible applicability for older adults with dementia . The SIWB was validated in a chronic illness setting and was also found to have excellent internal consistency, test-retest reliability and convergent validity .
Overall, of the 4 scales identified, the GSWB and FACIT-Sp were more rigorously developed, as reflected in their higher scoring on the quality criteria (table 3). Also, the GSWB and the SIWB were developed for older adults, and the SIWB has been validated in a chronic illness population and therefore might be more applicable in a dementia setting.
The Satisfaction with Life Scale (SWLS) , the Valuation of Life Scale (VLS)  and the Attitude towards Aging Scale (AtA)  were identified and grouped under the construct of ‘life valuation'. The AtA received the highest rating on the quality assessment in the development stage (6/18) illustrating the thoroughness of the reporting style for the AtA, which received a positive score for its reporting of floor/ceiling effects, a criterion that appears to be underreported in scale development.
All reported good internal consistency and suitable convergent assessments (table 2). The VLS consists of 2 subscales (positive valuation and negative valuation) and appeared most thorough in the assessment of convergent validity. The authors noted a significant, positive relationship between well-being, hardiness and mastery and positive valuation of life, and a significant negative relationship between negative valuation of life and well-being, hardiness and mastery. Furthermore, a negative relationship was found between positive valuation of life and depression. In contrast, the AtA reported aspects of construct validity but only weak negative correlations with other scales as a questionable indicator of discriminant validity.
Three validation studies were identified for the SWLS, consisting of a translation and validation in Turkish older adults , a translation and factor analysis in Spanish adolescents and older adults  and a Portuguese translation and validation for older adults . Two of these studies reported good to excellent internal consistency for older adults, with appropriate sample sizes and expected significant relationships. The Spanish translation examined factorial variance between adolescents and older adults and concluded that the SWLS was indeed sensitive to both these age groups, with an acceptable one-factor model identified for both. No validation studies were identified for the VLS or AtA.
Overall, it appears that the SWLS scale seems most appropriate for future use for people with dementia. Although its development was not as rigorous as the VLS, it has been the subject of at least 3 validation studies for older adults, illustrating its applicability to older adults cross-culturally. The AtA is a new scale, which would benefit from additional development with regard to convergent validity before adaption for people with dementia.
The Maastricht Personal Autonomy Questionnaire (MPAQ)  received a score of 8/18 for the quality assessment, illustrating adequate development, particularly with regard to establishing content validity. Acceptable internal consistency was reported using ICC for each of the 3 subscales, and a wide range of expected correlations were noted, thereby establishing its convergent validity (table 2). No validation studies were identified for the MPAQ, but this is unsurprising as the scale was published in 2014.
Sense of Coherence
The Sense of Coherence Scale (SOCS)  is a 29- or 13-item measure. Both the 29- and 13-item scale were given a low score of 2/18 for its development process, largely because information was not available for most of the criteria. For example, whilst it is noted that the scale was developed with a Jewish population, there was no indication of norms for this sample. Furthermore, content validity was difficult to establish as there was no record of target population involvement in the generation of items, and as there was no examination of convergent/divergent validity, construct validity for the scale was questionable.
Internal consistency for both the 29- and 13-item versions of the scale was reported and found to be high. However, convergent validity was not examined as the scale was proposed as ‘novel', and face validity was established from colleague feedback to the author. The 13-item SOCS was subject to a confirmatory factor analysis in a Dutch sample of young adults living with a chronic illness (n = 2,781) . Results indicated that 2 items should be dropped to improve overall consistency and, furthermore, the 3 subscales loaded onto a single order factor model with factor loadings ranging from 0.58 to 1.00.
Whilst the development of the scale was lacking in some basic areas, it has since been subject to extensive psychometric assessments. In a review of 124 studies , the SOCS was reported to have adequate internal consistency, to be relatively stable over time and predictive of health outcomes including risk of post-traumatic stress symptoms. As such, the SOCS is a well-established outcome measure that could be adopted within psychosocial dementia research.
The Resourcefulness Scale for Older Adults (RSOA)  is a 28-item scale developed in older adults, of which there was an average of three chronic health conditions per participant. It was awarded 5/18 on the development of the scale, and notably the authors conducted in-depth factor analyses and reported appropriate levels of internal consistency for subscales and overall scales. However, convergent validity was not established for this scale and, therefore, further development is needed before the possibility of adaptation for people with dementia.
The CASP-19 is a 19-item scale developed in a sample of older adults with an age range of 65-75 years and had the greatest score at the quality assessment stage, achieving a score of 9/18. This illustrates its thorough psychometric development, including the use of experts, discussion groups with target populations and factor analyses.
Internal consistency was reported for each of the 4 subscales and ranged from α = 0.59 to α = 0.77. Whilst this falls below the acceptable limit for Cronbach's α, the authors compensated for this by undertaking a factor analysis which suggested evidence for a single, underlying quality of life factor, with strong factor loadings occurring for each of the subscales (0.71-0.88) on a latent factor. The scale was also strongly positively correlated with the Life Satisfaction Index-W (p = 0.001).
Whilst The CASP-19 is used as an indicator of quality of life, it was developed using a needs satisfaction model, strongly linked with Maslow's work on human motivation  and assesses quality of life by the degree to which the requirements of the four domains it consists of are satisfied. However, as the CASP-19 is used as a quality of life indicator, validation studies were unlikely to be identified within this review. Nevertheless, the CASP-19 appears to be a psychometrically sound measure that could be used for people with dementia in future instances.
Whilst the quality criteria used within this review have been applied to a review of resilience measures , this is, to our knowledge, the first review to use well-defined criteria to assess outcome measures in the broad field of PP.
It is debateable as to why PP outcome measures have not been developed for or validated in dementia populations as there was no shortage of scales in populations identified as being similar within this review. This may be because of the prevailing medical model of diagnosing and treating dementia despite the emergence of more person-centred models , or it may be due to the continuing stigma surrounding the perception of dementia as a negative and debilitating condition, for which there is little to offer .
However, we identified a wide range of measures within the constructs of PP that could be further validated for people with dementia. These included the CD-RISC for resilience, the GSWB, FACIT-Sp and SIWB for spirituality, the SWLS and AtA for life valuation, the MPAQ for autonomy, the SOCS for sense of coherence, the RSOA for resourcefulness and the CASP-19 as a combined measure. However, for the constructs of self-efficacy, no suitable scales were identified. Although most scales identified scored moderately on a quality assessment, it is recommended that they are subject to further psychometric assessments, so that clinicians may better understand the potential role of positive traits within well-being for a dementia population. As the efficacy of non-pharmacological interventions within dementia has been established , positive outcomes may aid the facilitation of more appropriate psychosocial intervention studies that aim to enhance quality of life.
Methodological Problems and Limitations
Whilst an effort was made to include all-encompassing PP search terms, results often included outcome measures that were not indicative of PP, and the vast majority was subsequently excluded from the review on this basis. Furthermore, only one definition was used for the review. Whilst this definition was selected for its diverse nature, it is noted that there are variations as to what constitutes PP.
Obtaining the original development paper of outcome measures was sometimes difficult and often accomplished through extensive searching of databases. For example, the SOCS proved difficult to obtain, and an additional review of the measure was included in the review to more comprehensively assess its psychometric properties .
It is important to acknowledge that 14 of the 16 scales were developed in American populations, the exceptions being the CASP-19, which was developed in a British sample and the AtA, developed in a Portuguese sample. As such, it is questionable as to whether these scales are culturally appropriate for other populations as definitions of positive constructs may differ between cultures. However, of the 14 scales developed in American samples, 6 were translated into other languages and were validated appropriately, the most comprehensive of which was the SWLS, for which we identified three translations within the confines of this review.
The quality criteria employed here may have been overly constraining, and their use was largely hindered by a lack of available information. For example, it was nearly impossible to award a score for the responsiveness or reproducibility section of the assessment as authors rarely reported this. This difficulty was also reported by authors using these criteria in previous reviews . Future researchers may wish to include such information for the purpose of future reviews. However, the criteria employed here are amongst the few assessment criteria comprehensive enough to cover most aspects of measurement development.
It has been argued that below a certain level of cognitive function, it is difficult for people with dementia to accurately appraise their quality of life , but nevertheless some people still appear able to give a view to their quality of life even with severe impairment. Similar issues complicate the related assessment of PP ratings with people with dementia, which are likely to require more challenging appraisals than quality of life. However, there are so few studies in PP and dementia that further evidence is needed before a more definitive statement can be made.
This review highlights the need for authors to better report aspects of scale development and validation. For example, most of the scales identified failed to report a minimal important change (MIC) under the responsiveness criteria of the quality assessment. The exception to this was the MPAQ, although the authors conceded the MIC used during scale development was arbitrarily defined. Only one paper (CD-RISC) examined change in response during treatment for specific subgroups, but no MIC was identified. It is recommended that this should be a requirement of authors when developing scales, to determine a clinically meaningful change in score, as most of these scales are in use for clinical populations. Qualitative research alongside scale development may aid the identification of a clinically meaningful change or sensitivity to the effect of treatment.
The populations that were selected for the purpose of this review were chosen with the view to assessing the most suitable and adaptable scales for people with dementia, and consequently future researchers may wish to expand their search to include other populations as well, where there would be a larger range of measures available.
The CASP-19 and the AtA were the only measures for which the authors reported a floor and ceiling effect, identifying the range of the scale, skew and kurtosis of results. It is recommended that future authors endeavour to report these factors.
Only one scale of self-efficacy was identified for inclusion. The CRSES scored moderately at the quality assessment stage, and no further validation studies were identified. Whilst there is a wealth of research measuring self-efficacy for caregivers of people with dementia , there appears to be a lack of research concerning self-efficacy for people with dementia. This may be due to an absence of specific measures, developed for this population. As such, it is recommended that a domain-specific scale of self-efficacy be developed and validated for people with dementia. Domain-specific measures of self-efficacy are often reported to have greater predictive ability and a greater capacity to inform theoretical models .
Also of note was the MPAQ, a very recent scale, which scored 8/18 at the quality assessment stage. This scale was developed with older adults who reported a chronic physical illness and, therefore, represents two populations identified as being suitably similar to dementia within this review. Future researchers may wish to refine this measure and validate it for people with dementia.
As the CASP-19 received the highest score at the quality assessment stage and was developed specifically for older adults, it may be appropriate to assess the psychometric properties of this measure within a dementia population and adapt it, if necessary.
The authors would like to thank Dr. Emese Csipke who provided advice on methodology.
The authors declare no competing interests.