Background: Routinely collected databases are kept for administrative purposes. We have refined the analyses of the Swedish National Patient Register and the Cause of Death Register and explored their validity to monitor stroke at the population level. Methods: First-ever strokes (incident cases) and all stroke events were measured by combining the two administrative registers and adding refinements. The administrative registers were validated against the Northern Sweden MONICA, a well-validated population-based epidemiological stroke register. Positive predictive values (PPVs) and sensitivity were calculated. Results: After refinements (restriction to first-ever strokes and additional minor delineations), the PPV of the two administrative registers combined was 94% and sensitivity 92% when compared with all MONICA stroke categories together. For stroke attacks (first and recurrent events together), the PPV in the administrative registers was 85% and sensitivity 91%. The PPV was higher in women than in men, whereas the sensitivity was similar. The PPV was lower but sensitivity higher in people below compared with those above 75 years of age. Both PPV and sensitivity were lower among fatal cases than among cases that survived 28 days. Conclusions: After refinement, Swedish national administrative registers may, with some caveats, be used as a low-resource-consuming alternative to crudely monitor stroke incidence rates at the national level. If further accuracy is strived for, high-quality conventional epidemiological registers are required.
Provided that they are thoroughly validated, administrative registers could be cost-effective tools for surveillance of common diseases to support health care policy decisions, public health research, or health care quality evaluation and research. The alternatives, disease-specific registers, are usually of high validity and provide greater in-depth information but they are very resource consuming, not the least if the goal is to cover a whole nation.
An example of how administrative registers have been used to monitor national incidence rates (first-ever) and attack rates (all events) is acute myocardial infarction. By combining hospital discharge data and cause of death records  and following thorough validations, the routine administrative registers in Nordic countries have been found to be precise enough to monitor myocardial infarction incidence and attack rates in the population [2,3,4].
The accuracy of stroke diagnoses in administrative registers has been validated in Denmark [5,6], Finland [7,8], Norway  and Sweden [10,11] using data from the 1980s and 1990s. Generally, these studies have shown a considerable proportion of over- as well as underdiagnosis, limiting their usefulness for surveillance and research purposes. Most of these authors have concluded that the registers cannot be used for monitoring stroke at the population level without further validation studies.
Since 2000, the quality of stroke care has improved considerably in Sweden, as shown by the national stroke quality register Riks-Stroke . We hypothesize that also the quality of stroke diagnoses in administrative registers has improved to the extent that they may now be used for surveillance and research.
In this study, the analyses in the administrative Swedish National Patient Register (NPR), which comprises all hospital discharges, and the Swedish Cause of Death Register (CDR) have been refined to provide high-validity data on stroke events at the national level. Stroke events in the two administrative registers have been validated compared with a well-established population-based stroke register, the Northern Sweden MONICA register [13,14]. The validation process concerns both a measure of incidence (first-ever events) and the attack rate (first and recurrent events together) derived from administrative registers.
Administrative register-based cases of stroke were obtained for the year 2004 from the Swedish NPR and CDR at the National Board of Health and Welfare. The Swedish NPR comprises all admissions to hospital care. At the time this study was performed, it was possible to record 1 primary diagnosis and up to 7 secondary diagnoses. The personal identification number makes it possible to follow a patient between different hospitals and over time. In 2004, the number of admissions with a missing personal identification number was below 0.5% among stroke patients in the two counties involved in this study. In the same year, at least a primary diagnosis was registered for more than 99% of the hospital stays for acute somatic care.
In the CDR, all deaths among persons registered as Swedish residents are recorded. Diagnoses are coded as the underlying cause of death and up to 20 contributing causes. About 0.5% of all deaths in the Swedish CDR lack a diagnosis for underlying cause of death. For the year 2004, all persons had a valid personal identification number in the CDR.
To ascertain all cases of acute stroke (also stroke as a secondary diagnosis, e.g. associated with surgery), admissions or deaths with any diagnosis of hemorrhagic stroke (ICD10 I61), ischemic stroke (I63) or not defined acute stroke (I64) in all hospitals as well as general practitioner reports were used in this study. Subarachnoid hemorrhages (I60) were not included.
The Northern Sweden MONICA Stroke Register
The Northern Sweden MONICA register was initiated in 1985 as a part of the WHO MONICA Project [15,16]] and has since then continuously been monitoring cardiovascular events (myocardial infarction, stroke, and sudden cardiovascular death) . In short, stroke was defined according to the WHO criteria [17,18] and stroke events were registered in the same standardized way during all years according to the MONICA manual [15,16]. The register covers the two northernmost counties in Sweden, Västerbotten and Norrbotten, with a total population of 510,000.
On a regular basis, MONICA data on stroke events is collected only for the ages of 25–74 years. For this study, however, all medical records from hospitals and nursing homes and all death certificates with a stroke-related diagnosis were reviewed and recorded also for ages between 20 and 25 and ages above 74. In this way, all events in institutional care and all deaths were evaluated with regard to any of the MONICA acute categories.
In the MONICA register, each individual case with a clinical diagnosis of acute stroke except subarachnoid hemorrhage (i.e. ICD codes I61, I63 and I64) and with acute nonstroke diagnoses in which misclassified stroke events may be identified (G45–46, I60, I62, I65–69 and R96–99) is evaluated against strict MONICA diagnostic criteria for acute stroke [13,15,18]. A detailed community-based validation study with emphasis on identifying stroke events occurring out of hospital has been performed . The results of extensive quality controls are available at the WHO MONICA Project website .
All suspected stroke events were evaluated by a few special trained research nurses and in indecisive events together with a stroke physician.
In accordance with the MONICA criteria, acute stroke events were subdivided into definite stroke (acute focal signs supported by findings at brain imaging by CT or MRI), possible stroke (typical acute focal symptoms or signs without supportive findings by brain imaging), unclassifiable stroke (no diagnosis except stroke to explain the event but insufficient information on symptoms; this category was mainly applicable to fatal cases) and stroke from noncardiovascular causes [13,16]. Events reviewed by the MONICA team but assessed as other than acute stroke (e.g. transitory ischemic attacks, nonacute cerebrovascular disorders and symptom diagnoses without any evidence of an acute stroke) were categorized as ‘not stroke’.
Fourteen stroke events (0.6%), of which 7 were first-time events, recorded in the administrative registers were not evaluated by the MONICA team due to missing death certificates or missing medical records. As it could not be determined whether the reason for this was an error in the administrative registers or in the MONICA data collection process, these events were not included in the analyses.
Definitions and Refinements
To test the impact on validity, all stroke events and first-ever stroke events were analyzed separately. A ‘stroke attack’ was defined as any acute stroke event, whether first or recurrent. In accordance with the procedure used in the WHO MONICA Project [13,16], admissions and deaths within a time span of 28 days (days 0–27 from the day of onset) were assigned to a single attack. This delineation was employed with two exceptions: (1) when a patient was admitted to hospital for more than 28 days with a secondary diagnosis for stroke and the patient died on the day of discharge with stroke as a cause of death, only one attack was recorded; the date of death was recorded as the day of attack; (2) when a patient was admitted to acute hospital care for more than 28 days with a stroke diagnosis and transferred to nonacute care where acute stroke was registered as a diagnosis, only the first admission was counted as an attack.
The risk of overestimating the attack rate was considered by assigning all attacks within 90 days (instead of 28 days) to one attack.
First-ever (incident) stroke was defined as the first appearance in any of the administrative databases or the MONICA stroke register in 2004 and no previous stroke admission in NPR for the years 1987 (the first year available in NPR) to 2003.
Positive predictive values (PPVs) and sensitivities were calculated as measures of validity. Here, the PPV is the proportion of stroke cases in the administrative registers that were assigned to any of the MONICA stroke categories ‘definite or possible stroke’ and ‘definite or possible or unclassifiable or stroke from other causes’, respectively. For the same MONICA categories, the sensitivity is the proportion of all stroke cases recorded by MONICA that were also diagnosed as stroke by any of the administrative registers.
Number of Events in the Administrative Registers and in the MONICA Register
During the year 2004, 2,311 acute stroke attacks occurring in Västerbotten and Norrbotten counties were extracted from the administrative registers. Of these, 279 (12.1%) were registered solely in the CDR. In the same year, the Northern Sweden MONICA center recorded 2,166 acute stroke events. An additional 1,368 events were reviewed by the MONICA team but recorded as ‘not stroke’.
Among the 2,311 acute stroke events in the administrative registers and 2,166 events identified by MONICA, 1,528 (66.1%) and 1,562 (72.1%), respectively, were first-ever strokes (incident cases). Of the first-ever events in the administrative register, 102 (6.7%) were derived from the CDR only.
Positive Predictive Values
When compared with all stroke categories in the MONICA stroke register, the PPV was 88.1% of the stroke events registered in the NPR only and 62.0% of the events registered in the CDR only (table 1). The PPV for stroke attacks in any of the two administrative registers was 85.0%. If the 14 events that were missing in the MONICA evaluation but with records in the administrative registers were included in the analysis, the PPV would be reduced to 84.5%.
Among first-ever events (table 2), 94.0% in the NPR and 87.3% in the CDR were also first-ever events according to the MONICA evaluation. The PPV for first-ever events in the two administrative registers together was 93.6% when compared with first-ever events in the MONICA register (all categories together).
As shown in table 4, the sensitivity of the administrative registers to identify first-ever cases of ‘definite or possible stroke’ recorded in the MONICA register was 93.6%. The corresponding proportion was 91.5% when compared with all MONICA stroke categories combined.
Possible Refinement by Using 90-Day Stroke Attacks
When the upper time limit for recording one attack was extended from 28 to 90 days, the number of acute stroke attacks in the administrative registers was reduced to 2,195 (94.4% of the 28-day number).
When compared with 28-day attacks in the administrative registers (all events, both first-ever and recurrent), the PPV versus all MONICA categories together increased from 85.0 to 88.0% when the 90-day limit was applied, but the sensitivity decreased from 90.7 to 88.8%.
Subgroup analyses by sex, age group, type or event (first event vs. all events) and fatal versus nonfatal stroke are shown in table 5.
When all stroke attacks, whether first or recurrent, were analyzed together, the PPV was higher in women than in men, the difference reaching statistical significance when comparing with all MONICA categories combined. The sensitivity was very similar in men and women.
The PPV was similar in the 20- to 74-year and ≥75 year age groups when the administrative registers were compared with the MONICA category ‘definite or possible stroke’ but significantly lower in the younger age group when compared with all MONICA stroke categories combined (table 5). On the other hand, the sensitivity of the administrative registers was significantly higher in the younger age group.
When first-ever events were compared with stroke attacks (first and recurrent stroke combined), the PPV in the administrative registers was 9 percentage points higher for first-ever events, reaching 93.6% versus all MONICA stroke categories. The sensitivity was essentially the same whether first-ever events only or all stroke events were analyzed.
Both PPV and sensitivity were lower among fatal cases than in people who survived 28 days. This pattern was particularly obvious for all fatal stroke events (PPV 78.1% for fatal vs. 87.2% for nonfatal events when compared with all MONICA categories together). The sensitivity was also lower in fatal than in nonfatal cases when compared with all MONICA stroke categories combined (table 5).
The present study shows that, when information from the two Swedish administrative registers NPR and CDR is combined and refined, the PPV and the sensitivity for acute stroke events are high. Some of our observations indicate, however, that caution should be exerted when using the registers for monitoring stroke events in the population. First, the precision in terms of PPV and sensitivity is substantially higher when first-ever stroke events than all stroke events (first and recurrent) are recorded. Second, the PPV and sensitivity are considerably lower in fatal compared to nonfatal events. Third, there are differences in PPV and sensitivity by age group and, to some extent, by sex.
International comparisons and national statistics on stroke are usually limited to mortality statistics. Our analyses show that mortality data that rely solely on a cause of death register are of limited validity. Adding information on nonfatal cases recorded in the NPR markedly improves both PPV and sensitivity. It also provides more useful surveillance data on the community burden of stroke and permits temporal changes to be dissected into changes in attack rates and case fatality.
By using personal identification numbers, we have used the possibility to trace a patient treated in more than one hospital for one and the same stroke event. In our refinement procedure, we have included additional details that serve to avoid any double registration of events with long lengths of hospital stay. When the analyses of administrative registers were further refined by selecting first-ever stroke events only, both PPV and sensitivity reached levels well above 90% when compared with all MONICA stroke categories together, making the registers suitable for monitoring of stroke incidence (although with some caveats, as discussed below). Absence of stroke in the NPR during the previous 17 years was used to define first-ever events. In view of the low risk of recurrent stroke after the first 3 years and the low long-term survival after stroke , it seems unlikely that recurrent strokes occurring later than 17 years would have any significant impact on the present results.
The present data indicate that the validity of Swedish administrative registers to identify acute stroke is better than what has been reported in previous studies based on administrative register data from the 1980s  and 1990s , particularly when first-time (incident) stroke events are monitored. The validity reported here also seems better than that reported from Denmark [5,6], Finland [4,8 ]and Norway , all using administrative data from the 1990s. The improved validity in the present study may be due to better access to brain imaging over time (cf. ), but it may also be the result of our refinements of the administrative stroke data. Our conclusion is that, at least from 2004 (the year covered by the present validation study) and with some caveats, stroke events in the Swedish administrative registers have sufficient validity to be used for monitoring stroke incidence at the population level.
For stroke attacks (first and recurrent events), the validity was somewhat lower than for first-ever events. We explored how extension of the limit for counting a new attack from 28 to 90 days affected PPV and sensitivity. An increase in PPV was counterbalanced by a decline in sensitivity of almost the same magnitude. It seems that there is no obvious reason to divert from the 28-day limit presently used.
Our validation of the administrative registers was performed against the Northern Sweden MONICA stroke register. In this register, each individual case of possible stroke is evaluated against the WHO MONICA criteria [13,18]. The Northern Sweden MONICA stroke register fulfills 8 out of 9 criteria related to definitions and data collection procedures for an ‘ideal’ stroke epidemiology study, as defined by Sudlow and Warlow . The criterion ‘prospective study design, ideally with hot pursuit of cases’ is partly fulfilled in that the registration of data has been prospective (since 1986) but cold pursuit has been used, i.e. cases have been identified at hospital discharge (or corresponding for out-of-hospital cases) or at death. Hot pursuit means that cases are recorded at hospital admission. It has been argued that this makes the distinction between transient ischemic attack and stroke more precise . A possible limitation of using MONICA as a reference is therefore that some cases of stroke may have been misclassified as transient ischemic attack (and not included in the register) and vice versa (falsely included).
Despite being close to the ‘ideal’ criteria by Sudlow and Warlow, a possible limitation remains that the coverage, in particular for fatal cases, may have been incomplete in the MONICA register. A detailed community-based validation study with emphasis on identifying stroke events occurring out of hospital showed that 4% of all stroke events were not recorded in the Northern Sweden MONICA study . In the present study, this would result in a slight underestimate of the PPV of the administrative registers. A further caveat is that the MONICA register covers only 2 of the 21 counties in Sweden and that local variations in diagnostication of acute stroke in routine clinical practice may limit the generalizability of the present data. The design of administrative registers is usually country specific. Although the present study may give some indications how administrative registers may be used to monitor stroke at the population level, each country has to validate its own registers.
We conclude that the two Swedish administrative registers NDR and CDR used together with additional refinements provide a 94% PPV and a 92% sensitivity for first-ever stroke events. This may be a cost-effective alternative to more resource-consuming epidemiological stroke registers to crudely monitor stroke incidence rates over time. This approach does not have perfect accuracy and must be used with several caveats. It is, however, clearly superior to stroke mortality only to monitor stroke at the national level. With some additional caveats, the combined and refined administrative registers may also be used for monitoring of crude stroke attack rates (first and recurrent events together). It seems evident that, if further accuracy is strived for, high-quality conventional epidemiological registers are required. The conclusions rest on the assumption that the Northern MONICA stroke register serves as an accurate reference; this assumption is supported by several quality control studies.
The Northern MONICA Study has since its start in 1985 been supported by the county councils of Västerbotten and Norrbotten and by multiple noncommercial research funds.
Conflicts of Interest
M.K. and B.S. are employed by the National Board of Health and Welfare that administers the Swedish NPR and the CDR. The authors have no other potential conflicts of interest.