Abstract
Background: There are many failures in treatment development for Alzheimer’s disease (AD). Some of these failures are the result of development programs that lacked critical information about candidate drugs as these were advanced from one phase of development to the next. Translational scoring (TS) has been proposed as a means of increasing the rigor with which treatment development programs are executed. Previously, these approaches were not specific to AD or to the phase of drug development. Detailed information on the characteristics needed to advance a candidate agent from one phase to the next is the basis for success in subsequent phases. Summary: The TS approach is presented with a score range of 0–25 for agents entering phases 1, 2, and 3 of development and those that have completed phase 3 and are being considered for regulatory review. Each phase has 5 essential categories scored from 0–5 indicating the completeness of the data available when the agent is being considered for promotion to the next phase. Lower scores suggest that the development program should be reexamined for missing information while higher scores increase the confidence that the agent has the potential to succeed in the next phase. Scoring guidelines are provided and examples of scores for drugs in recent development programs are provided to illustrate the principles of TS. Key Messages: Successful development of drugs for AD treatment requires disciplined informed decision-making at each phase of development. TS is a methodology for more rigorous drug development to help ensure that inadequately characterized drugs are not advanced and that the development platform at each phase is optimal to support success at the next phase.
Introduction
Alzheimer’s disease (AD) is a progressive neurodegenerative disorder with cognitive, functional, and behavioral manifestations. AD progresses from asymptomatic preclinical phases, through prodromal and mildly symptomatic phases, to mild, moderate, and severe dementia [1-3]. Pathologically, AD is characterized by amyloid aggregation in the form of neuritic plaques, tau protein hyperphosphorylation, and neurofibrillary tangles as well as neurodegeneration with synaptic loss and neuronal death [4]. Comorbid pathologies are commonly present including neuroinflammation, vascular changes, and other types of protein accumulation including TDP-43 and α-synuclein [5]. AD becomes more common with aging and is rapidly increasing in prevalence with the greater longevity and increasing number of older individuals in the world’s population. There are currently 50 million individuals with AD globally, and this is expected to burgeon to 131 million by 2050 [6]. There will be a concomitant staggering increase in the toll on patients, families, and society.
There is an urgent need for new therapies that will prevent or delay the onset, slow the progression, or improve the symptoms of AD. A drug introduced by 2025 that delays the onset of AD by 5 years would decrease the prevalence of AD by 41% and the cost of the disease by 40% in 2050 [7]. Currently available treatments include cholinesterase inhibitors (donepezil, galantamine, and rivastigmine), an N-methyl-D-aspartate antagonist (memantine), and a fixed combination of donepezil and memantine (Namzaric®). These agents transiently improve symptoms and delay decline but have limited benefit and do not alter the progression of the disease. Despite hundreds of clinical trials of promising agents, there has been only 1 novel agent approved for the treatment of AD since 2003 (GV-971 was approved in China in 2019 and is available only in China), and the failure rate of AD drug development exceeds 99% [8].
There are many factors contributing to the lack of success of AD drug development. The complex biology of AD is difficult to reduce to a single target whose manipulation will impact the broad range of pathologies and symptoms, the drugs tested may be flawed by lack of efficacy or unacceptable adverse events, the biomarkers available and used in development programs may have been inappropriate as reporters of drug action, the populations tested in trials may have been too late in their disease course or in the wrong phase of the disease to benefit from the mechanism of action (MOA) of the test agent, and the trial may not have been well conducted and the data collected failed to inform on drug efficacy [9-12].
Translational medicine aims to integrate nonclinical observations and data from translational tools, like biomarkers, with clinical observations derived from trials to inform drug development [13]. Translational medicine provides an approach to improve drug development strategies for AD. Checklists have been developed as translational tools to promote preemptive error management in drug development [14]; scoring of drug attributes represents a next step in identifying optimal or deficient drug features and guiding the drug development process. Translational scoring (TS) is a tool of translational medicine and has been proposed as a means of assessing the readiness of drugs in the pipeline to be advanced for further testing [15-17]. Here, TS is discussed and its application specifically to AD drug development is described. TS will evolve as better targets, improved agents, a more comprehensive repertoire of biomarkers, better definitions of trial populations, and optimized trial conduct are achieved. This paper provides a foundation onto which these advances can be mapped. Successful development programs for agents approved by the US Food and Drug Administration (FDA) and other regulatory agencies will be particularly influential in the TS of drug development programs; drug development successes will comprise a standard against which to modify the TS approach to best predict translational and developmental success. The TS framework described here is designed to be sufficiently elastic to accommodate evolving information.
TS in AD Drug Development: An Overview
Here, TS is applied to the development of disease-modifying therapies (DMTs) for AD. Alternate TS frameworks will be required for cognition-enhancing agents or drugs targeting behavioral symptoms in AD. DMTs have comprised the majority of agents tested in trials in the last several years; the incorporation of animal models in nonclinical assessments and biomarkers for patient selection, proof of mechanism, and disease modification is more advanced in DMT development programs than for cognition-enhancing or psychotropic agents [18-22]. TS was not developed specifically for AD therapies [15-17] and is presented here for the AD-specific context. TS was also not originally conceived as being phase-related and is here rendered in phase-defined tranches since candidate therapies must be serially reassessed as they progress along the pipeline.
TS is a scoring system that systematically assesses key determinants of translational success, such as biomarkers, dose observations, and safety, with the intention of identifying deficiencies in information and determining the readiness of agents to progress to the next stage of development [15]. TS is the semi-quantitative application of translational medicine and is appropriate for biopharma, venture capital, philanthropic, federal, and academic drug development programs. TS allows the prioritization of agents to be advanced based on the likelihood of their success in the next phase.
In the current approach, the TS consists of a maximum score of 25 based on 5 items scored from 0 to 5. Ta-bles 1-4 provide examples of how scores are generated. A score of 5 means the agent has excellent pharmaceutical characteristics and has been rigorously assessed. Lower scores may mean that the agent has some undesirable features or was not comprehensively evaluated; either circumstance increases the risk of failure at the next stage of testing if the agent is advanced. It is not possible to give specific examples of each potential scoring situation; rather, the examples in the tables provide guidance for the application of scores. There is no “cut-off” score for when a compound is likely to succeed if advanced to the next phase of development. The scoring provides insight into the completeness of the data generated for each phase and its readiness to be advanced to the next phase. Lower scores indicate a higher risk of failure at the next level. In some cases, a compound might have a score suggestive of future success but 1 element of the development is incomplete; this should be addressed prior to advancing the agent to the next level.
TS reflects only the phase immediately prior to the calculation and the scores are not additive across phases. For example, nonclinical data are included only at the juncture of nonclinical and clinical development when an agent is being considered for testing in phase 1. The nonclinical data are not reintroduced into the TS at later phases. A repurposed agent might enter development in phase 2 as an AD therapy based on epidemiologic or patient observations without a strong AD-specific nonclinical assessment [23] The TS 2 score for repurposed agents would be based on the phase 1 data available and would not reflect the lack of a broad range of nonclinical assessments. The sponsor will continue to consider the strength of each previous stage as a candidate drug progresses through the development cycle. New nonclinical data may be discovered internally or in other laboratories after a compound is in the clinic that might influence development decisions, but this would not lead to recalculation in later phases. Development leaders continuously assess all available data and TS is one means of quantitating, prioritizing, and derisking the agent at any point in the development process.
TS is not a means of predicting the ultimate success and approval of an agent as a DMT. It is a means of determining the readiness of the candidate treatment to be advanced to the next level of assessment. TS is described here for treatments being assessed at the end of their nonclinical testing program, at the point of readiness to transition from phase 1 to phase 2, when ready to be advanced from phase 2 to phase 3, and at the end of phase 3 when being prepared for presentation to the regulatory authorities and considered for commercial development (Fig. 1). When applied in development programs, data for TS calculations are derived from the published literature, or, in the case of sponsors using TS to assess internal candidates, from the combination of published literature and proprietary information.
Not all aspects of drug development can be compressed into a single score, and TS will be complemented by other decision-support tools to determine if an agent is ready to be promoted to the next stage of development. By objectifying the compound characterization with TS, missing elements of development programs can be more easily identified, therapies at similar levels of development can be compared, and investment in agents (from within a sponsor company or from venture capital, granting agencies, or philanthropies) can be prioritized. Agents with higher scores are more thoroughly derisked and more likely to succeed at the next level of development. TS can function as an important tool in research portfolio management.
TS 1 for Compounds Being Considered for Phase 1 and Exiting the Nonclinical Phase of Development
To enter human clinical trials with features that predict success in further development, a potential treatment must have been characterized from a pharmacologic perspective, toxicity must be acceptable, the doses to be advanced must be determined, and there must be preliminary evidence of histologic and behavioral efficacy in experimental models relevant to AD [24, 25]. The models for efficacy are typically rodent species, usually transgenic (tg) mice [26, 27], although other species are used in some development programs [28]. Human induced pluripotent stem cells (iPSC) are increasingly employed in an attempt to better predict the behavior of the agent in humans using human-derived cells [29].
Pharmacokinetic (PK) characterization of the test agent includes absorption, distribution, metabolism, and excretion (ADME) [30]. Absorption, bioavailability, and blood-brain barrier (BBB) penetration must be known before progressing the agent to human testing, and the features must be compatible with use in humans to treat AD (e.g., BBB penetration demonstrated unless the mechanism of action is posited to depend on peripheral effects). Metabolic characterization includes how the molecule is metabolized, what metabolites are produced, and whether the metabolites are biologically active. Metabolites can contribute to both the efficacy and the toxicity of the administered agent. The liver is the major organ of drug metabolism and key aspects of hepatic metabolism to be understood are the role of the cytochrome P450 (CYP450) enzymes in the metabolism of the agent and whether it is an inducer or an inhibitor of these enzymes. Such observations are key to predicting drug-drug interaction liabilities of the candidate agent. The findings will determine if the dose of the agent may require adjustment in patients with hepatic compromise. Excretion is typically through the urine or feces, and so the percentage of the unchanged parent compound and its metabolites found in the urine and feces must be understood. Dosage adjustments may be required in patients with renal disease if the excretion is primarily through the kidney.
The toxicity of the test agent must be characterized in animals to avoid, as much as possible, the introduction of agents with toxic liabilities into human trials. Structural alerts of agents identified before animal testing are increasingly able to predict toxicity and are used to prevent testing in animals or humans [31]. Regulatory guidance requires that drugs be assessed in at least 2 species: rats are the species of choice for most studies, and cardiac toxicity is typically assayed in dogs since they are sensitive to cardiac effects and more likely to be able to predict cardiotoxicity in humans [32]. The most sensitive organs to drug-induced toxicity are determined by exposing the animals to escalating doses, and the “no observed adverse event level” (NOAEL) dose established. The NOAEL for the most sensitive species is used for further dose investigations and for determining the doses to be advanced to human testing in phase 1 clinical trials [32].
Nonclinical efficacy employs animal model systems of some aspect of AD that is targeted by the test agent. Preliminary observations conducted in cell assays will have identified a mechanism of interest (e.g., the modulation of γ-secretase in assays of amyloid-beta (Aβ) production or cell survival in assays for neuroprotective agents). Animal models that allow assessment of the impact of the candidate on the putative mechanism are deployed to determine the histologic or behavioral effects of treatment. Animal models do not reproduce the entire spectrum of AD-related pathological changes; tg models created with amyloid-related mutations (e.g., amyloid, precursor protein/presenilin 1 [APP/PS1]) allow for interrogation of the effects on amyloid metabolism, while tau-related mutations (e.g., P301S tg mice) facilitate investigating effects on tau biology [27, 33]. Models are created to serve as specific assays of the biological effect of interest. The timing of the intervention (before pathological changes occur in the model or after their development) determines if the intervention is more relevant for the prevention of disease or the treatment of existing disease. Observing similar biological effects across >1 model increases the confidence in the impact on the molecular mechanism under study. The rigor, reproducibility, and quality of the studies conducted are critical to the successful interpretation of the results and decisions about whether to advance the test agent to human trials [24, 34].
Behavioral assays are often used in the nonclinical characterization of candidate therapies, although their relevance in DMT development is debatable [35]. The Morris water maze (MWM) assesses hippocampal-type memory and is commonly used to determine if hippocampal function, relevant to human AD, is better preserved or enhanced in treated versus untreated tg animals and how the behavior compares to wild-type control animals. Similarly, novel object recognition (NOR), radial-arm water mazes, or fear conditioning can be used to explore learning and memory in treated versus untreated animals [35]. Animal model studies, histologic or behavioral, have not translated from success in animals to success in humans; however, advancing an agent to human testing without evidence of its efficacy in models is unwise. Both elements are included in TS 1.
In addition to the TS approach, animal models can be scored to determine the confidence to be placed in the model. Ferreira et al. [36] identified 8 parameters to be assessed in determining the degree to which animal model observations can be applied to human drug development including epidemiologic, natural history, genetic, biochemical, etiological, histological, pharmacological, and clinical/behavioral data. Integration of this rigorous assessment of animal models will strengthen the application of the TS approach. Treatment success in more fully validated models would be awarded a higher TS score.
Induced PSC represent an alternative or complement to animal model testing. Human fibroblasts are harvested and used to create PSC that can then be transitioned to neurons and grown in 3-dimensional cultures that recreate many aspects of the human brain [29, 37]. Introduction of human mutations leads to the production of amyloid and AD-related changes including the production of tau protein and tangle-like inclusions [29, 38]. These preparations can be used to assess the pharmacologic effects of proposed therapies [39]. iPSC may provide a more human-like means of assessing efficacy with possibly better predictive validity for treatment success in clinical trials. These nonanimal approaches can be captured in the TS calculation (Table 1).
TS 2 for Those Candidate Drugs Considered for Phase 2 and Exiting Phase 1
Phase 1 clinical trials comprise observations in single and multiple ascending dose (SAD and MAD) studies on cohorts of individuals receiving drug or placebo at each dose level. A well-conducted Phase 1 program has addressed 5 major issues in drug development: (1) safety and tolerability in humans, (2) ADME PK features in humans, (3) doses including the maximum tolerated dose (MTD), (4) doses to be advanced to phase 2, and (5) BBB penetration in humans.
An unsafe compound cannot be advanced under any circumstances, and TS is not required for such agents. Unacceptable cardiac effects on electrocardiography (ECG), elevated liver functions including bilirubin, hematopoietic effects, or other adverse laboratory and organ system outcomes preclude further development of an agent, although back-up drugs or further medicinal chemistry manipulations can lead to related agents with acceptable adverse event profiles. Similarly, a marked lack of tolerability (severe nausea and vomiting, sedation, dizziness, headache, etc.) would lead to the termination of development of that agent. Many drugs, however, have more minor side effects that do not necessitate termination, and these can become part of the TS calculation. Cardiac effects that are uncommon but require ECG monitoring, or side effects that are acceptable but inconvenient, would lower the TS score and lead to ranking the compound at a lower level than agents without such features. Safety and tolerability are scored on a scale of 1–5 as shown in Table 2.
ADME in humans must be understood at the end of phase 1. Phase 1 may include only healthy young volunteers, or at least one cohort of elderly individuals in MAD studies to assess the impact of aging on PK parameters. The ADME features determine the acceptability of the PK profile of the agent and figure into the TS calculation (Table 2). Agents with half-lives compatible with once-daily dosing will be scored more highly that those requiring 2×/3× daily dosing. Agents metabolized by the CYP450 enzymes commonly involved in drug metabolism will receive lower scores as drug-drug interactions are more common with these drugs. Agents with metabolites that are metabolically active and require monitoring or are a potential source of toxicity will receive lower TS scores. Reduced absorption or delay of absorption via food can affect maximal concentrations (Cmax) and time to maximal concentration (Tmax), respectively, and this is determined in phase 1. If renal excretion is the principal means of elimination of an agent, levels can be elevated in patients with renal compromise and dose adjustments will be required.
A critical question to be answered in phase 1 is the extent to which the agent enters the brain. Nonanimal membrane studies (e.g., the Caco-2 permeability assay), rodent cerebrospinal fluid (CSF) and brain studies, and nonhuman primate (NHP) studies all provide insight into whether the agent is likely to cross the BBB in humans, but confirmation of this is necessary in phase 1. A failure to do so impacts TS since agents should not be advanced to phase 2 before this pharmacologic feature is known. Knowing the blood-to-CSF ratio helps inform dose decisions since CSF levels are used to approximate target exposure levels and should be in the dose range of efficacy for animal models (after adjustment by allometric scaling or other equivalency measures). BBB penetration can be supported by post-dose changes in electroencephalography (EEG) or functional magnetic resonance imaging (MRI), but neither of these approaches provides the quantitative information available from CSF sampling. Receptor occupancy studies using positron emission tomography (PET), with radioactive labeling of the agent itself or of a displaceable tool compound, provide evidence of BBB penetration but should be complemented by direct CSF measures of candidate agent levels.
Determination of the highest dose to be used at any point in the development program should be determined in phase 1. The highest dose may be the MTD, the dose determined by PET receptor occupancy, or a dose that represents the maximum feasible dose that can be administered. A common error in development programs is to fail to explore the full range of possible doses in phase 1 [10]. If the agent then fails in later phases of development for lack of efficacy, the sponsor will not know if the drug lacked efficacy or was not administered in sufficient doses.
Example 1
Cognition Therapeutics Inc. (Pittsburgh, PA, USA) performed a phase 1 trial on CT1812, an agent that protects synapses from Aβ protein toxicity in assays and models [40]. In the SAD study, 6 doses spanned 10–1,120 mg. The MAD study assessed 3 doses (280, 560, and 840 mg). Plasma concentrations were slightly greater than dose-proportional and showed minimal accumulation with repeated dosing. The agent crossed the BBB with CSF levels approximately 3% of plasma levels. No MTD was determined; the CSF levels of higher doses were comparable to those achieving high levels of receptor occupancy in tg animal models. CT1812 was generally safe and well tolerated; 4 subjects in the MAD study showed elevated liver function tests of <3 times the normal level (1 patient was on placebo). One subject developed a rash which resolved with treatment. TS for CT1812 would take these observations into account. TS would include an MTD score of 2, a dose support of 5, a BBB penetration of 5, a pharmacokinetic score of 5, and a safety and tolerability score of 3. This results in a total score of 20, compatible with advancing the agent to phase 2 for further assessment. CT1812 is currently in a phase 2 study of mild-to-moderate AD (NCT03507790).
Example 2
TPI-287 (abeotaxane) is a putative microtubule stabilizing agent [41] that was assessed in a phase 1 basket trial including patients with AD, progressive supranuclear palsy (PSP), or corticobasal degeneration (CBD) [42]. Three dose cohorts were advanced (2.0, 6.3, and 20 mg/m3 intravenously [i.v.]). Three anaphylactoid reactions occurred in the 20-mg/m3 AD group, establishing 6.3 mg/m3 as the MTD for AD; 20 mg/m3 was likely the MTD for PSP/CBD participants as 34% of them experienced falls. Pharmacokinetic studies showed dose-dependent increases in plasma concentrations at 5 and 60 mg. No drug was detectable in the CSF and BBB penetration was not established. Side effects were common, with participants reporting headaches, dizziness, constipation, diarrhea, and nausea. From the study, TS would comprise a 5 for MTD, 4 for doses to be considered for advancement, 0 for BBB penetration, 5 for pharmacokinetics, and 2 for safety and tolerability. The total score of 16 could provide limited support for advancing the agent to phase 2; the absence of evidence of BBB penetration would require resolution prior to phase 2 studies.
TS 3 for Candidate Agents Being Considered for Phase 3 and Exiting Phase 2
Phase 2 incorporates patients with the target illness (e.g., AD) in relatively small numbers (e.g., 100–400 per arm, depending on what outcome is being explored). The goals of phase 2 are to show target engagement, guide the sponsor on what dose or doses should be advanced to phase 3, and accrue additional information on safety and tolerability in AD. In some development programs, a clinical effect is chosen as the basis for a go/no go decision regarding advancing to phase 2. In other programs, a prespecified change in a biomarker or a group of biomarkers is the basis for the go/no go decision. Programs with a clinical benefit demonstrated in phase 2 are at a lower risk of failure in phase 3 and will have a higher TS score. Requiring a clinical benefit in phase 2 typically increases the required size and duration of the study, and sponsors may be satisfied with biomarkers as decision support tools.
Confirmation of AD diagnosis is important in phase 2 and phase 3 drug development programs that target the biology of AD. Amyloid imaging obtained from participants referred for a trial of mild-to-moderate AD dementia showed that 33% had negative amyloid PET scans and lacked the defining biological changes of AD [43]. Confirmation of AD can be achieved by amyloid PET or by showing CSF changes consistent with AD [44]. This diagnostic approach is applicable across the continuum of the asymptomatic preclinical phase of AD, prodromal AD, and AD dementia. Staging of AD using tau PET provides more insight into the underlying biology present in AD patients and a greater understanding of the stage of disease at which the agent is being assessed [45]. Agents tested in patients with confirmed AD receive a higher TS score than those assessed in patients with unvalidated cognitive disorders.
Demonstration of target engagement is critical as a phase 2 objective. Target engagement may be shown directly, e.g., with PET receptor occupancy studies or indirectly through proof-of-pharmacology [46, 47]. Measurement of CSF Aβ showed the pharmacologic efficacy of BACE inhibitors [48]. Demonstration of reduced Aβ production by using the stable isotope labeled kinetics (SILK) [49] technique demonstrated target engagement by γ-secretase inhibitors [50]. Increased Aβ fragments are observed in plasma and CSF with γ-secretase inhibitors and modulators and they represent proof-of-pharmacology of the action of these agents on Aβ [51]. Candidate target engagement/proof-of-pharmacology biomarkers include peripheral indicators of inflammation for use in trials of anti-inflammatory compounds. The predictive value of these biomarkers for changes in cognition has not been established; there are no validated surrogate markers for AD trials. Later in their development, both BACE and γ-secretase inhibitors were shown to impair cognition [52]. Thus, demonstrating target engagement does not guarantee clinical efficacy later in development, but target engagement is an important means of derisking a candidate drug by showing biological activity that may translate into clinical efficacy. Agents without evidence of target engagement will have a lower TS score and are deemed to have a higher risk of failure in phase 3.
Amyloid imaging with amyloid PET is a target engagement biomarker that can demonstrate a reduction in plaque amyloid [53]. Several monoclonal antibodies have shown a dose- and time-dependent plaque reduction. In an early phase 1B trial, aducanumab achieved both significant plaque reduction and benefit on some clinical measures with evidence of a dose-response relationship [54]. Ba-pineuzumab and gantenerumab decreased plaque Aβ but had no corresponding impact on cognition or function with the doses studied [55, 56]. Removal of plaque amyloid may be necessary but not sufficient for a therapeutic benefit of anti-amyloid agents, or a marker of the engagement of a broad range of amyloid species including those required for a therapeutic response. Tau PET is a target engagement biomarker that can be used in trials of tau-related therapeutics [4]; reduced tau burden or reduced tau spread would indicate a therapeutic response. Aβ and tau signals do not measure neuroprotection and are not necessarily evidence of disease modification [57, 58].
A higher TS score in phase 2 can be achieved by demonstrating clinical benefit. The means of doing so will depend on the population included in the trials. For a phase 2 trial of mild-to-moderate AD, demonstration of clinical benefit on a standard instrument such as the Alzheimer’s Disease Assessment Scale-cognitive subscale (ADAS-cog) [59] is usually chosen but is not required, and alternatives such as the Neuropsychological Test Battery (NTB) [60, 61] are an acceptable means of capturing clinical benefit in phase 2. Global scales including the Clinical Dementia Rating Sum of Boxes (CDR-SB) that combine clinical and functional items can also provide evidence of clinical response [62]. The CDR-SB or NTB can be used in populations with mild cognitive impairment (MCI)/prodromal AD. Any valid and reliable cognition measure can be used in the calculation and no test is ranked higher than another in TS.
The clinical data included in TS must have been prespecified as primary or secondary outcomes in the phase 2 trial. Agents that fail in phase 2 have sometimes been advanced to phase 3 on the basis of a non-prespecified subgroup analysis [10, 11]. Such subgroup analyses do not contribute to the TS calculation and would decrease the likelihood of advancing agents with limited pharmacologic foundations.
Example 1
A Phase 2a clinical trial of AZD0530 (a Fyn kinase inhibitor) in mild AD showed that, despite its promising effects in animals [63], the agent had no effect on cognition, function, behavior, brain glucose metabolism measured by fluorodeoxyglucose (FDG) PET, MRI, or CSF biomarkers of total tau or phospho-tau (p-tau) [64]. Participants were required to have mild AD dementia confirmed on positive amyloid imaging. Calculating TS would include a score of 5 for participant diagnosis, 0 for clinical outcomes and target engagement, 1 for dose response (2 doses included; no dose relationship observed), and 2 for safety and tolerability reflecting the 48% rate of gastrointestinal side effects and an excess of serious adverse events (SAEs) in those treated with AZD0530. A total score of 8 would predict a high risk of failure should the agent be advanced to phase 3.
Example 2
A phase 2 clinical trial of PQ912 recruited participants with mild cognitive impairment or mild AD dementia confirmed by a CSF biomarker signature consistent with AD [65]. PQ912 is an inhibitor of glutaminyl cyclase (QC), an enzyme that accelerates the formation of synaptotoxic pyroglutamate-seeded Aβ oligomers. Measures of CSF QC decreased from 118.5 mU/L at baseline to 46.0 mU/L at the end of the trial in the active treatment group, and no change was observed in the placebo group. The measure indicated 93.2% target occupancy and strong target engagement. Large effect sizes for the drug-placebo difference were observed for spectral analyses of oscillatory EEG and small effect sizes were seen for drug-placebo differences on the NTB [61] one-back test and CSF measures of neurogranin and YKL-40. The study would result in a TS score of 5 for participant diagnosis, 2 for clinical outcomes, 5 for biomarker outcomes, 3 for dose and dose response, and 3 for safety and tolerability. The total score of 18 including effects on biomarkers and a clinical outcome support advancing this agent to the next level of testing. PQ912 is being advanced to a seamless phase 2A/2B study (NCT03919162).
TS 4 for Candidate Agents Exiting Phase 3 and Being Prepared for Regulatory Review
Phase 2 data set the stage for the determination of clinical and biomarker assessment in phase 3. Phase 3 outcomes and trial designs differ for prevention trials where participants are cognitively normal at the time of study entry and trials for prodromal AD or AD dementia where participants have clinical symptoms at baseline. In prevention trials, allowable designs might include biomarkers as primary outcomes with clinical outcomes as secondary outcomes. Accelerated approval could be based on the condition of long-term follow-up or other means of determining clinical benefit over time. The current Food and Drug Administration (FDA) guidance on clinical trials in early AD suggests that no current biomarker has been shown to have sufficient predictive power to provide the basis for accelerated approval [66]. A TS score for an end-of-phase 3 prevention trial would include drug-placebo differences on biomarkers indicative of disease modification and drug-placebo differences on stage-appropriate clinical measures such as the Preclinical Alzheimer Cognitive Composite (PACC) [67] or the Alzheimer Prevention Initiative Cognitive Composite (APCC) [68, 69].
Biomarkers in phase 3 are indicative of disease modification and differ from the target engagement biomarkers that are key in phase 2. Amyloid lowering on amyloid PET or tau reduction or containment on tau PET are target engagement biomarkers, but they can also contribute to a repertoire of biomarkers that in total might support the case for an impact on the basic biology of AD. True biomarkers of disease modification are those that demonstrate neuroprotection [57, 58]. Biomarkers of neurodegeneration include plasma or CSF neurofilament light chain levels, MRI, FDG PET, and CSF total tau (t-tau) [70, 71]. Biomarkers used in phase 3 may be incorporated in product labeling as companion or complementary biomarkers [21, 72]. Companion biomarkers are required for using a product, e.g., amyloid imaging with positive findings may be required prior to administering an anti-amyloid monoclonal antibody. A complementary biomarker is not required but may provide important information for management of a subgroup of patients, e.g., AD patients with the apolipoprotein E4 (ApoE4) gene are more vulnerable to the development of amyloid-related imaging abnormalities (ARIA), and the clinician may make use of genotyping to determine which patients require more vigilance for symptoms indicative of this adverse effect [73].
Biomarkers may be scored as a means of augmenting the TS approach. Wehling [74] proposed a biomarker TS methodology with scores ranging from 1 to 6, with 1 being a biomarker assessed on a single animal species and 6 being biomarkers that have achieved surrogate status and predict clinical outcomes. Similarly, Day et al. [75] proposed scores (1–3; for weak, intermediate, and strong supportive data) for biomarkers of target engagement, pharmacodynamics, and disease modification [75]. Biomarkers qualified by the FDA or that have a proven surrogate status would increase confidence in the information generated from the observations [72, 76]. Assessing the validity of biomarkers as part of TS will strengthen the likelihood of TS predictability.
Treatment trials of participants with prodromal AD or AD dementia must show a drug-placebo difference on primary clinical outcomes in order to be considered for approval. FDA guidance indicates that a single outcome measure such as the CDR-SB [62, 77] can serve as the primary outcome in patients with prodromal AD, although it is important to show that the drug-placebo difference is not entirely attributable to cognition but that both cognition and function have benefited to some degree, compared to placebo [78]. In patients with more advanced decline and manifesting with mild, moderate, or severe AD dementia, the dual outcome of cognition plus function or cognition plus a global scale must both show a drug-placebo difference in favor of active treatment [79]. The ADAS-cog [59] is the tool most frequently used to assess cognition in patients with mild or moderate AD; the severe impairment battery is the tool commonly use in patients with severe disease [80]. The NTB has served as an alternative to the ADAS-cog in some trials, especially those with participants that are less severely affected [60, 61]. The instrument most commonly used to assess function in moderate-to-severe AD is the Alzheimer’s Disease Cooperative Study (ADCS) Activities of Daily Living (ADL) scale [81], with the Amsterdam Instrumental ADL being increasingly implemented to assess function in AD trials [82]. The CDR-SB [62, 77] is the global scale most commonly used in trials of putative DMTs in AD dementia. The interpretation of the portion of TS derived from clinical measures and biomarkers depends on knowing that the sample size was adequate to detect a drug-placebo difference, given the anticipated effect size. A low TS score would be assigned regardless of the cause of the absent or weak drug-placebo difference, but the interpretation would differ. Testing an agent in an adequately powered trial that showed no treatment effect would result in a discontinuation of the development program, while the absence of a drug-placebo difference in an underpowered study leaves open the possible decision to conduct a larger, adequately powered trial.
Safety and tolerability are determined in all clinical trials and contribute to the TS of each phase of drug development. A key aspect of the final phase of drug devel-opment is whether the agent has met the target product profile (TPP) [83]. The TPP is a strategic development process tool recommended by the FDA [84]. The TPP is used throughout the development process from nonclinical testing to phase 3, and it is dynamically updated as knowledge of the candidate drug accrues. The TPP allows the developer to define the end goal of the development process and to then use the tool to systematically work toward the defined outcome throughout the nonclinical and clinical phases. Ideally, the final version of the TPP will be similar to the approved product labeling. The TPP defines the desirable features of a drug and its usage including primary indications, patient population, treatment duration, delivery mode, dosage, regimen, tolerability, risk/side effects, tolerability, and differentiating features in a competitive landscape. Analyses of FDA submissions, from programs with and without a TPP, show reduced development cycle times and better success for those where the program incorporates the TPP methodology [85]. The TPP is incorporated in TS as a means of measuring the overall integrity of the development program and the likelihood of regulatory and commercial success. When there are DMTs approved for AD, the TPP will include comparisons of the efficacy, safety, and convenience of the candidate agent for the approved therapy.
Example 1
Verubecestat was assessed in a phase 3 study of prodromal AD [52]. This agent showed excellent PK/pharmacodynamic features in the earlier development phases with a dose-dependent reduction of CSF Aβ indicative of target engagement [48]. No meaningful toxicity was observed in phase 1 or phase 2. In phase 3, verubecestat was administered to patients with prodromal AD confirmed by amyloid imaging or CSF Aβ/tau measures. The agent exacerbated cognitive impairment and increased atrophy, consistent with functional toxicity. The TPP was not met. The TS calculation would result in a score of 5 for participant diagnosis and a score of 0 for cognitive outcomes, biomarker outcomes, safety, and TPP. A score of 5 would caution against further trials with this agent, at least until new data about the basis of the adverse reactions and their management become available.
Example 2
Aducanumab was advanced to 2 large phase 3 trials after a phase 1B trial suggested a dose-related reduction in fibrillar Aβ and a clinical benefit at the highest dose (10 mg/kg i.v. delivered monthly) [54]. The trials were terminated due to futility when both failed to show a clinical benefit in the prespecified interim analysis. Further follow-up of the cohorts showed that patients in both trials who had received the highest dose for at least 10 months showed a significant slowing on the CDR-SB scores [77] (the primary outcome of the trial), and a significant reduction in brain amyloid as measured by amyloid PET. ARIA occurred in approximately 35% of the participants (highest in those with the ApoE4 genotype). A TS score of 5 for participant diagnosis, 0 for clinical benefit on the prespecified analysis, 3 for impact on an informative biomarker, 2 for safety, and 3 for the TPP would result in a total score of 13. If no other observations were available, aducanu-mab would represent a high-risk candidate for further development. However, non-prespecified analyses suggesting clinical benefit consistently observed in groups treated at higher doses for longer periods of time support additional trials of this agent using the exposures (dose and duration) that suggest benefit in the phase 3 trials.
Discussion
Applying TS methodology for AD drug development is intended to support and improve decision-making in the selection of candidate agents for the next step in development and to ensure that the critical aspects of drug characterization (i.e., that are needed to position the agent for success at the next stage) have been achieved. TS is one aspect of the disciplined drug development tool-kit that can help reduce the failure rate of AD DMTs, especially in later phases, by insisting on full characterization at an early stage [11, 12, 86]. The TS approach presented here builds on a translational foundation and applies the methodology specifically to AD therapies differentiated by phase of development [15-17]. At the end of each phase, the candidate is assessed for the completeness of characterization and the decision to advance is based on the features discovered in a thorough assessment. TS is a semiquantitative means of capturing and summarizing these key features.
TS can be augmented in a variety of ways to assist in drug prioritization (Fig. 2). As noted, animal models can be scored to determine how much confidence can be placed in them to predict human responses [36], and biomarkers can be scored based on the data linking the biomarker to the underlying pathology of AD as well as the robustness, reproducibility, and sensitivity to change of the biomarker assays [72, 75]. Currently, there are no qualified or surrogate biomarkers recognized by the FDA. When emerging biomarkers are qualified and eventually shown to predict clinical outcomes, confidence in the use of biomarkers will be increased and TS will be adjusted to reflect this [87]. Drug development success is greater when the MOA of the agent is genetically linked to the disease target [88-90]. Incorporating this information along with TS into development decisions will further derisk the development process and increase the likelihood of success of well-chosen candidates. The strategic fit of the agent with the company’s portfolio and the competitive landscape anticipated at the time of drug approval are additional considerations not captured in TS.
Scoring of the candidate drug substance is not included in the TS approach but would be conducted as part of a comprehensive assessment prior to testing the drug in animal models. Aspects to be considered are Lipinski’s “rule of 5” which assesses the drug-likeness of the substance [91] and such features as solubility, drug delivery parameters, stability, manufacturability, and in vitro/in vivo relations (IVIVR) [92].
Excellence in drug development will lead to effective therapies only if the target of the therapeutic modulation is meaningfully related to altering the disease course. Bioinformatics and quantitative pharmacology present a means of scoring disease targets to assist with target choices [93]. These approaches can lead to scores that allow the ranking and prioritizing of candidate targets. The proposed scores utilize gene expression data, disease-specific network information, and drug-gene interaction data to improve the inference of target-disease associations and potentially efficacious therapeutic interventions [93, 94].
The TS system presented here provides a foundation that will evolve. Regulatory approval of DMTs will provide standards of success against which drug development processes can be compared and modified. Data from negative trials can be highly informative and will inform TS evolution [10]. Bioinformatic approaches applied to targets, biomarkers, drug characteristics, and drug-disease pathway proximity will facilitate matching drug processes to disease pathways and assist in developing a robust TS approach. Application of a multiattribute utility theory to develop weighting of TS elements may improve predictability [95, 96].
TS is a decision support tool, not a decision maker. It must be integrated with other types of information to guide a final decision on whether to advance a candidate compound for further testing. The TS approach requires experience in drug development to assign the gradations of scoring; it is impossible to anticipate all circumstances that occur in the course of drug development and exhaustively specify the scores. The examples provided help guide use of the tool.
The value of TS is that the process allows the developer to identify the major factors at each phase of drug development that are not adequately characterized. Candidate agents can be compared and prioritized using TS, thereby allowing the funders, be they federal or state agencies, academic programs, philanthropists, venture capitalists, or pharmaceutical companies, to decide where their funds are best invested. TS can guide where resources should be invested to complete characterization of a promising agent. By allowing sponsors to devote resources to the most promising agents and avoiding the advancement of compounds that lack critical data, the drug development process can become more successful in yielding crucially needed new treatments for those suffering from or at risk of developing AD.
This article has been published in celebration of the 30th anniversary of the inception of Dementia and Geriatric Cognitive Disorders (1990–2020).
Acknowledgement
Dr. Cummings acknowledges support from Keep Memory Alive (KMA); COBRE grant # P20GM109025; TRC-PAD # R01AG053798; and DIAGNOSE CTE # U01NS093334.
Disclosure Statement
Dr. Cummings has provided consultation to Acadia, Actinogen, AgeneBio, Alkahest, Alzheon, Annovis, Avanir, Axsome, Biogen, Cassava, Cerecin, Cerevel, Cognoptix, Cortexyme, EIP Pharma, Eisai, Foresight, Gemvax, Green Valley, Grifols, Hisun, -Karuna, MapLight, Novo Nordisk, Nutricia, Orion, Otsuka, -ReMYND, Resverlogix, Roche, Samumed, Samus Therapeutics, Third Rock, Signant Health, Sunovion, Suven, and United Neuroscience pharmaceutical and assessment companies.
Funding Sources
This was an independent academic project.
References
Published in celebration of the 30th anniversary of the inception of Dementia and Geriatric Cognitive Disorders (1990–2020).