Abstract
Genome-wide association studies (GWAS) have greatly expanded our understanding of the genetic architecture of cardiovascular diseases in the past decade. They have revealed hundreds of suggestive genetic loci that replicate known biological candidate genes and indicate the existence of a previously unsuspected new biology relevant to cardiovascular disorders. These data have been used successfully to create genetic risk scores that may improve risk prediction and the identification of susceptive individuals. Furthermore, these GWAS-identified novel pathways may herald a new era of novel drug development and stratification of patients. In this review, we will briefly summarize the literature on the candidate genes and signals discovered by GWAS on hypertension and coronary artery disease and discuss their implications on clinical medicine.
Introduction
A large majority of heart diseases are polygenetic, that is, the result of a combination of multiple common genetic variants and environmental factors. Unravelling the genetic basis of heart diseases has proceeded slowly through linkage studies, because of the much smaller effect size attributable to common modifying variants in complex disorders [1]. Chip-based microarray technology for assaying over 1 million interindividual genetic variants provides the foundation of genome-wide association studies (GWAS), which are defined as any studies of common genetic variation across the entire human genome designed to identify genetic associations with observable traits [2]. The common variants usually refer to those with a prevalence of at least 5% in a population. A stringent genome-wide significance threshold of p < 5E–8 is routinely used as a correction for multiple testing, which is based on the estimation of approximately 1 million independent SNPs in a population [3].
The past decade has witnessed substantial advances in understanding the genetic basis of heart diseases through GWAS. Furthermore, HapMap- and 1000 Genomes Project-based meta-analyses including hundreds of thousands of subjects have expanded our understanding of the genetic architecture of heart diseases to a great extent. In the GWAS catalog (www.ebi.ac.uk/gwas/), hundreds of loci have been reported that show an association with more than 10 heart diseases or traits; the major achievements are listed in Table 1.
Genome-wide association studies of hypertension and several other cardiovascular diseases and abnormalities

In this review, we will give a brief summary of the achievements of GWAS regarding blood pressure (BP) and coronary artery disease (CAD), the progress in risk prediction at individual level, the novel pathophysiological pathways, and drug discoveries.
GWAS-Significant Loci
Blood Pressure
BP is a quantitative trait, normally distributed in the general population, whereas 30–50% of the variation in BP is determined by inherited genetic factors [4, 5]. Hypertension ranks as the leading cause of morbidity and mortality worldwide, contributing to CAD, atrial fibrillation, and heart failure, so that the knowledge about the genetics of hypertension or BP is an important factor in the diagnosis, control, and treatment of all of these heart diseases.
Until the middle of 2017, GWAS have identified and replicated genetic variants of modest or weak effect on BP over 200 loci; the strongest SNPs associated with different BP traits are summarized in Table 2a and b. However, the attempts to identify genetic variants associated with BP have been challenging and of relatively low yield in the early phase. In 2007, the first GWAS (by the Wellcome Trust Case Control Consortium [WTCCC]) [6] adopted a case-control study design using 3,000 shared controls and 14,000 cases (2,000 for hypertension) of European ancestry to study 7 complex diseases simultaneously. Hypertension was the only disease without any significant results. The first GWAS of quantitative BP phenotypes was conducted in the Framingham Heart Study, which included 1,400 family subjects and found no significant results either [5]. These two studies let researchers realize the complexity of the genetic mechanisms underlying BP regulation and the need for much larger sample sizes when looking for genes associated with BP/hypertension.
Significant genetic loci for blood pressure and hypertension reported in genome-wide association studies in Europeans (a) and Asians and Africans (b)

To enhance the statistical power, international collaborations were established between studies and organized in consortia. Furthermore, detecting associations with BP as a continuous variable rather than in case-control studies also was successful. In 2009, two large-scale meta-analyses of GWAS from the Global Blood Pressure Genetics (Global BPgen) [7] and Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) [8] consortia identified associations withstanding correction for multiple testing (genome-wide significance). Each of the study contained nearly 30,000 subjects at the discovery phase and found 8 genomic loci, among which 3 loci overlapped. In 2011, the International Consortium for Blood Pressure (ICBP) [9] used a multistage design with 200,000 individuals of European descent, replicated the previous 13 loci effectively and discovered 16 new loci significant at the genome-wide level. Then, the ICBP analyzed two derived BP phenotypes: mean arterial pressure and pulse pressure (PP) [10]. This study discovered 4 novel loci for PP and 2 novel loci for mean arterial pressure. The latest GWAS among UK Biobank participants of European ancestry included ∼140,000 subjects from a single source at the discovery phase. With dense 1000 Genomes Project and UK10K imputation, they yielded a data set with ∼9.8 million variants for the meta-analysis. With the help of major international consortia for parallel replication, they included GWAS data from 330,956 individuals in total and reported 107 significant loci, among which 24 were associated with systolic BP (SBP), 41 with diastolic BP (DBP), and 42 with PP [11].
The first large meta-analysis of GWAS on BP traits among East Asians was conducted by the Asian Genetic Epidemiology Network (AGEN) consortium [12]. Our group was one of the 8 groups participating in stage 1 meta-analysis, which included 19,608 individuals totally. After de novo genotyping in two stages of replication involving 10,518 and 20,247 East Asian samples, we confirmed 7 loci previously identified in the European population reported by Global BPgen and CHARGE, and additionally identified 6 novel loci: ST7L-CAPZA1, FIGN-GRB14, ENPEP, NPR3, TBX3, and ALDH2. In another meta-analysis of a Han Chinese population, which involved 11,816 subjects in the discovery stage and 69,146 subjects for replication [13], the authors replicated 8 previously reported loci and 4 novel loci: CASZ1, MOV10, FGF5, CYP17A1, SOX6, ATP2B1, ALDH2, JAG1, CACNA1D, CYP21A2, MED13L, and SLC4A7. These findings suggest a possible presence of allelic heterogeneity in BP regulation between Europeans and Asians and provide new mechanistic insight into hypertension.
African-descent populations are the most ancestral and have smaller regions of linkage disequilibrium due to the accumulation of more recombination events in that group, so that GWAS performed on Africans need more SNPs with better overall genomic coverage [14]. There are fewer loci reaching GWAS significance in African populations than in Europeans or Asians. The largest GWAS performed in an African-origin population analyzed 21 GWAS comprising 31,968 individuals of African ancestry and validated their results with an additional 54,395 individuals from multiethnic studies [15]. The authors found 9 loci with 11 independent variants for either SBP, DBP, hypertension, or combined traits.
Coronary Artery Disease
Heritability of CAD has been estimated between 40 and 60% based on family and twin studies [16]. The first robust locus associated with CAD identified by GWAS was reported in 2007 [6, 17, 18], when three independent groups reported common variants at the 9p21 locus. It was associated with a ∼30% increased risk of CAD per copy of the risk allele. Since then, progressively larger-sample-size meta-analyses of additional GWAS, mainly on subjects of European descent, have identified additional loci of smaller effect size. In 2015, the CARDIoGRAMplusC4D Consortium published a GWAS meta-analysis of 185,000 CAD cases and controls, interrogating 6.7 million common variants as well as 2.7 million low-frequency variants [19]. In this study, the authors confirmed most CAD loci known at that time and identified 10 novel loci. The majority of the significant loci had a frequency of more than 5%, which indicated that the genetics of CAD was largely determined by the cumulative effect of multiple common risks of small effect size and strongly supported the common disease/common variant hypothesis. In 2017, Nelson et al. [20] further meta-analyzed the results from 10,801 cases in the UK Biobank against 137,914 controls in combination with the CARDIoGRAMplusC4D 1000 Genomes and the MIGen/CARDIoGRAM Exome chip studies. They found 12 new signals reaching genome-wide significance. They also reported that 304 independent variants meeting the 5% false discovery rate threshold explained 21.2% of CAD heritability. This finding highlighted the importance of the false discovery rate approach in the expansion of associated variants. The strongest SNPs associated with CAD which had reached genome-wide significance are listed in Table 3a and b.
Significant loci for coronary artery disease reported by genome-wide association studies in Europeans (a) and Asians (b)

Although the majority of GWAS for CAD has been carried out on European populations, there were still important studies performed on Asian and Black populations with smaller sample sizes [19]. Unlike the similar effect sizes of CAD risk alleles in East Asian and European populations, the effect size was apparently attenuated in South Asian and Black populations. The lower level of linkage disequilibrium in the African genome might lead to failure to tag potentially shared causal variants [21].
Clinical Implications
It is undeniable that GWAS have achieved considerable success in exploring the genetic architecture of heart diseases. However, it is important to point out that the significant variant is often merely tagged. The gene indicated by the GWAS signal would not be considered as the causal variant until the functional mechanism can be found. The ultimate goals of the study of the genetics of CAD or hypertension are to understand the pathophysiological mechanism and subsequently to establish risk prediction methods and develop effective therapeutic strategies (Fig. 1).
The effect of the susceptibility variants identified by GWAS is small individually, but their effects are independent and additive, which can be calculated as a genetic risk score (GRS) in an unweighted manner by adding risk allele numbers or with a weighted mean [22]. As DNA is stable over the course of the lifetime, a genetic risk can be ascertained from birth; therefore, GRS may be particularly useful in risk prediction among younger patients in whom the cumulative impact of lifestyle factors is less pronounced.
The interpretation of GWAS often is complicated by the nature of significant loci, which are mostly located in the intergenic region. Bioinformatic approaches could narrow down the list, prioritizing a gene for subsequent functional study according to expression quantitative trait locus [23], genome-wide chromatin profiles of histone modifications, data on transcription factor-binding sites, chromatin immunoprecipitation sequencing [24], etc. Below, we describe some examples of clinical implications of GWAS results for risk prediction and mechanism interpretation.
Hypertension
With the expansion of significant BP loci by GWAS, robust associations with some biological candidate genes previously suspected to influence BP were detected. The latest GWAS by Warren et al. [11] identified multiple loci involved in BP regulation, including angiotensin-converting enzyme, voltage-dependent calcium channel auxiliary subunit, metallo-endopeptidase/neutral endopeptidase, adrenergic β 2B receptor, and phosphodiesterase 5a. Moreover, in the pathway analysis, they also showed an enrichment of pathways associated with cardiovascular disease, including the α-adrenergic pathway, CXCR4 chemokine signaling pathway, endothelin system, and angiotensin-receptor pathways.
In the ICBP GWAS [10], the difference in SBP and DBP between the top and the bottom quintile of the GRS was 4.6 and 3.0 mm Hg, respectively. Furthermore, the GRS generated by combination of 107 loci by Warren et al. [11] showed that the group with GRS in the lowest quintile had an SBP approximately 9–10 mm Hg lower than those with GRS in the highest quintile. Reductions in BP of such a magnitude might lead to significantly lower cardiovascular morbidity and mortality among hypertensive patients. The GRS could therefore be useful in early life to assess the risk of hypertension and direct dietary or lifestyle modifi-cations.
The GWAS findings had also provided clues for personalized prevention and treatment of hypertension. One good example is the uromodulin gene (UMOD), which was identified in an extreme case-control design [25]: rs13333226-G in UMOD showed an association with lower risk of hypertension and reduced urinary UMOD excretion. Uromodulin is mainly expressed in the thick portion of the ascending limb of the loop of Henle, which indicates that it may participate in BP regulation through an effect on sodium hemostasis. Trudu et al. [26] modeled the effect of UMOD in transgenic mice and demonstrated that uromodulin influenced BP through activation of the renal sodium cotransporter NKCC2. The subjects with the UMOD risk variant had increased UMOD excretion, greater salt sensitivity, hypertension, and a greater BP response to loop diuretics. This finding presents an opportunity for hypertension precision medicine and new drug development.
The AGEN study [12] found an ethnicity-specific locus on 12q24.13, where the acetaldehyde dehydrogenase (ALDH2) gene is located. ALDH2 is a key enzyme in the major pathway of alcohol metabolism, and glutamate-lysine substitution (rs671, E504K) produces an inactive subunit of ALDH2, resulting in an inability to metabolize acetaldehyde and subsequent accumulation of acetaldehyde after alcohol intake [27]. The SNP rs671 is not polymorphic in Europeans, but it is close to rs3184504 at the SH2B3 locus, which is significantly associated with BP in Europeans and has no polymorphism in East Asians. This phenomenon indicates the natural selection that has occurred. Furthermore, rs671 displays pleiotropic effects both on risk factors for cardiovascular disease and on CAD susceptibility. Interestingly, the locus exhibits a deleterious effect on BP but has protective effects on HDL cholesterol and LDL cholesterol, which results in a net reduction in CAD risk. Moreover, most of the associations between rs671 and each of the cardiovascular risk factors are influenced by alcohol intake.
Coronary Artery Disease
Since 2007, nearly 70 distinct genetic loci for CAD have been found due to the progressively larger sample sizes. A minority of all these risk variants appears to modulate CAD risk by influencing classic risk factors such as plasma lipids, diabetes, and hypertension, re inforcing the key role for these pathways in the development of CAD. The rest of the risk vari ants identified by GWAS are located in regulatory regions of unclear function.
In a large prospective cohort study with a median of 10.7 years of follow-up, Ripatti et al. [28] found that individuals with a GRS in the top quintile derived from 13 multilocus SNPs of CAD exhibited a 1.66-fold increased risk adjusting for traditional risk factors. However, the GRS did not improve the C-index over the traditional risk factors and family history or the net reclassification of risk categories. A recent survey on 55,685 subjects from three prospective studies and one cross-sectional study confirmed the association between GRS and incidence of CAD [29], with subjects in the top GRS quintiles having a 91% higher relative risk than those in the bottom quintiles. They also reported that a favorable lifestyle was associated with a relative risk of CAD nearly 50% lower than with an unfavorable lifestyle.
The 9p21.3 locus was the first one identified by GWAS, consisting of a cluster of 59 linked SNPs in a 53,000-bp region [30]. Among the CAD risk alleles, rs10811656 and rs10757278 are located in an enhancer element and disrupt a binding site for ATAT1 [31]. This enhancer locus interacts with CDKN2A/B and IFNA21, which encodes interferon-γ in human vascular endothelial cells and participates in the inflammatory process. It is also suggested that this region probably encodes for a long noncoding RNA designated as ANRIL (antisense noncoding RNA in the INK4 locus). ANRIL influences the expression of CDKN2A/B, which is involved in regulating the cell cycle and cellular proliferation [32]. A meta-analysis also confirmed that the 9p21.3 locus increases the burden of atherosclerosis but not the risk of myocardial infarction, which indicated the stimulatory effect of this locus in atherosclerosis [33].
The 1p13 locus has been independently associated with CAD in several GWAS. Among a number of genes located in this region, SORT1 was found as the target one by expression quantitative trait locus and expression studies. SORT1 encodes sortilin, which is a multiligand type 1 receptor and is expressed in many cell and tissue types [34]. Overexpression of Sort1 might decrease the very-low-density lipoprotein secretion rate and increase plasma low-density lipoprotein turnover [35]. However, the knockout or knockdown of Sort1 in different studies resulted in either increased or decreased very-low-density lipoprotein secretion [36, 37]. Whether Sort1 was the gene causing CAD or just a signal found in the GWAS still needs to be clarified. The concrete molecular mechanism by which sortilin influences lipid metabolism and CAD risk has not yet been elucidated.
Perspectives
The GWAS findings have identified novel biological signals involved in human heart diseases. However, there are still certain challenges. The heritability of BP as derived from family studies varies from 30 to 50%, but the collective effect of all BP loci identified through GWAS only explains ∼3.5% of BP heritability. The missing heritability is not unique to BP genetics but is universally observed in almost all common heart diseases. Heritability reflects the degree of phenotypic resemblance between relatives, which not only depends on the genetic architecture contributing to the trait but also is the result of environmental factors and interactions within the genome. In GWAS on hypertension-related phenotypes, significant gene-environment interaction has been identified for alcohol consumption [38], body mass index, smoking [39], educational level, and other modifiable lifestyle factors. These findings may help identify the causal genetic loci that contribute to missing heritability. In addition, gene-gene interaction [40], rare variants [41], and epigenetic mechanisms likely explain a certain fraction of complex heart diseases. Future research will focus on the application of variants identified in GWAS for the identification of individuals at risk of disease, guidance of clinical management decisions, and prediction of prognosis.
Disclosure Statement
No potential conflict of interest was reported by the authors.