Introduction: Chronic kidney diseases (CKD) encompass a spectrum of complex pathophysiological processes. While numerous genome-wide association studies (GWASs) have focused on individual traits such as albuminuria, estimated glomerular filtration rate (eGFR), and eGFR change, there remains a paucity of genetic studies integrating these traits collectively for comprehensive evaluation. Methods: In this study, we performed individual GWASs for albuminuria, baseline eGFR, and eGFR slope utilizing data from non-diabetic individuals enrolled from the Taiwan Biobank (TWB). Subsequently, we employed principal component analysis to transform these three quantitative traits into principal components (PCs) and performed GWAS based on these principal components (PC-based GWAS). Results: The individual GWAS analyses of albuminuria, baseline eGFR, and eGFR slope identified 10, 13, and 210 candidate loci respectively, with 2, 3, and 99 of them representing previously reported loci. PC-based GWAS identified additional 20 novel candidate loci linked to CKD (p values ranging from 5.8 × 10−7 to 9.1 × 10−6). Notably, 4 of these 20 single nucleotide polymorphisms (rs9332641, rs10737429, rs117231653, and rs73360624) exhibited significant associations with kidney expression quantitative trait loci. Conclusion: To our knowledge, this study represents the first PC-based GWAS integrating albuminuria, baseline eGFR, and eGFR slope. Our approach found 20 novel candidate loci suggestively associated with CKD, underscoring the value of integrating multiple kidney traits in unraveling the pathophysiology of this complex disorder.

Genome-wide association studies (GWASs) of chronic kidney disease (CKD) traits have been widely performed, but a comprehensive approach of integrating these individual traits is lacking. We applied principal component analysis to obtain principal components (PC) of albuminuria, estimated glomerular filtration rate (eGFR), and eGFR slope from Taiwan Biobank. Using these PCs as pseudotraits, we performed the first PC-based GWAS for evaluating genetic factors associated with CKD and twenty novel candidate loci associated with CKD were found. Our results provided new understanding of possible common pathogeneses for CKD.

Chronic kidney disease (CKD) is an important medical concern affecting more than 10% of general population worldwide with complex underlying pathophysiology [1]. There were plenty of genome-wide association studies (GWASs) identifying risk loci for CKD, generally defined as estimated glomerular filtration rate (eGFR) <60 mL/min/1.73 m2, or GWAS for eGFR per se [2‒10]. However, many CKD patients in stage 3–5 have rather stable kidney function for years, and thus, the change in eGFR over time is more important than a cross-sectional eGFR for predicting long-term prognosis [11]. Among the factors affecting eGFR trajectory, the degree of albuminuria is a major determinant [12]. The National Kidney Foundation has suggested using albuminuria and eGFR slope to replace the conventional outcome of 30% reduction of eGFR or kidney failure in clinical trials [13]. There were GWASs focused on identifying genetic factors associated with albuminuria [14‒16], and a few GWASs aimed to search risk loci for eGFR change [17‒19], whether a clear time interval was described or not. These traits were not seen as a whole and were analyzed separately in prior studies, maybe reflecting previous univariate GWASs of eGFR, albuminuria and eGFR change revealed associations with different loci.

A quantitative trait merely offers a partial understanding of a complex disease; therefore, GWAS focusing on a certain trait can only uncover a fraction of the SNPs linked to the disease. Even if summing up all the univariate GWAS results, some true loci associated with the disease will still be missed as they may not be significant in each GWASs of currently assessed traits. Since albuminuria, cross-sectional eGFR, and eGFR slope are quantitative traits, we can apply principal component analysis (PCA) to transform them into principal components (PCs). Using theses PCs as pseudotraits to perform GWAS (PC-based GWAS), we expect to identify additional candidate loci associated with CKD.

This study aimed to discover novel risk loci for CKD. Using data of non-diabetic participants from Taiwan Biobank (TWB), we have performed univariate GWASs for albuminuria, baseline eGFR, eGFR slope, and subsequent the multiple-trait PC-based GWAS. Twenty novel candidate loci associated with kidney function were found by PC-based GWAS.

Participants

The TWB is a prospective study of volunteers 30–70 years of age residing in Taiwan [20]. A total of 140,070 participants were recruited during January 2012 to December 2021. Biological specimens, personal, and clinical information as delinked data were analyzed. This study was approved by the Institutional Review Board of the National Taiwan University Hospital (No.: 202312110RINA). All subjects have provided written informed consent, and all methods were carried out in accordance with the Declaration of Helsinki and relevant regulations. Individuals with diabetes mellitus (either as stated in the questionnaire or having HbA1c ≥ 6.5%) were excluded from this study as it affected eGFR substantially.

Albuminuria Measurement

Urine albumin levels were measured using automatic quantitative turbidimetric immunoassays while urine creatinine levels were measured using the compensated Jaffe method and a chemistry analyzer (Hitachi LST008). Albuminuria is presented as urine albumin-to-creatinine ratio (UACR) and is log-transformed as it is not normally distributed.

Serum Creatinine Measurement, Cross-Sectional eGFR, and eGFR Slope Definition

Serum creatinine level was measured using a chemistry analyzer (Hitachi LST008) with compensated Jaffe method. eGFR was calculated using the 2021 CKD-EPI creatinine equation as follows: GFR (mL/min/1.73 m2) = 142 × min (serum creatinine/κ, 1)α × max (serum creatinine/κ, 1)−1.2 × 0.9938Age × 1.012 (if female), where κ is 0.7 for females and 0.9 for males, α is −0.241 for females and −0.302 for males; min indicates the minimum of Scr/κ or 1, and max indicates the maximum of Scr/κ or 1 [21]. eGFR slope is defined as change of eGFR, which is subtracting baseline eGFR (eGFR1) from latest follow-up eGFR (eGFR2), divided by follow up duration (expressed in years): ([eGFR2 – eGFR1]/duration).

Genotyping, Imputation, and Quality Control

The TWB undertook a comprehensive genotyping initiative involving the development of two custom single-nucleotide polymorphism (SNP) arrays. The aim was to enhance genotyping efficiency and accuracy. To harmonize the genotyping data from these two arrays, lifting over from Genome Reference Consortium Human Build 37 (GRCh37) to GRCh38 for TWB v1.0 array was done to be concordant with TWB v2.0 array. SHAPEIT4 [22] and IMPUTE2 [23] were utilized for phasing and imputation. Both the TWB dataset and the 1000 Genomes reference panel were used as imputation reference panels. This approach facilitated the integration of genotyping information from the two custom chips. To ensure data quality and reliability, a rigorous quality control procedure was meticulously executed. Markers located on the sex chromosomes were excluded from the analysis. Additionally, SNPs with a genomic control rate (GCR) below 5% were filtered out, as were those with a minor allele frequency (MAF) lower than 1%. Stringent Hardy-Weinberg equilibrium tests were applied, and SNPs failing these tests with a p value <1 × 10−8 were removed from consideration. Following these meticulous quality control steps, a total of 3,637,470 SNPs remained in the dataset, forming a robust foundation for subsequent association analyses.

Statistical Analyses

Age, eGFR, eGFR slope and follow-up duration are expressed as mean and standard deviation. UACR is expressed as a median and an interquartile range. The results of the GWAS are analyzed using PLINK v1.9 [24] with an additive genetic model. We utilized a linear regression model adjusting for age, sex, and the first ten principal components of ancestry to determine the association between SNPs and logUACR, baseline eGFR and eGFR slope. To avoid a substantial amount of false negative results by Bonferroni correction method, the p value threshold for candidate results were set at 1.0 × 10−5 in this study. A Manhattan plot and quantile-quantile plot (Q-Q plot) are generated using the qqman R package [25]. The GRCh38 is used for gene annotation. Principal component analyses were done using R software prcomp function [26].

Characteristics of Study Population

After quality control, the numbers of participants available for albuminuria, baseline eGFR, eGFR slope GWAS, and PC-based GWAS are 11,280, 85,460, 29,480, and 10,887 respectively (Fig. 1). The demographics regarding each univariate GWASs and PC-based GWAS are shown in Table 1.

Fig. 1.

Flowchart of the study. GWAS, genome-wide association study; PC, principal component; QC, quality control; TWB, Taiwan Biobank.

Fig. 1.

Flowchart of the study. GWAS, genome-wide association study; PC, principal component; QC, quality control; TWB, Taiwan Biobank.

Close modal
Table 1.

Demographic information of each univariate GWASs and PC-based GWAS

Albuminuria GWASBaseline eGFR GWASeGFR slope GWASPC-based GWAS
Participants, N 11,280 85,460 29,480 10,887 
Age, years 48.8±10.4 49.1±10.7 49.7±10.3 48.9±10.3 
Male, N (%) 4,285 (38.0) 31,040 (36.3) 9,912 (33.6) 3,950 (36.3) 
UACRa, mg/g 8.355 (4.819–9.554) 8.384 (4.839–9.587) 
Baseline eGFR, mL/min/1.73 m2 105.9±13.2 105.1±13.0 106.5±12.2 
eGFR slope, mL/min/1.73 m2/year −0.76±1.38 −0.73±1.24 
Albuminuria GWASBaseline eGFR GWASeGFR slope GWASPC-based GWAS
Participants, N 11,280 85,460 29,480 10,887 
Age, years 48.8±10.4 49.1±10.7 49.7±10.3 48.9±10.3 
Male, N (%) 4,285 (38.0) 31,040 (36.3) 9,912 (33.6) 3,950 (36.3) 
UACRa, mg/g 8.355 (4.819–9.554) 8.384 (4.839–9.587) 
Baseline eGFR, mL/min/1.73 m2 105.9±13.2 105.1±13.0 106.5±12.2 
eGFR slope, mL/min/1.73 m2/year −0.76±1.38 −0.73±1.24 

aExpressed as median (Q1–Q3).

Univariate GWASs Results

The Manhattan plots of albuminuria, baseline eGFR, and eGFR slope GWASs are shown in Figure 2 (the respective Q-Q plots in online suppl. Fig. S1; for all online suppl. material, see https://doi.org/10.1159/000541982). Individual GWAS of albuminuria, baseline eGFR and eGFR slope identified 10, 210, and 13 candidate loci with 2, 99, and 3 of them being previously reported to be associated with CKD. These candidate loci are listed in online supplementary Tables S1–S3.

Fig. 2.

Manhattan plots and Q-Q plots of logUACR, baseline eGFR, and eGFR slope GWASs performed among TWB non-diabetic individuals. SNPs are plotted on the x axis according to their chromosome position against association with the respective traits on the y axis. The red horizontal line indicates genome-wide significance level of p value = 5.0 × 10−8, while the blue horizontal line represents suggestive significance level of p value = 1.0 × 10−5.

Fig. 2.

Manhattan plots and Q-Q plots of logUACR, baseline eGFR, and eGFR slope GWASs performed among TWB non-diabetic individuals. SNPs are plotted on the x axis according to their chromosome position against association with the respective traits on the y axis. The red horizontal line indicates genome-wide significance level of p value = 5.0 × 10−8, while the blue horizontal line represents suggestive significance level of p value = 1.0 × 10−5.

Close modal

PC-Based GWAS Results

A total of 10,887 participants possessed complete data on albuminuria, baseline eGFR, and eGFR slope. PCA of logUACR, baseline eGFR, and eGFR slope revealed the first three principal components explained the total variance. The vectors representing the original three kidney traits on PC1-PC2 are illustrated in Figure 3. Manhattan plots of PC1-PC3 GWASs are presented in Figure 4 (the respective Q-Q plots in online suppl. Fig. S2), with significant top loci listed in Table 2, in which the implicated genes include AGMAT, F5, OBSCN, TRIML2, AGO2, APBA1, BAZ1A, GCKR, ZNF512, AKAP6, CACNA2D3, SEC24B, RNF157, and PTPRT. Of the thirty significant top SNPs from PC-based GWAS, twenty of them were neither seen in our univariate GWASs results nor previous published GWASs regarding albuminuria, cross-sectional eGFR, or eGFR slope (Table 3). Among these twenty novel loci other than the intergenic SNPs, there was one near the F5 gene on chromosome 1 (rs9332641; p = 8.39 × 10−6), another near the OBSCN gene on chromosome 1 (rs10737429; p = 6.66 × 10−6), a third near the CACNA2D3 gene on chromosome 3 (rs76352818; p = 4.54 × 10−6), a fourth near the SEC24B gene on chromosome 4 (rs139161904; p = 4.21 × 10−6), a fifth near the TRIML2 gene on chromosome 4 (rs67785655; p = 1.96 × 10−6), a sixth near the AGO2 gene on chromosome 8 (rs4961274; p = 1.62 × 10−6), a seventh near the APBA1 gene on chromosome 9 (rs74445275; p = 1.84 × 10−6), an eighth near the BAZ1A gene on chromosome 14 (rs7147621; p = 5.02 × 10−6), a ninth near the RNF157 gene on chromosome 17 (rs73360624; p = 6.19 × 10−6), and the other near the PTPRT gene on chromosome 20 (rs8119626; p = 2.99 × 10−6).

Fig. 3.

Biplot demonstrating primary kidney traits’ vectors in dimensions of PC1 and PC2.

Fig. 3.

Biplot demonstrating primary kidney traits’ vectors in dimensions of PC1 and PC2.

Close modal
Fig. 4.

Manhattan plots and Q-Q plots of PC1-PC3 GWASs, with PCs derived from TWB non-diabetic individuals with all three primary kidney traits. SNPs are plotted on the x axis according to their chromosome position against association with the respective traits on the y axis. The red horizontal line indicates genome-wide significance level of p value = 5.0 × 10−8, while the blue horizontal line represents suggestive significance level of p value = 1.0 × 10−5.

Fig. 4.

Manhattan plots and Q-Q plots of PC1-PC3 GWASs, with PCs derived from TWB non-diabetic individuals with all three primary kidney traits. SNPs are plotted on the x axis according to their chromosome position against association with the respective traits on the y axis. The red horizontal line indicates genome-wide significance level of p value = 5.0 × 10−8, while the blue horizontal line represents suggestive significance level of p value = 1.0 × 10−5.

Close modal
Table 2.

Top suggestive loci derived from PC-based GWAS

PseudotraitsChrBPSNPA1A2MAFp valueNearby gene
PC1 15,586,492 rs10159261 0.433 1.8E-07 AGMAT 
169,524,475 rs9332641 0.020 8.4E-06 F5 
228,258,445 rs10737429 0.434 6.7E-06 OBSCN 
123,677,383 rs10171475 0.227 1.8E-06 - 
188,111,676 rs67785655 0.188 2.0E-06 TRIML2 
140,598,947 rs4961274 0.087 1.6E-06 AGO2 
69,531,003 rs74445275 0.020 1.8E-06 APBA1 
12 97,658,889 rs4762397 0.399 8.6E-07 - 
14 34,806,498 rs7147621 0.217 5.0E-06 BAZ1A 
18 70,501,770 rs76982055 0.034 2.4E-06 - 
PC2 101,315,274 rs77814797 0.056 5.1E-06 - 
218,026,147 rs193129422 0.011 6.7E-06 LOC105372922 
27,512,105 rs6547692 0.488 6.4E-06 GCKR 
27,598,615 rs12989678 0.465 1.3E-06 ZNF512 
51,728,241 rs190931819 0.051 4.1E-06 LOC730100 
60,669,935 rs56188710 0.022 2.6E-06 - 
116,637,209 rs10007743 0.384 6.7E-06 - 
126,787,643 rs139816356 0.022 8.7E-06 - 
138,579,033 rs1376095 0.209 1.0E-06 - 
138,588,684 rs10461249 0.361 7.3E-06 - 
82,182,829 rs79425178 0.031 7.7E-06 LOC105375930 
14 32,571,580 rs1155069 0.038 6.9E-07 AKAP6 
PC3 54,613,722 rs76352818 0.022 4.5E-06 CACNA2D3 
109,504,697 rs139161904 0.021 4.2E-06 SEC24B 
54,486,360 rs78828070 0.090 6.5E-07 - 
54,558,447 rs146677015 0.015 2.9E-07 - 
141,371,205 rs117153205 0.069 9.1E-06 - 
10 103,868,924 rs117231653 0.078 5.8E-07 - 
17 76,196,746 rs73360624 0.047 6.2E-06 RNF157 
20 42,732,473 rs8119626 0.013 3.0E-06 PTPRT 
PseudotraitsChrBPSNPA1A2MAFp valueNearby gene
PC1 15,586,492 rs10159261 0.433 1.8E-07 AGMAT 
169,524,475 rs9332641 0.020 8.4E-06 F5 
228,258,445 rs10737429 0.434 6.7E-06 OBSCN 
123,677,383 rs10171475 0.227 1.8E-06 - 
188,111,676 rs67785655 0.188 2.0E-06 TRIML2 
140,598,947 rs4961274 0.087 1.6E-06 AGO2 
69,531,003 rs74445275 0.020 1.8E-06 APBA1 
12 97,658,889 rs4762397 0.399 8.6E-07 - 
14 34,806,498 rs7147621 0.217 5.0E-06 BAZ1A 
18 70,501,770 rs76982055 0.034 2.4E-06 - 
PC2 101,315,274 rs77814797 0.056 5.1E-06 - 
218,026,147 rs193129422 0.011 6.7E-06 LOC105372922 
27,512,105 rs6547692 0.488 6.4E-06 GCKR 
27,598,615 rs12989678 0.465 1.3E-06 ZNF512 
51,728,241 rs190931819 0.051 4.1E-06 LOC730100 
60,669,935 rs56188710 0.022 2.6E-06 - 
116,637,209 rs10007743 0.384 6.7E-06 - 
126,787,643 rs139816356 0.022 8.7E-06 - 
138,579,033 rs1376095 0.209 1.0E-06 - 
138,588,684 rs10461249 0.361 7.3E-06 - 
82,182,829 rs79425178 0.031 7.7E-06 LOC105375930 
14 32,571,580 rs1155069 0.038 6.9E-07 AKAP6 
PC3 54,613,722 rs76352818 0.022 4.5E-06 CACNA2D3 
109,504,697 rs139161904 0.021 4.2E-06 SEC24B 
54,486,360 rs78828070 0.090 6.5E-07 - 
54,558,447 rs146677015 0.015 2.9E-07 - 
141,371,205 rs117153205 0.069 9.1E-06 - 
10 103,868,924 rs117231653 0.078 5.8E-07 - 
17 76,196,746 rs73360624 0.047 6.2E-06 RNF157 
20 42,732,473 rs8119626 0.013 3.0E-06 PTPRT 
Table 3.

Results of suggestive novel loci derived from PC-based GWAS

ChrBPSNPA1A2MAFp valueNearby geneKidney eQTL1Involved gene expression in kidney tissue
101,315,274 rs77814797 0.056 5.14E-06 No  
169,524,475 rs9332641 0.02 8.39E-06 F5 Yes SELL (glo); ATP1B1 (ti) 
228,258,445 rs10737429 0.434 6.66E-06 OBSCN Yes TRIM17, OBSCN-AS1, IBA57-AS1, MRPL55 (ti) 
123,677,383 rs10171475 0.227 1.82E-06 No  
54,613,722 rs76352818 0.022 4.54E-06 CACNA2D3 No  
60,669,935 rs56188710 0.022 2.61E-06 No  
109,504,697 rs139161904 0.021 4.21E-06 SEC24B No  
116,637,209 rs10007743 0.384 6.73E-06 No  
126,787,643 rs139816356 0.022 8.72E-06 No  
138,588,684 rs10461249 0.361 7.29E-06 No  
188,111,676 rs67785655 0.188 1.96E-06 TRIML2 No  
141,371,205 rs117153205 0.069 9.14E-06 - No  
140,598,947 rs4961274 0.087 1.62E-06 AGO2 No  
69,531,003 rs74445275 0.02 1.84E-06 APBA1 No  
10 103,868,924 rs117231653 0.078 5.82E-07 No  
12 97,658,889 rs4762397 0.399 8.64E-07 No  
14 34,806,498 rs7147621 0.217 5.02E-06 BAZ1A Yes CFL2 (ti) 
17 76,196,746 rs73360624 0.047 6.19E-06 RNF157 Yes ST6GALNAC2, UBE2O (glo); CYGB (ti) 
18 70,501,770 rs76982055 0.034 2.43E-06 No  
20 42,732,473 rs8119626 0.013 2.99E-06 PTPRT No  
ChrBPSNPA1A2MAFp valueNearby geneKidney eQTL1Involved gene expression in kidney tissue
101,315,274 rs77814797 0.056 5.14E-06 No  
169,524,475 rs9332641 0.02 8.39E-06 F5 Yes SELL (glo); ATP1B1 (ti) 
228,258,445 rs10737429 0.434 6.66E-06 OBSCN Yes TRIM17, OBSCN-AS1, IBA57-AS1, MRPL55 (ti) 
123,677,383 rs10171475 0.227 1.82E-06 No  
54,613,722 rs76352818 0.022 4.54E-06 CACNA2D3 No  
60,669,935 rs56188710 0.022 2.61E-06 No  
109,504,697 rs139161904 0.021 4.21E-06 SEC24B No  
116,637,209 rs10007743 0.384 6.73E-06 No  
126,787,643 rs139816356 0.022 8.72E-06 No  
138,588,684 rs10461249 0.361 7.29E-06 No  
188,111,676 rs67785655 0.188 1.96E-06 TRIML2 No  
141,371,205 rs117153205 0.069 9.14E-06 - No  
140,598,947 rs4961274 0.087 1.62E-06 AGO2 No  
69,531,003 rs74445275 0.02 1.84E-06 APBA1 No  
10 103,868,924 rs117231653 0.078 5.82E-07 No  
12 97,658,889 rs4762397 0.399 8.64E-07 No  
14 34,806,498 rs7147621 0.217 5.02E-06 BAZ1A Yes CFL2 (ti) 
17 76,196,746 rs73360624 0.047 6.19E-06 RNF157 Yes ST6GALNAC2, UBE2O (glo); CYGB (ti) 
18 70,501,770 rs76982055 0.034 2.43E-06 No  
20 42,732,473 rs8119626 0.013 2.99E-06 PTPRT No  

1nephQTL (Gillies 2018 [27]).

glo, glomerulus; ti, tubulointerstitium.

In this first PC-based GWAS regarding albuminuria, cross-sectional eGFR and eGFR slope, we identified thirty candidate loci associated with CKD and among which twenty were novel. Several implicated genes of the ten non-novel loci identified concomitantly in univariate GWAS and PC-GWAS were readily known to be associated with kidney injury or disease. Agmatinase (AGMAT) is significantly differentially expressed between different kidney disease entities [28]. Glucokinase Regulator (GCKR) polymorphism is known to be associated with CKD, end-stage kidney disease (ESKD) and serum uric acid level [29‒31]. Zinc Finger Protein 512 (ZNF512) has been suggested as a candidate gene of hyperuricemia and gout [32]. A-Kinase Anchoring Protein 6 (AKAP6) is shown to be associated with nephrolithiasis [33].

Among the twenty novel loci, some of the nearby genes are also implicated with kidney injury and CKD from literature. F5 (Coagulation factor V) is produced by macrophages in diverse circumstances including acute kidney injury and thus is associated with kidney fibrosis [34]. Obscurin (OBSCN) is involved in sarcoplasmic reticulum handling of intracellular calcium concentration. When there are bi-allelic loss-of-function mutations in OBSCN, the individual will suffer from recurrent rhabdomyolysis with myalgia, muscle weakness, or dark urine. Serum creatinine level will be also elevated due to release of creatine and phosphocreatine from myocytes and tubular necrosis secondary to obstruction of the kidney tubules by myoglobin [35‒37]. Argonaute RISC Catalytic Component 2 (AGO2) is an RNA-binding protein of the Argonaute family that assembles the miRNA-induced silencing complex (miRISC) via binding both miRNA and its target mRNA. miR-429-3p attenuates branched-chain amino acid catabolism in kidney proximal tubules, causing an oxidative-stress-induced form of cell death known as ferroptosis [38]. Thus AGO2 may be involved in kidney injury and CKD.

Four of the novel loci also had significant kidney eQTL results from nephQTL database [27], in which rs9332641 was associated with SELL expression in kidney glomeruli and ATP1B1 expression in tubulointerstitium, rs10737429 was associated with tubulointerstitial expression of TRIM17, OBSCN-AS1, IBA57-AS1 and MRPL55, rs7147621 was associated with tubulointerstitial expression of CFL2 and rs73360624 was associated with ST6GALNAC2 and UBE2O expression in kidney glomeruli and CYGB expression in tubulointerstitium. SELL encodes Selectin L which regulates the inflammatory response in different organs, including the kidney [39]. miR-192-5p aims ATP1B1 mRNA as a target, and for rats with miR-192-5p knockdown undergoing subsequent ATP1B1 knockdown, the salt-sensitive hypertension phenomenon can be attenuated [40]. CFL2 mutation causes nemaline rod myopathy [41]. Malnutrition, myopathy and conditions that cause reduced muscle mass will all affect serum creatinine level [42]. ST6GALNAC2 is associated with IgA1 sialylation. Desialylation of IgA1 is one of the pathomechanism of IgA nephropathy, and there was study showing the frequency of haplotype ADG in the promoter region of ST6GALNAC2 gene was significantly higher in patients with IgA nephropathy [43]. Reduced podocytes numbers can be found in Cygb knockout mice and positively correlated with kidney function decline. Analysis of the CYGB-dependent transcriptome revealed dysregulation of genes involved in redox balance, apoptosis and CKD [44]. There was also an intergenic SNP on chromosome 10 (rs117231653, p = 5.82 × 10−7) significantly associated with COL17A1 expression in whole blood noted from GTEx [45]. Collagen XVII is identified in both human and murine kidney, which localized in foot process of podocytes and glomerular basement membrane by immunoelectron microscopy, and thus is possibly related to podocyte maturation and glomerular filtration [46].

PCA identifies vectors (PCs) which best describe the characteristics of a dataset via reducing the dimensionality of the data while retaining its maximal variation [47]. Since logUACR, cross-sectional eGFR and eGFR slope were all quantitative traits, we applied PC-based GWAS in search of additional loci that were not able to be identified via conventional univariate GWAS. PC-based GWAS is uncommon in the current literature, and there is only one PC-based GWAS regarding kidney traits/CKD [48]. Tran et al. selected serum creatinine level, eGFR and blood urea nitrogen (BUN) as primary traits for their PC-based GWAS. However, the eGFR value is simply a dependent variable of serum creatinine, thus basically the two are heavily overlapped; BUN is not a specific kidney biomarker function as individuals with normal kidney function can have elevated BUN under circumstances of relatively dehydrate or on high protein diet. The primary traits Tran et al. [48] selected have issues stated above, and thus, the representativeness of their results is questionable.

There were issues of “missing heritability” among many GWASs [49]. From studies of different diseases and complex traits, low-frequency variants of certain population were noted to play a substantial role [50‒55]. The phenomenon of missing heritability could at least partially explained by that conventional GWASs only identify common variants with MAF higher than 5% while leaving out the rarer variants. Half of the twenty novel candidate loci discovered in this study had MAF between 1 and 5% and thus cannot be replicated or seen in previous GWASs.

Diabetes is the leading cause of CKD worldwide and it strongly affects eGFR change [56]. The trajectory of eGFR is extremely complex and heterogeneous in patients with type 2 diabetes [57‒59]. Also, there is a high proportion of patients with diabetic kidney disease experiencing kidney hyperfiltration, in other words, eGFR was elevated rather than decreased during this stage [60]. Due to these reasons, we think it is necessary to exclude diabetic individuals in advance than performing covariate adjustment during association analysis to fully negate the effect of diabetes on kidney function.

Our study has several limitations. Firstly, the number of individuals possessing all three primary kidney traits was slightly above 10,000, which is considered small compared to the sample sizes of many large-scale GWAS for individual trait. Use of p values below 10−5 to identify suggestively associated loci would also increase the likelihood of capturing false positive associations. Although we excluded individuals with diabetes from this study, it is still likely that a portion of the remaining individuals may have non-diabetic kidney disease. Moreover, most biobanks, including TWB, typically assess serum creatinine but not cystatin C due to the higher cost associated with the latter. However, using creatinine-based formulas for GFR estimation has inherent limitations. Therefore, there is a growing trend towards incorporating serum cystatin C levels to improve the precision of GFR estimation [61]. Furthermore, the current eQTL databases such as nephQTL and GTEx are not based on East Asian population, and thus, the eQTL information may not be precise for Taiwanese population. Therefore, future studies would benefit from the establishment of a Taiwanese kidney transcriptomic database to provide more accurate eQTL results for candidate loci identified in our study.

In conclusion, we have conducted the first PC-based GWAS regarding albuminuria, cross-sectional eGFR and eGFR slope in the non-diabetic Taiwanese population in search of genetic loci associated with kidney function decline/CKD. Twenty novel candidate loci were identified and several of them had significant kidney eQTL results and the involved genes were readily known to be associated with kidney function or kidney disease. Future functional studies may further support our findings.

The authors would like to thank all participants in the Taiwan Biobank for providing the data and all the supports.

This study was approved by the Institutional Review Board of the National Taiwan University Hospital Research Ethics Committee (Approval No. 202312110RINA). All subjects have provided written informed consent, and all methods were carried out in accordance with the declaration of Helsinki and relevant regulations.

The authors declare no competing interests.

This study is funded by intramural grants from the National Taiwan University Hospital (NTUH110-M4813 and NTUH 112-S0093). The funding agency had no role in the study design, data collection, data analysis, data interpretation, writing of the report, or the decision to submit the report for publication.

G.-T.C. conceptualized the research idea and study design. G.-T.C. and T.P.-H.C. were responsible for data curation and formal analysis. G.-T.C. and Y.-C.C. contributed to funding acquisition. C.-N.H. provided statistical consultation and data interpretation. G.-T.C. drafted the initial manuscript. Y.-C.C. provided critical feedback during data analyzing and manuscript drafting. All authors read and approved the final manuscript.

Data that supports the results of this study are available from the Taiwan Biobank. According to the restrictions of data availability, the data can only be used under license for the current study and are not publicly available. Further enquiries can be directed to the corresponding author.

1.
Kovesdy
CP
.
Epidemiology of chronic kidney disease: an update 2022
.
Kidney Int Suppl
.
2022
;
12
(
1
):
7
11
.
2.
Pattaro
C
,
Teumer
A
,
Gorski
M
,
Chu
AY
,
Li
M
,
Mijatovic
V
, et al
.
Genetic associations at 53 loci highlight cell types and biological pathways relevant for kidney function
.
Nat Commun
.
2016
;
7
:
10023
.
3.
Mahajan
A
,
Rodan
AR
,
Le
TH
,
Gaulton
KJ
,
Haessler
J
,
Stilp
AM
, et al
.
Trans-ethnic fine mapping highlights kidney-function genes linked to salt sensitivity
.
Am J Hum Genet
.
2016
;
99
(
3
):
636
46
.
4.
Kanai
M
,
Akiyama
M
,
Takahashi
A
,
Matoba
N
,
Momozawa
Y
,
Ikeda
M
, et al
.
Genetic analysis of quantitative traits in the Japanese population links cell types to complex human diseases
.
Nat Genet
.
2018
;
50
(
3
):
390
400
.
5.
Morris
AP
,
Le
TH
,
Wu
H
,
Akbarov
A
,
van der Most
PJ
,
Hemani
G
, et al
.
Trans-ethnic kidney function association study reveals putative causal genes and effects on kidney-specific disease aetiologies
.
Nat Commun
.
2019
;
10
(
1
):
29
.
6.
Graham
SE
,
Nielsen
JB
,
Zawistowski
M
,
Zhou
W
,
Fritsche
LG
,
Gabrielsen
ME
, et al
.
Sex-specific and pleiotropic effects underlying kidney function identified from GWAS meta-analysis
.
Nat Commun
.
2019
;
10
(
1
):
1847
.
7.
Wuttke
M
,
Li
Y
,
Li
M
,
Sieber
KB
,
Feitosa
MF
,
Gorski
M
, et al
.
A catalog of genetic loci associated with kidney function from analyses of a million individuals
.
Nat Genet
.
2019
;
51
(
6
):
957
72
.
8.
Hellwege
JN
,
Velez Edwards
DR
,
Giri
A
,
Qiu
C
,
Park
J
,
Torstenson
ES
, et al
.
Mapping eGFR loci to the renal transcriptome and phenome in the VA Million Veteran Program
.
Nat Commun
.
2019
;
10
(
1
):
3842
.
9.
Sinnott-Armstrong
N
,
Tanigawa
Y
,
Amar
D
,
Mars
N
,
Benner
C
,
Aguirre
M
, et al
.
Genetics of 35 blood and urine biomarkers in the UK Biobank [published correction appears in Nat Genet. 2021 Nov;53(11):1622]
.
Nat Genet
.
2021
;
53
(
2
):
185
94
.
10.
Stanzick
KJ
,
Li
Y
,
Schlosser
P
,
Gorski
M
,
Wuttke
M
,
Thomas
LF
, et al
.
Discovery and prioritization of variants and genes for kidney function in >1.2 million individuals
.
Nat Commun
.
2021
;
12
(
1
):
4350
.
11.
Rosansky
SJ
.
Renal function trajectory is more important than chronic kidney disease stage for managing patients with chronic kidney disease
.
Am J Nephrol
.
2012
;
36
(
1
):
1
10
.
12.
Hallan
SI
,
Ritz
E
,
Lydersen
S
,
Romundstad
S
,
Kvenild
K
,
Orth
SR
.
Combining GFR and albuminuria to classify CKD improves prediction of ESRD
.
J Am Soc Nephrol
.
2009
;
20
(
5
):
1069
77
.
13.
Levey
AS
,
Gansevoort
RT
,
Coresh
J
,
Inker
LA
,
Heerspink
HL
,
Grams
ME
, et al
.
Change in albuminuria and GFR as end points for clinical trials in early stages of CKD: a scientific workshop sponsored by the national kidney foundation in collaboration with the US food and drug administration and European medicines agency
.
Am J Kidney Dis
.
2020
;
75
(
1
):
84
104
.
14.
Haas
ME
,
Aragam
KG
,
Emdin
CA
,
Bick
AG
;
International Consortium for Blood Pressure
,
Hemani
G
, et al
.
Genetic association of albuminuria with cardiometabolic disease and blood pressure
.
Am J Hum Genet
.
2018
;
103
(
4
):
461
73
.
15.
Zanetti
D
,
Rao
A
,
Gustafsson
S
,
Assimes
TL
,
Montgomery
SB
,
Ingelsson
E
.
Identification of 22 novel loci associated with urinary biomarkers of albumin, sodium, and potassium excretion
.
Kidney Int
.
2019
;
95
(
5
):
1197
208
.
16.
Teumer
A
,
Li
Y
,
Ghasemi
S
,
Prins
BP
,
Wuttke
M
,
Hermle
T
, et al
.
Genome-wide association meta-analyses and fine-mapping elucidate pathways influencing albuminuria
.
Nat Commun
.
2019
;
10
(
1
):
4130
.
17.
Gorski
M
,
Jung
B
,
Li
Y
,
Matias-Garcia
PR
,
Wuttke
M
,
Coassin
S
, et al
.
Meta-analysis uncovers genome-wide significant variants for rapid kidney function decline
.
Kidney Int
.
2021
;
99
(
4
):
926
39
.
18.
Han
M
,
Moon
S
,
Lee
S
,
Kim
K
,
An
WJ
,
Ryu
H
, et al
.
Novel genetic variants associated with chronic kidney disease progression
.
J Am Soc Nephrol
.
2023
;
34
(
5
):
857
75
.
19.
Robinson-Cohen
C
,
Triozzi
JL
,
Rowan
B
,
He
J
,
Chen
HC
,
Zheng
NS
, et al
.
Genome-wide association study of CKD progression
.
J Am Soc Nephrol
.
2023
;
34
(
9
):
1547
59
.
20.
Fan
CT
,
Lin
JC
,
Lee
CH
.
Taiwan Biobank: a project aiming to aid Taiwan's transition into a biomedical island
.
Pharmacogenomics
.
2008
;
9
(
2
):
235
46
.
21.
Inker
LA
,
Eneanya
ND
,
Coresh
J
,
Tighiouart
H
,
Wang
D
,
Sang
Y
, et al
.
New creatinine- and cystatin C-based equations to estimate GFR without race
.
N Engl J Med
.
2021
;
385
(
19
):
1737
49
.
22.
Delaneau
O
,
Zagury
JF
,
Marchini
J
.
Improved whole-chromosome phasing for disease and population genetic studies
.
Nat Methods
.
2013
;
10
(
1
):
5
6
.
23.
Howie
B
,
Fuchsberger
C
,
Stephens
M
,
Marchini
J
,
Abecasis
GR
.
Fast and accurate genotype imputation in genome-wide association studies through pre-phasing
.
Nat Genet
.
2012
;
44
(
8
):
955
9
.
24.
Purcell
S
,
Neale
B
,
Todd-Brown
K
,
Thomas
L
,
Ferreira
MAR
,
Bender
D
, et al
.
PLINK: a tool set for whole-genome association and population-based linkage analyses
.
Am J Hum Genet
.
2007
;
81
(
3
):
559
75
.
25.
Turner
SD
.
qqman: an R package for visualizing GWAS results using Q-Q and manhattan plots
.
bioRxiv
.
2014
:
005165
.
26.
R Core Team
R: a language and environment for statistical computing
.
R Foundation for Statistical Computing
;
2021
.
27.
Gillies
CE
,
Putler
R
,
Menon
R
,
Otto
E
,
Yasutake
K
,
Nair
V
, et al
.
An eQTL landscape of kidney tissue in human nephrotic syndrome
.
Am J Hum Genet
.
2018
;
103
(
2
):
232
44
.
28.
Tajti
F
,
Kuppe
C
,
Antoranz
A
,
Ibrahim
MM
,
Kim
H
,
Ceccarelli
F
, et al
.
A functional landscape of CKD entities from public transcriptomic data
.
Kidney Int Rep
.
2020
;
5
(
2
):
211
24
.
29.
Hishida
A
,
Takashima
N
,
Turin
TC
,
Kawai
S
,
Wakai
K
,
Hamajima
N
, et al
.
GCK, GCKR polymorphisms and risk of chronic kidney disease in Japanese individuals: data from the J-MICC Study
.
J Nephrol
.
2014
;
27
(
2
):
143
9
.
30.
Wang
K
,
Shi
M
,
Yang
A
,
Fan
B
,
Tam
CHT
,
Lau
E
, et al
.
GCKR and GCK polymorphisms are associated with increased risk of end-stage kidney disease in Chinese patients with type 2 diabetes: the Hong Kong Diabetes Register (1995-2019)
.
Diabetes Res Clin Pract
.
2022
;
193
:
110118
.
31.
Ho
LJ
,
Lu
CH
,
Su
RY
,
Lin
FH
,
Su
SC
,
Kuo
FC
, et al
.
Association between glucokinase regulator gene polymorphisms and serum uric acid levels in Taiwanese adolescents
.
Sci Rep
.
2022
;
12
(
1
):
5519
.
32.
Matsuo
H
,
Yamamoto
K
,
Nakaoka
H
,
Nakayama
A
,
Sakiyama
M
,
Chiba
T
, et al
.
Genome-wide association study of clinically defined gout identifies multiple risk loci and its association with clinical subtypes
.
Ann Rheum Dis
.
2016
;
75
(
4
):
652
9
.
33.
Yuan
S
,
Larsson
SC
.
Coffee and caffeine consumption and risk of kidney stones: a mendelian randomization study
.
Am J Kidney Dis
.
2022
;
79
(
1
):
9
14.e1
.
34.
Oh
H
,
Kwon
O
,
Kong
MJ
,
Park
KM
,
Baek
JH
.
Macrophages promote Fibrinogenesis during kidney injury
.
Front Med
.
2023
;
10
:
1206362
.
35.
Cabrera-Serrano
M
,
Caccavelli
L
,
Savarese
M
,
Vihola
A
,
Jokela
M
,
Johari
M
, et al
.
Bi-allelic loss-of-function OBSCN variants predispose individuals to severe recurrent rhabdomyolysis
.
Brain
.
2022
;
145
(
11
):
3985
98
.
36.
Stahl
K
,
Rastelli
E
,
Schoser
B
.
A systematic review on the definition of rhabdomyolysis
.
J Neurol
.
2020
;
267
(
4
):
877
82
.
37.
Rawson
ES
,
Clarkson
PM
,
Tarnopolsky
MA
.
Perspectives on exertional rhabdomyolysis
.
Sports Med
.
2017
;
47
(
Suppl 1
):
33
49
.
38.
Sone
H
,
Lee
TJ
,
Lee
BR
,
Heo
D
,
Oh
S
,
Kwon
SH
.
MicroRNA-mediated attenuation of branched-chain amino acid catabolism promotes ferroptosis in chronic kidney disease
.
Nat Commun
.
2023
;
14
(
1
):
7814
.
39.
Stavarachi
M
,
Panduru
NM
,
Serafinceanu
C
,
Moţa
E
,
Moţa
M
,
Cimponeriu
D
, et al
.
Investigation of P213S SELL gene polymorphism in type 2 diabetes mellitus and related end stage renal disease. A case-control study
.
Rom J Morphol Embryol
.
2011
;
52
(
3 Suppl l
):
995
8
.
40.
Baker
MA
,
Wang
F
,
Liu
Y
,
Kriegel
AJ
,
Geurts
AM
,
Usa
K
, et al
.
MiR-192-5p in the kidney protects against the development of hypertension
.
Hypertension
.
2019
;
73
(
2
):
399
406
.
41.
Agrawal
PB
,
Greenleaf
RS
,
Tomczak
KK
,
Lehtokari
VL
,
Wallgren-Pettersson
C
,
Wallefeld
W
, et al
.
Nemaline myopathy with minicores caused by mutation of the CFL2 gene encoding the skeletal muscle actin-binding protein, cofilin-2
.
Am J Hum Genet
.
2007
;
80
(
1
):
162
7
.
42.
Hari
P
,
Bagga
A
,
Mahajan
P
,
Lakshmy
R
.
Effect of malnutrition on serum creatinine and cystatin C levels
.
Pediatr Nephrol
.
2007
;
22
(
10
):
1757
61
.
43.
Li
GS
,
Zhu
L
,
Zhang
H
,
Lv
JC
,
Ding
JX
,
Zhao
MH
, et al
.
Variants of the ST6GALNAC2 promoter influence transcriptional activity and contribute to genetic susceptibility to IgA nephropathy
.
Hum Mutat
.
2007
;
28
(
10
):
950
7
.
44.
Randi
EB
,
Vervaet
B
,
Tsachaki
M
,
Porto
E
,
Vermeylen
S
,
Lindenmeyer
MT
, et al
.
The antioxidative role of cytoglobin in podocytes: implications for a role in chronic kidney disease
.
Antioxid Redox Signal
.
2020
;
32
(
16
):
1155
71
.
45.
GTEx Consortium
;
Laboratory, Data Analysis & Coordinating Center LDACC—Analysis Working Group; Statistical Methods groups—Analysis Working Group
;
Enhancing GTEx eGTEx groups
;
NIH Common FundNIH/NCI
; et al
.
Genetic effects on gene expression across human tissues [published correction appears in Nature. 2017 Dec 20;:]
.
Nature
.
2017
;
550
(
7675
):
204
13
.
46.
Hurskainen
T
,
Moilanen
J
,
Sormunen
R
,
Franzke
CW
,
Soininen
R
,
Loeffek
S
, et al
.
Transmembrane collagen XVII is a novel component of the glomerular filtration barrier
.
Cell Tissue Res
.
2012
;
348
(
3
):
579
88
.
47.
Ringnér
M
.
What is principal component analysis
.
Nat Biotechnol
.
2008
;
26
(
3
):
303
4
.
48.
Tran
NK
,
Lea
RA
,
Holland
S
,
Nguyen
Q
,
Raghubar
AM
,
Sutherland
HG
, et al
.
Multi-phenotype genome-wide association studies of the Norfolk Island isolate implicate pleiotropic loci involved in chronic kidney disease
.
Sci Rep
.
2021
;
11
(
1
):
19425
.
49.
Manolio
TA
,
Collins
FS
,
Cox
NJ
,
Goldstein
DB
,
Hindorff
LA
,
Hunter
DJ
, et al
.
Finding the missing heritability of complex diseases
.
Nature
.
2009
;
461
(
7265
):
747
53
.
50.
Marouli
E
,
Graff
M
,
Medina-Gomez
C
,
Lo
KS
,
Wood
AR
,
Kjaer
TR
, et al
.
Rare and low-frequency coding variants alter human adult height
.
Nature
.
2017
;
542
(
7640
):
186
90
.
51.
Mola-Caminal
M
,
Carrera
C
,
Soriano-Tárraga
C
,
Giralt-Steinhauer
E
,
Díaz-Navarro
RM
,
Tur
S
, et al
.
PATJ low frequency variants are associated with worse ischemic stroke functional outcome
.
Circ Res
.
2019
;
124
(
1
):
114
20
.
52.
Akiyama
M
,
Ishigaki
K
,
Sakaue
S
,
Momozawa
Y
,
Horikoshi
M
,
Hirata
M
, et al
.
Characterizing rare and low-frequency height-associated variants in the Japanese population [published correction appears in Nat Commun. 2020 Mar 9;11(1):1350]
.
Nat Commun
.
2019
;
10
(
1
):
4393
.
53.
Pang
H
,
Xia
Y
,
Luo
S
,
Huang
G
,
Li
X
,
Xie
Z
, et al
.
Emerging roles of rare and low-frequency genetic variants in type 1 diabetes mellitus
.
J Med Genet
.
2021
;
58
(
5
):
289
96
.
54.
Chang
X
,
Gurung
RL
,
Wang
L
,
Jin
A
,
Li
Z
,
Wang
R
, et al
.
Low frequency variants associated with leukocyte telomere length in the Singapore Chinese population
.
Commun Biol
.
2021
;
4
(
1
):
519
.
55.
Clarelli
F
,
Barizzone
N
,
Mangano
E
,
Zuccalà
M
,
Basagni
C
,
Anand
S
, et al
.
Contribution of rare and low-frequency variants to multiple sclerosis susceptibility in the Italian continental population
.
Front Genet
.
2021
;
12
:
800262
.
56.
Alicic
RZ
,
Rooney
MT
,
Tuttle
KR
.
Diabetic kidney disease: challenges, progress, and possibilities
.
Clin J Am Soc Nephrol
.
2017
;
12
(
12
):
2032
45
.
57.
Christensen
PK
,
Gall
MA
,
Parving
HH
.
Course of glomerular filtration rate in albuminuric type 2 diabetic patients with or without diabetic glomerulopathy
.
Diabetes Care
.
2000
;
23
(
Suppl 2
):
B14
20
.
58.
Hovind
P
,
Rossing
P
,
Tarnow
L
,
Smidt
UM
,
Parving
HH
.
Progression of diabetic nephropathy
.
Kidney Int
.
2001
;
59
(
2
):
702
9
.
59.
Zoppini
G
,
Targher
G
,
Chonchol
M
,
Ortalda
V
,
Negri
C
,
Stoico
V
, et al
.
Predictors of estimated GFR decline in patients with type 2 diabetes and preserved kidney function
.
Clin J Am Soc Nephrol
.
2012
;
7
(
3
):
401
8
.
60.
Tonneijck
L
,
Muskiet
MH
,
Smits
MM
,
van Bommel
EJ
,
Heerspink
HJL
,
van Raalte
DH
, et al
.
Glomerular hyperfiltration in diabetes: mechanisms, clinical significance, and treatment
.
J Am Soc Nephrol
.
2017
;
28
(
4
):
1023
39
.
61.
Delgado
C
,
Baweja
M
,
Crews
DC
,
Eneanya
ND
,
Gadegbeku
CA
,
Inker
LA
, et al
.
A unifying approach for GFR estimation: recommendations of the NKF-ASN task force on reassessing the inclusion of race in diagnosing kidney disease
.
Am J Kidney Dis
.
2022
;
79
(
2
):
268
88.e1
.