Introduction: Rheumatoid arthritis (RA) has become a serious threat to human health and quality of life worldwide. Previous studies have demonstrated that genetic factors play a crucial role in the onset and progression of RA. Due to the rapid development of genome-wide association study (GWAS) and large-scale genetic analysis, GWAS research on RA has received widespread attention in recent years. Therefore, we conducted a comprehensive visualization and bibliometric analysis of publications to identify hotspots and future trends in GWAS research on RA. Methods: Literature on RA and GWAS published between 2002 and 2024 was extracted from the Web of Science Core Collection database by strategic screening. Collected data were further analyzed by using VOSviewer, CiteSpace, and Excel. The collaborations networks of countries, authors, institutions, and the co-citation networks of publications were visualized. Finally, research hotspots and fronts were examined. Results: A total of 713 publications with 45,773 citations were identified. The number of publications and citations has had a significant surge since 2007. The United States contributed the most publications globally. Okada, Yukinori, was the most influential author. The most productive institution in this field was the University of Manchester. The analysis of keywords revealed that “mendelian randomization analysis”, “association”, “innate”, “instruments”, “bias”, “pathogenesis”, and “genome-wide association study” are likely to be the frontiers of research in this field. Conclusion: This study can be used to predict future research advances in the fields of GWAS on RA and helps to promote academic collaboration among scholars.

Rheumatoid arthritis (RA) is a chronic heterogeneous autoimmune disease characterized by progressive and symmetric inflammation of the affected joints leading to irreversible cartilage destruction, bone erosion, and disability, imposing an enormous burden on individuals and society [1]. Previous studies have demonstrated that genetic factors play a crucial role in the onset and progression of RA [2]. Research in the Finnish and British populations has shown the heritability of RA – which measures how much of the disease risk is explained by genetic factors – to be approximately 65% and 53%, respectively [3]. RA can be classified into two main subtypes based on the presence or absence of specific autoantibodies: seropositive RA and seronegative RA. Seropositive RA is characterized by the presence of rheumatoid factor and/or anti-citrullinated protein antibodies, while seronegative RA lacks these autoantibodies. A multiomics analysis based on the Northwest European genome-wide association study (GWAS) identified a total of 37 sequence variants, including 33 variants associated with seropositive RA [4]. These findings provide valuable insights into the genetic underpinnings of RA, particularly within the seropositive subset, which typically presents with more severe disease progression and joint damage compared to seronegative RA. In addition, this study also revealed that pathogenic genes from the JAK/STAT pathway increased the risk of seropositive RA by 2.27-fold [4]. Thus, investigating this disease from a genetic perspective would contribute to our understanding of its etiology. In the past 20 years, our understanding of the genetic structure of RA has significantly increased. This was primarily due to the development of novel gene typing technologies and statistical tools, which facilitated the implementation of large-scale genetic analysis, particularly GWASs. GWAS uses simple linear models to associate genetic variants with disease phenotype, allowing for the identification of thousands of genetic variants associated with complex traits [5]. Advanced GWAS techniques can also be combined with multi-omics to provide more comprehensive clues to understand the pathogenesis of disease. However, the rapid growth of GWAS has led to a proliferation of studies that can result in redundancies, making it challenging to grasp the full scope of GWAS on RA research. This wealth of information highlights the critical need for effective management and synthesis to maintain a comprehensive understanding of the current research landscape and to identify emerging trends and knowledge gaps. To address this issue, we used bibliometric analysis, a convenient and accurate tool to summarize the quantity and quality of papers published in different countries/regions, institutions, authors, and fields [6]. The systematic analysis of GWAS publications on RA, including research milestones, collaborative networks, and emerging research hotspots, provides valuable insights to guide future studies on methodological advances and research priorities in this field. By conducting a bibliometric analysis, we can better navigate the vast amount of information and ensure that our understanding of GWAS on RA research remains both up-to-date and comprehensive. However, until recently, comprehensive visualization and bibliometric analysis have not been conducted in the field of GWAS on RA. This study utilized the Web of Science Core Collection (WoSCC) to perform a comprehensive visualization and bibliometric analysis of research on RA and GWAS over the past 22 years, aiming to identify hotspots and development trends for future GWAS on RA.

Search Strategy

The Web of Science was the leading research platform in the life sciences, physical sciences, and technology, and it includes more than 11,000 leading academic journals [7]. Compared to other databases, including Scopus, MEDLINE, and PubMed, the Web of Science provides a more comprehensive and reliable platform for bibliometric analysis [8]. As shown in Figure 1, we used the WoSCC database as the data source for this study. Within WoSCC, we specifically utilized the Science Citation Index Expanded (SCI-EXPANDED) and Social Science Citation Index (SSCI) to identify relevant articles. SCI-EXPANDED provides comprehensive coverage of high-impact, peer-reviewed scientific journals with complete citation linkages, standardized author information, and detailed bibliometric data across natural sciences, engineering, and biomedical research [9]. SSCI complements this coverage by focusing on high-quality social science literature, offering extensive citation networks, author profiles, and institutional affiliations, which are particularly valuable for interdisciplinary research in public health and medical sociology [9]. Our research focused on articles and reviews indexed in English, utilizing a topic search method for retrieval. The search terms used were as follows: #1: Topic TS = (“Association Stud*, Genome-Wide” OR “Genome-Wide Association Stud*” OR “Study*, Genome-Wide Association” OR “Whole Genome Association Analysis” OR “GWA Stud*” OR “Stud*, GWA” OR “Whole Genome Association Study” OR “Genome-Wide Association Scan” OR “Genome-Wide Association Stud*” OR “Genome-Wide Association Analysis”); #2 TS=(“Arthritis, Rheumatoid” OR “Rheumatoid Arthritis”); Find data source: #1 AND #2. The period of study was from 2002 to 2024. The studies that met the following criteria were included: (1) published between January 1, 2002, and November 25, 2024, (2) focused on GWAS and RA, (3) published as articles or reviews, (4) indexed in SCI-EXPANDED and SSCI, (5) published in English. The exclusion criteria were: (1) duplicate publications, (2) non-articles or non-review formats, and (3) titles and abstracts that did not mention GWAS and RA. Two researchers screened the literature by examining the abstracts based on these inclusion and exclusion criteria.

Fig. 1.

Flow diagram of the inclusion process. A total of 1,143 publications on the topic of GWAS on RA were retrieved from the WoSCC database. After a manual screening process, 430 publications were excluded, resulting in a final dataset of 713 publications for bibliometric and visualization analysis.

Fig. 1.

Flow diagram of the inclusion process. A total of 1,143 publications on the topic of GWAS on RA were retrieved from the WoSCC database. After a manual screening process, 430 publications were excluded, resulting in a final dataset of 713 publications for bibliometric and visualization analysis.

Close modal

A total of 1,143 records were obtained, and 713 records were retained after screening and exported as complete records with cited references on November 25, 2024. The records were saved as plain text files and tabulated as compartmentalized files, with plain text files stored in text format.

Data Analysis

The above plain text was imported into VOSviewer (version 1.6.16) and CiteSpace (version 6.3.R1) software. VOSviewer is recognized for its capabilities in mapping and visualizing scientific knowledge, highlighting the structure, evolution, and collaboration within specific knowledge domains through graphical display and large-scale data analysis [10]. In our study, the software was utilized to complete the following analysis: countries/regions analysis, authors analysis, institutions analysis, and journals and co-cited journals analysis, keywords co-occurrence analysis. Visualization includes keywords and countries with more than 5 occurrences and authors and institutions ranked within the top 100 by publication volume. The visual network representing authors and institutions is further enhanced with a temporal overlay, where lighter shades denote more recent publications. In the generated maps, nodes represent entities such as countries, institutions, journals, and authors, with their size corresponding to the volume of related publications. The thickness of lines connecting nodes reflects the degree of co-citation or collaboration among them.

CiteSpace provides valuable insights into emerging trends and dynamic changes within research hotspots and disciplines [11]. In our study, CiteSpace was used to create dual-map overlays of journals and to visualize the keywords and references with citation bursts. Parameters were set as follows: timespan: 2002–2024 (slice length = 1), selection criteria: g-index (k = 25). Additionally, Microsoft Excel 2020 was used to conduct a quantitative analysis of the publication.

Bibliometric Indicators

We utilized several bibliometric indicators to quantify the impact and trends within the field. These indicators include (1) Normalized Citations: a metric that adjusts citation counts to reflect a publication’s citation impact relative to the average for its field, accounting for disciplinary citation practices; (2) total link strength (TLS): this indicates the overall connectivity of a node, reflecting the extent of its collaborations or co-citations; (3) burst strength: this detects significant increases in term frequency, highlighting emerging research trends or hotspots; (4) links: these represent the connections between nodes, such as co-authorships or co-citations, indicating the network of research relationships; (5) publications: this refers to the total articles published; (6) citations: this denotes the number of times an article has been cited; (7) occurrences: this refers to the frequency of specific terms or keywords in the articles.

Trends in Annual Publications

A visualization of the number of publications and frequency of citations could provide a clear picture of the pace and trends of research in the field. A total of 713 articles were included in this study between 2002 and 2024 and received 45,773 citations. As shown in Figure 2, the number of annual publications and annual citations in the field of RA and GWAS shows an increasing trend. Notably, there was a significant surge in the number of publications from 2007 to 2009, with 2024 marking the peak year for article output.

Fig. 2.

Annual number of publications citations in RA and GWAS.

Fig. 2.

Annual number of publications citations in RA and GWAS.

Close modal

Contribution of Countries/Regions

A total of 66 countries/regions contributed to GWAS publications on RA between 2002 and 2024. Table 1 lists the top 10 most productive countries/regions. The United States (US) was the most productive country (n = 263, 36.89%), followed by the People’s Republic of China (PRC) (n = 207, 29.03%) and the United Kingdom (UK) (n = 157, 22.02%). Notably, the US and UK also had higher Normalized Citations (US = 359.91, UK = 319.56), ranking them first and second in this metric. Figure 3a shows the geographical cooperation map of GWAS on RA in various regions. The US (TLS = 397) and UK (TLS = 403) had strong international collaboration networks, as indicated by their high TLS that were, respectively, 2.1 and 2.2 times higher than the average TLS of the top 10 productive countries.

Table 1.

Top 10 most productive countries/regions

CountriesPublicationsCitationsNormalized CitationsTLS
US 263 22,977 359.91 397 
PRC 207 7,800 226.76 85 
UK 157 33,351 319.56 403 
Japan 86 5,702 97.75 80 
Sweden 55 8,119 143.23 228 
Spain 54 5,917 94.62 142 
The Netherlands 51 9,695 137.54 208 
Germany 44 3,636 99.34 158 
Canada 35 7,446 107.13 139 
Italy 35 4,027 69.46 121 
CountriesPublicationsCitationsNormalized CitationsTLS
US 263 22,977 359.91 397 
PRC 207 7,800 226.76 85 
UK 157 33,351 319.56 403 
Japan 86 5,702 97.75 80 
Sweden 55 8,119 143.23 228 
Spain 54 5,917 94.62 142 
The Netherlands 51 9,695 137.54 208 
Germany 44 3,636 99.34 158 
Canada 35 7,446 107.13 139 
Italy 35 4,027 69.46 121 

Publications, total number of published articles; citations, total number of times an article has been cited; Normalized Citations, citation counts adjusted to reflect a publication’s impact relative to the field average; TLS, total link strength, overall connectivity of a node, reflecting the extent of collaborations or co-citations.

Fig. 3.

Collaboration networks among countries/regions, institutions, and authors in RA and GWAS. a Geographical collaboration map of countries/regions. The size of the nodes indicates the number of publications from each country, and the thickness of the connecting lines indicates the intensity of cooperation between countries. b Visual map of the collaboration network among institutions. Nodes and links are distinguished by color, where lighter colors indicate newer article publication and partnerships. c Visual map of the collaboration network among the authors.

Fig. 3.

Collaboration networks among countries/regions, institutions, and authors in RA and GWAS. a Geographical collaboration map of countries/regions. The size of the nodes indicates the number of publications from each country, and the thickness of the connecting lines indicates the intensity of cooperation between countries. b Visual map of the collaboration network among institutions. Nodes and links are distinguished by color, where lighter colors indicate newer article publication and partnerships. c Visual map of the collaboration network among the authors.

Close modal

Contribution of Institutions

A total of 1,308 institutions contributed to GWAS publications on RA between 2002 and 2024. Table 2 lists the top 20 most productive institutions. The University of Manchester had the most publications (n = 50, 7.01%), followed by the University of Tokyo (n = 34, 4.77%) and Harvard University (n = 33, 4.63%). Moreover, the University of Manchester also had the highest Normalized Citations (61.80) and TLS (152). Figure 3b provides a visualization of the collaboration network among the top 100 institutions, which contains 4,948 links, illustrating the evolution of institution contributions over time. The field was initiated by 12 pioneering institutions before 2012, with the University of California San Francisco (average publication year = 2,011.53, citations = 2,066) and University of Groningen (average publication year = 2011.38, citations = 2,499) making significant contributions. During 2012–2014, 25 institutions emerged, led by Harvard University (average publication year = 2012.24, citations = 4,085) and the University of Manchester (average publication year = 2013.72, citations = 4,064), establishing major research networks. The period of 2014–2016 saw 28 new institutions becoming active, notably the University of Tokyo (average publication year = 2014.09, citations = 2,117) and Karolinska Institute (average publication year = 2015.73, citations = 2,556), expanding the field into East Asia. The most recent phase (after 2016) featured 35 institutions, with Soochow University (average publication year = 2020.75), Xi'an Jiaotong University (average publication year = 2022.08), and Zhejiang University (average publication year = 2019.33, citations = 1,760) leading a strong expansion of Chinese research networks. This evolution demonstrates a gradual geographical diversification of GWAS research on RA, with distinct but interconnected regional clusters forming over time.

Table 2.

Top 20 most productive institutions

InstitutionsPublicationsCitationsNormalized CitationsTLSCountries
University of Manchester 50 4,064 61.80 152 UK 
University of Tokyo 34 2,117 36.02 68 Japan 
Harvard University 33 4,085 57.32 128 US 
Karolinska Institution 30 2,556 61.78 118 Sweden 
Anhui Medical University 23 1,082 23.32 17 PRC 
RIKEN 22 1,441 20.07 60 Japan 
University of Oxford 21 2,966 39.04 56 UK 
Soochow University 20 563 22.37 15 PRC 
Brigham and Women’s Hospital 18 2,656 34.41 91 US 
University of Cambridge 18 1,901 55.77 61 UK 
Kings Coll London 16 893 41.80 58 UK 
Leiden University 16 2,462 33.01 91 The Netherlands 
CSIC 15 918 16.03 55 Spain 
Tokyo Medical and Dental University 15 606 12.00 41 Japan 
The University of California, San Francisco 15 2,066 25.90 68 US 
The University of Chicago 15 1,268 22.32 28 US 
Broad Institute 14 2,124 28.06 99 US 
Osaka University 14 807 24.75 35 Japan 
University of California 14 1,305 15.79 41 US 
University of Pennsylvania 14 981 20.69 57 US 
InstitutionsPublicationsCitationsNormalized CitationsTLSCountries
University of Manchester 50 4,064 61.80 152 UK 
University of Tokyo 34 2,117 36.02 68 Japan 
Harvard University 33 4,085 57.32 128 US 
Karolinska Institution 30 2,556 61.78 118 Sweden 
Anhui Medical University 23 1,082 23.32 17 PRC 
RIKEN 22 1,441 20.07 60 Japan 
University of Oxford 21 2,966 39.04 56 UK 
Soochow University 20 563 22.37 15 PRC 
Brigham and Women’s Hospital 18 2,656 34.41 91 US 
University of Cambridge 18 1,901 55.77 61 UK 
Kings Coll London 16 893 41.80 58 UK 
Leiden University 16 2,462 33.01 91 The Netherlands 
CSIC 15 918 16.03 55 Spain 
Tokyo Medical and Dental University 15 606 12.00 41 Japan 
The University of California, San Francisco 15 2,066 25.90 68 US 
The University of Chicago 15 1,268 22.32 28 US 
Broad Institute 14 2,124 28.06 99 US 
Osaka University 14 807 24.75 35 Japan 
University of California 14 1,305 15.79 41 US 
University of Pennsylvania 14 981 20.69 57 US 

Publications, total number of published articles; citations, total number of times an article has been cited; Normalized Citations, citation counts adjusted to reflect a publication’s impact relative to the field average; TLS, total link strength, overall connectivity of a node, reflecting the extent of collaborations or co-citations; countries, geographical locations where the research was conducted.

Contribution of Authors

A total of 3,841 authors contributed to GWAS publications on RA between 2002 and 2024. Table 3 lists the top 10 most productive authors. The most productive author was Okada, Yukinori (n = 21), followed by Barton, Anne (n = 20), and Kochi, Yuta (n = 17). Okada, Yukinori, also had the highest Normalized Citations (33.07) and TLS (60). Figure 3c provides a visualization of the collaboration network among the top 100 authors, which contains 4,943 links, illustrating the evolution of author contributions over time. The field was pioneered by several influential researchers before 2012, with Katherine Siminovitch (average publication year = 2010.25, citations = 771) and Robert Plenge (average publication year = 2011.7, citations = 975) making substantial early contributions. During 2012–2014, the research network expanded significantly, led by Anne Barton (average publication year = 2013.1, citations = 1,536) and Jane Worthington (average publication year = 2012.58, citations = 1,394), who established major collaborative networks. The period of 2014–2016 saw the emergence of key researchers like Yukinori Okada (average publication year = 2016.09, citations = 1,371) and Leonid Padyukov (average publication year = 2014.83, citations = 427), expanding the field’s focus into Asian populations. The most recent phase (after 2016) featured researchers like Jing Ni (average publication year = 2022, citations = 100), Sen Yang (average publication year = 2013.33, citations = 140), and Shu-feng Lei (average publication year = 2018.87, citations = 109) leading a new wave of research networks. This evolution demonstrates a gradual diversification of RA genetic research, with distinct but interconnected collaborative clusters forming across different geographical regions over time.

Table 3.

Top 10 most productive authors

AuthorsPublicationsCitationsTLSNormalized CitationsCountries
Okada, Yukinori 21 1,371 60 33.07 Japan 
Barton, Anne 20 1,536 48 21.81 UK 
Kochi, Yuta 17 898 59 19.62 Japan 
Yamamoto, Kazuhtiko 14 823 59 17.44 Japan 
Worthington, Jane 12 1,394 47 17.43 UK 
Gregersen, Peter K 11 1,598 22 21.15 US 
Suzuki, Akari 11 774 52 16.30 Japan 
Orozco, Gisela 10 318 21 9.09 UK 
Plenge, Robert M 10 975 20 13.53 US 
Bae, Sang-Cheol 388 14 8.55 South Korea 
AuthorsPublicationsCitationsTLSNormalized CitationsCountries
Okada, Yukinori 21 1,371 60 33.07 Japan 
Barton, Anne 20 1,536 48 21.81 UK 
Kochi, Yuta 17 898 59 19.62 Japan 
Yamamoto, Kazuhtiko 14 823 59 17.44 Japan 
Worthington, Jane 12 1,394 47 17.43 UK 
Gregersen, Peter K 11 1,598 22 21.15 US 
Suzuki, Akari 11 774 52 16.30 Japan 
Orozco, Gisela 10 318 21 9.09 UK 
Plenge, Robert M 10 975 20 13.53 US 
Bae, Sang-Cheol 388 14 8.55 South Korea 

Publications, total number of published articles; citations, total number of times an article has been cited; TLS, total link strength, overall connectivity of a node, reflecting the extent of collaborations or co-citations; Normalized Citations, citation counts adjusted to reflect a publication’s impact relative to the field average; countries, geographical locations where the research was conducted.

Distribution of Journals and Co-Cited Journals

The top 10 most productive journals published 189 articles, which contributes 26.51% of the total publications. As shown in online supplementary Table S1 (for all online suppl. material, see https://doi.org/10.1159/000543947), Frontiers in Immunology was the journal with the highest number of published articles (n = 26), followed closely by Nature Genetics (n = 25) and Plos One (n = 23). The journal with the highest impact factors (IF) was Nature Genetics (IF = 31.7). Among the co-cited journals, the New England Journal of Medicine (IF = 96.2) was the journal with the highest IF. Moreover, the identification of shared research hotspots and distinct research directions between citing journals and cited journals provides insights into the interconnections and impacts among journals. In online supplementary Figure S1, the citing journals are displayed on the left side of the map, while the cited journals are shown on the right side. The connecting lines represent that research on certain topics in the cited journals is frequently cited by research on certain topics in the citing journals, and the width of the lines is closely related to the frequency of citations. As shown in online supplementary Figure S1, the yellow and green citation path indicated that research published in the Molecular Biology and Immunology journals, as well as the Molecular Biology and Clinical journals, generally referenced studies published in the Molecular Biology and Genetics journals.

Analysis of Keywords

In the analysis of keywords, only the keywords that appeared at least 5 times were considered. As a result, a total of 212 keywords and 7 clusters were obtained, as shown in Figure 4a and online supplementary Table S2. Online supplementary Table S3 shows the 20 most frequently occurring keywords. In addition to “rheumatoid arthritis” (n = 503) and “genome-wide association study” (n = 477), the keywords that appeared more frequently in our dataset included “susceptibility” (n = 246), “risk locus” (n = 231), “systemic lupus erythematosus” (n = 146), “disease” (n = 116), “variants” (n = 111), and “mendelian randomization analysis” (n = 103). Figure 4b shows the 25 keywords related to the strongest citation bursts. Notably, “mendelian randomization analysis” (33.94) had the strongest citation burst. This was followed by “instruments” (15.09), “risk locus” (10.05). The “single nucleotide polymorphisms,” “ptpn22”, “multiple sclerosis,” “crohns disease,” “region,” “lymphoid tyrosine phosphatase,” “itgam”, “risk locus,” “polymorphism,” and “nf-kappa-b” had their end burst time before 2013. The “mendelian randomization analysis,” “association,” “innate,” “instruments,” “bias,” “pathogenesis,” and “genome-wide association study” have gained significant popularity from 2021 to the present, suggesting the current surge of interest in this field of research.

Fig. 4.

Hotspots and Frontier analyses in RA and GWAS. a Co-occurrence network of keywords in RA and GWAS. The keywords were divided into seven categories. b Twenty-five keywords with the strongest citation bursts in RA and GWAS. The blue color indicates the timeline, and the red segments on the blue timeline indicate the start year, the end year, and the duration of the outbreak. c Twenty-five references with the strongest citation bursts in RA and GWAS.

Fig. 4.

Hotspots and Frontier analyses in RA and GWAS. a Co-occurrence network of keywords in RA and GWAS. The keywords were divided into seven categories. b Twenty-five keywords with the strongest citation bursts in RA and GWAS. The blue color indicates the timeline, and the red segments on the blue timeline indicate the start year, the end year, and the duration of the outbreak. c Twenty-five references with the strongest citation bursts in RA and GWAS.

Close modal

Analysis of Articles and Reference

The citation patterns of the articles and references revealed insights into the structure and dynamics of scientific paradigms. Online supplementary Table S4 lists the top 15 frequently cited articles [12‒26]. Meanwhile, we included the citation burst in Figure 4c to identify the major milestones in the development of GWAS research on RA steering the research trends [12, 13, 15, 27‒48]. “Genome-wide association study of 14,000 cases and 3,000 shared controls for 7 common diseases” [12] was the most frequently cited article (n = 7,417), with a substantial citation burst (strength = 24.85) during 2008–2012, marking a foundational study in the field. “Genetics of rheumatoid arthritis contributes to biology and drug discovery” [13] emerged as the second most frequently cited article (n = 1,658) and demonstrated the strongest recent citation burst (strength = 34.87) from 2015 to 2019, highlighting the translation of genetic findings into therapeutic applications. “Integration of summary data from GWAS and eQTL studies predicted complex trait gene targets” [14] ranked third in citations (n = 1,431) and maintains high contemporary relevance with a high CSNCR of 22.39. This progression of highly cited works reflects the field’s evolution from large-scale genomic studies to therapeutic applications and multi-omics integration, with consistent citation patterns supporting their foundational role in shaping GWAS research on RA.

General Information

In this study, we used CiteSpace and VOSviewer to review the research results and progress of GWAS on RA from 2002 to 2024. We observed that the number and citation frequency of related publications in this field were generally increasing since 2007 and peaking in 2024 (Fig. 2). This trend may be attributed to the increasing interest among researchers in the use of GWAS research on RA, which may be related to significant advancements in the field of GWAS research for RA. However, of the 713 articles retrieved, 540 were research articles, while the remaining 173 included review articles and other types of publications, reflecting a balanced composition of original research and synthetic analyses in this field. This distribution suggests both active ongoing research and sufficient scholarly effort to synthesize and summarize the accumulated findings. Online supplementary Figure S1 shows that publications in journals of basic science (Molecular Biology and Genetics) were mainly cited by publications in journals related to basic science (Molecular Biology and Immunology) and clinical medicine (medicine, medical clinical). GWAS research on RA has been conducted in 64 countries and regions, highlighting the global scope of this scientific endeavor. Regarding the number of publications by country (Table 1), the US leads the field with the highest number of publications, followed closely by the PRC and the UK, which rank second and third, respectively. This global distribution is further characterized by the top 10 most productive countries including six European countries (the UK, Sweden, the Netherlands, Spain, Germany, and Italy), two Asian countries (the PRC and Japan), and two North American countries (the US and Canada). This diverse representation underscores the international collaboration and research efforts within the field of GWAS research on RA. Among the 20 most productive institutions (Table 2), the majority were from the US, reflecting both the strong research infrastructure and substantial investments in genetic research in American institutions. This high representation of US institutions and their significant publication output appear to be mutually reinforcing: strong research capabilities and resources have enabled productive research output, while continuous scientific achievements have likely attracted further investment and talent, creating a positive feedback loop in advancing GWAS research on RA. To determine the cooperative relationships between countries/regions, we constructed the geographic cooperation map (Fig. 3a), which clearly showed that European countries tend to collaborate more frequently. The observed results may be influenced by factors such as geography or language. However, it is important to acknowledge that population differences may also have an impact on the study’s results. Given that the current study sample is predominantly from Europe, future studies would benefit from broadening its focus to include diverse racial and ethnic groups. Among the top 10 most productive authors, Okada, Yukinori, not only has the highest number of publications but also possesses the highest Normalized Citations and TLS. He has published articles in this field since 2012, mainly focusing on RA risk in East Asian people.

As shown in online supplementary Table S3, Table S4, Figures 4b and c, our bibliometric analysis revealed distinct research phases in GWASs on RA. Keyword analysis and citation patterns identified several research hotspots and emerging trends. The early phase (2005–2013) focused on fundamental genetic associations, as evidenced by the burst of keywords such as “lymphoid tyrosine phosphatase” (strength = 5.07) and “PTPN22” (strength = 5.04). This aligned with high citations of early GWAS papers, particularly the landmark study of seven common diseases [12] (7,417 citations). The field progressed toward cross-disease studies (2008–2014), evidenced by burst terms “multiple sclerosis” (strength = 4.22) and “celiac disease” (strength = 4.22), while research scope expanded to diverse populations, marked by “japanese population” (strength = 3.58). This internationalization was exemplified by Okada et al.’s work [40], which conducted the first cross-ethnic meta-GWAS and identified 42 new risk loci. Recent trends (2015–2024) demonstrate methodological advancement, with strong bursts in “Mendelian randomization analysis” (strength = 33.94) and “meta-analysis” (strength = 4.4). The current frontier emphasizes methodological rigor and translational applications, shown by recent bursts in “instruments” (strength = 15.09) and “pathogenesis” (strength = 4.93). This evolution is supported by highly cited papers bridging genetic findings with therapeutic applications and biological mechanisms, indicating the field’s progression toward integrated approaches combining genetic insights with clinical applications.

However, in conducting in-depth research on RA using GWAS, we had to consider various types of genetic variations, interactions between genes and the environment, and interactions between genes themselves. We are constantly faced with various challenges, such as linkage disequilibrium, genetic heterogeneity, and various biases. These biases not only stemmed from inherent biases in different populations or observational studies but could also be attributed to the analysis methods employed in the research. Three articles that were still at the forefront of breakthroughs, as listed in Figure 4c, were related to Mendelian randomization analysis and how to address bias issues [47, 49, 50]. Second, while the WoSCC is recognized as a robust tool with a central role in academic research, and its data representative, it is essential to note that its coverage may not extend to all publications, especially those in the gray literature. Relevant research is currently ongoing, and future breakthroughs in this field could contribute to our research on GWAS on RA.

This study investigated open literature data from the Web of Science Website and did not involve personal information, so ethical review and informed consent were not required.

The authors declare that they have no competing interests.

This study was supported by the National Natural Science Foundation of China (82103922, 81872681, 31401079, and 81473046), the Science and Technology Project of Suzhou (SS202050, SYS2019024), the QingLan Project of Higher Education of Jiangsu Province, and a Project of the Priority Academic Program Development of Jiangsu Higher Education Institutions.

Wen-Hui Wang analyzed and interpreted the data and was the primary contributor in writing the manuscript. Ming-Hui Xia and Xin-Ru Liu assisted in data organization. Shu-Feng Lei and He Pei provided insights for the manuscript and made revisions. All authors have read and approved the final manuscript.

The data that support the findings of this study are not publicly available due to the authors do not have permission to share the data but are available from corresponding author (e-mail: [email protected]) for further inquiries upon reasonable request.

1.
Smolen
JS
,
Aletaha
D
,
McInnes
IB
.
Rheumatoid arthritis
.
Lancet
.
2016
;
388
(
10055
):
2023
38
.
2.
Deane
KD
,
Demoruelle
MK
,
Kelmenson
LB
,
Kuhn
KA
,
Norris
JM
,
Holers
VM
.
Genetic and environmental risk factors for rheumatoid arthritis
.
Best Pract Res Cl Rh
.
2017
;
31
(
1
):
3
18
.
3.
MacGregor
AJ
,
Snieder
H
,
Rigby
AS
,
Koskenvuo
M
,
Kaprio
J
,
Aho
K
, et al
.
Characterizing the quantitative genetic contribution to rheumatoid arthritis using data from twins
.
Arthritis Rheum
.
2000
;
43
(
1
):
30
7
.
4.
Saevarsdottir
S
,
Stefansdottir
L
,
Sulem
P
,
Thorleifsson
G
,
Ferkingstad
E
,
Rutsdottir
G
, et al
.
Multiomics analysis of rheumatoid arthritis yields sequence variants that have large effects on risk of the seropositive subset
.
Ann Rheum Dis
.
2022
;
81
(
8
):
1085
95
.
5.
Ozaki
K
,
Ohnishi
Y
,
Iida
A
,
Sekine
A
,
Yamada
R
,
Tsunoda
T
, et al
.
Functional SNPs in the lymphotoxin-alpha gene that are associated with susceptibility to myocardial infarction
.
Nat Genet
.
2002
;
32
(
4
):
650
4
.
6.
Chen
C
.
Science mapping: a systematic review of the literature
.
J Data Inf Sci
.
2017
;
2
(
2
):
1
40
.
7.
Singh
VK
,
Singh
P
,
Karmakar
M
,
Leta
J
,
Mayr
P
.
The journal coverage of Web of Science, Scopus and Dimensions: a comparative analysis
.
Scientometrics
.
2021
;
126
(
6
):
5113
42
.
8.
Yeung
AWK
,
between Scopus
C
;
Web of Science
.
PubMed and publishers for mislabelled review papers
.
Curr Sci India
.
2019
;
116
(
11
):
1909-+
.
9.
Li
K
,
Rollins
J
,
Yan
E
.
Web of Science use in published research and review papers 1997-2017: a selective, dynamic, cross-domain, content-based analysis
.
Scientometrics
.
2018
;
115
(
1
):
1
20
.
10.
van Eck
NJ
,
Waltman
L
.
Software survey: VOSviewer, a computer program for bibliometric mapping
.
Scientometrics
.
2010
;
84
(
2
):
523
38
.
11.
Chen
CM
,
Hu
ZG
,
Liu
SB
,
Tseng
H
.
Emerging trends in regenerative medicine: a scientometric analysis in
.
Expert Opin Biol Th
.
2012
;
12
(
5
):
593
608
.
12.
Burton
PR
,
Clayton
DG
,
Cardon
LR
,
Craddock
N
,
Deloukas
P
,
Duncanson
A
, et al
.
Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls
.
Nature
.
2007
;
447
(
7145
):
661
78
.
13.
Okada
Y
,
Wu
D
,
Trynka
G
,
Raj
T
,
Terao
C
,
Ikari
K
, et al
.
Genetics of rheumatoid arthritis contributes to biology and drug discovery
.
Nature
.
2014
;
506
(
7488
):
376
81
.
14.
Zhu
Z
,
Zhang
F
,
Hu
H
,
Bakshi
A
,
Robinson
MR
,
Powell
JE
, et al
.
Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets
.
Nat Genet
.
2016
;
48
(
5
):
481
7
.
15.
Stahl
EA
,
Raychaudhuri
S
,
Remmers
EF
,
Xie
G
,
Eyre
S
,
Thomson
BP
, et al
.
Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci
.
Nat Genet
.
2010
;
42
(
6
):
508
14
.
16.
Kurilshikov
A
,
Medina-Gomez
C
,
Bacigalupe
R
,
Radjabzadeh
D
,
Wang
J
,
Demirkan
A
, et al
.
Large-scale association analyses identify host factors influencing human gut microbiome composition
.
Nat Genet
.
2021
;
53
(
2
):
156
65
.
17.
O'Shea
JJ
,
Plenge
R
.
JAK and STAT signaling molecules in immunoregulation and immune-mediated disease
.
Immunity
.
2012
;
36
(
4
):
542
50
.
18.
Dubois
PC
,
Trynka
G
,
Franke
L
,
Hunt
KA
,
Romanos
J
,
Curtotti
A
, et al
.
Multiple common variants for celiac disease influencing immune gene expression
.
Nat Genet
.
2010
;
42
(
4
):
295
302
.
19.
The Wellcome Trust Case Control Consortium
;
Craddock
N
,
Hurles
ME
,
Cardin
N
,
Pearson
RD
,
Plagnol
V
,
Robson
S
, et al
.
Genome-wide association study of CNVs in 16,000 cases of eight common diseases and 3,000 shared controls
.
Nature
.
2010
;
464
(
7289
):
713
20
.
20.
Liu
Y
,
Helms
C
,
Liao
W
,
Zaba
LC
,
Duan
S
,
Gardner
J
, et al
.
A genome-wide association study of psoriasis and psoriatic arthritis identifies new disease loci
.
PLoS Genet
.
2008
;
4
(
3
):
e1000041
.
21.
Hunt
KA
,
Zhernakova
A
,
Turner
G
,
Heap
GA
,
Franke
L
,
Bruinenberg
M
, et al
.
Newly identified genetic risk variants for celiac disease related to the immune response
.
Nat Genet
.
2008
;
40
(
4
):
395
402
.
22.
Eyre
S
,
Bowes
J
,
Diogo
D
,
Lee
A
,
Barton
A
,
Martin
P
, et al
.
High-density genetic mapping identifies new susceptibility loci for rheumatoid arthritis
.
Nat Genet
.
2012
;
44
(
12
):
1336
40
.
23.
Suhre
K
,
Arnold
M
,
Bhagwat
AM
,
Cotton
RJ
,
Engelke
R
,
Raffler
J
, et al
.
Connecting genetic risk to disease end points through the human blood plasma proteome
.
Nat Commun
.
2017
;
8
:
14357
.
24.
Zhernakova
A
,
van Diemen
CC
,
Wijmenga
C
.
Detecting shared pathogenesis from the shared genetics of immune-related diseases
.
Nat Rev Genet
.
2009
;
10
(
1
):
43
55
.
25.
Australia, New Zealand Multiple Sclerosis Genetics C: Genome-wide association study identifies new multiple sclerosis susceptibility loci on chromosomes 12 and 20.
Australia and New Zealand Multiple Sclerosis Genetics Consortium ANZgene
.
Genome-wide association study identifies new multiple sclerosis susceptibility loci on chromosomes 12 and 20
.
Nat Genet
.
2009
;
41
(
7
):
824
8
.
26.
Parkes
M
,
Cortes
A
,
van Heel
DA
,
Brown
MA
.
Genetic insights into common pathways and complex relationships among immune-mediated diseases
.
Nat Rev Genet
.
2013
;
14
(
9
):
661
73
.
27.
Begovich
AB
,
Carlton
VE
,
Honigberg
LA
,
Schrodi
SJ
,
Chokkalingam
AP
,
Alexander
HC
, et al
.
A missense single-nucleotide polymorphism in a gene encoding a protein tyrosine phosphatase (PTPN22) is associated with rheumatoid arthritis
.
Am J Hum Genet
.
2004
;
75
(
2
):
330
7
.
28.
Todd
JA
,
Walker
NM
,
Cooper
JD
,
Smyth
DJ
,
Downes
K
,
Plagnol
V
, et al
.
Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes
.
Nat Genet
.
2007
;
39
(
7
):
857
64
.
29.
Price
AL
,
Patterson
NJ
,
Plenge
RM
,
Weinblatt
ME
,
Shadick
NA
,
Reich
D
.
Principal components analysis corrects for stratification in genome-wide association studies
.
Nat Genet
.
2006
;
38
(
8
):
904
9
.
30.
Wellcome Trust Case Control Consortium
.
Wellcome Trust Case Control C: genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls
.
Nature
.
2007
;
447
(
7145
):
661
78
.
31.
Plenge
RM
,
Seielstad
M
,
Padyukov
L
,
Lee
AT
,
Remmers
EF
,
Ding
B
, et al
.
TRAF1-C5 as a risk locus for rheumatoid arthritis--a genomewide study
.
N Engl J Med
.
2007
;
357
(
12
):
1199
209
.
32.
Plenge
RM
,
Cotsapas
C
,
Davies
L
,
Price
AL
,
de Bakker
PI
,
Maller
J
, et al
.
Two independent alleles at 6q23 associated with risk of rheumatoid arthritis
.
Nat Genet
.
2007
;
39
(
12
):
1477
82
.
33.
Remmers
EF
,
Plenge
RM
,
Lee
AT
,
Graham
RR
,
Hom
G
,
Behrens
TW
, et al
.
STAT4 and the risk of rheumatoid arthritis and systemic lupus erythematosus
.
N Engl J Med
.
2007
;
357
(
10
):
977
86
.
34.
Thomson
W
,
Barton
A
,
Ke
X
,
Eyre
S
,
Hinks
A
,
Bowes
J
, et al
.
Rheumatoid arthritis association at 6q23
.
Nat Genet
.
2007
;
39
(
12
):
1431
3
.
35.
Purcell
S
,
Neale
B
,
Todd-Brown
K
,
Thomas
L
,
Ferreira
MA
,
Bender
D
, et al
.
PLINK: a tool set for whole-genome association and population-based linkage analyses
.
Am J Hum Genet
.
2007
;
81
(
3
):
559
75
.
36.
International HapMap Consortium
;
Frazer
KA
,
Ballinger
DG
,
Cox
DR
,
Hinds
DA
,
Stuve
LL
,
Gibbs
RA
, et al
.
A second generation human haplotype map of over 3.1 million SNPs
.
Nature
.
2007
;
449
(
7164
):
851
61
.
37.
Raychaudhuri
S
,
Remmers
EF
,
Lee
AT
,
Hackett
R
,
Guiducci
C
,
Burtt
NP
, et al
.
Common variants at CD40 and other loci confer risk of rheumatoid arthritis
.
Nat Genet
.
2008
;
40
(
10
):
1216
23
.
38.
Raychaudhuri
S
,
Thomson
BP
,
Remmers
EF
,
Eyre
S
,
Hinks
A
,
Guiducci
C
, et al
.
Genetic variants at CD28, PRDM1 and CD2/CD58 are associated with rheumatoid arthritis risk
.
Nat Genet
.
2009
;
41
(
12
):
1313
8
.
39.
Franke
A
,
McGovern
DP
,
Barrett
JC
,
Wang
K
,
Radford-Smith
GL
,
Ahmad
T
, et al
.
Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci
.
Nat Genet
.
2010
;
42
(
12
):
1118
25
.
40.
Okada
Y
,
Terao
C
,
Ikari
K
,
Kochi
Y
,
Ohmura
K
,
Suzuki
A
, et al
.
Meta-analysis identifies nine new loci associated with rheumatoid arthritis in the Japanese population
.
Nat Genet
.
2012
;
44
(
5
):
511
6
.
41.
Raychaudhuri
S
,
Sandor
C
,
Stahl
EA
,
Freudenberg
J
,
Lee
HS
,
Jia
X
, et al
.
Five amino acids in three HLA proteins explain most of the association between MHC and seropositive rheumatoid arthritis
.
Nat Genet
.
2012
;
44
(
3
):
291
6
.
42.
Stahl
EA
,
Wegmann
D
,
Trynka
G
,
Gutierrez-Achury
J
,
Do
R
,
Voight
BF
, et al
.
Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis
.
Nat Genet
.
2012
;
44
(
5
):
483
9
.
43.
Jostins
L
,
Ripke
S
,
Weersma
RK
,
Duerr
RH
,
McGovern
DP
,
Hui
KY
, et al
.
Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease
.
Nature
.
2012
;
491
(
7422
):
119
24
.
44.
Trynka
G
,
Sandor
C
,
Han
B
,
Xu
H
,
Stranger
BE
,
Liu
XS
, et al
.
Chromatin marks identify critical cell types for fine mapping complex trait variants
.
Nat Genet
.
2013
;
45
(
2
):
124
30
.
45.
Farh
KK
,
Marson
A
,
Zhu
J
,
Kleinewietfeld
M
,
Housley
WJ
,
Beik
S
, et al
.
Genetic and epigenetic fine mapping of causal autoimmune disease variants
.
Nature
.
2015
;
518
(
7539
):
337
43
.
46.
Verbanck
M
,
Chen
CY
,
Neale
B
,
Do
R
.
Publisher Correction: detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases
.
Nat Genet
.
2018
;
50
(
8
):
1196
.
47.
Hemani
G
,
Zheng
J
,
Elsworth
B
,
Wade
KH
,
Haberland
V
,
Baird
D
, et al
.
The MR-Base platform supports systematic causal inference across the human phenome
.
Elife
.
2018
;
7
:
e34408
.
48.
Burgess
S
,
Davey Smith
G
,
Davies
NM
,
Dudbridge
F
,
Gill
D
,
Glymour
MM
, et al
.
Guidelines for performing Mendelian randomization investigations: update for summer 2023
.
Wellcome Open Res
.
2019
;
4
:
186
.
49.
Verbanck
M
,
Chen
CY
,
Neale
B
,
Do
R
.
Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases
.
Nat Genet
.
2018
;
50
(
5
):
693
8
.
50.
Burgess
S
,
Thompson
SG
.
Interpreting findings from Mendelian randomization using the MR-Egger method
.
Eur J Epidemiol
.
2017
;
32
(
5
):
377
89
.