Abstract
Background: The Aromatoleum/Azoarcus/Thauera (AAT) cluster comprises anaerobic degradation specialists (Aromatoleum, Thauera) and N2-fixing endophytes (Azoarcus). Omics-based and genetic studies with associated model strains implicate stringent response (SR) in adaptation to nutrient limitation and plant colonization. SR is well-studied in standard bacteria such as Escherichia coli and known as adaptive strategy to nutrient limitations by adjusting e.g., transcription and stress response. SR involves the alarmone (p)ppGpp, whose cellular level is controlled by the synthetases/hydrolases RelA/SpoT and the noncanonical transcription factor DksA, whose interaction with RNA polymerase (RNAP) binding of (p)ppGpp enhances. Summary: DksA-mediated SR occurs across Proteobacteria and other phylogenetic groups such as Myxococcia and Spirochaetia, mostly applying to pathogens. Furthermore, all three DksA variants (four, two or one cysteine residue(s) for Zn2+-binding) were found. Genes encoding SR components are present in all studied 37 genomes representing 31 species from the AAT cluster. Each genome encodes a synthesizing RelA, a hydrolyzing SpoT, a four cysteine-containing DksA, and mostly also a one cysteine-containing DksA. Opposing functions of RelA and SpoT in Aromatoleum aromaticum EbN1T, Aromatoleum sp. strain CIB, Azoarcus olearius BH72, and Thauera aromatica K172T (also entire AAT cluster) are implicated by full conservation of amino acids (E and D vs. 2H2D motif and ED diad) essential for catalysis by their synthetase versus hydrolase domains. Likewise, functionality of the predicted C4-type DksAs from these four model strains was visually assessed by structural modeling and comparison of key features (binding sites for Zn2+/(p)ppGpp; CC tip for RNAP interaction) to those of the available E. coli DksA cryo-EM structure. Key Messages: SR as a global adaptation strategy should contribute to the success of the AAT cluster in its distinct habitats: complex and highly variable soils/sediments (high molecular and microbial diversity, fluctuating nutrient availabilities and redox states) of the free-living degradation specialists versus the defined endorhizosphere (more stable conditions, less complex community) of the endophytes. A noteworthy exception is Aromatoleum sp. strain CIB by combining degradation and endophytic features. Thus, future investigations into the role of SR in the habitat success of such bacteria reflecting their divergent environmental niches are needed as well as promising.
Introduction
The betaproteobacterial family Rhodocyclaceae harbors a distinct cluster of physiologically specialized, facultative anaerobic bacteria distributed across the three genera Aromatoleum, Azoarcus, and Thauera [1]. While members of the genera Aromatoleum [2] and Thauera [3] comprise anaerobic degradation specialists for hydrocarbons, aromatic compounds, and terpenes, those of the genus Azoarcus stand out for their N2-fixing, endophytic lifestyle in association with plant roots, e.g., of rice [4]. The diversity of the Aromatoleum/Azoarcus/Thauera (AAT) cluster is evident from its 9/4/18 taxonomically described species and 10/7/20 available genomes, with the first complete genomes reported for Aromatoleum aromaticum EbN1T [5] and Azoarcus olearius BH72 [6]. Moreover, a recent biogeographic meta-analysis showed that members of this cluster are globally distributed and occur in a large variety of terrestrial and aquatic habitats [7]. In particular, 4 members of the AAT cluster were developed over more than 20 years into well-studied model systems by investigating them with different as well as complementary foci of research topics and experimental approaches, as briefly summarized in the following.
Aromatoleum aromaticum EbN1T is the model for architecture and regulation of the catabolic network towards a systems biology-level understanding, with combination of physiology, molecular genetics, and proteogenomics. The strain was originally isolated from freshwater mud samples with ethylbenzene under nitrate-reducing conditions [8]. The research on its complete catabolic network - the degradation part comprises 31 peripheral and 4 central routes, overall recruiting 279 different proteins (178 identified) [5] - integrates several perspectives, such as: (i) reaction sequences and substrate-specific regulation of (novel) peripheral degradation routes [9, 10]; (ii) function of sensory/regulatory proteins and nano-molar thresholds for responsiveness towards aromatic substrates [11]; (iii) inducer-independent formation of uptake systems and degradation routes in response to nutrient limitation (“preparedness for nutritional opportunities”) [12]; (iv) systems biology-level studies including a genome-wide metabolic model [13]; (v) horizontal gene transfer as a shaping force for the strain’s genome and likely also the genus’s pan-genome [7]; and (vi) novel anaerobic enzymes, including C—H bond cleaving ethylbenzene dehydrogenase [14] and ATP-dependent acetophenone carboxylase [15].
Aromatoleum sp. strain CIB is the model for transcriptional regulation of aromatic compound degradation, with application of profound molecular genetics. The strain (formerly Azoarcus sp. strain CIB; taxonomic description is still awaited) was isolated with benzoate under nitrate-reducing conditions [16] from a culture derived from a fuel contaminated laboratory column [17]. Major insights into transcriptional regulation include: (i) the repressors BzdR and MbdR, which are responsive to benzoyl- versus 3-methylbenzoyl-CoA and hence control the bzd/box gene clusters (anaerobic/aerobic benzoate degradation) versus mbd genes (anaerobic 3-methylbenzoate degradation) [18, 19]; and (ii) the master regulator AccR, which represses bzd gene expression in the context of carbon catabolite repression [20, 21]. Recently, the capacity for colonizing the roots of rice was recognized [22], with the endophytic lifestyle possibly governed by the second messenger cyclic di-GMP (c-di-GMP) [23].
Azoarcus olearius BH72 is the original model for the N2-fixing, endophytic lifestyle within the AAT cluster, with integration of advanced microscopy, physiology, transcriptomics, and molecular genetics. The strain was isolated from surface-sterilized roots of Kallar grass from saline-sodic soil in Pakistan, using nitrogen-free, malate-containing mineral medium [4]; species name according to [24]. Insights into the strategies for endophytic diazotroph-cereal interactions include: (i) colonization of rice roots in high numbers [25], with microoxia in root aerenchyma favoring nif gene expression (nitrogenase) [26‒28]; (ii) adhesion to the root surface requires the main structural pilAB genes (type IV pili) [29], minor pilins [30], and several adhesion proteins [31]; (iii) ingress and systemic spreading inside the plant is facilitated by the endoglucanase EglA [32], as well as twitching (type VI pili) [33] and flagellar motility [34]; (iv) two signal transduction proteins (c-di-GMP signaling) and components of type VI protein secretion systems (T6SS) are involved in endophytic competence [30]; and (v) N-limitation induces downregulation of genetic information processing versus upregulation of flagella and pili formation [35], which is reminiscent of responses earlier linked with stringent response (SR) in E. coli [36].
Thauera aromatica K172T is the model for the intricate enzyme mechanisms of the novel reactions discovered in aromatic compound degradation, with application of sophisticated biochemical approaches. The strain was isolated from anaerobic sludge with phenol under nitrate-reducing conditions [37]. Examples of novel degradative enzymes and reactions are: (i) benzylsuccinate synthase, which homolytically cleaves a benzylic C—H bond of toluene to facilitate the stereoselective addition to fumarate yielding (R)-benzylsuccinate [38, 39]; (ii) a modified β-oxidation converts the latter to central benzoyl-CoA [40, 41]; (iii) phosphorylation [42] followed by carboxylation [43] initiate anaerobic degradation of phenol; (iv) α-oxidation is involved in phenylalanine conversion to phenylacetate [44]; (iv) reductive dearomatization via an ATP-dependent, Birch-like reaction converts central benzoyl-CoA to a cyclic dienoyl-CoA [45, 46], which is then further degraded (v) via a specific, modified β-oxidation [47, 48].
The strategies of a “preparedness for nutritional opportunities” apparently implemented by Ar. aromaticum EbN1T in response to nutrient limitation [12], as well as the involvement of c-di-GMP in response to exposure to rice plant (Oryza sativa) exudates [30] of and “preparedness for alternative N-source opportunities under N2-fixation” by Az. olearius BH72 [35] already suggested SR and/or similar global regulatory processes to play a role in adaptation to niche-specific environmental conditions. This notion was recently substantiated by studies with Aromatoleum sp. strain CIB on how its endophytic lifestyle could be controlled by c-di-GMP levels [23]. Furthermore, the disparate habitats of the free-living degradation specialists (high molecular and microbial diversity and fluctuations) and the N2-fixing endophytes (more stable conditions, less complex community) suggest distinct strategies for niche adaptation in the phylogenetically coherent AAT cluster. Against this background, the present study aims at briefly summarizing the state-of-the-art of SR to then bioinformatically define the genetic repertoire for conducting SR across the AAT cluster, in order to lay a foundation for future experimental investigations.
SR in Escherichia coli
Role of (p)ppGpp ‒ Historical Perspective
Nearly all bacteria have to deal with the omnipresent risk of nutrient limitation. To overcome such periods of scarcity, bacteria have evolved regulatory mechanisms to globally adjust their metabolism. This involves downregulating processes such as DNA replication, transcription, translation, and monomer biosynthesis to maintain cellular balance while simultaneously activating amino acid biosynthesis, stress responses, and other cellular processes. This adaptive strategy, known as SR, is a conserved feature across bacterial species and has been extensively characterized in standard model microorganisms such as Escherichia coli. SR is regulated by the alarmones (second messengers) guanosine tetraphosphate (ppGpp) and guanosine pentaphosphate (pppGpp), collectively referred to as (p)ppGpp (Fig. 1a) [49]. These molecules were first detected as chromatographic signals, termed “Magic Spots” (MS I and MS II), in amino acid-starved auxotrophic E. coli strains and later identified [50‒52]. A key regulatory mechanism of (p)ppGpp involves its binding to RNA polymerase (RNAP) at two distinct sites: site 1 at the interface of the ω and β′ subunits of RNAP and site 2 at the interface of RNAP and the transcription factor DksA. (p)ppGpp-binding at both sites regulates RNAP through allosteric inhibition: in site 1 by hindering relative movement of RNAP subunits and in site 2 by stabilizing RNAP-DksA interactions in the secondary channel of RNAP, which leads to interference with its promoter interactions (Fig. 2a) [49, 53‒55]. Beyond RNAP, (p)ppGpp has multiple additional regulatory targets in the form of proteins that bind it directly [56].
Multifaceted regulatory and global metabolic control by SR in E. coli. a Structure of the alarmone (second messenger) (p)ppGpp. b Synthesis/hydrolysis of (p)ppGpp and affected processes. c Domain structures of the (p)ppGpp synthetases/hydrolases Rel, RelA, and SpoT. Abbreviations (alphabetic order): ACT, aspartate kinase, chorismate mutase, TyrA (prephenate dehydrogenase); AH/RIS, α-helical/ribosomal intersubunit; CTD, C-terminal domain; NTD, N-terminal domain; and TGS, ThrRS (threonine-tRNA synthetase), GTPase, SpoT. Light blue and dashed line indicate nonfunctional pseudoHD (hydrolase domain) in RelA; pale pink marks largely inactive SD (synthetase domain) in SpoT. Gray section within SD marks the α11-loop, which interlinks it to (pseudo)HD and carries the allosteric site of (p)ppGpp (red dot) in Rel and RelA. d Regulation of the synthetase activity of RelA off the ribosome versus by ternary interaction with deacylated tRNA and stalled ribosome. Subfigures were redrawn/modified from [49].
Multifaceted regulatory and global metabolic control by SR in E. coli. a Structure of the alarmone (second messenger) (p)ppGpp. b Synthesis/hydrolysis of (p)ppGpp and affected processes. c Domain structures of the (p)ppGpp synthetases/hydrolases Rel, RelA, and SpoT. Abbreviations (alphabetic order): ACT, aspartate kinase, chorismate mutase, TyrA (prephenate dehydrogenase); AH/RIS, α-helical/ribosomal intersubunit; CTD, C-terminal domain; NTD, N-terminal domain; and TGS, ThrRS (threonine-tRNA synthetase), GTPase, SpoT. Light blue and dashed line indicate nonfunctional pseudoHD (hydrolase domain) in RelA; pale pink marks largely inactive SD (synthetase domain) in SpoT. Gray section within SD marks the α11-loop, which interlinks it to (pseudo)HD and carries the allosteric site of (p)ppGpp (red dot) in Rel and RelA. d Regulation of the synthetase activity of RelA off the ribosome versus by ternary interaction with deacylated tRNA and stalled ribosome. Subfigures were redrawn/modified from [49].
Role of DksA in SR in E. coli. a (p)ppGpp/DksA interactions with RNA polymerase (RNAP) and effects on transcription; site 1 of RNAP binds only (p)ppGpp, while site 2 accepts DksA and (p)ppGpp. b RNAP/DksA/(p)ppGpp indirectly affect the divisome and thereby the cell length. Subfigures (a) and (b) were redrawn/modified from [49, 53], respectively.
Role of DksA in SR in E. coli. a (p)ppGpp/DksA interactions with RNA polymerase (RNAP) and effects on transcription; site 1 of RNAP binds only (p)ppGpp, while site 2 accepts DksA and (p)ppGpp. b RNAP/DksA/(p)ppGpp indirectly affect the divisome and thereby the cell length. Subfigures (a) and (b) were redrawn/modified from [49, 53], respectively.
Role of Rel, RelA, and SpoT
The levels of SR-triggering (p)ppGpp in E. coli fluctuate depending on environmental changes, starvation, and stress [57]. The turnover between (p)ppGpp and the nucleotides GTP and GDP is catalyzed by synthetase (SYNTH) domain (SD)-containing enzymes (formation) and by hydrolase domain (HD)-containing enzymes (degradation). The synthetases form (p)ppGpp by an ATP-dependent 3′-hydroxyl group phosphorylation of GTP or GDP, respectively, while hydrolases cleave off this inorganic diphosphate (Fig. 1a, b). These enzymes belong to the RelA/SpoT homolog (RSH) superfamily, which includes three main types: (i) long-RSH enzymes with both SD and HD in their N-terminal regions (NTD) and additional regulatory subdomains in the C-terminal region (CTD), such as the TGS domain for interaction with uncharged tRNA; (ii) small alarmone synthetases with only the SD; and (iii) small alarmone hydrolases with only the HD (Fig. 1c) [49, 58]. The inherent sequence similarity between the name-giving RelAs/SpoTs of the RSH superfamily and Rels can be attributed to the evolutionary origin of RelA and SpoT as differentiations of Rel after a gene duplication. The archetypical Rel contains both functional SD and HD where either one is exclusively activated by the C-terminal regulatory domains in response to starvation versus nutrient abundance. RelA and SpoT also contain both SD and HD; however, they play contrasting roles in E. coli. While RelA contains an inactive pseudoHD and mainly acts as a switched synthetase, SpoT has only weak synthetase activity and acts mainly as a switched hydrolase [59, 60]. The cellular homeostasis of (p)ppGpp is not only required for aligning SR with multifaceted physiological demands but also needs to be strictly maintained, since over-accumulation of (p)ppGpp is detrimental to the cell [61]. Several appropriate regulation systems, which are based on controlling the activities of RelA and SpoT and reveal also the complex interactions of SR with metabolism, are described in the following.
RelA Regulation
Control of the (p)ppGpp-synthesizing activity of RelA (Fig. 1d) starts from its closed conformation, where pseudoHD and CTD block spatially the catalytic SD such that its substrates can be accommodated only poorly. The ramping up of SD activity can proceed via two distinct and intriguing routes that are either ribosome-independent or, on the contrary, involve the stalled ribosome (with empty A-site): (i) upon binding of (p)ppGpp to its allosteric site at the α-helix, which interlinks pseudoHD and SD, the NTD adopts an open conformation releasing its spatial inhibition. This allows efficient entry of substrates and thereby turning on of the SD activity. The spatial interaction with the CTD remains preserved, however. (ii) Under conditions of amino acid starvation, closed RelAs interact with deacetylated tRNAs and stalled ribosomes, forming the so-called “starved” ribosomal complex. Here, the spatial inhibition of SD by CTD but not that by pseudoHD is released yielding an extended conformation of RelA primed for (p)ppGpp synthesis. Subsequently, binding of (p)ppGpp to its allosteric site lifts the inhibition by pseudoHD, which leads to the full activation of SD [60, 62‒65].
Activity of RelA and consequently (p)ppGpp homeostasis in E. coli can furthermore be controlled on two different levels with opposing effects, whereby SR is linked to N-metabolism: (i) in response to N-starvation, the hexameric NtrC response regulator of the NtrBC two-component system (controlling nitrogen assimilation) binds to an enhancer-like region upstream of the relA gene. This activates an until then inactive σ54-RNAP-promoter complex, which in turn amplifies the constitutive σ70-dependent relA gene expression and as a result formation of (p)ppGpp [36]. (ii) By contrast, the NirD protein (small subunit of cytoplasmic nitrite reductase, NirBD) can bind to the catalytic NTD of RelA, resulting in inhibition of (p)ppGpp synthesis [66].
SpoT Regulation
The activity of SpoT can be modulated by various means: (i) interaction of the SpoT-TGS domain with the acyl carrier protein results in turning off of the hydrolase function for the benefit of the synthesis function, such that under conditions of impaired fatty acid metabolism (acyl carrier protein loaded with short- rather than long-chain fatty acids), SR is switched on [67]. (ii) Conversely, under the condition of nutrient richness, the 50S ribosomal subunit associated GTPase (CgtA/ObgE) interacts with SpoT to maintain its (p)ppGpp-hydrolyzing activity, thereby suppressing SR and ensuring normal growth [68‒70]. (iii) When glucose depletion is succeeded by a shift to an alternative substrate, the Rsd protein (anti-σ activity) is released from the PTS component HPr to then associate with SpoT and stimulate its HD. This harnesses excessive formation of (p)ppGpp by RelA during such transitions in C-source utilization [71]. (iv) During fatty acid and phosphate starvation, the small YtfK protein interacts with the catalytic domains of SpoT shifting its activity from hydrolysis to synthesis [72]. Notably, expression of the ytfK gene is positively regulated by the cAMP-CRP complex relating it to glucose availability and linked catabolite repression [73]. (v) The YmgB protein binds directly to the TGS and helical domains of SpoT, resulting again in SpoT-dependent accumulation of (p)ppGpp [74]. Earlier structural-functional analyses of YmgB indicated a role as DNA-binding protein and involvement in acid resistance [75].
Role of DksA
DksA is an independent regulator of RNAP, whose function is significantly enhanced in the presence of (p)ppGpp. It belongs to a family of structurally similar proteins that regulate transcription not by binding to DNA but by interacting within the secondary channel of RNAP. Despite sharing a conserved coiled-coil (CC) core motif with similar amino acid residues at functional key positions, other notable members of this family, such as GreA and GreB, exhibit no overall sequence similarity with DksA. These proteins bind competitively to the RNAP secondary channel, where they fulfill both redundant and unique regulatory functions [76, 77]. One regulatory effect on transcription by DksA and Gre factors arises from their ability to bind to RNAP prior to transcription initiation (Fig. 2a), destabilizing the RNAP open complex and thus acting as negative regulators of transcription [78‒80]. Notably, binding of (p)ppGpp at site 2 between DksA and the β′rim helix of RNAP, forming a ternary complex (RNAP-DksA/(p)ppGpp), alters the location of DksA therein, which enhances the destabilizing effect of DksA on the interaction of RNAP open with rRNA gene promoters [81]. Additionally, DksA and Gre factors influence transcription by targeting non-transcribing elongation complexes. Specifically, Gre factors bind to elongation complexes, which are arrested because of misincorporation, and cleave nascent RNA to rescue the complex. Similarly, DksA can bind to stalled elongation complexes during starvation and rescue the complex through an as-yet-unknown mechanism [76, 82]. In the context of SR, DksA employs a second fundamental regulatory mechanism where it, together with (p)ppGpp, exerts either positive or negative control of transcription depending on the GC content of target promoters. Specifically, it enhances transcription from promoters with low GC content while repressing those with high GC content [83, 84]. In sum, the cooperative regulatory effects of DksA and site 2-bound (p)ppGpp outperform the effects of (p)ppGpp bound to site 1, giving rise to diverse regulatory answers to varying (p)ppGpp levels [55]. Regulatory targets of DksA/(p)ppGpp in SR help the cell to manage starvation stress and beyond that influence cell wall homeostasis and cell division (Fig. 2b) [53, 85].
Notably, the intracellular concentration of DksA remains constant under different conditions and throughout the cell’s life cycle [75]. However, oxidative stress can lead to oxidation of Zn2+-binding cysteines (yielding disulfide bonds), distortion of the protein’s globular structure, and thereby to dysfunctional DksA. Notably, in Salmonella (closely related to E. coli), the chaperone DnaJ was shown to interact with impaired DksA and restore its functionality by reducing the disulfide bonds [86].
General Dissemination of DksA-Mediated SR
Since its initial discovery and mechanistical elucidation in E. coli, SR has been recognized as a widespread regulatory process, found in symbiotic and pathogenic bacteria-host interactions, plants, and animals [87‒90]. While these examples are predominantly mediated by (p)ppGpp, DksA-mediated SR appears to be largely restricted to Proteobacteria, with a few notable exceptions. In Firmicutes such as B. subtilis rising (p)ppGpp levels correlate with reduced GTP levels, which in turn impede RNA synthesis through this limitation in the NTP pool [91, 92]. In Archaea stringent-like phenotypes have been observed; however, these cases are mediated by neither DksA or (p)ppGpp [93, 94]. In contrast to these SR mechanisms in Firmicutes, Archaea, and eukaryotes, DksA-mediated SR in Proteobacteria represents a more direct and specific transcriptional control mechanism (see above paragraph). Table 1 [95‒122] shows a selection of organisms in which the phenotypic response of DksA-mediated SR has been studied. The diverse phenotypic responses show the lifestyle-dependent role of DksA and (p)ppGpp beyond nutrient limitation-triggered SR. It ranges from virulence control in pathogenic bacteria to facilitating symbiosis in plant symbionts, general stress response and resistance, phage-host interactions, and to being essential for viability in Myxococcus xanthus. A particularity in some organisms is the presence of two or more DksA paralogs that differ in the cysteine content of the Zn2+-binding site. DksA phylotypes with four, two, or one conserved cysteine(s) have been reported, where the differing Zn2+-binding kinetics directly influence the regulatory response to oxidative and nitrosative stress [104, 123, 124]. Rhodobacter sphaeroides encodes a four cysteine (C4) and a one cysteine (C1) DksA paralog. While the C1 DksA is formed during aerobic and anaerobic growth, the C4 DksA is only induced under anoxic conditions. Furthermore, the C1 DksA is necessary for photosynthetic growth, and its function can be complemented by the C4 DksA of E. coli and vice versa. The regulatory role of the C4 paralog, which cannot supplement DksA in E. coli, is unclear with deletion mutants having no specific phenotype [97].
Dissemination of DksA-mediated SR and phenotypic response
Organism . | Lifestyle . | DksA typea . | Exper. evidence . | Phenotypic response . | Ref. . |
---|---|---|---|---|---|
Alphaproteobacteria | |||||
Bartonella henselae | Pathogen (human) | C1 | Gene KO |
| [95] |
Sinorhizobium meliloti | Plant symbiont | C4 + C1 | Gene KO |
| [96] |
Rhodobacter sphaeroides | Aquatic, photosynthetic | C4 + C1 | Gene KO |
| [97] |
Gammaproteobacteria | |||||
Escherichia coli | Opportunistic pathogen (human) | C4 | Crystal structure |
| [98] |
[99] | |||||
[100] | |||||
[101] | |||||
E. coli – phage P1 interaction | Bacteriophage |
| [102] | ||
Salmonella spp. | Pathogen (human) | C4 | Gene KO |
| [103] |
[104] | |||||
[105] | |||||
Acinetobacter baumannii | Opportunistic pathogen (human) | C4 | Gene KO |
| [106] |
[107] | |||||
Yersinia enterocolitica | Gastrointestinal foodborne pathogen (human) | C4 | Gene KO |
| [108] |
[109] | |||||
Vibrio cholerae | Pathogen (human) | C4 | Gene KO |
| [110] |
Haemophilus ducreyi | Pathogen (human) | C4 + C4 | Gene KO |
| [111] |
Shigella flexneri | Pathogen (human) | C4 | Gene KO |
| [112] |
Pseudomonas aeruginosa | Broad host pathogen | C4 + C2 | Crystal structure |
| [113] |
[114] | |||||
PDB: 4IJJ | |||||
Pseudomonas plecoglossicida | Pathogen (fish) | C4 | Gene silencing (RNA interference) |
| [115] |
Erwinia amylovora | Pathogen (plant) | C4 | Gene KO |
| [116] |
Xanthomonas citri | Pathogen (plant) | C1 | Gene KO |
| [98] |
Pseudomonas putida | Soil | C4 | Gene KO |
| [117] |
Azotobacter vinelandii | Soil, N2-fixing | C4 + C4(+C2) | Gene KO |
| [118] |
Myxococcia | |||||
Myxococcus xanthus | Saprophytic | C4 (+ C4 + C3 + C2) | Artificial regulation |
| [119] |
Chlamydiaceae | |||||
Chlamydia trachomatis | Pathogen (human) | C1 | Crystal structure |
| [120] |
PDB: 6PTG | |||||
Spirochaetia | |||||
Borrelia burgdorferi | Pathogen (human) | C4 | Gene KO |
| [121] |
[122] |
Organism . | Lifestyle . | DksA typea . | Exper. evidence . | Phenotypic response . | Ref. . |
---|---|---|---|---|---|
Alphaproteobacteria | |||||
Bartonella henselae | Pathogen (human) | C1 | Gene KO |
| [95] |
Sinorhizobium meliloti | Plant symbiont | C4 + C1 | Gene KO |
| [96] |
Rhodobacter sphaeroides | Aquatic, photosynthetic | C4 + C1 | Gene KO |
| [97] |
Gammaproteobacteria | |||||
Escherichia coli | Opportunistic pathogen (human) | C4 | Crystal structure |
| [98] |
[99] | |||||
[100] | |||||
[101] | |||||
E. coli – phage P1 interaction | Bacteriophage |
| [102] | ||
Salmonella spp. | Pathogen (human) | C4 | Gene KO |
| [103] |
[104] | |||||
[105] | |||||
Acinetobacter baumannii | Opportunistic pathogen (human) | C4 | Gene KO |
| [106] |
[107] | |||||
Yersinia enterocolitica | Gastrointestinal foodborne pathogen (human) | C4 | Gene KO |
| [108] |
[109] | |||||
Vibrio cholerae | Pathogen (human) | C4 | Gene KO |
| [110] |
Haemophilus ducreyi | Pathogen (human) | C4 + C4 | Gene KO |
| [111] |
Shigella flexneri | Pathogen (human) | C4 | Gene KO |
| [112] |
Pseudomonas aeruginosa | Broad host pathogen | C4 + C2 | Crystal structure |
| [113] |
[114] | |||||
PDB: 4IJJ | |||||
Pseudomonas plecoglossicida | Pathogen (fish) | C4 | Gene silencing (RNA interference) |
| [115] |
Erwinia amylovora | Pathogen (plant) | C4 | Gene KO |
| [116] |
Xanthomonas citri | Pathogen (plant) | C1 | Gene KO |
| [98] |
Pseudomonas putida | Soil | C4 | Gene KO |
| [117] |
Azotobacter vinelandii | Soil, N2-fixing | C4 + C4(+C2) | Gene KO |
| [118] |
Myxococcia | |||||
Myxococcus xanthus | Saprophytic | C4 (+ C4 + C3 + C2) | Artificial regulation |
| [119] |
Chlamydiaceae | |||||
Chlamydia trachomatis | Pathogen (human) | C1 | Crystal structure |
| [120] |
PDB: 6PTG | |||||
Spirochaetia | |||||
Borrelia burgdorferi | Pathogen (human) | C4 | Gene KO |
| [121] |
[122] |
KO, knock out.
aDksA type refers to the number of conserved cysteines in the Zn2+-binding motive.
Dissemination of SR Components in the AAT Cluster
RelA and SpoT
Analyzing the genomes of representative members of the AAT cluster revealed the seamless presence of the hydrolases/synthetases RelA and SpoT with one coding gene each per genome (Fig. 3). The orthologs of the cluster share similar, high sequence identities with their counterparts in E. coli (RelA, 41–44%; SpoT, 48–51%). As RelA and SpoT are the only RSH family proteins detected in the analyzed genomes of the AAT cluster, it can be assumed that RelA acts as the predominant (p)ppGpp synthetase with a presumable inactive pseudoHD and, conversely, SpoT as the predominant (p)ppGpp hydrolase with only weak synthetase activity (see below section Phylogeny, Domain Architecture, Similarity, and Catalytic Residues of RelA and SpoT). As expected for betaproteobacteria, no truly bifunctional Rel or monofunctional small alarmone synthetase and small alarmone hydrolase [49] are encoded by our studied organisms.
Dissemination of the SR components (DksA, RelA, and SpoT) across genome-sequenced members of the AAT cluster. Phylotype color coding reflects the phylogenetic clustering of the DksAs (online suppl. Fig. S1), correlating with the number of cysteine residues involved in binding of the Zn2+ cation.
Dissemination of the SR components (DksA, RelA, and SpoT) across genome-sequenced members of the AAT cluster. Phylotype color coding reflects the phylogenetic clustering of the DksAs (online suppl. Fig. S1), correlating with the number of cysteine residues involved in binding of the Zn2+ cation.
DksA
The studied genomes of the AAT cluster contain at least one copy of the dksA gene, with many species harboring additional paralogs (Fig. 3). These DksA paralogs correspond to three phylotypes (online suppl. Fig. S1; for all online suppl. material, see https://doi.org/10.1159/000546200), differentiated by the cysteine composition of their Zn2+-binding sites and sequence similarity to well-characterized DksA proteins:
The four cysteine-containing C4 phylotype is conserved across all studied genomes and the representatives share 72–83% sequence similarity to DksA of E. coli (DksAEco). The C4-type retains all functionally relevant amino acids of the Zn2+- and (p)ppGpp-binding sites as well as the RNAP interaction site (CC tip) described for the DksAEco, suggesting functional equivalence (for details see below section Visualizing Structural Features of DksA Serving Key Functions). This phylotype is presumed to play a similar role in transcriptional regulation and SR across species. In Figures 3 and online supplementary S1, DksAs belonging to this phylotype are indicated in blue.
The C1 phylotype, characterized by a single cysteine in the Zn2+-binding site, clusters with DksA proteins from Bartonella henselae and Rhodobacter sphaeroides (see above section General Dissemination of DksA-Mediated SR). Sequence identity to DksAEco lies between 48 and 65%. It is the second most prevalent type found in the majority of studied species from the AAT cluster. Members of this phylotype exhibit a characteristic lysine-to-glutamine substitution at the equivalent K139 position within the (p)ppGpp-binding site of DksAEco, while the RNAP-interacting loop residues remain conserved. Functional evidence suggests that this phylotype performs similar regulatory roles. For instance, the single C1-type DksA protein in Bartonella henselae mediates virulence by regulating the type IV secretion system (T4SS) in response to starvation [95]. Similarly, one of the two DksA homologs in Rhodobacter sphaeroides (C1-type) is essential for photosynthetic growth and is functionally interchangeable with DksAEco [97]. In Figures 3 and online supplementary S1, DksAs belonging to this phylotype are labeled in green.
A second C4-type DksA occurs exclusively in species of the genus Azoarcus within the studied AAT cluster. Despite retaining functionally relevant amino acids within the Zn2+-binding and RNAP interaction site, key amino acids involved in (p)ppGpp-binding are not conserved. Functional assignment of this phylotype is inconclusive, as the function of the representative second C4-type DksA of Rhodobacter sphaeroides is not elucidated [97]. They share 31–34% sequence identity to DksAEco. In Figures 3 and 4, DksAs belonging to this phylotype are labeled in red.
Comparison of selected RelAs and SpoTs. a Phylogenetic clustering and domain architecture of proteins from Ar. aromaticum EbN1T, Aromatoleum sp. strain CIB, Az. olearius BH72, and T. aromatica K172T, compared to the orthologs from E. coli K12. Accession numbers of proteins are indicated. For abbreviations of domains see legend to Fig. 1c. b Percent identity matrix of HD (top) and SD (bottom) from the same selected four strains of the AAT cluster compared to those of E. coli. The nonfunctional RelA-HD of E. coli is indicated by light gray background.
Comparison of selected RelAs and SpoTs. a Phylogenetic clustering and domain architecture of proteins from Ar. aromaticum EbN1T, Aromatoleum sp. strain CIB, Az. olearius BH72, and T. aromatica K172T, compared to the orthologs from E. coli K12. Accession numbers of proteins are indicated. For abbreviations of domains see legend to Fig. 1c. b Percent identity matrix of HD (top) and SD (bottom) from the same selected four strains of the AAT cluster compared to those of E. coli. The nonfunctional RelA-HD of E. coli is indicated by light gray background.
SR in Ar. aromaticum EbN1T, Aromatoleum sp. CIB, Az. olearius BH72, and T. aromatica K172T
As outlined in the Introduction, within the AAT cluster, the four species Ar. aromaticum EbN1T, Aromatoleum sp. strain CIB, Az. olearius BH72, and T. aromatica K172T are well-studied model organisms that represent the phylogenetic and phenotypic spectrum of the cluster. Moreover, their seeming “preparedness for nutritional/nitrogen opportunities” and use of c-di-GMP for plant colonization suggest second messenger-dependent regulatory processes such as SR to be also employed by these nonstandard model bacteria. Therefore, the predicted structures of their RelA, SpoT, and DksA proteins were inspected more closely to assess their functionality and thereby the capacity of the strains to perform SR.
Phylogeny, Domain Architecture, Similarity, and Catalytic Residues of RelA and SpoT
The RelAs and SpoTs from Ar. aromaticum EbN1T, Aromatoleum sp. strain CIB, Az. olearius BH72, and T. aromatica K172T mirror the phylogenetic relations of the strains and share the same composition and order of domains (hydrolase, synthetase, TGS, AH/RIS, and ACT) as known from the experimentally studied E. coli orthologs (Fig. 4a) [49]. Accordingly, percent identity matrices (on the amino acid level) in relation to the E. coli orthologs revealed solid-to-high values for the HDs (63.3–66.0% for SpoTs vs. 34.0–35.3% for RelAs) and the SDs (58.8–62.7% for SpoTs vs. 50.0–54.5% for RelAs) (Fig. 4b). Thus, these predicted RelAs and SpoTs in the four selected model strains should be functional.
To further substantiate this potential functionality, we aligned the sequences of HDs/SDs of SpoTs (Fig. 5a, [125‒127]) and RelAs (Fig. 5b), respectively, from the four model strains with those from E. coli. Thereby, we could verify the full conservation of all residues that are essential for catalysis (Fig. 5a bottom). In the catalytic center of the SpoT-HD, the 2H2D (His, Asp) motif and ED (Glu-Asp) diad coordinate Mn2+ and Mg2+, respectively, which jointly interact with the δε-PP group at the 3′-position of (p)ppGpp. Furthermore, the Mg2+ ion activates a H2O molecule for a nucleophilic attack, resulting in the hydrolytic removal of this diphosphate [125]. In the SpoT-SD, the conserved D and E residues provide the essential carboxyl groups that coordinate the catalytic Mg2+ ion. Here, the metal ion catalyzes the transfer of the βγ-PP from ATP to the 3′-hydroxyl group of the ribose moiety in GTP or GDP forming (p)ppGpp [126]. Taken together, the SpoTs from our model strains should indeed act as bifunctional hydrolase/synthetases, as typical for Beta- and Gammaproteobacteria but not for monofunctional hydrolase SpoTs in Moraxellaceae [58]. By contrast, the RelAs of the four model strains lack the 2H2D motif and ED diad generating a pseudoHD, while they harbor the conserved D and E residues in the SDs. Thus, the RelAs should serve the role of monofunctional synthetases. Noteworthy, multiple protein sequence alignments (online suppl. Fig. S2) revealed that all of these residues, which are indispensable for the catalytic activity of RelAs and SpoTs, are also fully conserved across the aforementioned 31 genome-sequenced members of the AAT cluster. However, the regulatory proteins Rsd, YtfK, and YmbG, which modulate the activities of RelA and SpoT in E. coli, are not encoded in any of the studied genomes of the AAT cluster.
Multiple sequence alignment of bi/monofunctional SpoTs (a) and RelAs (b) from E. coli K12, Ar. aromaticum EbN1T, Aromatoleum sp. strain CIB, Az. olearius BH72, and T. aromatica K172T (all truncated after SD). Order of strains is according to phylogenic relation of proteins. Role of essential residues in the catalytic centers of the HDs and SDs is illustrated in the schemes below the SpoT alignments; red arrows indicate nucleophilic attack (modified from refs. [125‒127]).
Multiple sequence alignment of bi/monofunctional SpoTs (a) and RelAs (b) from E. coli K12, Ar. aromaticum EbN1T, Aromatoleum sp. strain CIB, Az. olearius BH72, and T. aromatica K172T (all truncated after SD). Order of strains is according to phylogenic relation of proteins. Role of essential residues in the catalytic centers of the HDs and SDs is illustrated in the schemes below the SpoT alignments; red arrows indicate nucleophilic attack (modified from refs. [125‒127]).
Visualizing Structural Features of DksA Serving Key Functions
The crystal and cryo-EM structures of DksAEco revealed that this all α-helical protein is composed of three major structural elements [128, 129] (Fig. 6, left panel, top): (i) the globular (G) domain, which comprises the N- and C-terminal regions; (ii) the central, CC domain consisting of two α-helices connected via an α-turn, which harbors the two highly conserved, functionally indispensable tip residues (D74, A76) that interact with the trigger loop/helix of RNAP [55, 130]; and (iii) a C-terminal α-helix (CT domain), which is loosely connected to the G and CC domains. The G and CT domains each contribute two cysteine residues to the Zn2+-binding site, which likely constrains the mobility of the α-helix of the CT domain. Binary and ternary 3D structures of DksAEco (with and without (p)ppGpp) in complex with the RNAP holoenzyme binding to the rRNA gene promoter from E. coli were determined based on X-ray crystal analysis [81] and cryo-EM [129]. DksAEco apparently uses in total four conserved residues from its CT (K139) and CC (L95, K98, and R129) domains to establish contacts with the 5′-phosphates and the guanine moiety of (p)ppGpp. This interaction with (p)ppGpp induces a rotation of DksAEco, which drives the CC tip deeper into the secondary channel of RNAP strengthening the DksA-RNAP interaction and leading to an interference of the CC tip with the bridging helix of RNAP. This in turn alters the kinetics of DNA loading, impedes open complex formation, and favors the less stable early stage of the RNAP-promoter complex [81, 129].
3D-structural visualization of selected DksAs. The left panel displays the overall 3D structure of DksA: top, from E. coli as determined by cryo-EM [129]; bottom, overlaid with those from Ar. aromaticum EbN1T and Az. olearius BH72 as predicted by AlphaFold. The right panel presents close-up views of the Zn2+- and (p)ppGpp-binding sites as well as the CC tip interacting with RNAP for E. coli [129], compared to those of the AlphaFold-predicted DksAs from Ar. aromaticum EbN1T, Aromatoleum sp. strain CIB, Az. olearius BH72, and T. aromatica K172T.
3D-structural visualization of selected DksAs. The left panel displays the overall 3D structure of DksA: top, from E. coli as determined by cryo-EM [129]; bottom, overlaid with those from Ar. aromaticum EbN1T and Az. olearius BH72 as predicted by AlphaFold. The right panel presents close-up views of the Zn2+- and (p)ppGpp-binding sites as well as the CC tip interacting with RNAP for E. coli [129], compared to those of the AlphaFold-predicted DksAs from Ar. aromaticum EbN1T, Aromatoleum sp. strain CIB, Az. olearius BH72, and T. aromatica K172T.
The AlphaFold-predicted structures of the DksAs from our four model strains show the same overall architectural features as the DksAEco cryo-EM structure, as shown for the DksAs from Ar. aromaticum EbN1T and Az. olearius BH72 (Fig. 6, left panel, bottom). Moreover, close-up views of the Zn2+- and (p)ppGpp-binding sites as well of the CC tips of the DksAs from all four model bacteria correlated also very well with those of DksAEco (Fig. 6, right panel). Thus, these key structural DksA features from the four selected members of the AAT cluster can be assumed to function similarly in negatively regulating initiation of transcription.
Conclusions and Outlook
Albeit SR has been studied in-depth for E. coli, this regulatory mechanism was observed in a taxonomically broad range of bacteria. As most of the functional studies were derived for pathogens, it would be rewarding to study functions and targets of SR for completely different environmental niches. Bioinformatic analyses allowed us to postulate that functional modules of SR are widespread across the AAT cluster, in which SR is only poorly investigated let alone understood. Unraveling SR responses in model bacteria of this group could help clarify some fundamental questions: Since SR is commonly regarded as stress response in comparison to our rich laboratory conditions, are soils and sediments generally stressful for bacteria in their environmental niche? Are roots as habitats for beneficial endophytes really the nutrient-rich niche as commonly thought? What roles does SR play in such scenarios beyond stress response?
Materials and Methods
Phylogenetic and Bioinformatic Analyses
Modeling of DksA Structures
Structural modeling of DksAs from Ar. aromaticum EbN1T, Aromatoleum sp. CIB, Az. olearius BH72, and T. aromatica K172T were performed using AlphaFold Server version 3 [133]. To improve the accuracy and relevance of the predicted structures, co-structure modeling of DksA with all RNAP subunits of the respective organisms was employed. Including the RNAP constituents in the modeling query enhanced the structural context of the interaction site and the overall conformational fidelity of DksA. Post-prediction, the generated 3D models were analyzed to assess their structural quality and accuracy by evaluating the predicted aligned errors, predicted template modeling scores, and interface predicted template modeling scores provided by AlphaFold. The structural models with the highest interface predicted template modeling scores (all 0.81) and lowest predicted aligned errors (14.29‒15.84) were selected for further analysis. Visualizations and structural alignments were performed using UCSF ChimeraX and the integrated MatchMaker tool [134, 135].
Conflict of Interest Statement
Barbara Reinhold-Hurek and Ralf Rabus declare Editorial Board membership with Microbial Physiology. Patrick Becker, Jakob Ruickoldt, and Petra Wendler have no conflicts of interest to declare.
Funding Sources
Bioinformatic analyses and manuscript preparation were conducted with no extramural funding.
Author Contributions
P.B., B.R.-H., and R.R. conceived the study and finalized the manuscript and all authors agreed to its final version. P.B. performed the bioinformatic analysis and literature searches and drafted the manuscript; and J.R., P.W., and P.B. conducted the structural modeling.