Aging is a major risk factor for chronic diseases, which in turn can provide information about the aging of a biological system. This publication serves as an introduction to systems biology and its application to biological aging. Key pathways and processes that impinge on aging are reviewed, and how they contribute to health and disease during aging is discussed. The evolution of this situation is analyzed, and the consequences for the study of genetic effects on aging are presented. Epigenetic programming of aging, as a continuation of development, creates an interface between the genome and the environment. New research into the gut microbiome describes how this interface may operate in practice with marked consequences for a variety of disorders. This analysis is bolstered by a view of the aging organism as a whole, with conclusions about the mechanisms underlying resilience of the organism to change, and is expanded with a discussion of circadian rhythms in aging. Finally, the book presents an outlook for the development of interventions to delay or to reverse the features of aging.
The publication is recommended to students, researchers as well as professionals dealing with public health and public policy related to an aging society.
18 - 34: Applications to Aging Networks
-
Published:2014
-
Christopher Wimble, Tarynn M. Witten, 2014. "Applications to Aging Networks", Aging and Health - A Systems Biology Perspective, S.M. Jazwinski, A.I. Yashin
Download citation file:
Abstract
This chapter will introduce a few additional network concepts, and then it will focus on the application of the material in the previous chapter to the study of systems biology of aging. In particular, we will examine how the material can be used to study aging networks in two sample species: Caenorhabditis elegans and Saccharomyces cerevisiae.
In the previous chapter, we addressed the importance of understanding how complexity theory, as manifested through nonlinear dynamics, hierarchies and network analysis, can be used to study the intricate and fascinating behaviors of living systems. We discussed why reductionism, while it has its uses, also causes us to lose important information about the behavior of a system because breaking a system apart costs us information about how the ‘whole' organism functions. We discussed the difference between a complicated system and a complex system, and we then detailed the core properties of a complex system. We pointed out that if we were to examine a large collection of different complex systems, we would find that complex systems have certain common or unifying characteristics. We argued that complex systems have ‘emergent properties' meaning that a behavior that was not predicted from infinite knowledge of the parts emerges as part of the system's behaviors. Living systems, whether they are cells or ecosystems, do not function like pieces of a jigsaw puzzle. Instead, they are often fuzzy or stochastic, with backup systems and redundancies that belie their true structure. And we pointed out that an understanding of these systems requires a different conceptual framework. Thus, in order to understand complex systems, we must understand them through a reverse engineering perspective rather than a reductionist perspective. One approach to gaining this understanding is through the use of network representations of living systems.
Networks and Graphs
In the previous chapter, we introduced the basics of network theoretic methods as a beginning means to understand the complexity of living systems. In particular, we introduced the concept of a graph G that has nodes nj (or vertices) and edges Eij (a connection between node nj and node ni). We illustrate an undirected longevity gene-protein network in figure 1 [1,2]. We then introduced the idea of the adjacency matrix A. With this simple set of definitions, we now have some powerful tools with which to investigate the structure of a network and how it might inform us about the biological dynamics of the overall network. We began with the concept of connectivity. Consider the network in figure 1. We note that some of the nodes appear to have very many connections while others have but a few. Does this imply anything about the network and its behaviors? What information could we glean from this?
Illustration of a sample C. elegans longevity gene-protein network. See Witten and Bonchev [2] for more details.
Illustration of a sample C. elegans longevity gene-protein network. See Witten and Bonchev [2] for more details.
Creating a Longevity Network: An Example with Yeast
To understand network dynamics, particularly as applied to aging, we first discuss how one can actually go about creating a longevity gene-protein network. The answer is, it is not easy, and it takes a good deal of time. The network in figure 1 took nearly a year of database searching, literature review and peer collaboration to create. Of course, when we started the project, the available databases were not nearly as efficient or sophisticated and the data far less abundant than they are now.
Let us first consider Saccharomyces cerevisiae. S. cerevisiae was chosen as the model organism for this study because it is well understood, highly studied, and regarded as a good model with which to study aging processes [3]. Yeast has two different ways that it can age; replicative life span (RLS) and chronological life span. When constructing a longevity network, genes having an effect on chronological life span and RLS can both be considered. Chronological life span is a measure of the length of time a nondividing cell can survive, while RLS measures how many times a cell can divide [4,5]. While both are useful, RLS was chosen because it is easier to study with yeast and has been shown to have an overlap with more complicated organisms [4]. Each of the proteins in the longevity network was shown to increase RLS when removed from the genome as a result of gene knockout studies [6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34]. We illustrate the RLS network for S. cerevisiae in figure 2.
Illustration of the S. cerevisiae RLS extension network. The different color connecting lines indicate different types of interaction; binding (purple), regulation (dotted lines), genetic interaction (green) and direct regulation (grey). The different shapes represent different types of factors involved in the connections.
Illustration of the S. cerevisiae RLS extension network. The different color connecting lines indicate different types of interaction; binding (purple), regulation (dotted lines), genetic interaction (green) and direct regulation (grey). The different shapes represent different types of factors involved in the connections.
Notice that there are a number of unconnected nodes. These are our islands. They are likely unconnected because we do not know how they are connected in the network.
In this example, we are interested in how different subnetworks (TOR: target of rapamycin, and CRH: cellular response to heat) are tied to the RLS network. And in understanding how the addition of other subnetworks to the RLS network might further inform us about the dynamics of aging at a genetic level. We choose the TOR pathway due to its demonstrated effect on RLS, and we choose the CRH as a model for how the cell responds to environmental stressors [18,35,36,37]. Biological aging has been described as a cascading breakdown of processes resulting in decreased ability of an organism to respond to stress, and thus a model of stress response was included [5]. We illustrate the TOR and CRH networks in figure 3 and figure 4. The combined total network TOT is illustrated in figure 5. All of the images were created using Pathway Studio software.
Illustration of the S. cerevisiae TOR network. The different color connecting lines indicate different types of interaction; binding (purple), regulation (dotted lines), genetic interaction (green), expression (blue), protein modification (golden green), promotion of binding (lime green) and direct regulation (grey). The different shapes represent different types of factors involved in the connections.
Illustration of the S. cerevisiae TOR network. The different color connecting lines indicate different types of interaction; binding (purple), regulation (dotted lines), genetic interaction (green), expression (blue), protein modification (golden green), promotion of binding (lime green) and direct regulation (grey). The different shapes represent different types of factors involved in the connections.
Illustration of the S. cerevisiae CRH network. The different color connecting lines indicate different types of interaction; binding (purple), regulation (dotted lines), genetic interaction (green), expression (blue), protein modification (golden green), promotion of binding (lime green) and direct regulation (grey). The different shapes represent different types of factors involved in the connections.
Illustration of the S. cerevisiae CRH network. The different color connecting lines indicate different types of interaction; binding (purple), regulation (dotted lines), genetic interaction (green), expression (blue), protein modification (golden green), promotion of binding (lime green) and direct regulation (grey). The different shapes represent different types of factors involved in the connections.
Illustration of the S. cerevisiae combined network TOT. The combined network is composed of RLS, TOR, and CRH. Here, we used the software program cytoscape to graphically represent all three networks combined, where nodes are color coded to show which network the protein it represents is part of. Yellow nodes represent proteins in the RLS network, green for the CRH group, and blue for the TOR network.
Illustration of the S. cerevisiae combined network TOT. The combined network is composed of RLS, TOR, and CRH. Here, we used the software program cytoscape to graphically represent all three networks combined, where nodes are color coded to show which network the protein it represents is part of. Yellow nodes represent proteins in the RLS network, green for the CRH group, and blue for the TOR network.
All of the networks were built by mining the literature [for details see [2,24]] and online databases. In particular, we added information from the Saccharomyces Genome Database, YEASTRACT, The Comprehensive Yeast Genome Database, The NetAge Database, Sageweb, and AmiGO. Items on the list were verified by comparing them to results from papers that measured the effect on RLS of gene deletions (see previous references). Lists for the TOR as well as the CRH were constructed using gene ontology terms on AmiGO [35] as well as the Saccharomyces Genome Database [37].
Once the lists were prepared, protein-protein interaction data were obtained using a yeast interaction database developed to work with the software program Pathway Studio. A network of direct connections (DC) was constructed as well as a shortest-path network (SP) for the RLS, TOR, and CRH subnetworks. A TOT was also constructed. TOT was constructed by fusing the three networks together into the larger TOT. Not only did Pathway Studio show the connections between proteins, but it also yielded information on their function and the type of interaction. We illustrate a sample list for the CRH network in table 1.
Analyzing the Network: An Example with Yeast and Caenorhabditis elegans
As we mentioned in the previous chapter, it is natural to conclude that the more edges going in and out of a node, the more likely that the given node is going to be of importance to the network. In order to assist us in understanding the connectivity structure of the network, we create a connectivity plot (fig. 6). To do this, we first count the number of nodes with a given connectivity k where the connectivity varies from zero to the maximum connectivity value. The number of nodes with a given connectivity k is called the frequency of that connectivity and is denoted f(k). Next, plot the frequency f(k) versus the connectivity k. We illustrate this for Caenorhabditis elegans in figure 6. Because the C. elegans network is large, it has a smoother look to it. Let us look at the yeast RLS shortest-path network (RLS-SP) and construct the power-law graph (fig. 7).
Illustration of a sample connectivity or degree distribution plot for the network in figure 1. See Witten and Bonchev [2] for more details. The rhombs represent the complete distribution. The squares are the data points binned into groups of three. The black solid line is the nonlinear regression line. Results are significant at p < 0.05.
Illustration of a sample connectivity or degree distribution plot for the network in figure 1. See Witten and Bonchev [2] for more details. The rhombs represent the complete distribution. The squares are the data points binned into groups of three. The black solid line is the nonlinear regression line. Results are significant at p < 0.05.
The degree distribution plot for yeast RLS-SP network. The horizontal axis is the connection number k and the vertical axis is the frequency f(k). Notice that the degree distribution is more irregular and that there are an enormous number of zero and one values in the network. This makes fitting a power curve more difficult and also increases the likelihood that the fit will not be statistically significant due to the small sample size (number of nodes). Sometimes binning can help when there are zero node numbers. However, that also affects the fit.
The degree distribution plot for yeast RLS-SP network. The horizontal axis is the connection number k and the vertical axis is the frequency f(k). Notice that the degree distribution is more irregular and that there are an enormous number of zero and one values in the network. This makes fitting a power curve more difficult and also increases the likelihood that the fit will not be statistically significant due to the small sample size (number of nodes). Sometimes binning can help when there are zero node numbers. However, that also affects the fit.
As we mentioned in the previous chapter, studies of the statistical behavior of various network structures have shown that networks can have a small variety of overall topologies; random, regular, small-world and scale-free. Moreover, many real-world networks can be shown to be small-world or scale-free. Because scale-free networks are ubiquitous and highly relevant to our discussion, let us look at them a bit more closely. How can we determine if we have a scale-free distribution?
Power Plots and Scale-Free Networks
It is hard to interpret a plot like that illustrated in figures 6 and 7. However, we observe that if we take the log of both sides of f(k) = Bk-γ, the more linear the data plot, the more likely it fits a power curve. This follows because we would have ln[f(k)] = −γln(k)+ ln(B). Thus, networks whose connectivity structure follows a power law of the form f(k) = Bk-γ where B and γ are parameters to be estimated should look like negative slope lines if they are scale free. The simplest way to estimate the parameters is to perform a linear regression on the log-log transformed f(k) versus k data, dropping the k =0 data point because there are no connectivities. We found that B =1,992.4 and γ =2.2499 with an r2 = 0.969 (fig. 8). Thus, our C. elegans longevity gene-protein network [2] can be said to be a scale-free network.
Illustration of the log-log data and regression curve through the data of figure 6. The outer lines are the 95% confidence interval boundaries for the linear regression estimate. See Witten and Bonchev [2] for more details.
Illustration of the log-log data and regression curve through the data of figure 6. The outer lines are the 95% confidence interval boundaries for the linear regression estimate. See Witten and Bonchev [2] for more details.
Categorizing Small-World Networks
Due to the unique nature of scale-free networks, a log-log connectivity plot is enough to let you know if you are dealing with a scale-free network. However, this trick does not work for other network forms. Because many biological systems demonstrate small-world network behavior, we briefly examine how to determine whether or not a network is a small-world network.
To help characterize small-world networks, in the previous chapter, we introduced a few new network descriptors. The first was the average path length of a network. Path length is the distance or number of edges between two nodes in the network. We used the idea of path length to construct the minimum path length between node ni and node nj and denoted it by ℓij. We now introduce the concept of the diameter of a network. The diameter is the largest direct distance between any two nodes in the network. Consider the example network in figure 9.
Now, consider the adjacency matrix A derived from the network in figure 9 and which is illustrated on the left hand side of table 2. The matrix A represents the number of length 1 paths between node ni and node nj. If we multiply A× A, the entries that are non-zero represent the number of length 2 paths between each pair of nodes. If we repeat this process exactly N- 1, then we get the total number of paths of length 1,2,…,N- 1 for the network between each pair of nodes in the network where N is the total number of nodes in the network. The right hand side of table 2 illustrates A× A× A× A× A, which is the number of paths of length five between each pair of nodes. So, for example, there are 29 possible paths of length 5 between node C and node B. At some point between 1 and N- 1 multiplies, every element in the matrix or its subsequent multiplies A, A2, A3, A4, . . ., AN- 1 will have been non-zero during the multiply sequence. The diameter is the minimum number of times the adjacency matrix A has to be multiplied by itself so that each entry has taken a value greater than 0 at least once during the multiply sequence. For this particular network, the diameter is 4.
A network is considered a small-world network if the diameter is small relative to the number of nodes in the network. Obviously, 4 is not small relative to 8. To really understand small-worldness, one network sample is not sufficient. You actually need a set of graphs. However, we have only the one network graph. Therefore, we need other means to study the behavior of networks to understand if they are small-world networks. To do this, we introduced the ideas of clustering. Observe that even for a small network such as the one we have illustrated, the calculations can become tedious. We will discuss software at a later point in this chapter.
Node-Node Connectivities
In the previous chapter, we talked about the idea of the centrality of a node where centrality is a measure of the ‘position' or relative importance of a node in a network. In the literature, there are four main measures of centrality of a node: degree centrality, betweenness centrality, closeness centrality and eigenvector centrality. From an aging-related perspective, understanding node centrality of the nodes in a network could lead to potential targets for pharmaceuticals that might help hinder disease progression or extend life span. Briefly, the main centrality measures are:
• Degree centrality of a node is denoted by CD(ni); it measures the chance that a given node ni in the network will receive something flowing along the network.
• Closeness centrality, denoted CC(ni) can be thought of as a measure of how long it will take to send a chemical or other biological signal out from ni to all of the other nodes in the network.
• Betweenness centrality, denoted CB(ni) looks at how often, in a network, a given node ni acts as a bridge along the shortest path between two other nodes. From a biological perspective, knocking out a node with high betweenness centrality would force a signal to reroute itself along a path that was not the shortest path.
• Eigenvector centrality, denoted CE(ni) is a measure of the ‘influence' of a node in a network.
Eigenvalue centrality is commonly used as a centrality measure because it is an influence measure for a node, and these could be potential targets for further study or drug design. In figure 10, we illustrate the SP for the RLS network. For a discussion of SPs in aging, see Managbanag et al. [24]. In figure 11, we illustrate the corresponding eigenvalue centrality measures for the various nodes in the RLS-SP network. The larger the eigenvalue centrality, the larger the influence in the RLS-SP network.
Illustration of the RLS-SP network. Note how highly connected the one node in the upper left hand side is. That node, by the way, is RPL16B ribosomal 60S subunit protein L16B; N-terminally acetylated, binds 5.8 S rRNA; transcriptionally regulated by Rap1p; homologous to mammalian ribosomal protein L13A and bacterial L13 [37].
Illustration of the RLS-SP network. Note how highly connected the one node in the upper left hand side is. That node, by the way, is RPL16B ribosomal 60S subunit protein L16B; N-terminally acetylated, binds 5.8 S rRNA; transcriptionally regulated by Rap1p; homologous to mammalian ribosomal protein L13A and bacterial L13 [37].
Illustration of the eigenvalue centrality measures for the yeast replicative shortest-path life span network. The horizontal axis is the node name in the network, the vertical axis is the eigenvalue centrality value. The larger the eigenvalue centrality, the larger the influence in the RLS-SP network.
Illustration of the eigenvalue centrality measures for the yeast replicative shortest-path life span network. The horizontal axis is the node name in the network, the vertical axis is the eigenvalue centrality value. The larger the eigenvalue centrality, the larger the influence in the RLS-SP network.
Thus, from figure 11, we would infer that GPA2, CDC25, SCH9 and CYR1 are influential nodes and therefore likely candidates to investigate further. SCH9 - AGC family protein kinase; functional ortholog of mammalian S6 kinase; phosphorylated by Tor1p and required for TORC1-mediated regulation of ribosome biogenesis, translation initiation, and entry into G0 phase; involved in transactivation of osmostress-responsive genes; regulates G1 progression, cAPK activity and nitrogen activation of the FGM pathway [37]. CYR1 - adenylate cyclase is required for cAMP production and cAMP-dependent protein kinase signaling; the cAMP pathway controls a variety of cellular processes, including metabolism, cell cycle, stress response, stationary phase, and sporulation [37]. CDC25 - Membrane-bound guanine nucleotide exchange factor; indirectly regulates adenylate cyclase through activation of Ras1p and Ras2p by stimulating the exchange of GDP for GTP; required for progression through G1 [37]. GPA2 - Nucleotide-binding α-subunit of the heterotrimeric G protein interacts with the receptor Gpr1p, has signaling role in response to nutrients [37]. Witten and Bonchev [2] illustrate these concepts for the C. elegans longevity gene network illustrated in figure 1 of that paper.
Most network analysis programs calculate the basic centrality and other measures. The algorithms for these calculations are tedious and not trivial to program. Therefore, it is better to use one of the programs discussed in the upcoming software section rather than to write your own programs to make the calculations. In table 3, we illustrate the properties of all four of our original networks and their corresponding SPs.
Some of the basic network variables for the original CRH, RLS, TOR and TOT networks and their corresponding shortest-path networks

Interpreting the Results
From the eigenvalue centrality, we have already seen a number of genes worth investigating due to their influential nature in the networks. From the graph of the DC, we discovered that the TOR1 protein was shared between the RLS and TOR networks, and the HOS2 gene was shared between the RLS and the CRH networks. It was also discovered that the TOR and CRH networks were densely connected to the RLS network. However, there were relatively few connections between the CRH and TOR networks. TOR1 is responsible for PIK-related protein kinase and rapamycin target; subunit of TORC1, a complex that controls growth in response to nutrients by regulating translation, transcription, ribosome biogenesis, nutrient transport and autophagy; involved in meiosis [37]. HOS2 is histone deacetylase and subunit of Set3 and Rpd3L complexes; required for gene activation via specific deacetylation of lysines in H3 and H4 histone tails; subunit of the Set3 complex, a meiotic-specific repressor of sporulation-specific genes that contains deacetylase activity [37]. We observe that all of these targets are related to growth and division in some way.
We were able to demonstrate that the SPs followed a power-law distribution. This was not the case in the DC perhaps due to their relatively small sample size. Mean vertex degree was noticeably different between the SPs and DCs. It was more pronounced in the TOR network, the shortest-path mean vertex degree being nearly double what it was in the DCs. This was even more noticeable between the TOTs, which was more than double. There was a large difference in node densities with the SPs having ones lower than the DCs. Between the RLS networks, this was far less pronounced with the SP having half the node density of the DC. The TOR shortest-path node density was a one tenth of the DCs. The network diameter was similar for the TOR DCs and the SPs, yet for the RLS and TOTs the shortest-path diameter was half what it was in the DC.
Software for Network Analysis
Due to the fact that many networks have large numbers of nodes and connections, it is not possible to hand-calculate the various network descriptors that we have discussed. Over the past decade, a number of network analysis software packages have become available. Two of the most commonly used packages are Pajek, available at http://vlado.fmf.uni-lj.si/pub/networks/pajek/, and Cytoscape, which is available at http://www.cytoscape.org/. Another excellent package is NetworkX from Los Alamos National Laboratories. It can be downloaded at http://networkx.lanl.gov/index.html. All of these packages offer free downloads on numerous computation platforms and operating systems.
One of the challenges in understanding large complex networks, including biological networks, is visualizing them. Both Pajek and Cytoscape offer network visualization tools. However, a number of other visualization tools are now available, and these are very powerful visualization software packages. CFinder, available at http://cfinder.org/ is a cluster and community software package designed for finding and visualizing dense groups of nodes in networks. Gephi, available at https://gephi.org, is an open graph visualization program that allows the user to perform exploratory data analysis on a given network, link analysis and generate high-quality printable network images. There are many other social network analysis software packages now available. The packages frequently allow the user to analyze biological networks as well as other network forms. An excellent discussion of available network analysis and visualization software may be found at http://en.wikipedia.org/wiki/Social_network_analysis_software.
Future Directions
Future directions for the research include adding additional yeast subnetworks that are believed to have a tie to aging processes. In addition, we will add networks that are believed to be unrelated to replicative aging processes. These unrelated networks will serve as control networks. For the TOT of direct interactions, proteins were labeled to show which group they belonged to (TOR, heat-shock, RLS, or shared). It would be helpful to do the same for the TOT of shortest-path connections.
We will also take what we have learned about studying yeast networks, and use this to study protein-protein interaction networks in other species, such as C. elegans, Drosophila melanogaster and eventually in humans. By using C. elegans, a wider variety of genes that have an effect on aging can be studied, i.e. genes such as the FOXO gene. However, because it is multicellular, the C. elegans genome would be more complex, having about 20,000 genes as opposed to only 6,000 in yeast. Homologs to yeast genes/proteins in other organisms can be investigated as possible important genes/proteins. Using human interaction networks allows the study to be directly related to the study of aging in humans, which is the ultimate goal. With an expanded yeast network, it will be easier to show links between existing data and studies of other model organisms. It might also help guide decisions on which networks to study in C. elegans and humans.
Closing Thoughts
In the previous sections, we introduced a large number of concepts and constructs that are based upon the premise that biological systems can be represented as network graphs. These concepts described how network nodes were interconnected and the consequences of certain specific classes of connectivity and network structure. At the 1982 Palo Alto American Mathematical Society meeting, Witten presented a paper on representing aging using the model of network decay. Of course, in those days, network analysis was not what it is today, and we had next to nothing of the genomic and network level data that we now have. However, even then, it was natural to consider aging as the temporal decay of a hypothetical organismal ‘aging network'. How then may we extend these ideas to the study of aging?
While little is currently known about how aging-related networks evolve across the organism's life span, it is reasonable to assume that two possible changes can occur; inactivation of active nodes/activation of inactive nodes and loss of connectivity/increase in connectivity. How or why nodes become inactive or edges disappear is irrelevant here; just that they do. It turns out that the structure of small-world networks, due to their hub connectivity, makes them vulnerable to targeted attacks aimed at specific hubs. Attacks that knock out essential genes are knocking out the life span network because the organism dies when an essential gene is knocked out. Thus, essential genes are critical hub genes [2,38,39]. Small-world neural networks have been shown to exhibit short-term memory capability. This suggests that memory decay, such as that seen in Alzheimer's disease may be related to decay of brain neural network structure in such a way as to remove the small-worldness property of the memory network. Understanding patterns in network decomposition could lead to potential early AD detection and to potential pharmaceutical intervention at earlier points in the disease course.
Connectivity gain and loss also have implications when it comes to discussing the hierarchical modularity of aging-related network architectures. Loss of connectivity through inactivity of a node or through loss of an edge could unlink an entire module of importance. Thus, nodes that connect modules within a larger network are critical to the functioning of the network. Questions around the role of evolutionary processes in the development of network architectures of various organisms may be of importance in understanding how network architectures related to aging processes are constructed. Why are some components of a network redundant while others are not (see also all of the citations on reliability theory)? What is the role of backup subnetworks? What is the importance of robustness and resilience? Why are some networks more robust to attack [46,47,48,49,50], less fragile than others or more frail [40,41,42,43,44]? How do we balance the need to adapt and evolve with robustness [45]? What, if any, is the association of life span with network architecture? These and many other questions remain to be answered.
Acknowledgements
The authors would like to thank many individuals for their respective support and collaborative kindnesses. In alphabetical order I would like to acknowledge my colleagues and friends: Danail Bonchev, S. Michal Jazwinski, Tom Johnson, Matt Kaeberlein and Brian Kennedy for their support and access to data and software. An expanded bibliography for both chapters is available at http://www.people.vcu.edu/∼tmwitten.