Landscape of gene networks for random parameter perturbation

Chunhe Li ab
aShanghai Center for Mathematical Sciences, Fudan University, Shanghai, China. E-mail: chunheli@fudan.edu.cn
bInstitute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China

Received 21st November 2017 , Accepted 3rd January 2018

First published on 4th January 2018


Landscape approaches have been exploited to study the stochastic dynamics of gene networks. However, how to calculate the landscape with a wide range of parameter variations and how to investigate the influence of the network topology on the global properties of gene networks remain to be elucidated. Here, I developed an approach for the landscape of random parameter perturbation (LRPP) to address this issue. Based on a self-consistent approximation approach, by making perturbations to parameters in a given range, I obtained the landscape for gene network systems. I applied this approach to two biological models, one for the mutual repression model and the other for the embryonic stem (ES) cell differentiation network. For the mutual repression model, my results confirm quantitatively that positive feedback promotes the robustness of multistability. For the ES cell differentiation model, I identify three cell states, representing the ES cell, the differentiation cell, and the intermediate state cell, respectively. I propose that the intermediate states and the wide range of parameter values coming from inhomogeneous cellular environments provide possible explanations for the heterogeneity observed in single cell experiments. I also offer a counterintuitive result that noise could reduce heterogeneity and promote the stability of cell states. These results support that the network topology determines the operating principles of the genetic networks, reflected by the representative landscapes from LRPP. This work provides a new route to obtain the potential landscape for a gene network system given a wide range of parameter values and study the influences of the network topology on the global properties of the system.



Insight, innovation, integration

An approach for the landscape of random parameter perturbation (LRPP) is developed to investigate the global properties of gene networks. The results quantitatively demonstrate that the topology determines the dynamical properties of the gene networks. For the ES cell differentiation model, attractors for the ES cell, the differentiation cell, and the intermediate state cell are identified respectively. The intermediate states and the wide range of parameter values coming from inhomogeneous cellular environments provide possible sources for the heterogeneity observed in single cell experiments. This work provides a general framework to study the global properties of biological systems considering the wide range of parameter variations, and physical insights into understanding quantitatively the relationship between network topology and interaction strengths.

1 Introduction

The Waddington epigenetic landscape,1 as a metaphor, has been frequently exploited to investigate the cell fate decision process during development and differentiation. Recently the Waddington landscape has been quantified for some biological systems, such as for stem cell differentiation and cancer.2–6 Based on the relevant gene regulatory networks, different approaches have been developed to construct the landscape of biological networks.2,3,7–16 However, some critical issues remain challenging. First, in the previous work, the landscape is usually acquired based on certain fixed parameter values, and the influences of parameters on the dynamics of the system are explored by the sensitivity analysis for a single parameter.4,17 Under these circumstances, the landscape acquired is actually based on a small parameter region compared to the whole parameter space, especially for the high-dimensional cases.

Another critical question is how to investigate the effects of the network structure on the global properties of the system. One possible way for that is to compare two network structures with similar parameter values. However, the conclusion drawn from this type of comparison is limited because each network structure could have a wide range of parameter choices, resulting in different dynamical behaviors, and it is impractical to compare two networks with all parameter choices. Previous studies have computationally investigated the circuit architecture for network function by enumerating all three-node topologies.18 The sensitivity of parameters in models has also been explored in different ways, such as the catastrophe theory19 and the sloppy model.20,21 Recently, Huang et al.22 reported a random circuit perturbation method to study the robustness of the dynamical behavior of gene regulatory networks against perturbations of circuit kinetic parameters, which provides a way to explore the effects of parameters on the gene network systems. However, these previous studies are usually based on the deterministic modeling, and the stochasticity of the gene expressions (intrinsic and external gene expression fluctuations) against parameter perturbations has yet to be studied. Also, the global stability of attractors or stable states in the network remains elusive from these methods.

In this study, based on the landscape theory of gene networks,4,13 I develop an approach to generate the landscape of random parameter perturbations (LRPP), so that I can investigate the global properties of the system using the landscape from a wider range of parameter regions. In other words, I ask what will be the consequence in terms of the landscape and the global properties of the system if I introduce random perturbations to the kinetic parameters in the network. I have successfully tested this approach in a simplified mutual repression (MR) model (also called toggle switch model), as well as in a complex gene network for embryonic stem (ES) cell differentiation. For both models, the LRPP can generate the landscape with multi-stable states in a wide parameter range. For example, for the MR model, the landscape displays bistable or tristable properties depending on different network topologies (Fig. 1D–I), and for the ES differentiation network, the landscape displays tristable or multi-stable state properties (Fig. 2B–E).


image file: c7ib00198c-f1.tif
Fig. 1 Landscape generated with LRPP for the mutual repression models. Panels (A–C) show the diagrams for the mutual repression (MR) model (A), the mutual repression with one self-activation (MRSA1) model (B) and the mutual repression with two self-activations (MRSA2) model (C). Arrows represent activation and short bars represent repression. Panels (D–I) show the landscapes for the MR model (D and G), the MRSA1 model (E and H), and the MRSA2 model (F and I). The landscape is robust against different types of distributions (uniform distribution for panels (D–F) and Gaussian distribution for panels (G–I)) used to sample the parameters. X and Y represent the relative gene expression level of genes X and Y, respectively.

image file: c7ib00198c-f2.tif
Fig. 2 Landscape generated with LRPP for the ES cell developmental network. (A) Diagram for an embryonic stem (ES) cell developmental network including 8 gene nodes. Magenta nodes represent the ES cell marker gene and cyan nodes represent differentiation marker genes. Red arrows represent activation and blue short bars represent repression. (B–D) Landscape generated with LRPP for the ES cell developmental network at different parameter range. The uniform distribution is used to sample random parameters. The diffusion coefficient is set as D = 0.01. The parameter range is set as a: 0.01–1, b: 0.01–1 (B), a: 0.01–0.5, b: 0.01–0.5 (C), and a: 0.01–0.5, b: 0.5–1 (D). (E) Landscape for the ES cell developmental network at a larger noise level. The parameter range is the same as the one in (D) but with a different noise level. The diffusion coefficient is set as D = 0.05. The multiple basins on the landscape reflect the heterogeneity of cell populations in the stem cell differentiation process.

For the MR model, my results from the landscape confirm quantitatively that positive feedback promotes multistability, consistent with previous findings. For the ES differentiation network, I identify three basins or attractors on the landscape, which correspond to the ES cell, the differentiation cell, and the intermediate state cell, individually. I propose that the existence of intermediate states and the wide range of parameter values provide a possible source of heterogeneity observed in single cell experiments. Additionally, by studying the effects of the network structure on stem cell differentiation, I found, remarkably, that although the random parameter perturbations are introduced, the critical roles of the network structure in the global properties of the system are preserved, which proposes that the network topology is the major determinant of the global behavior of the system.

The LRPP approach proposed here provides a way to generate the landscape for a multidimensional gene network system with a wide range of parameter perturbations. It can also be used to study the effects of distinct network structures on the global properties of the system in a more meaningful way. Different from previous approaches (e.g. bifurcation theory) for studying the effects of parameters or structures on the behavior of the system, the LRPP considers the stochastic properties in biological networks, and provides a quantitative way to study the global stability of attractors or cell states under parameter fluctuations in biological systems. I expect LRPP to be a useful tool to study the stochasticity and global stability of gene regulatory or other dynamical systems.

2 Methods

2.1 Self-consistent mean field approximation

The stochasticity of the gene regulatory network dynamics, coming from external and intrinsic noise sources, has been extensively studied by many previous studies.23–25 Previously, we have developed a self-consistent mean field approximation approach to study the stochastic dynamics of gene networks by acquiring the potential landscape for multi-dimensional gene networks.4,13,26 The time evolution of a dynamical system is determined by the Fokker–Planck equation (or probabilistic diffusion equation). Given the system state P(X1,X2,…,Xn,t), where X1,X2,…,Xn represent the concentration or number of molecules, we expect to have n-coupled differential equations, which are difficult to solve. Following a self-consistent mean field approach,4,13,26–28 we can split the probability into the products of the individual ones: P(X1,X2,…,Xn,t) ∼ ΠniP(Xi,t) and solve the probability self-consistently. In this way, we effectively reduce the dimensionality of the system (from MN to MN), and therefore make the computation of the problem tractable.

However, for the multidimensional cases, it is still challenging to solve the diffusion equations directly. We start from the moment equations and assume specific probability distribution based on physical arguments, i.e., we assume some specific connections between moments. In principle, once we know all moments, we can obtain the probability distributions. Here, we assume a Gaussian distribution as an approximation, which means that we need to calculate two moments, the mean and the variance.

When the diffusion coefficient D is small, the moment equations can be approximated to29,30

 
image file: c7ib00198c-t1.tif(1)
 
image file: c7ib00198c-t2.tif(2)

Here, x, σ(t) and A(t) are vectors and tensors, and AT(t) is the transpose of A(t). The elements of matrix A are specified as image file: c7ib00198c-t3.tif. Based on these equations, we can solve [x with combining macron](t) and σ(t). Here, we only consider the diagonal elements of σ(t) from the mean field approximation. Therefore, the evolution of probability distribution for each variable can be acquired from the Gaussian approximation:

 
image file: c7ib00198c-t4.tif(3)

Here, [x with combining macron](t) and σ(t) are the solutions from eqn (1) and (2). The probability distribution obtained above corresponds to one steady state or basin of attraction. If the system has multiple steady states, we need several probability distributions localized at each basin with different variances. So, the total probability is the weighted sum of all these probability distributions. From the mean field approximation, we can extend this formulation to the multidimensional systems by assuming that the total probability is the product of each individual probability for each variable. Finally, with the total probability distribution, we can obtain the potential landscape by U(x) = −ln[thin space (1/6-em)]Pss(x). Here Pss(x) represents the steady state probability distribution and U(x) denotes the dimensionless potential.

2.2 Landscape for random parameter perturbations

Based on the above procedures, we can obtain the landscape for gene network systems with multiple dimensions at a certain fixed parameter choice. However, in realistic biological systems, there could be another source of fluctuation, which has yet to be considered explicitly. The heterogeneity has been observed in single cell populations. One indication from this observation is that each cell could have different parameter values. Additionally, the precise parameter values for the kinetics of all kinds of regulatory interactions are usually missing, especially from in vivo experiments. In this case, it is natural to assume that the parameters should take some range, rather than some specific values. In this work, this kind of randomness is studied by randomly picking parameters from a fixed distribution (such as a Gaussian distribution or a uniform distribution) for each model, which leads to an ensemble of models. Therefore, it is important to investigate the landscape and dynamics of the system considering an ensemble of models, which is challenging because it means that one needs to study many different landscapes generated by each set of parameter values. To address this problem, I develop a new approach to generate the landscape of random parameter perturbations (LRPP) for the gene network systems. The LRPP aims to combine these two kinds of randomness (from gene regulatory dynamics and from uncertainty of parameter values, e.g. the strengths for interactions), and discover the landscape for the ensemble of models, rather than the models with some specific parameter values.

Given a network topology, the time evolution of different gene expression level in the system is governed by the corresponding ordinary differential equations (ODEs), shown in eqn (4).

 
image file: c7ib00198c-t5.tif(4)

Here, F represents the driving force for the time evolution of different variables (gene expression level). In eqn (4), i ∈ (1,2,…,N) represents the index of variables. There are in total N equations describing the time evolution of N genes. S is the threshold of the sigmoidal function, and n is the Hill coefficient, which determines the steepness of the sigmoidal function.3,4A and B are the interaction matrices respectively for the activation and repression regulations. Aji = a when gene j activates gene i, otherwise Aji = 0. Bji = b when gene j inhibits gene i, otherwise Bji = 0. In addition, a is the activation constant, b is the repression constant and k is the degradation rate (see Tables 1 and 2 for the description of parameters and the range of parameter values). In eqn (4), the first term represents activation regulation from gene j to gene i (this term represents self-activation when i = j), the second term represents repression from gene j to gene i (this term represents self-repression when i = j), and the last term represents degradation.

Table 1 Parameter value range for the mutual repression (toggle switch) model
Symbol Value Unit Description
k 0.1–1 hour−1 Degradation rate
a 0.1–1 μM hour−1 Activation constant
b 0.1–1 μM hour−1 Repression constant
n 1–6 Hill coefficient
S 0.1S0–1.9S0 Threshold for Hill function, S0 = b/2


Table 2 Parameter value range for the ES cell differentiation model (corresponding to Fig. 2D)
Symbol Value Unit Description
k 0.1–1 hour−1 Degradation rate
a 0.01–0.5 μM hour−1 Activation constant
b 0.5–1 μM hour−1 Repression constant
n 1–6 Hill coefficient
S 0.1S0–1.9S0 Threshold for Hill function, S0 = b/2


From the ODEs, to consider the wide distribution of parameter values, I sample the parameter values from some distributions (uniform distribution or Gaussian distribution) based on the range of parameters and generate an ensemble of models where each model has distinct parameter values. For each of these models, I apply the self-consistent approximation approach to generate the potential landscape as described in the previous section. To find the multistability of the system, I solve the ODEs numerically by giving many different initial conditions (10[thin space (1/6-em)]000 different random initial conditions are picked for each model).

In this study, the initial conditions for the ODEs are randomly picked from a uniform distribution based on the range of different variables in the models. For example, for the mutual repression (MR) model, the mutual repression with one self-activation (MRSA1) model and the mutual repression with two self-activation (MRSA2) model, the range for the variables X and Y (representing the relative expression level of genes X and Y) is set as from 0 to 10. The maximum of X and Y from the ODE solutions is determined by (a + b)/k, so the range from 0 to 10 is reasonable to cover all possible steady states. Then the initial conditions for X and Y are randomly chosen from a uniform distribution from 0 to 10, i.e. U(0,10). For a similar reason, for the ES cell differentiation model, the range of variables (relative gene expression level) involved in the model is set as from 0 to 10, and the initial conditions for the 8 variables are randomly chosen from a uniform distribution U(0,10). In this way, I can get the weight factor wj (j = 1,2,…,Q, where Q is the total number of steady states) of different steady states (attractors) based on the proportion of initial conditions, which lead to the corresponding steady states.

Then, the landscape for the ensemble of the models is obtained by summing over all the landscapes with distinct parameter values. In this work, I randomly sample 10[thin space (1/6-em)]000 parameter sets to generate the landscape. Specifically, image file: c7ib00198c-t6.tif, where Psse(x) represents the steady state distribution of the ensemble of the models, Pssi(x) represents the steady state distribution of each individual model, and M is the number of random parameter sets (in this work I choose M to be 10[thin space (1/6-em)]000). The potential landscape for the ensemble of the models is calculated by U(x) = −ln[thin space (1/6-em)]Psse(x).

In the computational models, I have five critical parameters, which are degradation rate k, activation constant a, repression constant b, threshold S, and Hill coefficient n. The randomization process is performed on all five types of parameters. To make these five parameters representative in the parameter space with biological meanings, I set a large range for each of them estimated from experiments based on some biological constraints.17,31 Specifically, k is set as 0–1 hour−1, a is set as 0–1 μM hour−1, and b is set as 0–1 μM hour−1. The Hill coefficient n is set as a number between 1 and 6 as these values can account for most of the binding behavior for the gene regulations in biological systems. Here each parameter is set by randomly picking values from either a uniform distribution or the Gaussian distribution. How to choose the value of the Hill function threshold S is critical to the system dynamics. I first make a simple guess, and set S0 = b/2, where b is the repression constant. S is the threshold value for the activation or repression of a regulation, so a reasonable assumption is that all regulations sometimes work and sometimes do not work, i.e., the regulator levels need to be sometimes above the threshold and sometimes below the threshold. Therefore, a reasonable guess for S0 is the maximum expression level divided by 2. For the most simple mutual repression model (toggle switch model), the maximum regulator level is determined by b, so I make a simple guess S0 = b/2. Then I give S a big enough perturbation around S0, i.e., S = S0r, where r is a random number between 0.1 and 1.9. This represents a wide perturbation to S0, from −90% to 90%.

To summarize, to get the landscape for a gene network with a wide range of parameter values, the LRPP method takes the following steps:

(1) Write down the ODEs according to the structure (topology) of the gene regulatory network.

(2) Estimate the range of different parameters in the model based on the mathematical and biological constraints.

(3) Randomly choose a set of parameter values to get a model based on some distributions (such as uniform distribution or Gaussian distribution), calculate how many stable steady states the system has by giving many different initial conditions, and get the weight factor for each stable state. Calculate the landscape for the current parameter setting based on the self-consistent approximation.

(4) Repeat step 3 and sum over the landscapes for different models with different randomly chosen parameter values, and obtain the final landscape for the ensemble of the gene network models.

After obtaining the landscape for a gene network, we can study how the topology of the network affects the global behavior of the system. Here I used two different ways for changing activation regulations and repression regulations. For the activation regulations, deleting an activation link in the network represents the knockout of the corresponding activation regulation. For the repression regulations, I set the threshold level parameter S as a huge number (e.g. 1010) to represent the knockout of a repression regulation.

3 Results

3.1 LRPP for the mutual repression model

I first test this approach for the mutual repression (also called toggle switch) model (with no self-activation, with one self-activation and with two self-activations), indicated in Fig. 1A–C. I call them the MR (mutual repression) model, the MRSA1 (with one self-activation) model, and the MRSA2 (with two self-activations) model, respectively. From previous work on the MRSA2 model,3 the major parameters used in the model have been set in the range where the activation constant a is 0–1, repression constant b is 0–1, degradation rate k is 0–1, and the Hill coefficient n is 1–6 (see parameter setting in Table 1).

With the LRPP, I acquired the landscape for the three network structures respectively (Fig. 2, see Fig. S1, ESI for the landscapes from different angles). For the MR model (Fig. 1A), I found a bistable landscape with two deep basins (Fig. 1D and G). After adding a self-activation link to the model (Fig. 1B), the landscape is still bistable, but it is biased to one of the two basins because of the self-activation regulation for one of the two genes (Fig. 1E and H). Furthermore, if both genes are self-activated, the model generates a stable tristable landscape (Fig. 1F and I), with a new intermediate state. It has been suggested that the positive feedback loop promotes bistability and leads to a greater robustness for biological networks.32,33 The landscape results obtained here confirm this notion quantitatively. Specifically, the barrier height (see Fig. 1F for the definition of barrier) can be calculated from the landscape, which provides a quantitative measure for the global stability of the basins or stable states. Therefore, these results show quantitatively that the positive feedback loops actually promote the robustness of the multistability, i.e., for the three models investigated here, the MRSA2 model is most robust to generate a tristable landscape. Additionally, I found that the form of randomization process does not affect this conclusion (in Fig. 1D–I, the uniform distribution is assumed for (D–F) and the Gaussian distribution is assumed for (G–I)). These results suggest that LRPP works well in a two-dimensional model to construct the landscape of the ensemble of the models with randomization of parameters. Considering the wide range of parameter values in LRPP, these results also provide support for the notion that it is the structure of the network, rather than the specific parameter setting, that determines the function of the network.22

Additionally, to show that the landscape generated from LRPP over a parameter range is more representative than the one from specific parameter sets, I also obtained some landscapes for certain specific parameter sets. These results show that some parameter sets can lead to a bistable landscape or a monostable landscape for the MRSA2 model (Fig. S2, ESI).

3.2 LRPP for the ES cell differentiation model

To see if LRPP works in the high-dimensional models, I investigated the embryonic stem (ES) cell differentiation model with LRPP. The heterogeneity and multistable property in stem cell differentiation have been suggested in previous work.4,6,34 Accordingly, I constructed a simplified ES cell differentiation gene network (Fig. 2A). In this network, there are five stem cell marker genes: OCT4, SOX2, NANOG, KLF4, and OCT4SOX2 (the complex between OCT4 and SOX2), and three differentiation marker genes: GCNF, CDX2, and GATA6. Several mutual repression motifs (such as the mutual repression between NANOG and GATA6, and the mutual repression between OCT4 and CDX2) govern the cell fate decision process in ES cell differentiation.

With LRPP, I obtained the landscape for the ES cell network considering a wide range of parameter values in the gene expression space of NANOG/GATA6 (Fig. 2B), for the parameter range of a (0.01–1), b (0.01–1), and k (0.1–1) (Table 2). The landscape displays multistability with a large heterogeneity. The landscape displays an ES cell basin (high NANOG/low GATA6) and a differentiation cell basin (low NANOG/high GATA6). But there is also a deep basin for an intermediate state (high NANOG/high GATA6) and some other small basins (Fig. 2B). So, the landscape results from LRPP are consistent with the observation that heterogeneity exists in the developmental processes.

Next, to see how the range of parameters a and b affects the landscape, I show the landscape for the parameter range of a: 0.01–0.5 and b: 0.01–0.5. With the decrease of a and b, the landscape exhibits a stable intermediate state. This is because the decrease of the repression constant b inhibits the appearance of bistable states, and therefore generates a stable intermediate state. When I tune the parameters to the range of a (0.01–0.5) and b (0.5–1), I obtained a tristable landscape for ES cell differentiation, with an intermediate state. It means that smaller activation and larger repression lead to a robust tristable cell fate decision. This is reasonable partly because a bistable switch needs strong mutual repression, and the self-activations for the ES and differentiation marker genes (e.g. NONOG, GATA6, CDX2) promote the intermediate state. Although the parameter variation from a wide range is introduced, the tristable state property (governed by the network structure) is preserved. This result demonstrates that the tristable state property or the existence of an intermediate state is some intrinsic property of the ES cell differentiation and a natural consequence of the landscape for the ES differentiation network,35 which does not depend on the specific parameter choices. The existence of the intermediate state is critical for the ES cell differentiation and the reprogramming process, which is supported by previous theoretical and experimental work.34,36,37 The intermediate state indicated here could also correspond to the partially reprogrammed cell states because the intermediate state on the landscape has intermediate expression of both ES cell marker genes and differentiation marker genes, and therefore is easier to be reached from certain reprogramming tactics. These results also quantitatively confirm that the partially reprogrammed cells are a natural consequence of a high-dimensional landscape for the ES differentiation.35

I also explored how the noise level (characterized by diffusion coefficient D) affects the landscape for the ES differentiation. When the noise level increases from D = 0.01 (Fig. 2D) to D = 0.05 (Fig. 2E), the intermediate state disappears and the bistable state becomes more robust (see Fig. S3, ESI for the landscapes of ES differentiation from different angles). This provides a counterintuitive result that noise can reduce heterogeneity. One possible explanation for this counterintuitive result is that the landscape generated by the LRPP approach in some sense can be considered as an average landscape from a parameter range. The larger noise will blur the boundary between close attractors and lead to the merger of attractors. This implies that the gene expression fluctuations, in certain circumstances, can reduce heterogeneity and give rise to the robust cell states. Actually, accumulating evidence has shown that the noise plays critical roles in biological systems, such as inducing the entrainment of intracellular oscillators38 and promoting pattern formations in development.39

The tristable landscape indicates that LRPP can replicate the mutually repressed cell fates in stem cell differentiation. In the meantime, for each network structure the landscape is obtained for a certain parameter range (not specific parameter values). So, LRPP actually provides a tool to explore the influence of the network structure on the global properties of the system. Fig. 3 shows the results of the landscape for the knockout of different genes. Compared with Fig. 2D, the knockout of genes SOX2, KLF4, and SOX2OCT4 makes the differentiation state more stable (Fig. 3). Also, the knockout of CDX2 makes the ES cell state (marked as ES state) more stable, indicated by the disappearance of the differentiation state (Fig. 3E). When I perform multiple gene knockout such as knocking out SOX2OCT4 and KLF4 together, the stability of the differentiation state (marked as D state) is significantly enhanced, indicated by the increase of the barrier of the differentiation state and the decrease of the barrier for the ES cell state (Fig. 4A).


image file: c7ib00198c-f3.tif
Fig. 3 Landscape for the ES cell developmental network for the knockout of different genes. It is shown that the knockout of CDX2 makes the differentiation state disappear, and the knockout of OCT4SOX2 makes the differentiation state more stable. OCT4SOX2: the complex formed by SOX2 and OCT4, E: embryonic stem (ES) cell, D: differentiation state cell, I: intermediate state cell.

image file: c7ib00198c-f4.tif
Fig. 4 Landscape for the ES cell developmental network for OCT4SOX2/KLF4 knockout (A) and GCNF/CDX2 knockout (B). It is shown that the knockout of OCT4SOX2 and KLF4 makes the differentiation state more stable, and the knockout of GCNF and CDX2 makes the differentiation state disappear. E: embryonic stem (ES) cell, D: differentiation state cell, I: intermediate state cell.

Here, the potential barrier is defined as the potential difference between the local minimum and the saddle point (see Fig. 1F). A larger barrier means that the corresponding attractor or cell state is more stable, and it will take longer for the system to switch from this attractor to the other attractors. So, the barrier height provides a quantitative measure for the global stability of the attractor or cell state. Fig. 5 shows the change of barrier heights for the knockout of different genes (corresponding to Fig. 3 and 4). It is clearly shown that the knockout of OSK (OCT4SOX and KLF4) is most effective for making the differentiation state more stable (promoting differentiation, suggested by the increase of the barrier for reprogramming and decrease of the barrier for differentiation), and the knockout of GCNF and CDX2 is most effective to make the ES state more stable (promoting reprogramming, suggested by the decrease of the barrier for reprogramming and increase of the barrier for differentiation). This confirms quantitatively the importance of SOX2, OCT4, and KLF4, three of the four Yamanaka factors for reprogramming,40 in the maintenance of the ES cell state. Also, if I knock out CDX2 and GCNF together, the differentiation state will disappear, and a stable ES cell state and an intermediate state coexist (Fig. 4B). This demonstrates that CDX2 and GCNF are critical factors for maintaining the differentiation states.


image file: c7ib00198c-f5.tif
Fig. 5 Change of barrier heights for the landscape of the ES cell developmental network when different genes are knocked out. The X-axis represents the genes for the knockout. The Y-axis represents the change of barrier compared with the case without knockout (Fig. 2D). Cyan bars represent the change of USE (barrier for the differentiation process, USE = USUE), and magenta bars represent the change of USD (barrier for the reprogramming process, USD = USUD). It is clearly shown that the knockout of OSK is most effective for making the differentiation state more stable (promoting differentiation), and the knockout of GCNF and CDX2 is most effective to make the ES state more stable (promoting reprogramming). OSK: knockout of OCT4SOX2 and KLF4; GC: knockout of GCNF and CDX2.

Therefore, by studying the effects of the network structure on stem cell differentiation, I found, remarkably, that although the random parameter perturbations are introduced, the critical roles of the network structure in the global behavior of the system are preserved. It has been debated which is more important to the network properties, the network topology or parameters (e.g. the interaction strengths)? Here, considering the random parameter perturbations as the consequence of the heterogeneous cellular environments, the landscape results actually propose that the network topology, rather than the specific parameter setting, mainly determines the global properties of the network. These results provide new physical insights into understanding quantitatively the relationship between the network topology and the interaction strengths.

Recently, with the development of PCR and RNA sequencing (RNA-seq) technology, the measurement of the transcriptional activity of single cells has become amenable.41 A critical challenge is to interpret the heterogeneity in the single cell data in a meaningful way. Accumulating evidence suggests that the wide distributions of many transcriptional factors actually have functional significance.42 From our landscape view, one possible source of heterogeneity in the single cell data is the existence of the intermediate states (Fig. 2D), which leads to the wide distribution of transcription factors (here NANOG and GATA6). The intermediate state has been suggested to play a critical role in increasing the plasticity in cell fate decision processes.43 Of note, the different timescales introduced by epigenetic regulatory dynamics (such as DNA methylation and histone modification) can lead to the appearance of more stable cell states,34,44 which provide another source of heterogeneity. It would also be interesting to investigate the non-adiabatic effects coming from the epigenetic regulations34,44 on the gene network dynamics based on the LRPP framework. Additionally, it is reasonable to assume that each single cell has different kinetic parameter values due to the fluctuations from intrinsic or external sources.23,24,45 So, the LRPP from this work, generated from an ensemble of the models with different parameters, rather than fixed parameter values, provides a more realistic explanation for the cellular environment, and possibly a better way to explain the heterogeneity observed in the single cell expression data.

4 Discussion

In this work, I have developed an approach to construct the landscape for a wide range of parameters, i.e., the landscape of random parameter perturbations (LRPP). I tested this approach in a simple mutual repression model and a complex ES cell differentiation model. For both models, the LRPP can generate the landscape with multistable states in some biologically reasonable parameter range. For the mutual repression model, the landscape results indicate that positive feedback promotes multistability. For the ES cell developmental network, my results identify three attractors or basins on the landscape representing the ES cell, the differentiation cell, and the intermediate state cell, respectively. The intermediate states generated by the wide range of parameter values provide a source of heterogeneity observed in single cell experiments. These results also suggest that the existence of the intermediate state is some intrinsic property of ES differentiation, as a natural consequence of the landscape for the ES differentiation network.

Additionally, the landscape results also suggest, counterintuitively, that the noise can reduce heterogeneity and promote stable cell states in certain circumstances, which offers a new example for the critical roles of fluctuations in the cell fate decision process of biological systems. By knocking out different genes in the ES cell differentiation network model, I investigated the influence of individual genes on stem cell differentiation. Interestingly, although the parameter perturbations are introduced in a wide range, the determinant roles of the network structure in the global properties of the network are preserved, which suggests that the network structure determines the operating principles of the genetic network systems.

This work provides an approach for obtaining the landscape in a wide range of parameters and studying the influence of the network structure on the global properties of the system. The LRPP approach and associated analysis are general and can be applied to the other gene regulatory systems or cell–cell interaction systems, e.g. the cancer immune cell–cell interaction system,16 the cell fate decision system for the Epithelial-to-Mesenchymal Transition (EMT),22 and cancer gene regulatory networks.5 One limitation of the current approach is that the landscape acquired here provides a quantitative measure for the global stability of gene networks over a wide range of parameter values, but the dynamics of the system is not specifically investigated. Another limitation is that the current approaches focus on the gene network modelling, which does not fully exploit the gene expression data from experiments, such as single cell data.46,47 In the future, the dynamical transition between landscape basins will be further studied by incorporating the kinetic path-based approaches into the current LRRP framework. Also, it would be of great interest to combine the gene network modelling and the single cell data analysis, to discover the landscape of gene regulatory networks.

Conflicts of interest

There are no conflicts to declare.

Acknowledgements

C. L. is supported by the National Natural Science Foundation of China (11771098), the Natural Science Foundation of Shanghai, China (17ZR1444500), and the Thousand Youth Talents Program.

References

  1. C. H. Waddington, The strategy of the genes: a discussion of some aspects of theoretical biology, Allen and Unwin, London, 1957, p. 290 Search PubMed.
  2. J. Wang, K. Zhang, L. Xu and E. K. Wang, Proc. Natl. Acad. Sci. U. S. A., 2011, 108, 8257–8262 CrossRef CAS PubMed.
  3. J. Wang, L. Xu, E. K. Wang and S. Huang, Biophys. J., 2010, 99, 29–39 CrossRef CAS PubMed.
  4. C. Li and J. Wang, PLoS Comput. Biol., 2013, 9, e1003165 CAS.
  5. C. Li and J. Wang, J. R. Soc., Interface, 2014, 10, 20140774 CrossRef PubMed.
  6. M. Sasai, Y. Kawabata, K. Makishi, K. Itoh and T. P. Terada, PLoS Comput. Biol., 2013, 9, e1003380 Search PubMed.
  7. J. Wang, Adv. Phys., 2015, 64, 1–137 CrossRef.
  8. J. Wang, L. Xu and E. K. Wang, Proc. Natl. Acad. Sci. U. S. A., 2008, 105, 12271–12276 CrossRef CAS PubMed.
  9. K. Zhang, M. Sasai and J. Wang, Proc. Natl. Acad. Sci. U. S. A., 2013, 110, 14930–14935 CrossRef CAS PubMed.
  10. C. Liao and T. Lu, J. Phys. Chem. B, 2013, 117, 12995–13004 CrossRef CAS PubMed.
  11. P. Ao, J. Genet. Genomics, 2009, 36, 63–73 CrossRef PubMed.
  12. C. Lv, X. Li, F. Li and T. Li, PLoS Comput. Biol., 2015, 11, e1004156 Search PubMed.
  13. C. Li and J. Wang, Proc. Natl. Acad. Sci. U. S. A., 2014, 111, 14130–14135 CrossRef CAS PubMed.
  14. H. Ge and H. Qian, Chaos, 2012, 22, 023140 CrossRef PubMed.
  15. M. Lu, J. Onuchic and E. Ben-Jacob, Phys. Rev. Lett., 2014, 113, 078102 CrossRef PubMed.
  16. C. Li, Phys. Chem. Chem. Phys., 2017, 19, 7642–7651 RSC.
  17. M. Lu, H. Jolly, H. Levine, J. Onuchic and E. Ben-Jacob, Proc. Natl. Acad. Sci. U. S. A., 2013, 110, 18144–18149 CrossRef CAS PubMed.
  18. W. Ma, A. Trusina, H. El-Samad, W. A. Lim and C. Tang, Cell, 2009, 138, 760–773 CrossRef CAS PubMed.
  19. E. C. Zeeman, Sci. Am., 1976, 4, 65–83 CrossRef.
  20. B. B. Machta, R. Chachra, M. K. Transtrum and J. P. Sethna, Science, 2013, 342, 604–607 CrossRef CAS PubMed.
  21. B. C. Daniels, Y.-J. Chen, J. P. Sethna, R. N. Gutenkunst and C. R. Myers, Curr. Opin. Biotechnol., 2008, 19, 389–395 CrossRef CAS PubMed.
  22. B. Huang, M. Lu, D. Jia, E. Ben-Jacob, H. Levine and J. N. Onuchic, PLoS Comput. Biol., 2017, 13, e1005456 Search PubMed.
  23. P. S. Swain, M. B. Elowitz and E. D. Siggia, Proc. Natl. Acad. Sci. U. S. A., 2002, 99, 12795–12800 CrossRef CAS PubMed.
  24. M. Kaern, T. C. Elston, W. J. Blake and J. J. Collins, Nat. Rev. Genet., 2005, 6, 451–464 CrossRef CAS PubMed.
  25. T. Lu, J. Hasty and P. G. Wolynes, Biophys. J., 2006, 91, 84–94 CrossRef CAS PubMed.
  26. J. Wang, C. Li and E. K. Wang, Proc. Natl. Acad. Sci. U. S. A., 2010, 107, 8195–8200 CrossRef CAS PubMed.
  27. M. Sasai and P. Wolynes, Proc. Natl. Acad. Sci. U. S. A., 2003, 100, 2374–2379 CrossRef CAS PubMed.
  28. B. Zhang and P. G. Wolynes, Proc. Natl. Acad. Sci. U. S. A., 2014, 111, 10185–10190 CrossRef CAS PubMed.
  29. N. G. Van Kampen, Stochastic processes in Chemistry and Physics, North Holland, Amsterdam, 1st edn, 1992, pp. 120–127 Search PubMed.
  30. G. Hu, Stochastic Forces and Nonlinear Systems, Shanghai Scientific and Technological Education Press, Shanghai, 1994, pp. 68–74 Search PubMed.
  31. X. Tian, H. Zhang and J. Xing, Biophys. J., 2013, 105, 1079–1089 CrossRef CAS PubMed.
  32. J. E. Ferrell, Curr. Opin. Cell Biol., 2002, 14, 140–148 CrossRef CAS PubMed.
  33. T. Y.-C. Tsai, Y. S. Choi, W. Ma, J. R. Pomerening, C. Tang and J. E. Ferrell, Science, 2008, 321, 126–129 CrossRef CAS PubMed.
  34. C. Li and J. Wang, J. R. Soc., Interface, 2013, 10, 20130787 CrossRef PubMed.
  35. A. H. Lang, H. Li, J. J. Collins and P. Mehta, PLoS Comput. Biol., 2014, 10, e1003734 Search PubMed.
  36. K. Plath and W. E. Lowry, Nat. Rev. Genet., 2011, 12, 253–265 CrossRef CAS PubMed.
  37. P. Wang, C. Song, H. Zhang, Z. Wu, X.-J. Tian and J. Xing, Interface Focus, 2014, 4, 20130068 CrossRef PubMed.
  38. A. Gupta, B. Hepp and M. Khammash, Cell Syst., 2016, 3, 521–531 CrossRef CAS PubMed.
  39. L. Zhang, K. Radtke, L. Zheng, A. Q. Cai, T. F. Schilling and Q. Nie, Mol. Syst. Biol., 2012, 8, 613 CrossRef PubMed.
  40. K. Takahashi and S. Yamanaka, Cell, 2006, 126, 663–676 CrossRef CAS PubMed.
  41. O. Stegle, S. A. Teichmann and J. C. Marioni, Nat. Rev. Genet., 2015, 16, 133 CrossRef CAS PubMed.
  42. A. M. Arias and J. M. Brickman, Curr. Opin. Cell Biol., 2011, 23, 650–656 CrossRef PubMed.
  43. C. Li, T. Hong and Q. Nie, Phys. Chem. Chem. Phys., 2016, 18, 17949–17956 RSC.
  44. H. Feng and J. Wang, Sci. Rep., 2012, 2, 550 CrossRef PubMed.
  45. G. Balazsi, A. van Oudenaarden and J. J. Collins, Cell, 2011, 144, 910–925 CrossRef CAS PubMed.
  46. D. A. Lawson, N. R. Bhakta, K. Kessenbrock, K. D. Prummel, Y. Yu, K. Takai, A. Zhou, H. Eyob, S. Balakrishnan and C.-Y. Wang, et al. , Nature, 2015, 526, 131–135 CrossRef CAS PubMed.
  47. S. Petropoulos, D. Edsgard, B. Reinius, Q. Deng, S. P. Panula, S. Codeluppi, A. P. Reyes, S. Linnarsson, R. Sandberg and F. Lanner, Cell, 2016, 165, 1012–1026 CrossRef CAS PubMed.

Footnote

Electronic supplementary information (ESI) available. See DOI: 10.1039/c7ib00198c

This journal is © The Royal Society of Chemistry 2018