Open Access Article
A.
Treveil‡
ab,
P.
Sudhakar‡
ac,
Z. J.
Matthews‡
d,
T.
Wrzesiński
a,
E. J.
Jones
ab,
J.
Brooks
abde,
M.
Ölbei
ab,
I.
Hautefort
a,
L. J.
Hall
b,
S. R.
Carding
bd,
U.
Mayer
f,
P. P.
Powell
d,
T.
Wileman
bd,
F.
Di Palma
a,
W.
Haerty
*a and
T.
Korcsmáros
*ab
aEarlham Institute, Norwich Research Park, Norwich, Norfolk NR4 7UZ, UK. E-mail: Tamas.Korcsmaros@earlham.ac.uk; Fax: +44-(0)1603450021; Tel: +44-(0)1603450961
bQuadram Institute Bioscience, Norwich Research Park, Norwich, Norfolk NR4 7UA, UK
cDepartment of Chronic Diseases, Metabolism and Ageing, KU Leuven, Belgium
dNorwich Medical School, University of East Anglia, Norwich, Norfolk NR4 7TJ, UK
eDepartment of Gastroenterology, Norfolk and Norwich University Hospitals, Norwich, UK
fSchool of Biological Sciences, University of East Anglia, Norwich, Norfolk NR4 7TJ, UK
First published on 10th December 2019
The epithelial lining of the small intestine consists of multiple cell types, including Paneth cells and goblet cells, that work in cohort to maintain gut health. 3D in vitro cultures of human primary epithelial cells, called organoids, have become a key model to study the functions of Paneth cells and goblet cells in normal and diseased conditions. Advances in these models include the ability to skew differentiation to particular lineages, providing a useful tool to study cell type specific function/dysfunction in the context of the epithelium. Here, we use comprehensive profiling of mRNA, microRNA and long non-coding RNA expression to confirm that Paneth cell and goblet cell enrichment of murine small intestinal organoids (enteroids) establishes a physiologically accurate model. We employ network analysis to infer the regulatory landscape altered by skewing differentiation, and using knowledge of cell type specific markers, we predict key regulators of cell type specific functions: Cebpa, Jun, Nr1d1 and Rxra specific to Paneth cells, Gfi1b and Myc specific for goblet cells and Ets1, Nr3c1 and Vdr shared between them. Links identified between these regulators and cellular phenotypes of inflammatory bowel disease (IBD) suggest that global regulatory rewiring during or after differentiation of Paneth cells and goblet cells could contribute to IBD aetiology. Future application of cell type enriched enteroids combined with the presented computational workflow can be used to disentangle multifactorial mechanisms of these cell types and propose regulators whose pharmacological targeting could be advantageous in treating IBD patients with Crohn's disease or ulcerative colitis.
To date, various cell types have been identified in the intestinal epithelium based on specific functional and gene expression signatures. Paneth cells residing in the small intestinal crypts of Lieberkühn help to maintain the balance of the gut microbiota by secreting anti-microbial peptides, cytokines and other trophic factors.6 Located further up the intestinal crypts, goblet cells secrete mucin, which aggregates to form the mucus layer, which acts as a chemical and physical barrier between the intestinal lumen and the epithelial lining.7 Both of these cell populations have documented roles in gut-related diseases.8,9 Dysfunctional Paneth cells with reduced secretion of anti-microbial peptides have been shown to contribute to the pathogenesis of CD,10 while reduction in goblet cell numbers and defective goblet cell function has been associated with UC in humans.11
Recent studies have employed single cell transcriptomics sequencing of tissue samples to characterise the proportion and signatures of different epithelial cell types in the intestines of healthy and IBD patients.12–14 However, to provide deeper insights into the role of specific cell populations (such as Paneth cells and goblet cells) in IBD, in vitro models are required for in depth testing and manipulation. Such models can be used to study specific mechanisms of action, host-microbe interactions, intercellular communication, patient specific therapeutic responses and to develop new diagnostic approaches. Due to ease of manipulation, observation and analysis, organoid models, including small intestinal models (enteroids), are increasingly used in the IBD field.15–17 Enteroid cell culture systems employ growth factors to expand and differentiate Lgr5+ stem cells into spherical models of the small intestinal epithelia which recapitulate features of the in vivo intestinal tissue.18–20 It has been shown that these enteroids contain all the major cell types of the intestinal epithelium, and exhibit normal in vivo functions.21 These models, generated using intestinal tissue from mice, from human patient biopsies or from induced pluripotent stem cells (iPSCs), have proven particularly valuable for the study of complex diseases which lack other realistic models and exhibit large patient variability, such as IBD.22,23 Small molecule treatments have been developed that skew the differentiation of enteroids towards Paneth cell or goblet cell lineages, improving representation of these cells within the enteroid cell population.24,25 Specifically, differentiation can be directed towards the Paneth cell lineage through the addition of DAPT, which inhibits notch signaling, and CHIR99021, which inhibits GSK3β-mediated β-catenin degradation. Enteroid cultures enriched in goblet cells can be generated through the addition of DAPT and IWP-2, an inhibitor of Wnt signaling.25 Whilst these methods do not present single cell type resolution, they provide useful tools to study Paneth cell and goblet cell populations in the context of the other major epithelial cell types.26 A recent study by Mead et al. found that Paneth cells from enriched enteroids more closely represent their in vivo counterparts than those isolated from conventionally differentiated enteroids, based on transcriptomics, proteomics and morphologic data.27 Furthermore, we have shown that enteroids enriched for Paneth cells and goblet cells recapitulate in vivo characteristics on the proteomics level28,29 and that they are a useful tool for the investigation of health and disease related processes in specific intestinal cell types.29
Nevertheless, the effect of Paneth cell and goblet cell enrichment of enteroids on key regulatory landscapes has not been extensively characterised. In this study, we comprehensively profiled mRNAs, miRNAs and lncRNAs from mouse derived, 3D conventionally differentiated enteroids (control), Paneth cell enriched enteroids (PCeE) and goblet cell enriched enteroids (GCeE) to determine the extent to which these enteroids display increased Paneth cell and goblet cell signatures. We applied a systems-level analysis of regulatory interactions within the PCeEs and GCeEs to further characterise the effect of cell type enrichment and to predict key molecular regulators involved with Paneth cell and goblet cell specific functions. This analysis was carried out using interactions networks, which are a primary method to collate, visualise and analyse biological systems. These networks are a type of systems biology data representation, which aids the interpretation of -omics read-outs by contextualising genes/molecules of interest and identifying relevant signalling and regulatory pathways.30,31 In the presented analysis, the nodes of the interaction networks represent genes/molecules of interest from the transcriptomics data and the edges represent regulatory connections (molecular interactions) between the nodes inferred from databases. Studying regulatory interactions using interaction networks has proven useful to uncover how cells respond to changing environments at a transcriptional level, to prioritise drug targets and to investigate the downstream effects of gene mutations and knockouts.32–34
We used network approaches to interpret the PCeE and GCeE transcriptomic data by integrating directed regulatory connections from resources containing transcriptional and post-transcriptional interactions. This integrative strategy led us to define regulatory network landscapes altered by Paneth cell and goblet cell enrichment of enteroids, termed the network and GCeE network, respectively (Fig. 1). By incorporating known Paneth cell and goblet cell markers, we used these networks to predict master regulators of Paneth cell and goblet cell differentiation and/or maintenance in the enriched enteroids. Furthermore, we highlighted varying downstream actions of shared regulators between the cell types. This phenomenon, called regulatory rewiring, highlights the importance of changes in regulatory connections in the function and differentiation of specific cell types. Taken together, we show that cell type enriched enteroids combined with the presented network biology workflow have potential for application to the study of epithelial dysfunction and mechanisms of action of multifactorial diseases such as IBD.
![]() | ||
| Fig. 2 Differentially expressed genes in Paneth cell enriched enteroids (PCeEs) and goblet cell enriched enteroids (GCeEs) (compared to conventionally differentiated enteroids). (A) Volcano plots showing log2 fold change and adjusted p value for each gene following differential expression analysis of PCeEs (left) and GCeEs (right). Horizontal and vertical lines indicate the differential expression criteria cut offs (q value ≤ 0.05 and log2 fold change ≥|1|). (B) Number of differentially expressed genes (DEGs). miRNA = microRNA; lncRNA = long non-coding RNA; genes annotated as ‘other’ include pseudogenes and antisense genes. (C) Venn diagrams indicating the number of DEGs (passing the cut off criteria). (D) Top 10 Reactome pathways of the 50 most significant DEGs (by q value). (E) Enrichment of cell type specific marker genes in the DEG lists. Higher significance scores indicate greater enrichment. Number of markers in DEG list out of the total number of markers shown below significance score. Also see Tables S1 and S4 (ESI†). | ||
Pathway analysis was employed to study functional associations of the DEGs (Fig. 2D). The PCeE-specific DEGs were associated with a number of metabolic pathways, including metabolism of vitamins and cofactors, pyruvate metabolism and citric acid (TCA) cycle and cholesterol biosynthesis. On the other hand, GCeE-specific DEGs were associated with the cell cycle through pathways such as cell cycle checkpoints, DNA replication and G1/S transition. Pathways associated with the shared DEGs included transmission across chemical synapses, integration of energy metabolism and a number of pathways linked to hormones. As hormone functions are characteristic of enteroendocrine cells, this analysis suggests that enteroendocrine cells are enriched in both the PCeEs and the GCeEs.
To validate the cell types present in the enteroids, the expression of five previously reported major cell type specific markers were investigated across the enteroids using transcript abundances and RNA differential expression results (Table S2 and Fig. S2, ESI†). The control enteroids and the cell type enriched enteroids expressed all five investigated markers: Lgr5 (stem cells), ChgA (enteroendocrine cells), Muc2 (goblet cells), Lyz1 (Paneth cells) and Vil1 (epithelial cells). We observed an upregulation of Muc2, Lyz1 and ChgA and a downregulation of Lgr5 in PCeEs and GCeEs compared to the control enteroids, confirming the more pronounced differentiated status of the enteroids. In addition, a number of Paneth cell specific antimicrobial peptide genes were differentially expressed in the PCeE dataset, including Ang4, Reg3γ, Pla2g2a and Defa2 (Table S3, ESI†). Some of these genes were also differentially expressed in the GCeE dataset but with smaller log fold change values, e.g. Lyz1 and Ang4. Conversely, a number of goblet cell mucin related genes (including Muc2 and Tff3) were differentially expressed in both datasets although all genes exhibited a smaller increase in the PCeEs (Table S2, ESI†). Therefore, using primary cell type specific markers, antimicrobial peptide genes and mucin-related genes, we show that the enteroids contain all major cell types, and that Paneth cells are most upregulated in the PCeEs, while goblet cells are most upregulated in the GCeEs. We also note that both differentiation methods resulted in increases of other secretory cell types as well.
To further investigate secretory cell type specific signatures of the enteroids, we measured enrichment of secretory lineages in the upregulated DEG lists using additional marker genes of Paneth cells, goblet cells and enteroendocrine cells. While all tests were significant (hypergeometric model, q value ≤ 0.05), we identified greater enrichment of Paneth cell markers in the PCeE DEG list and goblet markers in the GCeE DEG list (Fig. 2E). This confirms that both enteroid enrichment protocols were successful in increasing the proportion of their target cell type, but also increased proportions of other secretory lineages, albeit to a lesser extent (Table S4, ESI†). This observation confirms previous studies that these enteroid differentiation protocols result in enteroendocrine enrichment in addition to Paneth cell and goblet cell enrichment.25,28
In conclusion, we have used image-based validation, pathway analysis and marker gene investigation to show successful enrichment of target cell types in the PCeE's and GCeE's. We also highlighted an additional increase in other secretory lineages, particularly enteroendocrine cells, as a result of both enrichment protocols.
383
897 unique regulatory interactions connecting 23
801 molecular entities. All interactions within the network represent one of the following types of regulation: TF–TG, TF–lncRNA, TF–miRNA, miRNA–mRNA or lncRNA–miRNA. TF–TGs and TF–lncRNAs make up the majority of the network at 77% and 11% of all interactions, respectively. Due to its non-specific nature, this universal network contains many interactions not relevant for the current biological context. In order to get a clearer and valid picture of regulatory interactions occurring in our enteroids, we used the universal network to annotate the PCeE DEGs and GCeE DEGs with regulatory connections. Combining these connections, we generated specific regulatory networks for PCeEs and GCeEs, where every node is a DEG and every interaction has been observed in mice previously.
In total, the PCeE network, generated using differential expression data from the PCeEs compared to the conventionally differentiated enteroids, contained 37
062 interactions connecting 208 unique regulators with 3023 unique targets (Fig. 3A and Item S1, ESI†). The GCeE network, generated using differential expression data from the GCeEs compared to the conventionally differentiated enteroids, contained 19
171 interactions connecting 124 unique regulators with 2095 unique targets (Fig. 3A and Item S1, ESI†). 15.7% of all interactions (8856 out of 56
234) were shared between the PCeE and GCeE networks, however the interacting molecular entities in these interactions (termed nodes) did not all exhibit the same direction of differential expression between the networks (comparing PCeE or the GCeE data to the conventionally differentiated enteroid data). In each of the enriched enteroid regulatory networks, a particular gene was represented (as a node in the network) only once, but may have been involved in multiple different interactions. In different interactions, a single node could act either as a regulator or as a target and in different molecular forms, for example, as a lncRNA in one interaction and as a target gene in another.
![]() | ||
| Fig. 3 Summary and cluster analysis of regulatory network for Paneth cell enriched enteroid (PCeE) and goblet cell enriched enteroid (GCeE) datasets. (A) Summary of number of nodes and interactions in the whole PCeE (left) and GCeE (right) networks. Total number of each regulator type shown in red, number of each target type shown in blue. In the targets pie-chart, mRNAs represent protein coding genes and proteins, miRNAs represent miRNAs genes and lncRNAs represent lncRNA genes. Size of circles represents log10 (total unique regulators/targets). Bar chart represents the distribution of interaction types in the networks (log10 scale). (B) Heatplot of Reactome pathways significantly associated (q value ≤ 0.05) with each cluster of the PCeE (orange) and GCeE (purple) networks. Only the top 5 pathways shown for each group (or more where equal q values). Only the top 3 clusters had significantly associated pathways. Clusters labelled with rank and cell type and colours match the colour of the cluster shown in (C and D). (C and D) Visualisation of the PCeE and GCeE regulatory networks with their associated clusters. The cluster rank and score is given next to each cluster. Black nodes in the whole networks represent nodes which were not found in any cluster, whereas coloured nodes represent the cluster which they are part of. TF – transcription factor; miRNA – microRNA; lncRNA – long non-coding RNA. See Items S1, S2 and Table S5 (ESI†). | ||
To further investigate the makeup of these networks, we employed cluster analysis to identify highly interconnected regions (possible regulatory modules) in the PCeE and GCeE regulatory networks. Using the MCODE software,37 we identified five distinct clusters in the PCeE network and seven distinct clusters in the GCeE networks. A total or 1314 nodes are present in the PCeE network clusters and 698 in the GCeE network clusters. Functional analysis identified Reactome pathways38 associated with each of the modules. Significant pathways (q value ≤ 0.05) were identified only for the highest ranked three modules from each network, with a total of 12 pathways shared between the PCeE and GCeE associated clusters (out of 32 associated with the PCeE clusters and 42 with the GCeE clusters) (Fig. 3B–D). Of particular note, the first cluster of the GCeE network has associations with the endosomal/vacuolar pathway and antigen presentation, the second cluster is associated with the cell cycle. Of the PCeE clusters, the first cluster is associated with a range of functions including nuclear receptor transcription pathway, regulation of lipid metabolism and senescence. The second is associated with response to metal ions and endosomal/vacuolar pathway and the third with G alpha (i) signalling events (Table S5, ESI†).
In conclusion, we have generated regulatory interaction networks, including transcriptional and post-transcriptional interactions, which illustrate the effect of skewing enteroid differentiation towards Paneth cell and goblet cell lineages.
Of the 95 marker regulators, we identified approximately one-third (30/95) as present in both subnetworks (Fig. 4C). Given that the markers are different between the cell types, a regulator shared between the Paneth cell and goblet cell subnetworks must show an altered pattern of regulatory targeting in the two cell types. This phenomenon, referred to as regulatory rewiring, often results in functional differences of shared regulators in different environments39 – for example, in this case, between the Paneth cells and goblet cells.
Further investigation of the distinct regulator–marker interactions highlighted a gradient of regulator specificity. We generated matrices to visualise the markers controlled by each regulator in the goblet cell (Fig. 5A and C) and the Paneth cell (Fig. 5B) subnetworks. Each coloured square indicates that a marker (shown on the y-axis) is regulated by the corresponding regulator (shown on the x-axis). Squares are coloured blue if the associated regulator is shared between the Paneth cell and goblet cell subnetworks and orange if they are specific to one subnetwork. A collection of regulators (both subnetwork specific and shared) appear to regulate large proportions of the markers. For example, Ets1, Nr3c1 and Vdr regulate >50% of the markers in both the Paneth cell and the goblet cell subnetworks. Specific to the Paneth cell subnetwork, Cebpa, Jun, Nr1d1 and Rxra regulate >50% of the markers. Specific to the goblet cell subnetwork, Gfi1b and Myc regulate >50% of the markers. These regulators represent potential master regulators of differentiation or maintenance of the given cell types in the enriched enteroids. Referring back to the highly-interconnected clusters identified in the PCeE and GCeE networks (Fig. 3C and D), we find these predicted master regulators in different clusters. In the PCeE network, Cebpa, Nr1d1, Nr3c1 and Rxra are in cluster 1, Vdr is in cluster 2, Jun is in cluster 3 and Ets1 is unclustered. In the GCeE network, Ets1 and Myc are in cluster 1, Nr3c1 and Vdr are in cluster 2 and Gfi1b is in cluster 3. This suggests a wide range of central functions are carried out by this group of regulators, with possible divergence of roles between the Paneth cell and the goblet cell. In contrast to the predicted master regulators, regulators such as Mafk in the Paneth cell subnetwork and Spdef in the goblet cell subnetwork regulate only one marker. These regulators likely have more functionally specific roles.
![]() | ||
| Fig. 5 Matrices of interactions between markers and their regulators in the Paneth cell and goblet cell subnetworks. Regulators on y-axis, markers (regulator targets) on y-axis. Orange boxes indicate the interaction of a regulator and a marker where the regulator is only found in one of the two subnetworks. Blue boxes signify that the regulator is found in both the Paneth cell and the goblet cell subnetworks. (A) All goblet cell markers12 and their regulators in the goblet cell subnetwork. (B) All Paneth cell markers12 and their regulators in the Paneth cell subnetwork. (C) Sub-section of A showing the markers (and their regulators) which have the most regulatory connections. Interactions in Table S7 (ESI†). | ||
Together, these results highlight potential regulators which likely play key roles in specification and maintenance of Paneth and goblet cells and their functions in cell type enriched enteroids.
Looking at the regulators in more detail, the GCeE specific targets of miR-151-3p, for example, are significantly enriched in functions relating to antigen presentation, cell junction organisation, Notch signalling and the calnexin/calreticulin cycle. None of these functions are enriched in the shared or PCeE targets. Of particular interest is the calnexin/calreticulin cycle, which is known to play an important role in ensuring proteins in the endoplasmic reticulum are correctly folded and assembled.41 Dysfunction of protein folding and the presence of endoplasmic reticulum stress are both associated with IBD.42–44 Therefore, we predict that miR-151-3p plays a role in the secretory pathway of goblet cells and could be an interesting target for IBD research. In addition, different functional profiles were also observed for the targets of Rora in the PCeE and GCeE regulatory networks: targets present in both networks are significantly associated with mitosis, whereas those specific to the PCeE network are associated with metabolism, protein localisation, nuclear receptor transcription pathway, circadian clock and hypoxia induced signalling. GCeE specific targets of Rora are connected to Notch signalling, cell cycle and signalling by Rho GTPases (associated with cell migration, adhesion and membrane trafficking) and interferon.
Altogether these observations show that some of the regulators of both Paneth cell and goblet cell marker genes have different targets (with different associated functions) between the PCeE and the GCeE networks. This suggests that regulatory rewiring occurrs between Paneth cell and goblet cell types.
The literature search was carried out using the three groups of predicted master regulators: those specific to the Paneth cell markers (Cebpa, Jun, Nr1d1 and Rxra), those specific to the goblet cell markers (Gfi1b and Myc) and those which appear to regulate many of the markers of both cell types (Ets1, Nr3c1 and Vdr). We identified five genes (Ets1, Nr1d1, Rxra, Nr3c1 and Vdr) with associations to inflammation, autophagy and/or inflammatory bowel disease (IBD), as shown in Table 1. These genes correspond to 71% (5/7) of the Paneth cell associated master regulators and 60% (3/5) of the goblet cell associated master regulators. Interestingly, four of these genes (all apart from Ets1), encode nuclear hormone receptors.
| Putative master regulator | Autophagy/inflammation/IBD associations | Ref. |
|---|---|---|
| NR1D1 (REV-ERBa) | Modulates autophagy and lysosome biogenesis in macrophages leading to antimycobacterial effects | 45 |
| SNP rs12946510 which has associations to IBD, acts as a cis-eQTL for NR1D1 | 46 | |
| NR3C1 (glucocorticoid receptor) | Associations with cellular proliferation and anti-inflammatory responses | 47 |
| Exogenous glucocorticoids are heavily used as anti-inflammatory therapy for IBD | 48 and 49 | |
| ATG16L1, an autophagy related gene, was down-regulated in patients who do not respond to glucocorticoid treatment | 50 and 51 | |
| Transcriptionally regulates NFKβ1, a SNP affected gene in ulcerative colitis | 52 and 53 | |
| VDR (vitamin D receptor) | Regulates autophagy in Paneth cells through ATG16L1 – dysfunction of autophagy in Paneth cells has been linked to Crohn's disease | 54 and 55 |
| Induces antimicrobial gene expression in other cell lines | 56 and 57 | |
| Specific polymorphisms in the VDR genes have been connected to increased susceptibility to IDB | 58 | |
| A study looking at colonic biopsies of IBD patients observed reduced VDR expression compared to healthy biopsies | 59 | |
| Interacts with five SNP affected UC genes | 60 and 61 | |
| RXRa (retinoid X receptor alpha) | Heterodimerizes with VDR (see above) | 62 |
| ETS1 (ETS proto-oncogene 1) | Important role in the development of hematopoietic cells and Th1 inflammatory responses | 63 and 64 |
| Angiogenic factors in the VEGF-Ets-1 cascades are upregulated in UC and downregulated in CD | 65 | |
| IBD susceptibility gene | 66 | |
Given the possible relationship between the identified master regulators and IBD, we tested the potential of the PCeE and GCeE regulatory networks to study the pathomechanisms of CD or UC. We checked for the presence of known CD or UC associated genes in the networks, using data from two studies of single nucleotide polymorphisms (SNPs)67,68 and one study of CD expression quantitative trait loci (eQTLs).69 Using hypergeometric significance tests, we found that the PCeE network was significantly enriched in all tested lists: genes with UC associated SNPs (13/47, p < 0.005), genes with CD associated SNPs (22/97, p < 0.005) and genes with CD associated eQTLs (290/1607, p < 0.0001) (Tables S10–S12, ESI†). On the other hand, we found that the GCeE network was significantly enriched in genes with UC associated SNPs (10/47, p < 0.005) but regarding CD, the genes with SNP associations were not significantly enriched (12/97, p = 0.11) and the genes with qQTL associations were enriched with a larger p value (p < 0.05) (Tables S10–S12, ESI†).
Next we investigated whether any of the genes with UC or CD associated SNPs acts as regulators in the PCeE and GCeE networks. Of the genes with CD associated SNPs, one acts as a regulator in each network. Similarly, two genes with UC associated SNPs act as regulators in the networks. Specifically regarding CD associated genes, in the PCeE network, the gene Dbp regulates Bik, which encodes the BCL2 interacting killer, a pro-apoptotic, death promoting protein. In the GCeE network, Notch2 regulates Notch3 and Hes1. Specifically, regarding UC associated genes, in the PCeE network, Hnf4a regulates 994 genes/RNAs including nine Paneth cell markers (Cd244a, Fgfrl1, Clps, Habp2, Hspb8, Pnliprp1/2, Defb1, Mymx) and one other gene with UC associated SNPs (Tnfsf15). Additionally, a gene with UC associated SNPs, Nr5a2, was found in both the PCeE and GCeE networks regulating 389 and 276 genes/RNAs respectively. In the PCeE network Nr5a2 targets include 6 Paneth cell markers (Cd244a, Copz2, Pnliprp1/2, Sntb1, Mymx). Ultimately, the large number of targets of these regulatory UC associated genes suggests they have wide ranging effects on the regulatory network of Paneth and goblet cells. To further establish the relevance of the inferred PCeE and GCeE networks, we also found an over-representation of drug target associated genes in both the PCeE and GCeE networks (2683/16223 and 1918/16223 respectively, p < 0.0001), highlighting their potential for the study of therapeutic implications (Table S12, ESI†).
To investigate the link between predicted master regulators and IBD, we observed whether the genes with UC and CD associated SNPs are regulated by the predicted master regulators in the PCeE and GCeE networks (Table S13, ESI†). Given that Paneth cell dysregulation is classically associated with CD and goblet cell dysregulation/depletion with UC,70 we focused this analysis only on these pairings, examining CD genes amongst targets of Paneth cell predicted master regulators, and UC genes amongst targets of goblet cell predicted master regulators. In the PCeE network, we found 21 (of 22) of the CD genes were regulated by at least one of the seven Paneth cell predicted master regulators, while the targets of these master regulators were significantly enriched with CD genes in the PCeE network (p < 0.001). Similarly, we observed that all 10 UC genes in the GCeE network were regulated by at least one of the five goblet cell predicted master regulators, while the targets of these master regulators were significantly enriched with UC genes (p < 0.005).
To confirm the relevance of these predicted master regulators in a human system, a similar analysis was carried out using goblet cell differentially expressed genes from a recent single cell study of human inflamed UC colon biopsies.13 Using the top 100 differentially expressed genes, following conversion to mouse Ensembl identifiers, 20 were found to be targeted by the predicted goblet cell master regulators in the GCeE network. This represents a significant enrichment amongst all master regulator targets (p < 0.005) (Table S13, ESI†).
Ultimately, by integrating functional annotations obtained through literature searches, we show that the Paneth cell and goblet cell regulatory networks contain genes with direct and indirect associations with IBD. Furthermore, we find that the PCEe and GCeE networks and the targets of predicted master regulators are enriched with IBD associated genes – this finding is corroborated using human single cell data from UC colon biopsies. Consequently, these networks and the workflow to reconstruct and analyse them have great potential for the study of IBD pathomechanisms in specific intestinal cell types.
Analysis using known cell type specific markers confirmed that skewing differentiation of enteroids towards Paneth cells or goblet cells results in an increase in the target cell signatures at the transcriptomics level. In addition, signatures of enteroendocrine cells were increased in both differentiation protocols, albeit at a lower amount than the target cell type. This finding correlates with previous investigations at both the transcriptomic and proteomic levels25,27–29 and is likely due to the shared differentiation pathways of these secretory cells. Nevertheless, as enteroids contain a mixed population of cell types by nature and because intercellular communication is key to a functioning epithelium,19,71 the increased proportion of non-targeted secretory lineages should not be an issue for the application of these models to research. In fact, the enrichment of specific cell types is beneficial for enteroid-based research to increase the signal originating from a specific population of cells and to provide a larger population of cells of interest for downstream single cell analysis of enteroids, which is particularly beneficial when studying rare populations such as Paneth cells. This is valuable due to the lack of in vitro models for long-term culture of non-self-renewing small intestinal epithelial cells.72,73 Specifically, the comparison of ‘omics data from a cell type enriched enteroid to a conventionally differentiated enteroid enables generation of cell type signatures with more specificity than can be obtained otherwise – except through single cell sequencing. Single cell sequencing, however, comes at a greater financial cost and provides lower coverage which can be problematic for rare cell types and lowly expressed RNAs. Furthermore, a number of previous studies have shown that these cell type enriched enteroid models, which offer a simplified and manipulatable version of the intestinal environment, are useful for the investigation of health and disease related processes.26,28,29 It must be considered, however, that through applying chemical inhibitors to enteroids to enrich particular cell types, we may observe changes in gene expression which are related to the direct effects of the inhibitor but not to differentiation. Due to the nature of the inhibitors, it would be challenging to separate these effects. For example, CHIR99021, which is used for enriching Paneth cells, is a direct inhibitor of glycogen synthase kinase 3 (GSK-3). GSK-3 has a role in many cellular pathways including cellular proliferation and glucose regulation.74
Using a priori molecular interaction knowledge, we annotated differentially expressed genes (cell type enriched enteroids vs. conventionally differentiated enteroids) with regulatory connections. Collecting all connections together generated regulatory networks for the Paneth cell enriched enteroid (PCeE) and goblet cell enriched enteroid (GCeE) datasets. This approach to collating networks (regulatory or otherwise) has been used for a wide variety of research aims, such as the identification of genes functioning in a variety of diseases,75,76 the prioritisation of therapeutic targets77 and for a more general understanding of gene regulation in biological systems.78,79 The application of prior knowledge avoids the need for reverse engineering/inference of regulatory network connections, which is time consuming, computationally expensive and requires large quantities of high quality data.80 To investigate the substructure and functional associations of the generated PCeE and GCeE regulatory networks we applied a clustering approach. The identified clusters represent collections of highly interconnected nodes, which likely form regulatory modules. Functional analysis confirmed distinct functional associations between the clusters as well as between the networks. The observation that less than half of the network nodes exist in clusters is consistent with the view that regulatory networks are hierarchical and scale free with most genes exhibiting low pleiotropy.81,82
Given the observed additional increase in secretory lineages based on the DEGs of the enriched organoids, we chose to use cell type specific markers to extract interactions specific to Paneth cells and goblet cells from the generated regulatory networks. This enables further enrichment of Paneth cell and goblet cell signatures and reduction in noise in the networks due to the presence of other cell types in the enteroids. Using this approach, we identified possible regulators of cell type specific functions in Paneth cells and goblet cells. Some of these regulators were predicted to be important in both cell types, but exhibited differential targeting patterns between the PCeE and the GCeE networks, indicating rewiring of regulators between the cell types. This highlights apparent redundancy and/or cooperation of regulators which control similar cell type specific functions and shows the potential importance of regulatory rewiring in the evolution of cell type specific pathways and functions, something which has been shown previously to occurr.83,84 Functional analysis of the targets of the most rewired regulators (Etv4, let-7e-5p, miR-151-3p, Myb and Rora) highlights an overrepresentation of metabolism associated targets in the PCeE network and cell cycle associated targets in the GCeE network. A similar result was observed when functional analysis was carried out on genes with significantly different expression levels between the cell type enriched enteroids and the conventionally differentiated enteroids (Fig. 2B). This suggests that transcriptional changes during the skewing of enteroid differentiation could be driven by rewired regulators and that these functions are key features of Paneth cells and goblet cells in the enteroids. The latter is supported by current understanding that Paneth cells rely on high levels of protein and lipid biosynthesis for secretory functions,85 and they play an important role in metabolically supporting stem cells.86 Additionally, as terminally-differentiated cells do not undergo cell division, this result suggests that enteroid goblet cell signatures are derived from a large population of semi-differentiated goblet-like cells, a phenomenon previously observed in tissue sample based studies.13,87 Extension of our workflow to single cell sequencing of enteroid cells could validate these findings by providing greater cell type specificity. However, these techniques pose further technical and economic challenges.88,89 Specifically, a large number of organoids must be sequenced to mitigate cellular complexity and batch heterogeneity and powerful, reproducible and accurate computational pipelines are required to analyse such data.90
We predicted key regulators involved in differentiation or maintenance of Paneth cells and goblet cells in the enteroids: Cebpa, Jun, Nr1d1 and Rxra specific to Paneth cells, Gfi1b and Myc specific for goblet cells and Ets1, Nr3c1 and Vdr shared between them. Validation of these regulators poses significant challenges due to their wide expression and broad function range. If the master regulators are controlling differentiation as opposed to cell function maintenance, evaluating lineage arrest or delay can be carried out using a gene knockout or knock down. However the effects of pleiotropy will significantly hamper the results and such a study would require significant follow-up studies. On the other hand, if key regulators were predicted by applying the presented computational workflow to condition-specific organoids compared to control organoids (e.g. drug treated organoids vs. non-treated organoids), the validation would be much simpler. Literature investigation highlighted that many of the predicted master regulators, particularly those associated with Paneth cells, have connections to autophagy, inflammation and IBD (Table 1). While some of these associations are related to different cell types, we assume that if a gene can contribute to a specific function in one cell type it may also contribute in another. This finding suggests that dysregulation of key cell master regulators could contribute to IBD.
To further investigate this finding, we identified Crohn's disease (CD) and ulcerative colitis (UC) genes in the PCeE and GCeE networks. We found that CD associated genes are more strongly associated with the PCeE network than the GCeE network. Given that Paneth cell dysfunction is classically associated with CD, this finding highlights the relevance of the generated networks to the in vivo situation. In the PCeE network one SNP associated CD gene, Dbp, acts as a regulator. Dbp, encoding the D site binding protein, regulates Bik, which encodes the BCL2 interacting killer, a pro-apoptotic, death promoting protein. Interestingly, rate of apoptosis has been implicated in IBD disease mechanisms91 and has been associated with IBD drug response.92 Therefore, this finding highlights a possible regulatory connection between CD susceptibility genes and IBD pathology on a Paneth cell specific level. In the GCeE network, the SNP associated CD gene Notch2 acts as a regulator for Notch3 and Hes1. It has been previously demonstrated that this pathway can block glucocorticoid resistance in T-cell acute lymphoblastic leukaemia via NR3C1 (predicted master regulator).93 This is relevant to IBD given that glucocorticoids are a common treatment for IBD patients.49 Furthermore, this pathway has been previously associated with goblet cell depletion in humans94 commonly observed in IBD patients. Furthermore, we identified a significant enrichment of UC associated genes in both the PCeE and GCeE networks. The majority of UC associated genes identified in the networks (9/14) were present in both, suggesting that genetic susceptibilities of UC do not have a Paneth cell or goblet cell specific effect. Two of the identified UC associated genes act as regulators in the networks (Nr5a2 and Hnf4a), targeting hundreds of genes and thus suggesting a broad ranging effect on the networks. Building on the identified literature associations of predicted master regulators, we found that the targets of Paneth cell master regulators are enriched with CD associated genes, and the targets of the goblet cell master regulators are enriched with UC associated genes. This finding was further illustrated using UC associated goblet cell genes from a human biopsy study,13 highlighting the relevance of these findings in a human system. Ultimately, the observation of IBD susceptibility genes in the regulatory networks of these enteroids highlights possible application of this model system to study disease regulation in specific intestinal cell types, through understanding specific mechanistic pathways.
We have shown how network biology techniques can be applied to generate interaction networks representing the change in regulatory environments between two sets of enteroids. Here we presented this workflow, and by using transcriptomics data we characterised the effect of Paneth cell and goblet cell differentiation skewing protocols on enteroids. However. the described workflow could be applied to a variety of ‘omics datasets and enteroid conditions. For example, to test the response of enteroids to external stimuli, such as bacteria, and on enteroids grown from human-derived biopsies, enabling patient-specific experiments. The application of further ‘omics data-types to the described approach could generate a more holistic view of cellular molecular mechanisms, including the ability to observe post-translational regulation. In this study, we integrated miRNA and lncRNA expression datasets, in addition to mRNA data, as these molecules have been shown to perform critical regulatory and mediatory functions in maintaining intestinal homeostasis.46,68,95 However, only small proportions of the generated PCeE and the GCeE networks contained miRNA and lncRNA interactions (Fig. 2B), due to lack of published interaction information, particularly from murine studies. Both the application of human enteroid data and the future publication of high-throughput interaction studies involving miRNAs and lncRNAs will improve the ability to study such interactions. Nevertheless, these network approaches are beneficial for contextualising gene lists through annotating relevant signalling and regulatory pathways31 and we can use them to represent and analyse current biological knowledge, to generate hypotheses and to guide further research.
In conclusion, we described an integrative systems biology workflow to compare regulatory landscapes between enteroids from different conditions, incorporating information on transcriptional and post-transcriptional regulation. We applied the workflow to compare Paneth cell and goblet cell enriched enteroids to conventionally differentiated enteroids and predicted Paneth cell and goblet cell specific regulators, which could provide potential targets for further study of IBD mechanisms. Application of this workflow to patient derived organoids, genetic knockout and/or microbially challenged enteroids, alongside appropriate validation and single cell sequencing if available, will aid discovery of key regulators and signalling pathways of healthy and disease associated intestinal cell types.
To chemically induce differentiation, on days two, five and seven post-crypt isolation, ENR media was changed to include additional factors for each cell type specific condition: 3 μM CHIR99021 (Tocris) and 10 μM DAPT (Tocris) [Paneth cells]; 2 μM IWP-2 (Tocris) and 10 μM DAPT [goblet and enteroendocrine cells].25
Library purification combines using BluePippin cassettes (Sage Science Pippin Prep 3% Cassettes Dye-Free (CDF3010), set to collection mode range 125–160 bp) to extract the library molecules followed by a concentration step (Qiagen MinElute PCR Purification, 28004) to produce libraries ready for sequencing. Library concentration and size are established using HS DNA Qubit and HS DNA Bioanalyser. The resulting libraries were then equimolar pooled and qPCR was performed on the pool prior to clustering.
The small RNA reads were analysed using the sRNAbench tool within the sRNAtoolbox suite of tools.104 The barcodes from the 5′ end and adapter sequences from the 3′ end were removed respectively. Zero mismatches were allowed in detecting the adapter sequences with a minimum adapter length set at 10. Only reads with a minimum length of 16 and a read-count of 2 were considered for further analysis. The mice miRNA collection was downloaded from miRBase version 21.105 The trimmed and length filtered reads were then mapped to the mature version of the miRBase miRNAs in addition to the annotated version of the mouse genome (version mm10). No mismatches were allowed for the mapping. A seed length of 20 was used for the alignment with a maximum number of multiple mappings set at 10. Read-counts normalised after multiple-mapping were calculated for all the libraries. The multiple-mapping normalised read-counts from the corresponding cell type enriched enteroids were compared against the conventionally differentiated enteroids to identify differentially expressed miRNAs in a pair-wise manner using edgeR.106 miRNAs with an absolute log2 fold change of 1 and false discovery rate ≤0.05 were considered to be differentially expressed.
Differentially expressed genes were grouped by their presence in the PCeE dataset, the GCeE dataset or in both. Each group of differentially expressed genes was tested for functional enrichment (hypergeometric model, q value ≤ 0.1) based on Reactome and KEGG annotations using the ReactomePA R package38,107–109 following conversion from mouse to human identifiers using Inparanoid (v8).110,111
Hypergeometric distribution testing was carried out using a custom R script to measure enrichment of cell type specific marker genes in the differentially upregulated gene sets. To standardise the universal dataset, only markers which are present in the output of the Wald test (genes with variance greater than zero among samples) were used. Similarly, to enable fair comparisons, only differentially expressed protein coding genes and documented lncRNAs were used from the DEG lists, as was surveyed in the cell type specific marker paper. Bonferroni correction was applied to the hypergeometric distribution p values to account for multiple testing and significance scores were calculated using −log10 (corrected p value). For the mapping of marker genes to the interaction networks, no filters were applied.
To generate PCeE and GCeE regulatory networks, interactions in this collated universal network were filtered using the differential expression data (Fig. 1). The assumption was made that if both nodes of a particular interaction were expressed in the RNAseq data, the interaction is possible. Furthermore, to filter for the interactions of prime interest, only nodes which were differentially expressed and their associated interactors were included in the regulatory networks.
The nodes of each cluster were tested for functional enrichment (hypergeometric model, q value ≤ 0.05) based on Reactome annotations using the ReactomePA R package38,107–109 following conversion from mouse to human identifiers using Inparanoid (v8).110,111 Cases where the number of nodes associated with a pathway <5 were considered not significant regardless of the q value. The top 5 significant Reactome pathways associated with each cluster were visualised using a heatplot generated in R (Fig. 3B). More than 5 pathways were visualised, where multiple Reactome pathways had equal q values.
Footnotes |
| † Electronic supplementary information (ESI) available. See DOI: 10.1039/c9mo00130a |
| ‡ Equal contributions. |
| This journal is © The Royal Society of Chemistry 2020 |