Meenakshi
Anurag
a,
Gajinder Pal
Singh
b and
Debasis
Dash
*a
aG.N.R. Knowledge Center for Genome Informatics, Institute of Genomics and Integrative Biology (CSIR-IGIB), CSIR, Delhi, 110007, India. E-mail: ddash@igib.res.in
bInstitute of Biochemistry, Biological Research Centre, Szeged, Hungary
First published on 25th October 2011
Intrinsic disorder in proteins has been explored to study lack of structure–function aspects of many proteins. The current study focuses on coiled coils which are often linked to intrinsic disorder. We present a sequence level analysis of human coiled coils to find out if this is universally true for all coiled coils. When annotated coiled-coil regions were collected from UniProt and investigated with disorder prediction tools namely—IUPred and DISpro, three patterns were commonly observed—disordered coiled coils (DisCCs), ordered coiled coils (OCCs) and the last one having a disordered region outside the coiled-coil region (DOCCs). Differential enrichment in the gene ontology was seen in these three categories. We found that OCCs are enriched in structural components of the extracellular space including the fibrinogen complex and laminin complex. On the contrary, DisCCs were found to be exclusively over-represented in proteins involved in actin filament, lamellipodium, cell junction, macromolecule complexes, ciliary rootlet and nucleolus. DOCCs are found to be associated with many regulatory and adaptor functions including positive regulation of calcium ion transport via store-operated calcium channel activity, cytoskeletal adaptor activity etc. Other than the GO-based analysis, sequence level analysis showed that disordered coiled-coil regions bear a high proportion of low-complexity regions as compared to ordered coiled coils. The former also has a higher probability of forming a dimer as compared to the ordered counterpart. Our study shows that the in silico approach of mapping of disorder in or around coiled coils in other biological systems or organisms can be applied to understand and rationalize the mode of action of these dynamic motifs.
Coiled coils, apart from being structurally interesting, play important roles in transcription control, association and organization of complexes, chromosomal and cell cycle maintenance etc. They are also vital for the fusion of viral and cellular membrane and hence are being widely studied in HIV.5,6 Recently, Barth et al.7 used computational methods to design peptides that inhibited coiled coil formation between the metastable coiled-coil region of yeast septin—Cdc12—and its natural binding partner Cdc3. The study highlights designing of specific peptide inhibitors that can bind to the protein domain lacking intrinsic structural stability.
Coiled coils are often linked to intrinsic disorder at a sequence level because coiled coils are frequently disordered as monomers and become folded upon association and formation of quaternary structure.8 Studies of Stalk Domain of ncd Motor protein in Drosophila have shown two states of coiled coil—reversible and irreversible.9 The reversibility and irreversibility can be linked to the intrinsic disorder of the protein or segment. Intrinsically disordered proteins (IDPs) are known to remain disordered in their native state and a conformational transition can fold them into a more compact or rigid structure induced upon binding or by altered physiological conditions.
The current work primarily addresses two questions—first, how commonly coiled coils are disordered and second, how are they differentially enriched in terms of functions, cellular components and biological processes. In this study we categorized the coiled coil proteins into three sets—disordered coiled coils (DisCCs), ordered coiled coils (OCCs) and disorder outside coiled coil (DOCCs) and performed in silico analysis to find out the processes and cellular components that are significantly and differentially associated with the three categories. Studying the human coiled coils will help synthetic biologists and nanotechnologists to design novel inhibitors, switches and regulators.10 This study also highlights the sequence–structure relation of coiled coils.
Fig. 1 Workflow depicting the categorization of proteins as disordered coiled coils (DisCCs), ordered coiled coils (OCCs) and disorder outside coiled coil domains (DOCCs). |
Amino acid composition was calculated for the coiled-coil regions of DisCCs and AOCCs. The feature was calculated using web server PROFEAT,16 which is freely accessible at http://jing.cz3.nus.edu.sg/cgi-bin/prof/prof.cgi. The mean value of the percentage composition for each amino acid was calculated.
Low complexity regions (LCRs) in the coiled-coil regions of DisCC and AOCCs sets were predicted using Seg (ftp://ftp.ncbi.nih.gov/pub/seg/seg) which implements the method of Wootton and Federhen18 which identifies compositionally biased regions in the amino acid sequence. LCRs were predicted using the default parameters of Seg. The total length of LCRs was calculated for each coiled-coil region of the two sets.
Finally, a Wilcoxon rank sum test is performed on the oligomerization state probability as well as proportion (percentage length) of LCRs in coiled coils. Two separate tests were performed—the first test assesses a p-value for the null hypothesis that there is no difference between dimeric and trimeric state probabilities in DisCCs and AOCCs, the second test was performed with the null hypothesis that there is no difference between the proportion of LCRs in coiled coils in DisCCs and AOCCs. The tests were performed using Wilcox.test function of R which implements Wilcoxon rank sum test on provided vectors.
We enquired into the enrichment of certain functional classes in these three categories of coiled coils by performing gene ontology analysis on these protein sets for finding relevant biological function, process and localization. GOEAST is a powerful tool to study the enrichment patterns of gene ontology (GO) from a given set of genes/proteins. We used these three sets of proteins to get GO enrichment data. 488 out of 598 DisCCs, 375 out of 437 OCCs and 256 out of 285 DOCCs had associated gene ontology information in GOEAST and hence were used further for the enrichment analysis. The GO enrichment data obtained for the three categories were compared to find differentially enriched molecular functions, biological processes and cellular components. Important cellular components enriched in different sets of coiled coils have been depicted in Fig. 2 which has been generated using canvas utility of TinkerCell.19 Three different colours red, blue and green represent DisCC, DOCC and OCC respectively. The statistical significance of the enrichment of DisCCs, DOCCs and OCCs in various cellular components is provided as ESI† (Table S1).
Fig. 2 Enrichment of different categories of coiled coil proteins in various cellular components as found by gene ontology analysis. The disordered coiled coils (DisCCs) are shown in red, ordered coiled coils (OCCs) in green and disorder outside coiled coils (DOCCs) in blue. |
Performing the cellular component analysis using multiple set GO comparison utility of GOEAST, we found that OCCs are enriched in structural components of the extracellular space including the fibrinogen complex and laminin complex (Fig. 2). Membrane trafficking complexes like SNARE complexes and exocysts were also found to be enriched in OCCs.
We found DisCCs to be exclusively over-represented in proteins involved in actin filament, lamellipodium, cell junction, macromolecule complexes, ciliary rootlet, nucleolus etc. (Fig. 2) As evident most of the cellular components listed prior contribute to motility and mechanical integrity of the cell.
DOCCs were rich in cellular components including kinetochore–microtubule complexes, ruffle and midbody (Fig. 2). This subset of coiled coil proteins was observed to be associated with biological processes including regulation of calcium ion transport. They were found to be associated with many regulatory and adaptor functions including positive regulation of calcium ion transport via store-operated calcium channel activity, negative regulation of endocytosis, regulation of GTPase activity, cytoskeletal adaptor activity etc.
Fig. 3 Amino acid composition of DisCCs and AOCCs. The X-axis shows the 20 amino acids and the Y-axis shows the percentage composition of the amino acids in the coiled-coil region. The error bar depicts the standard error of mean (±SEM). |
UniProt-id | Gene | Var-id | Mutation | Site | Disease | MIM-id |
---|---|---|---|---|---|---|
B1AK53 | ESPN | VAR_043455 | R774Q | 774 | Deafness autosomal dominant without vestibular involvement (DFNAWVI) | [MIM:606351] |
P02671 | FIBA | VAR_010731 | E545V | 545 | Amyloidosis type 8 (AMYL8) | [MIM:105200] |
P02671 | FIBA | VAR_010732 | R573L | 573 | Amyloidosis type 8 (AMYL8) | [MIM:105200] |
Q3SXY8 | ARL13B | VAR_054372 | R200C | 200 | Joubert syndrome type 8 (JBTS8) | [MIM:612291] |
Q7Z4S6 | KIF21A | VAR_019400 | M947R | 947 | Congenital fibrosis of extraocular muscles type 1 (CFEOM1) | [MIM:135700] |
Q7Z4S6 | KIF21A | VAR_019401 | M947V | 947 | Congenital fibrosis of extraocular muscles type 1 (CFEOM1) | [MIM:135700] |
Q7Z4S6 | KIF21A | VAR_027021 | M947T | 947 | Congenital fibrosis of extraocular muscles type 1 (CFEOM1) | [MIM:135700] |
Q7Z4S6 | KIF21A | VAR_019402 | R954Q | 954 | Congenital fibrosis of extraocular muscles type 1 (CFEOM1) | [MIM:135700] |
Q7Z4S6 | KIF21A | VAR_019403 | R954W | 954 | Congenital fibrosis of extraocular muscles type 1 (CFEOM1) | [MIM:135700] |
Q9ULD2 | MTUS1 | VAR_035184 | Q1201R | 1201 | Hepatocellular carcinoma (HCC) | [MIM:114550] |
The heptad repeats have prefered positions for hydrophobic (a and d) and charged (e and g) amino acids. The current observation suggests that amongst the hydrophobic residues isoleucine (I) is enriched in DisCCs and alanine (A) and leucine (L) are favored by AOCCs (Fig. 3). Since leucine and isoleucine have the same composition and size, the packing geometry plays a major role in oligomerization of coiled coils and studies have shown that presence of isoleucine at both a and d positions is preferred in a trimeric oligomeric state.23,24 Our observed marginal high probability of DisCCS to form trimers over AOCCs can be because of the high frequency of isoleucines in DisCCs (p-value of the Wilcoxon test = 0.01345).
The charged amino acids occupy e and g positions of the heptad repeats. In a study conducted by Kohn et al.25 it was observed that increasing the frequency of glutamic acid (E) destabilizes the helical conformation of the dimeric coiled coils pushing the conformation towards a random coiled coil state. In another report26 the group also showed that the protonation of glutamic acid increases the stability of the coiled coils and that as the number of glutamic acid residues that are protonated increases the dimer shifts from a less stable to a more stable form. Since DisCCs were found to be enriched in glutamic acids, the dimer formation of such a coiled coil might be triggered by protonation of the residue which in turn is caused by the change in the pH (low).
Coiled coils are known to be highly versatile motifs despite their simple architecture. It was first discovered as a structural feature of alpha-keratin.27 Coiled coils are found particularly enriched in “skeletal proteins” and “motor proteins”.
The actin cytoskeleton and lamellipodium were found to be significantly enriched in DisCCs. Actin cytoskeletal proteins are known to play an integral role in cell shape determination, motility, cytokinesis and cell–cell or cell–matrix interactions. Along with myosins the actin cytoskeletal disordered set consists of angiomotin, aniline, coronin-1A, espin and other proteins involved in mechanical integrity of cell, adhesion and motility. Similarly, lamellipodium is essential for motility, membrane domain organization, substrate adhesion and phagocytosis.28 These processes involve a high level of regulation, recruitment and organization with one protein interacting with multiple partners which is a forte of IDRs.
Our analysis shows that cellular component kinetochore–microtubules comprising CLASP proteins have IDRs which lie outside the coiled coil motif (DOCC). These proteins have two homologs in humans—CLASP1 and CLASP2—which interact with clip proteins and few others involved in stabilization of the microtubules.29 They are associated with microtubule plus-end binding and regulate the affinity of kinetochore–microtubule attachments by decreasing the frictional drag.30 We found that these CLASP proteins are highly disordered except for the C-terminal coiled coil domain. This region in CLASP1 is involved in localization of kinetochore and in CLASP2 it is required for cortical localization. These coiled-coil regions are well conserved in the two homologs and hence might contribute to functionality of the complex.
In contrast, coiled coils associated with the laminin and fibrinogen complex were found to be enriched in OCCs. The proteins represented in these subsets are laminin subunit alpha 1, 2, 3 and beta 3 and fibrinogen like protein 1 and fibroleukin. These are primarily extracellular components which require the proteins to be soluble. A recent review supports the argument that IDRs are more represented in intracellular proteins as compared to the extracellular ones.31
Kinesins are known as the courier machinery which is responsible for transport of a variety of cargos including intermediate filaments, mRNA, signaling molecules, membranous organelles etc.32 They are functional as heterodimers which have a globular head connected by a neck-linker which in turn is connected to the coiled coil domain followed by the cargo binding domain. There are two models proposed for the mechanism of kinesin walking. One is the inchworm model and the other is the hand-over-hand model. The former postulates that the stalk does not rotate during the step and one head always leads to the other33 while in the latter model the neck-linker region transits its conformation in such a way that the two heads alternate in the lead. This difference in the proposed mechanism can be due to the influence of position and degree of disorder in the neck-linker and coiled-coil region on the walking of the complex.
Another class of motor proteins—myosins—were found to be enriched in DisCCs. Most of these proteins are involved in microfilament motor activities and regulation. Myosins play a key role in muscular contraction which is itself a highly dynamic process and hence depends on flexible coiled coils. The myosins present in this set bear ATP, nucleotide and calmodulin binding domains. Non-muscle myosin II which were present in the disordered coiled coil set are associated with cell migration and cytokinesis and function by converting chemical energy to force using conformational switching techniques,34 hence we speculate that the coiled coils involved in these dynamic processes tend to favor disorder.
Contrary to kinesins and myosins, dynein complexes were predicted to lack significant length of IDRs and hence were categorized as ordered coiled coil proteins. Dynein is structurally composed of an ATPase domain, the microtubule-binding domain and separating the two is the anti-parallel coiled coil stalk domain.35 The stalk region is highly conserved and only an optimal length is found across the family. It has been reported that stalks that are too long or short hamper the proper functioning of dynein by hindering the packing of heads close to each other.36,37 ATP hydrolysis is known to bring about a conformational change in the head domain and the coiled coil domain acts as a rigid lever arm.2 A recent study suggests that ATP-induced motion produces a sliding motion in coiled coils which in turn helps the complex move along the cytoskeletal track.38 Our study supports this observation by providing insight into the lack of ability of the dynein coiled coils to bring about large conformational changes and use a relatively subtle sliding motion.
Similarly, DisCCs bearing mutations were found to be associated with diseases like amyloidosis type 8, Joubert syndrome type 8, congenital fibrosis of extraocular muscles type 1, hepatocellular carcinoma etc. (Table 1). Intrinsically disordered proteins are known to be associated with amyloid related disorders.42 We found that mutations in fibrinogen α protein (FIBA) are associated with amyloidosis type 8. The mutations reported are in residue numbers 545 and 574 and exhibit substitution of disorder favoring (Glu and Arg)43 to order favoring residues (Val and Leu respectively).43 This suggests that a decrease in the flexibility of the coiled coil may result in improper functioning of the protein and hence might be associated with the diseased state. The molecular basis of this disease is still under exploration and the current finding can help in getting new insight into the mechanism of action of this protein in normal and diseased states.
IDP | Intrinsically disordered proteins |
DisCC | Disordered coiled coil |
DOCC | Disorder outside coiled coil |
OCC | Ordered coiled coil |
AOCC | All ordered coiled coil |
LCR | Low-complexity region |
Footnotes |
† Published as part of a Molecular BioSystems themed issue on Intrinsically Disordered Proteins: Guest Editor M. Madan Babu. |
‡ Electronic supplementary information (ESI) available. See DOI: 10.1039/c1mb05210a |
This journal is © The Royal Society of Chemistry 2012 |