Cascaded walks in protein sequence space: use of artificial sequences in remote homology detection between natural proteins

S. Sandhya; R. Mudgal; C. Jayadev; K. R. Abhinandan; R. Sowdhamini; N. Srinivasan

doi:10.1039/C2MB25113B

Cascaded walks in protein sequence space: use of artificial sequences in remote homology detection between natural proteins†

S. Sandhya,^ab R. Mudgal,^c C. Jayadev,^bd K. R. Abhinandan,‡^b R. Sowdhamini^a and N. Srinivasan*^b

Author affiliations

* Corresponding authors

^a National Centre for Biological Sciences, UAS-GKVK Campus, Bangalore 560 065, India

^b Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560 012, India
E-mail: ns@mbu.iisc.ernet.in
Fax: +91-80-23600535
Tel: +91-80-22932837

^c IISc Mathematics Initiative, Indian Institute of Science, Bangalore 560 012, India

^d Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore 560 012, India

Abstract

Over the past two decades, many ingenious efforts have been made in protein remote homology detection. Because homologous proteins often diversify extensively in sequence, it is challenging to demonstrate such relatedness through entirely sequence-driven searches. Here, we describe a computational method for the generation of ‘protein-like’ sequences that serves to bridge gaps in protein sequence space. Sequence profile information, as embodied in a position-specific scoring matrix of multiply aligned sequences of bona fide family members, serves as the starting point in this algorithm. The observed amino acid propensity and the selection of a random number dictate the selection of a residue for each position in the sequence. In a systematic manner, and by applying a ‘roulette-wheel’ selection approach at each position, we generate parent family-like sequences and thus facilitate an enlargement of sequence space around the family. When generated for a large number of families, we demonstrate that they expand the utility of natural intermediately related sequences in linking distant proteins. In 91% of the assessed examples, inclusion of designed sequences improved fold coverage by 5–10% over searches made in their absence. Furthermore, with several examples from proteins adopting folds such as TIM, globin, lipocalin and others, we demonstrate that the success of including designed sequences in a database positively sensitized methods such as PSI-BLAST and Cascade PSI-BLAST and is a promising opportunity for enormously improved remote homology recognition using sequence information alone.

Supplementary files

Article information

DOI: https://doi.org/10.1039/C2MB25113B
Article type: Paper
Submitted: 22 Mar 2012
Accepted: 15 May 2012
First published: 13 Jun 2012

Download Citation

Mol. BioSyst., 2012,8, 2076-2084

Permissions

Request permissions

Cascaded walks in protein sequence space: use of artificial sequences in remote homology detection between natural proteins

S. Sandhya, R. Mudgal, C. Jayadev, K. R. Abhinandan, R. Sowdhamini and N. Srinivasan, Mol. BioSyst., 2012, 8, 2076 DOI: 10.1039/C2MB25113B

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Molecular BioSystems

Cascaded walks in protein sequence space: use of artificial sequences in remote homology detection between natural proteins†

Abstract

Supplementary files

Article information

Download Citation

Permissions

Cascaded walks in protein sequence space: use of artificial sequences in remote homology detection between natural proteins

Search articles by author

Spotlight

Advertisements