The nature of the conserved basic amino acid sequences found among 437 heparin binding proteins determined by network analysis†
In multicellular organisms, a large number of proteins interact with the polyanionic polysaccharides heparan sulphate (HS) and heparin. These interactions are usually assumed to be dominated by charge–charge interactions between the anionic carboxylate and/or sulfate groups of the polysaccharide and cationic amino acids of the protein. A major question is whether there exist conserved amino acid sequences for HS/heparin binding among these diverse proteins. Potentially conserved HS/heparin binding sequences were sought amongst 437 HS/heparin binding proteins. Amino acid sequences were extracted and compared using a Levenshtein distance metric. The resultant similarity matrices were visualised as graphs, enabling extraction of strongly conserved sequences from highly variable primary sequences while excluding short, core regions. This approach did not reveal extensive, conserved HS/heparin binding sequences, rather a number of shorter, more widely spaced sequences that may work in unison to form heparin-binding sites on protein surfaces, arguing for convergent evolution. Thus, it is the three-dimensional arrangement of these conserved motifs on the protein surface, rather than the primary sequence per se, which are the evolutionary elements.