A short guide to abbreviations and their use in peptide science

Abbreviations, acronyms and symbolic representations are very much part of the language of peptide science in conversational communication as much as in its literature. They are not only a convenience, either they enable the necessary but distracting complexities of long chemical names and technical terms to be pushed into the background so the wood can be seen among the trees. Many of the abbreviations in use are so much in currency that they need no explanation. The main purpose of this editorial is to identify them and free authors from the hitherto tiresome requirement to define them in every paper. Those in the tables that follow which will be updated from time to time may in future be used in this Journal without explanation.

All other abbreviations should be defined. Previously published usage should be followed unless it is manifestly clumsy or inappropriate. Where it is necessary to devise new abbreviations and symbols, the general principles behind established examples should be followed. Thus, new amino-acid symbols should be of form Abc, with due thought for possible ambiguities (Dap might be obvious for diaminoproprionic acid, for example, but what about diaminopimelic acid?).

Where alternatives are indicated below, the first is preferred.

Amino Acids

Proteinogenic Amino AcidsAlaAlanineAArgArginineRAsnAsparagineNAspAspartic acidDAsxAsn or AspCysCysteineCGlnGlutamineQGluGlutamic acidEGlxGln or GluGlyGlycineGHisHistidineHIleIsoleucineILeuLeucineLLysLysineKMetMethionineMPhePhenylalanineFProProlinePSerSerineSThrThreonineTTrpTryptophanWTyrTyrosineYValValineV

Copyright © 1999 European Peptide Society and John Wiley & Sons, Ltd. Reproduced with permission from J. Peptide Sci., 1999, 5, 465471.

Other Amino AcidsAadα-Aminoadipic acidβAadβ-Aminoadipic acidAbuα-Aminobutyric acidAibα-Aminoisobutyric acid; α-methylalanineβAlaβ-Alanine; 3-aminopropionic acid (avoid Bal)Asuα-Aminosuberic acidAzeAzetidine-2-carboxylic acidChaβ-cyclohexylalanineCitCitrulline; 2-amino-5-ureidovaleric acidDhaDehydroalanine (also ΔAla)Glaγ-Carboxyglutamic acidGlppyroglutamic acid; 5-oxoproline (also pGlu)HphHomophenylalanine (Hse=homoserine, and so on). Caution is necessary over the use of the prefix homo in relation to α-amino-acid names and the symbols for homo-analogues. When the term first became current, it was applied to analogues in which a side-chain CH2 extension had been introduced. Thus homoserine has a side-chain CH2CH2OH, homoarginine CH2CH2CH2NHC([double bond, length half m-dash]NH)NH2, and so on. In such cases, the convention is that a new three-letter symbol for the analogue is derived from the parent, by taking H for homo and combining it with the first two characters of the parental symbol hence, Hse, Har and so on. Now, however, there is a considerable literature on β-amino acids which are analogues of α-amino acids in which a CH2 group has been inserted between the α-carbon and carboxyl group. These analogues have also been called homo-analogues, and there are instances for example not only of ‘homophenylalanine', NH2CH(CH2CH2Ph)CO2H, abbreviated Hph, but also ‘homophenylalanine', NH2CH(CH2Ph)CH2CO2H abbreviated Hph.Further, members of the analogue class with CH2 interpolated between the α-carbon and the carboxyl group of the parent α-amino acid structure have been called both ‘α-homo'- and ‘β-homo’. Clearly great care is essential, and abbreviations for ‘homo’ analogues ought to be fully defined on every occasion. The term ‘β-homo’ seems preferable for backbone extension (emphasizing as it does that the residue has become a β-amino acid residue), with abbreviated symbolism as illustrated by βHph for NH2CH(CH2Ph)CH2CO2H.Hylδ-HydroxylysineHyp4-HydroxyprolineαIleallo-Isoleucine; 2S, 3R in the l-seriesLanLanthionine; S-(2-amino-2-carboxyethyl)cysteineMeAlaN-Methylalanine (MeVal=N-methylvaline, and so on). This style should not be used for α-methyl residues, for which either a separate unique symbol (such as Aib for α-methylalanine) should be used, or the position of the methyl group should be made explicit as in αMeTyr for α-methyltyrosine.NleNorleucine; α-aminocaproic acidOrnOrnithine; 2,5-diaminopentanoic acidPhgPhenylglycine; 2-aminophenylacetic acidPipPipecolic acid; piperidine-s-carboxylic acidSarSarcosine; N-methylglycineStaStatine; (3S, 4S)-4-amino-3-hydroxy-6-methyl-heptanoic acidThiβ-ThienylalanineTic1,2,3,4-Tetrahydroisoquinoline-3-carboxylic acidαThrallo-Threonine; 2S, 3S in the l-seriesThzThiazolidine-4-carboxylic acid, thiaprolineXaaUnknown or unspecified (also Aaa)

The three-letter symbols should be used in accord with the IUPAC-IUB conventions, which have been published in many places (e.g.European J. Biochem. 1984; 138: 937), and which are (May 1999) also available with other relevant documents at: http://www.chem.qnw.ac.uk/iubmb/iubmb.html#03

It would be superfluous to attempt to repeat all the detail which can be found at the above address, and the ramifications are extensive, but a few remarks focussing on common misuses and confusions may assist. The three-letter symbol standing alone represents the unmodified intact amino acid, of the l-configuration unless otherwise stated (but the l-configuration may be indicated if desired for emphasis: e.g.l-Ala). The same three-letter symbol, however, also stands for the corresponding amino acid residue. The symbols can thus be used to represent peptides (e.g. AlaAla or Ala-Ala=alanylalanine). When nothing is shown attached to either side of the three-letter symbol it is meant to be understood that the amino group (always understood to be on the left) or carboxyl group is unmodified, but this can be emphasized, so AlaAla=H-AlaAla-OH. Note however that indicating free termini by presenting the terminal group in full is wrong; NH2AlaAlaCO2H implies a hydrazino group at one end and an α-keto acid derivative at the other. Representation of a free terminal carboxyl group by writing H on the right is also wrong because that implies a terminal aldehyde.

Side chains are understood to be unsubstituted if nothing is shown, but a substituent can be indicated by use of brackets or attachment by a vertical bond up or down. Thus an O-methylserine residue could be shown as 1, 2, or 3.

Note that the oxygen atom is not shown: it is contained in the three-letter symbol showing it, as in Ser(OMe), would imply that a peroxy group was present. Bonds up or down should be used only for indicating side-chain substitution. Confusions may creep in if the three-letter symbols are used thoughtlessly in representations of cyclic peptides. Consider by way of example the hypothetical cyclopeptide threonylalanylalanylglutamic acid. It might be thought that this compound could be economically represented 4.

But this is wrong because the left hand vertical bond implies an ester link between the two side chains, and strictly speaking if the right hand vertical bond means anything it means that the two Ala α-carbons are linked by a CH2CH2 bridge. This objection could be circumvented by writing the structure as in 5.

But this is now ambiguous because the convention that the symbols are to be read as having the amino nitrogen to the left cannot be imposed on both lines. The direction of the peptide bond needs to be shown with an arrow pointing from CO to N, as in 6.

Actually the simplest representation is on one line, as in 7.

Substituents and Protecting GroupsAcAcetylAcmAcetamidomethylAdoc1-AdamantyloxycarbonylAllocAllyloxycarbonylBoct-ButoxycarbonylBomπ-BenzyloxymethylBpoc2-(4-Biphenylyl)isopropoxycarbonylBtmBenzylthiomethylBumπ-t-ButoxymethylBuii-ButylBunn-ButylButt-ButylBzBenzoylBzlBenzyl (also Bn); Bzl(OMe)=4-methoxybenzyl and so onChaCyclohexylammonium saltClt2-ChlorotritylDchaDicyclohexylammonium saltDde1-(4,4-Dimethyl-2,6-dioxocyclohex-1-ylidene)ethylDdz2-(3,5-Dimethoxyphenyl)-isopropoxycarbonylDnp2,4-DinitrophenylDppDiphenylphosphinylEtEthylFmoc9-FluorenylmethoxycarbonylForFormylMbh4,4′-Dimethoxydiphenylmethyl, 4,4′-DimethoxybenzhydrylMbs4-MethoxybenzenesulphonylMeMethylMob4-MethoxybenzylMtr2,3,6-Trimethyl,4-methoxybenzenesulphonylNps2-NitrophenylsulphenylOA11Allyl esterOBt1-Benzotriazolyl esterOcHxCyclohexyl esterONp4-Nitrophenyl esterOPcpPentachlorophenyl esterOPfpPentafluorophenyl esterOSuSuccinimido esterOTce2,2,2-Trichloroethyl esterOTcp2,4,5-Trichlorophenyl esterTmob2,4,5-TrimethoxybenzylMtt4-MethyltritylPacPhenacyl, PhCOCH2 (care! Pac also=PhCH2CO)PhPhenylPhtPhthaloylScmMethoxycarbonylsulphenylPmc2,2,5,7,8-Pentamethylchroman-6-sulphonylPrii-PropylPrnn-PropylTfaTrifluoroacetylTos4-Toluenesulphonyl (also Ts)Troc2,2,2-TrichloroethoxycarbonylTrtTrityl, triphenylmethylXan9-XanthydrylZBenzyloxycarbonyl (also Cbz). Z(2C1)=2-chlorobenzyloxycarbonyl and so onAmino Acid DerivativesDKPDiketopiperazineNCAN-CarboxyanhydridePTHPhenylthiohydantoinUNCAUrethane N-carboxyanhydrideReagents and SolventsBOP1-Benzotriazolyloxy-tris-dimethylamino-phosphonium hexafluorophosphateCDICarbonyldiimidazoleDBUDiazabicyclo[5.4.0]-undec-7-eneDCCIDicyclohexylcarbodiimide (also DCC)DCHUDicyclohexylurea (also DCU)DCMDichloromethaneDEADDiethyl azodicarboxylate (DMAD=the dimethyl analogue)DIPCIDiisopropylcarbodiimide (also DIC)DIPEADiisopropylethylamine (also DIEA)DMADimethylacetamideDMAP4-DimethylaminopyridineDMFDimethylformamideDMSDimethylsulphideDMSODimethylsulphoxideDPAADiphenylphosphoryl azideEEDQ2-Ethoxy-1-ethoxycarbonyl-1,2-dihydroquinolineHATUThis is the acronym for the ‘uronium’ coupling reagent derived from HOAt, which was originally thought to have the structure 8, the Hexafluorophosphate salt of the O-(7-Azabenzotriazol-lyl)-Tetramethyl Uronium cation. In fact this reagent has the isomeric N-oxide structure 9 in the crystalline state, the unwieldy correct name of which does not conform logically with the acronym, but the acronym continues in use. Similarly, the corresponding reagent derived from HOBt has the firmly attached label HBTU (the tetrafluoroborate salt is also used: TBTU), despite the fact that it is not actually a uronium salt.HMPHexamethylphosphoric triamide (also HMPA, HMPTA)HOAt1-Hydroxy-7-azabenzotriazoleHOBt1-HydroxybenzotriazoleHOCt1-Hydroxy-4-ethoxycarbonyl-1,2,3-triazoleNDMBAN,N′-Dimethylbarbituric acidNMMN-MethylmorpholinePAMPhenylacetamidomethyl resinPEGPolyethylene glycolPtBOP1-Benzotriazolyloxy-tris-pyrrolidinophosphonium hexafluorophosphateSDSSodium dodecyl sulphateTBAFTetrabutylammonium fluorideTBTUSee remarks under HATU aboveTEATriethylamineTFATrifluoroacetic acidTFETrifluoroethanolTFMSATrifluoromethanesulphonic acidTHFTetrahydrofuranWSCIWater soluble carbodiimide: 1-ethyl-3-(3′-dimethylaminopropyl)-carbodiimide hydrochloride (also EDC)TechniquesCDCircular dichroismCOSYCorrelated spectroscopyCZECapillary zone electrophoresisELISAEnzyme-linked immunosorbent assayESIElectrospray ionizationESRElectron spin resonanceFABFast atom bombardmentFTFourier transformGLCGas liquid chromatographyhplcHigh performance liquid chromatographyIRInfra redMALDIMatrix-assisted laser desorption ionizationMSMass spectrometryNMRNuclear magnetic resonancenOeNuclear Overhauser effectNOESYNuclear Overhauser enhanced spectroscopyORDOptical rotatory dispersionPAGEPolyacrylamide gel electrophoresisRIARadioimmunoassayROESYRotating frame nuclear Overhauser enhanced spectroscopyRPReversed phaseSPPSSolid phase peptide synthesisTLCThin layer chromatographyTOCSYTotal correlation spectroscopyTOFTime of flightUVUltravioletMiscellaneousAbAntibodyACEAngiotensin-converting enzymeACTHAdrenocorticotropic hormoneAgAntigenAIDSAcquired immunodeficiency syndromeANPAtrial natriuretic polypeptideATPAdenosine triphosphateBKBradykininBSABovine serum albuminCCKCholecystokininDNADeoxyribonucleic acidFSHFollicle stimulating hormoneGHGrowth hormoneHIVHuman immunodeficiency virusLHRHLuteinizing hormone releasing hormoneMAPMultiple antigen peptideNPYNeuropeptide YOTOxytocinPTHParathyroid hormoneQSARQuantitative structureactivity relationshipRNARibonucleic acidTASPTemplate-assembled synthetic proteinTRHThyrotropin releasing hormoneVIPVasoactive intestinal peptideVPVasopressin

J. H. Jones

© The Royal Society of Chemistry 2015 (2014)