The identi ﬁ cation of peptides by nanoLC-MS/MS from human surface tooth enamel following a simple acid etch extraction †

Tooth enamel is the hardest, densest and most mineralized tissue in vertebrates. This is due to the high crystallinity of enamel. During enamel formation, proteins responsible for mineralization are degraded by proteases, which results in mature enamel having less than 1% proteinaceous material, mostly as peptides. Many toxicological studies have taken advantage of the stability of tooth enamel to study heavy metal exposure, however few studies have been successful in identifying peptides from the enamel, especially from a single tooth. Furthermore, amelogenin, the most abundant protein involved in tooth development, is expressed from both the X and Y chromosomes and is dimorphic. Sequencing of the gender dimorphic peptide regions may be useful in determining gender, especially when no other biomaterial is available nor intact DNA remains. In light of this, a method employing nano ﬂ ow liquid chromatography (nanoLC) electrospray ionization tandem mass spectrometry (MS/MS) was used to analyse peptides released through an acid etch of the enamel from individual teeth. Two approaches were investigated, one with trypsin digest following acid etch and one without. Peptide identi ﬁ cation was accomplished using typical proteomics methodology by searching against the human proteome. Peptides from the major enamel structural proteins were identi ﬁ ed including amelogenin isoforms, ameloblastin, and enamelin. Furthermore, Y-chromosome-speci ﬁ c amelogenin peptides were also detected in mature enamel. Peptides were identi ﬁ ed from the enamel of single teeth on present-day and archaeological samples in a non-destructive and minimally invasive method by nanoLC-MS/MS. The identi ﬁ cation of tooth enamel speci ﬁ c peptides with this approach allows for its potential applications in forensic analysis and archaeological studies.


Introduction
Dental enamel is the outermost tissue that covers the tooth crown. It is the hardest, densest, and most calcied tissue in toothed vertebrates (96% by weight). 1 Because of its unique physical characteristics, enamel is the best preserved tissue from remains which contain teeth. Its mostly unaltered structure is oen used for phylogenetic classication of species based, for instance, on morphological features such as the Hunter-Schreger bands. 2,3 The epi-illumination of enamel and the differential spread of light through groups of enamel rods has recently been shown to produce light-and-dark patterns resembling a digital impression. 4 Toxicological studies have taken advantage of the stability of tooth enamel to study heavy metal exposure. 5,6 While enamel exhibits these rich morphological features and important toxicological information in its "rocky matter", it does not contain DNA due to a lack of cells. Mature enamel, however, does contain a small amount of proteins (<1%). These proteins are mainly the heterogeneous amelogenins (AMELX and AMELY, comprising 90% of the immature enamel matrix), 7 ameloblastin (AMBN), enamelin (ENAM), matrix metalloproteinase 20 (MMP20), and kallikrein 4 (KLK4), 1 as characterized from immature enamel. 8 Most of these proteins are enamel-specic and they are thought to play a signicant role in the formation of mature enamel, having evolved from independent genes $500 million years ago. 9 Biochemical characterization of enamel has been done, mostly from abundant porcine material in order to purify sufficient amounts of protein 10 or by expressing and purifying the cloned proteins from bacteria. 11 Peptides can be recovered from thousand-year-old to million-year-old remains, for example, enamel-specic peptides were recovered from 1100 year-old mummy teeth from which one peptide was identied by matrix assisted laser desorption ionization tandem time-of-ight (MALDI-TOF/TOF) MS, 12 bone proteins were recovered from a femur of a 43 000 year-old mammoth, 13 and collagen has been obtained from mastodon and dinosaur bone. 14,15 The ability to recover peptides from old to ancient samples has signicant value in many elds where the study of proteins preserved from the past could shed light into diet, lifestyle, and on evolution of the proteins themselves, since evidence of such changes may not be encoded in present DNA. [15][16][17] Schweitzer et al., 18 previously discussed the challenges of obtaining biomolecules from ancient material, and mentions many advantages of using ancient peptides/proteins over DNA for evolutionary phylogenetic analyses.
Previously we have shown the feasibility of identifying peptides from mature enamel from fully calcied human and porcine teeth using MALDI-TOF/TOF MS. 19 However, single tooth acid-etchings did not provide sufficient material for MS analysis. We were also successful in identifying two N-terminal peptides from AMELX from >1000 years old mummy teeth, but this required etching of the whole crown. 12 Workows using nano-ow liquid chromatography tandem mass spectrometry (nanoLC-MS/MS) followed by protein database searching has become a routine approach in the eld of proteomics as it is a sensitive and selective method for detecting and identifying low levels of peptides from complex mixtures.
In this report, we describe the recovery of enamel-specic peptides from an acid etch sampling of single teeth and iden-tication of unique peptides by nanoLC-MS/MS. Two sampling methods are presented; one with reductive alkylation of cysteines followed by a trypsin digest and one without these steps. The rst method was performed on 6 teeth (3 male, 3 female) and the second method was performed on 4 teeth (2 male, 2 female) and 2 teeth (1 male, 1 female) originating from an archeological site c. 600-900 AD. This approach has great potential in its application to the many different elds including forensics, paleontology and archaeology, where enamel, due to its unique properties, may be the only remaining source of unaltered preserved tissue.

Materials & methods
All chemicals and materials were of reagent grade unless otherwise mentioned. The rst set of human teeth (3 male, 3 female) were obtained from patients whose 3 rd molars were being extracted for various reasons at the Dental Surgery Clinic at the University of Pittsburgh following ethical guidelines under the University of Pittsburgh Institutional Review Board, protocol number: IRB 0511110. The second set of teeth (2 male, 2 female) were attained from patients attending the Dental Surgery Clinic at the Dental School of Ribeirão Preto of the University of São Paulo (Faculdade de Odontologia de Ribeirão Preto, FORP/USP), as approved by the Institutional Review Board protocol number CAAE 0229412.0.0000.5419. Two teeth (one female and one male) were obtained from skeletons from a mid to late Anglo-Saxon cemetery (c. 600-900 AD) in Seaham, UK.

Sample preparation: acid etch method
In both experimental sets each tooth provided one sample for analysis. Teeth were freed from any macroscopic so tissue. The enamel crown was rst washed with 3% H 2 O 2 for 5 min, etched in 1 mL 10% (v/v) HCl for 1 min followed by a second etch of 10% HCl containing protease inhibitors: phenanthroline, Nethylmaleimide, and phenylmethylsulfonyl uoride; added just prior to use all at 1.0 mM (Sigma-Aldrich, St. Louis, MI, USA) in the cap of a separate microcentrifuge tube for 5 min. The crown was then washed twice with 500 mL ddH 2 O (milliQ water, Millipore). The initial 1 min etch was discarded and samples from the second etch (5 min acid etch) were dried by vacuum centrifugation (Speed-Vac, Thermo Scientic) and used for analysis. Samples were desalted using a pipet tip packed with reversed-phase resin (POROS R2 50 mm, Life Technologies, NY, USA); 50 mL of resin/tip, to retain proteins/peptides. Elution from resin was accomplished using 50 mL of acetonitrile at 50% (v/v) containing formic acid at 0.2% (v/v) (both from Sigma, HPLC grade). Tryptic digestion of the enamel peptides was carried out on three male and three female samples (rst set of teeth). Each dried sample was resuspended in 20 mL water-: acetonitrile (50% v/v, 50 mM ammonium bicarbonate). Five mL of DTT (45 mM, Sigma, HPLC grade) were added to the solution, which was then incubated for 1 hour in the dark at 56 C. Aerwards, ve mL of iodoacetamide (Sigma) at 100 mM were added to the sample, which was incubated for another hour in the dark at room temperature. Samples were then diluted 5 times with 100 mM ammonium bicarbonate solution, and trypsin (Trypsin Gold, Mass Spectrometry grade, Promega, Madison, WI, USA) was added to the solution, which was incubated at 37 C in the dark for 22 hours. Five mL of formic acid (PA, 98%, Sigma) were added to stop the reaction, and samples were passed again through a tip column with POROS R2 resin. Elution of peptides from the resin was carried out using 50% (v/v) methanol containing 5% (v/v) acetic acid. Samples were dried under vacuum centrifugation (Speed Vac, Savant Thermo Scientic) and redissolved in 50 mL of 0.1% triuoroacetic acid ddH 2 O prior to MS analysis.

NanoLC-MS/MS
Two microliters of each sample from the rst sample set were analyzed by reversed phase nanoLC-MS/MS using a high performance liquid chromatography (HPLC) system (Ultimate This journal is © The Royal Society of Chemistry 2016 3000, Dionex) equipped with a static ow-splitter, and binary solvent system (solvent A: 0.1% formic acid in HPLC grade water; solvent B: 0.1% formic acid in acetonitrile) coupled to a linear ion trap (LTQ XL, Thermo Scientic). Samples were loaded on the column (75 mm ID, 15 mm tip packed with 10.5 cm of Reprosil-PUR C18, 3 mm particle size, 120Å pore size, Pico-Chip, New Objective) for 8 min in 2% solvent A at a ow rate of 0.5 mL min À1 , ow was split post sample loop at 8.5 min and chromatography was performed using a linear gradient program (8.5-50 min, 2-40% B, 50-51 min, 40-95% B; 51-52 min, 95% B; 52-52.5 min, 95-2% B, 52.5-55, 2% B). The datadependent acquisition mode was used to collect MS/MS spectra for the most intense ions (up to 5) from the preceding full-scan mass spectrum (350-1800 m/z) for a total acquisition time of 60 min. One microliter injections of each sample from the second sample set (no trypsin digest) and ve microliter injections of the archeological sample set were subjected to reversed phase nanoLC-MS/MS (nanoRS U3000, Thermo Scientic), and binary solvent system (solvent A, 0.1% formic acid, 3% DMSO in HPLC grade water; solvent B, 0.1% formic acid, 3% DMSO in acetonitrile) coupled to a hybrid linear ion trap orbitrap (Orbitrap XL, Thermo Scientic). Peptides were loaded onto a C18 trapping cartridge (Pepmap100C18; 0.3 Â 5 mm ID; 5 mm particle size) for 5 min at a ow-rate of 5 mL min À1 in 0.1% TFA loading buffer. Peptides were separated on an analytical column (25 cm Â 75 mm; 5 mm particle size, C18 PepMap100) with a ow rate of 300 nL min À1 and a gradient of 0 to 30% solvent B over 40 min, 30% to 70% solvent B over 5 min, 70% to 90% solvent B over 5 min, held constant at 90% for 10 min, 90% to 0% in one min and equilibrated at 0% for 10 min. Nanospray was performed with a 10 mm uncoated silica tip emitter (New Objective, FS360-20-10-N-20).
The MS was operated in data-dependent MS/MS mode in which each full MS scan was collected in the orbitrap, precursor ion range of 300-1600 m/z (R ¼ 60 000 @ 400 m/z), followed by up to eight MS/MS scans performed in the linear ion trap where the most abundant peptide molecular ions were selected for collision-induced dissociation (CID), using a normalized collision energy of 35%. Total MS acquisition time was 72 min.

Database searches
Data from the rst set of samples was searched against the human proteome (Uniprot 02/2013, 87 656 entries) with no enzyme constraint, methionine oxidation as variable modication, using average mass with a peptide tolerance of 1.4 Da and a MS/MS tolerance of 0.5 Da using Mascot (v2.4.0, Matrix Science Ltd.). Filtering the data was performed using Scaffold (version 4.2.0, Proteome Soware Inc., Portland, OR). Peptide identications were accepted if they could be established at greater than 95.0% probability by the Scaffold Local FDR algorithm. Protein identications were accepted if they could be established at greater than 99.0% probability and contained at least 2 identied peptides.
For the second set and archaeological sets of samples, data was searched against the human proteome (UniprotKB, 10/ 2015, canonical and isoform, 92 035 entries) using MaxQuant (Version 1.5.1.2) employing default search settings, methionine oxidation as variable modication, unspecic digestion mode, with a rst search peptide tolerance of 20 ppm and a main search peptide tolerance of 4.5 ppm.

Results and discussion
A simple acid etch methodology (see Fig. 1) was applied to two sample sets of teeth to produce peptides for nanoLC-MS/MS analysis. The rst sample which was reduced and alkylated followed by an overnight trypsin digest, produced peptides predominantly specic to tooth enamel proteins (see Table 1). Peptides derived from the X/Y isoforms of amelogenin made up the bulk of peptides identied. Other tooth specic peptides identied were from osteopontin and the structural proteins ameloblastin and enamelin. Non-tooth specic peptides were also identied from serum albumin, hemoglobin and collagen and this may be due to the natural lifetime contact of the teeth with the oral cavity. The presence of collagen and osteopontin may also have originated from the root's cementum during the etching procedure as the acid solution tends to spread over the tooth during the etching and may have come into contact with the root.
Similar results were obtained from the second set of teeth with the reductive alkylation and trypsin steps omitted (see Table 2); with the exception that osteopontin was not identied. Results from the rst set (with trypsin digest) produced additional peptide identications for amelogenin which are likely the result of the cleavage C-terminal to lysine 24 (AMELX) by trypsin (e.g. K.WYQSIRPP.Y). Although the use of trypsin increased the variety of peptides identied it did not greatly improve sequence coverage. Overall, from both methods, the list of proteins is small (a dozen or less) indicative of the nature of the sample. Peptides specic to the Y isoform of amelogenin were identied; a signicant nding in this study, allowing the possibility to determine sex from enamel sampling alone. The sequence coverage of amelogenin is shown in Fig. 2 with the dimorphic peptides identied highlighted (all peptide sequences identied for amelogenin isoforms are shown in ESI Table 1 †). Of note, the identied peptides originated from two regions of the protein sequence; the tyrosine-rich amelogenin polypeptide (TRAP) N-terminal region (AA1-45) and the hydrophilic charge containing C-terminus (AA165-180) region. Peptides from the central region of the protein, which includes a histidine-rich coil-domain region (AA46-125) and the PXX repeat domain region (AA126-164) 20 were not identied. The loss of the central domains is thought to be the result of proteolytic processing during maturation of enamel by matrix metalloproteinase 20 (MMP20) and kallikrein 4 (KLK4). 21 The dimorphic sequences of AMELY found in the enamel samples analyzed in this study are: YEVLTPLKWYQSMIRPPYS,    Table 3 List of proteins with accession numbers identified from peptides recovered from the acid etch of tooth enamel from archaeological samples Protein name  Sex   11  Q99217, Q99218-1  Amelogenin, X isoforms  Male  4  Q9NRM1  Enamelin  Male  3  Q99217-2  Isoform 2 of amelogenin, X isoform  Male  3  Q9NP70, Q9NP70-2  Ameloblastin  Male  9 Q99217, Q99218-1 Amelogenin, X isoforms from isoform Y (see Fig. 3). The amelogenin peptides identied (ESI Tables 1 and 2 †) have sequences which agree with predicted cleavage products of KLK4 determined from recombinant porcine amelogenin substrates and uorogenic peptide substrates. 22 Using this simple method to produce enamel derived peptides, without the need of a trypsin digest, different depths of the enamel could be probed by varying the time of exposure to acid. This method may be useful in the identication of inherited diseases of the enamel when DNA material is unavailable, such as in Amelogenesis Imperfecta (AI), where one cause of AI is a mutation in the gene encoding for AMELX. It is widely accepted that amelogenins are transcribed from both the X and the Y chromosomes, but that only 10% of the transcripts originate from chromosome Y. 23 Jobling, et al., studied samples from 45 males with deletions in the short arms of the Y chromosome, in which AMELY is deleted, and although these teeth had a normal appearance, they suggest that this deletion may be of functional signicance. 24 Information on the presence of protein transcribed and translated from the AMELY gene is still lacking. Our study suggests that AMELY specic transcripts are translated into protein and some of these peptides remain in the mature enamel and the signicance of the Y-specic transcripts with respect to function can be further studied.

Peptide count Accession numbers
The central region of mammalian amelogenin was described by Sire, et al., 25 to be highly variable in sequence when compared to that of reptiles, however the N-and C-terminal regions were highly conserved (over 250 million years of evolution). This implies that these sequence regions of amelogenin are evolutionarily critical. The feasibility to obtain peptides from these highly conserved N-and C-terminal sequences directly from mature enamel using the acid etch method described herein, may contribute in the study of evolution.
Amelogenin peptides have been found in mummy teeth, 12 suggesting that peptides may be preserved inside dental enamel, protected by the hardest of all mammal tissues. Since many species have dimorphism in the amelogenin gene, and since DNA rarely survives more than 10-15 thousand years, the use of enamel peptides may open a window into the past to determine the gender of ancient humans and possibly fossils. To test whether the method can be used on "old" archeological samples we applied the direct acid etch method to two teeth; one male one female, recovered from an Anglo-Saxon cemetery (c. 600-900 AD, Seaham, UK). Similar results to those of "present-day" samples were obtained, as shown in Table 3, where the majority of peptides identied originate from the predominant enamel proteins; amelogenin, ameloblastin and enamelin. The peptide SM(ox)IRPPY ([M + 2H] 2+ ¼ 440.2233 m/ z) from amelogenin isoform Y was not identied in the male sample from the database search (peptides identied are shown in ESI Table 3 †). This is thought to be due to the "unspecic" enzyme search parameter, for when a search is performed using kallikrein like specicity 26 it is identied (see MS/MS spectrum; ESI Fig. 4, † which is identical to ESI Fig. 2 †). An example of a few peptides from the two archaeological samples is shown in Fig. 4, as the RICs of their corresponding m/z. Again peptide SM(ox)IRPPY is not present in the female sample, however it seems to be in higher abundance compared to the present day samples (Fig. 3); this difference in intensity may be indicative of increased oxidation of the methionine, due to its age or other factors which must be investigated. From these results, it is evident that this method can be applied to archaeological samples and we are currently pursuing this, but results may differ from "present-day" samples and further study is required to characterize these variations. Also, further work would be required to implement this method to assess in the eld of forensics/archaeology.
During the preparation of this manuscript, Castiblanco et al., identied peptides from human tooth enamel without the use of trypsin digest using LC-MS/MS and a Mascot database search. 27 However, their methodology involved the cutting of the tooth crown and removing the enamel under a stereomicroscope followed by grinding in liquid nitrogen. Results from our method clearly demonstrate that this intricate process is avoidable if not unnecessary and in comparison our method is minimally invasive; a key feature in the preserving of precious archeological samples.
In conclusion, by means of a simple acid etch technique followed by nanoLC-MS/MS, peptides specic to enamel proteins were identied; isoforms X and Y of amelogenin, ameloblastin and enamelin. This offers the possibilities for studying both present-day and older specimens, since due to its hard and dense properties dental enamel is ideal for preserving such peptides. The etching from one tooth provided ample material to easily identify peptides specic to tooth enamel including sex specic peptides from amelogenin isoform Y.