DOI:
10.1039/B008012H
(Paper)
Analyst, 2001,
126, 52-57
Searching the Porphyromonas gingivalis genome
with peptide fragmentation mass spectra
Received 1st September 2000, Accepted 30th October 2000
First published on 7th December 2000
Abstract
An approach is described for genomic database searching based
on experimentally observed proteolytic fragments, e.g., isolated
from 1D or 2D gels or analyzed directly, that can be applied to unfinished
prokaryotic genomic data in the absence of annotations or previously
assigned open reading frames (ORFs). This variation on the database search
is in contrast to the more familiar use of peptide mass spectral
fragmentation data to search fully annotated inferred protein databases,
e.g., OWL or SWISS-PROT. We compared the SEQUEST search results
from a six reading frame translation of the Porphyromonas
gingivalis genome DNA sequence with those from computationally derived
ORFs created using publicly available genomics software tools. The ORF
approach eliminated many of the artifacts present in output from the six
reading frame search. The method was applied to uninterpreted tandem mass
spectrometric data derived from proteins secreted by the periodontal
pathogen Porphyromonas gingivalis in response to the gingival
epithelial cell environment, a model system for the study of
host–pathogen interactions relevant to human periodontal disease.
Introduction
The use of databases of known expressed or inferred protein sequences to
assist in the identification of unknown proteins analyzed using peptide
collision-induced dissociation (CID) data is a well-established practice.
This is, in essence, the use of model data (the database) to elucidate the
experimental observation (the peptide CID, that intrinsically contains
partial amino acid sequence information, that in turn can be related back
to a specific gene via the database). Two recent reviews summarize
current progress in proteomics research as it relates to
microbiology1 and genetics.2 We present here an approach by which the
experimental observation is used to test and elucidate the model data found
within new or unfinished prokaryotic genomic databases, which consist of
one or more long DNA contigs rather than entries for individual proteins
translated from the gene sequence. The object of the exercise was to
utilize fully new genomic data that are not normally available in an
annotated protein database format, often for months or years after the
actual DNA sequence itself is complete. The mass spectrometry based
‘reverse genomics’ approach presented here is complementary to
cDNA array technology in that it represents a view of the bacterial genome
that is firmly based on experimentally observed protein expression. At the
present time, protein mass spectrometry and cDNA arrays are the most
promising approaches for linking the worlds of conventional hypothesis
driven pathogenesis research and functional genomics.Mass spectrometry software programs for protein analysis are geared
towards searching large, fully annotated protein databases, e.g.,
OWL,3 SWISS-PROT4 and others. Our approach allows database searching
strategies, based on the SEQUEST5,6
program, to be applied to raw prokaryotic genomic DNA data in the absence
of any prior knowledge regarding protein expression. While the data
reduction and analysis approach presented here is hardly elegant, it has
provided our laboratory with an interim solution pending the availability
of the genome data in an annotated format that is more compatible with the
current generation of protein mass spectrometry software programs. The
absence of extended sequences of non-coding DNA, relative to the situation
in higher organisms, coupled with the small sizes of the genomes
(e.g., 2.2 million base pairs (MBP) for Porphyromonas
gingivalis), makes our approach reasonable with modest computational
resources.
P. gingivalis is a Gram-negative anaerobic coccobacillus that
is an etiological agent of severe adult periodontitis. P.
gingivalis possesses a number of virulence factors including the
ability to invade the epithelial cells of the gingiva. In primary cultures
of gingival epithelial cells the internal bacteria rapidly locate in the
cytoplasm, predominantly in the perinuclear area, where they can replicate
and reach a high density.7 The molecules of
P. gingivalis that direct these events have yet to be determined.
In most invasive Gram-negative bacteria, the type III protein secretion
machinery mediates secretion of bacterial molecules into the cytoplasm of
host cells. The targets of the type III secretion apparatus often have the
ability to subvert information flow within the host cells and facilitate
bacterial entry.8 Similarly, P.
gingivalis secretes a novel set of proteins when in contact with
epithelial cells which may have intracellular effector activity.9 However, the genomic sequence for P.
gingivalis reveals that the organism does not have the genes for a
conventional type III protein secretion apparatus. In order to investigate
the nature and role of the proteins secreted by P. gingivalis in
response to epithelial cells and the process by which they are secreted, it
is first necessary to identify the secreted proteins. We acquired tryptic
peptide CID data from a differential proteomics experiment using proteins
secreted by P. gingivalis grown under normal laboratory conditions
and after exposure to conditioned growth medium from human gingival
epithelial cells (GECs). GECs are a model system used to study
host–pathogen interactions involved in human periodontal disease. The
data set from these experiments was used both to validate the analytical
strategy and to generate biologically relevant information about P.
gingivalis and its interactions with GECs. The data reduction and
analysis strategy that has evolved from this work consists of using a
computational open reading frame (ORF) finding tool and a locally hosted
BLAST server to augment our existing mass spectrometry and protein database
search capabilities.
Experimental
Bacterial growth conditions, protein extraction,
SDS-PAGE
P. gingivalis 33277 was grown anaerobically at 37 °C in
Trypticase soy broth supplemented per liter with 1 g of yeast extract, 5 mg
of hemin and 1 mg of menadione. Proteins secreted by P. gingivalis
were collected as described by Park and Lamont.9 Briefly, P. gingivalis (PG) cells washed
with phosphate-buffered saline (PBS) were resuspended in conditioned
keratinocyte basal medium (KBM) (culture supernatant of gingival epithelial
cells) or PBS and incubated for 6 h at 37 °C. Proteins in the cell-free
medium were obtained by precipitation with 10% trichloroacetic acid (TCA)
as described previously.9 Three gel bands
containing the proteins of interest were excised from Coomassie Brilliant
Blue-stained gels after SDS-PAGE and digested in situ with
trypsin.10 For our purposes, a protein was
of interest if it was observed as a gel band after exposure to the
conditioned GEC growth medium described above, but not observed in the PG
containing control buffer. A blank control was also run consisting of a gel
containing only laboratory background without PG protein.Protein mass spectrometry
All CID (MS2) data were collected in an automated, data
dependent manner11 using a Finnigan (San
Jose, CA, USA) TSQ 7000 mass spectrometer coupled with a microcapillary
HPLC inlet system12 and a modified
electrospray ionization interface13 that
have been described previously. Briefly, Magic C18 stationary
phase material (Michrom BioResources, Auburn, CA, USA) was packed into a 12
cm × 75 μm id fused-silica capillary column. Approximately 1 μl
of digest from each gel spot was loaded pneumatically.14,15 The columns were eluted with a 45 min linear gradient
of 2–95% acetonitrile in water (0.4% v/v acetic acid) at a flow rate
of 250 nl min−1, as measured at the beginning of the
gradient. A script written in Instrument Control Language (ICL) (Finnigan)
instructed the TSQ to collect a scan of centroid mode main beam
(MS1) data over the range m/z 200–2000 every 1.5
s, until a signal was detected above a threshold value of 40000 counts with
a signal-to-noise ratio (S/N) >5. Once the main beam signal exceeded the
threshold, the instrument acquired several scans of CID product ion mass
spectra (MS2) while invoking a subroutine to optimize the
collision offset, before automatically switching back to the MS1
mode. This process was repeated for the entire HPLC run. A constant
pressure of 3.0 mTorr of argon was maintained in the octapole collision
cell at all times. The finished data files from each gel band typically
consisted of roughly 50–200 CID spectra per run, which were acquired
and stored on a DEC AlphaStation Model 200 4/166 computer (Compaq/DEC,
Houston, TX, USA). The hard drive containing the raw data in Finnigan ICIS
format was cross-mounted via an intranet connection and the Unix
Network File System (NFS) to both a second DEC AlphaStation 200 and a much
faster Compaq/DEC AlphaStation 500. The first AlphaStation was dedicated to
real time control of the mass spectrometer only and was not used for
post-run computations. Subsequent post-run analyses and database searching
queues were initiated from either the second AlphaStation 200 or the
AlphaStation 500.Computational hardware, software and procedures
Our approach to the genome-as-database problem was first to pre-compile
on disk our own six reading frame translation in FASTA format, based on a
PERL script (PERL version 5.005_03, www.perl.com) provided by
the UW Genome Center. Second, we investigated the ORFIND program (author:
T. Tatusov) from the National Center for Biotechnology Information (NCBI, a
division of the National Institutes of Health, Bethesda, MD, USA) for
purposes of creating a second FASTA database of putative ORFs. ORFIND
source code (C language) was compiled locally to run under Digital Unix.
The SEQUEST program itself will make the six reading frame translation of
the DNA ‘on the fly.’ However, we preferred to pre-compile our
databases for reasons of speed, to insure the use of a translation table
optimized specifically for bacteria (NCBI Table 11, NIH, Bethesda, MD,
USA), and to have a permanent record of the theoretical translation. The
six reading frame search output from SEQUEST was itself searched against
the PG ORF database using BLAST 2.1 (see below), as shown schematically in
Fig. 1. The putative ORF database was also
searched by SEQUEST directly (see Fig. 2).
CIDs with high scoring matches common to both databases were inspected
manually to verify the peptide sequence in the SEQUEST output. High scoring
matches against the six reading frame database not found in the ORF
database were also inspected manually. All searches were run on an
AlphaStation 500 running Digital Unix 4.0D, the APACHE web server
(www.apache.com) and SEQUEST (Unix version 27, University of
Washington, Seattle, WA, USA). General protein database searches were
conducted using OWL version 31.4
(www.biochem.ucl.ac.uk/bsm/dbbrowser/OWL/OWL.html). |
| Fig. 1 Flow chart describing the order of events when searching the six reading
frame translation of the PG genome with peptide tandem mass spectrometric
data. Early in our studies the ORF database was used purely to verify
putative amino acid sequences already assigned by SEQUEST, and it was not
searched directly. | |
 |
| Fig. 2 Flow chart describing the order of events when SEQUEST was used to
search the ORF database directly. Prior to the ORF search the CID data are
normally searched against a large protein database (e.g., OWL) to
identify any non-PG proteins that are present; see the text for further
discussion. | |
The six reading frame and ORF PG databases were based on the May 17,
1999, release of the PG genome (www.tigr.org). SEQUEST runs were
controlled using the DQS queuing system version 3.2.7 (Florida State
University, Tallahassee, FL, USA). The search results were fed into HTML
based data summary tools and presented using standard HTML browsers.
Subsequent searches, conducted using a variant of the Basic Local Alignment
Search Tool (BLAST), TBLASTN, were used to confirm the locations of the
codons in the PG database corresponding to the experimentally observed
peptides common to both the raw and putative ORF databases. These searches
were run locally on our AlphaStation 500 using the stand alone web-based
BLAST server (NCBI). For short peptides five amino acid residues in length
(e.g., DLLFK, ESLTK; see Table
1) that failed to work with BLAST, we used a web browser
(Netscape) based text editor to locate the fragments in the original PG DNA
database or the putative ORF protein database.
Table 1 SEQUEST search results for tryptic fragments with amino acid sequences
that could be derived from both the six reading frame and the ORF databases
generated from the P. gingivalis genome. Each sample represents a
band excised from an SDS-PAGE gel
Sample name | Locationa | Amino acid sequences for matching peptides | Homologyb |
---|
Base pair location from genome sequence in the TIGR database
(www.tigr.org).
Determined by translating the ORF containing the peptide sequence and
BLAST searching for homology in the GenBank database.
Low homology score, E value >0.1
(www.ncbi.nlm.nih.gov/BLAST/). |
---|
Band 1 | 917688–917717 | QAIVYWKTLK | HipA protein—E. coli |
1720447–1720418 | SDELRLMIHR | DNA damage-inducible protein F—Vibrio
cholerae |
1987395–1987427 | QSSKEHIPSNK | Acrosomal protein ACR55—Homo
sapiensc |
Band 2 | 155209–155177 | HNRGFLTPELK | Lipopolysaccharide biosynthesis
protein—Thermotoga maritima |
705350–705321 | DLLFK | Probable phosphoserine
phosphatase—Streptomyces coelicolor |
1539806–1539780 | DSPVCEAIPK | Hypothetical protein A—Bacillus
stearothermophilusc |
2118271–2118242 | GAAPINHAIR | Chain A, methionyl-trnafmet formyltransferase complexed
with formyl-methionyl-trnafmet—E. coli |
Band 3 | 1853282–1853308 | ESAPRSFEK | RTN2-C—Homo sapiensc |
573866–573895 | KALGYLLSER | Amidophosphoribosyltransferase—Pasteurella
multocida |
1369783–1369754 | KNGENLLLIK | Hypothetical protein RP819—Rickettsia
prowazekii |
2047261–2047275 | ESLTK | Lycopene β-cyclase—Citrus
sinensisc |
Results and discussion
Initial observations and OWL database search
results
A representative data-dependent capillary LC-MS-MS experiment is shown
in Fig. 3. The majority of the peptides
observed were from a protein contaminant of human origin, siderophilin
(GenBank accession No. P02787), as identified in the OWL database, that was
present in all three gel bands at much higher abundance than the proteins
secreted by PG. We believe the siderophilin originated with the cell
culture medium. However, the experiment is robust enough that even this
relatively high contamination level did not significantly impede the
identification of the PG proteins. Signals from the PG related tryptic
fragments were weak, but adequate to generate at least one high quality
match per putative protein. As has been noted previously in the literature
with respect to inferred protein databases,5,6 only a small amount of protein sequence coverage is
necessary to relate it back to the database. The OWL or other general
protein database search is an important tool even in experiments targeted
towards a single genome because of the ease with which peptide fragments
from common background sources, e.g., human keratin or trypsin
autolytic fragments, are quickly eliminated from consideration as useful
data. However, on our equipment a search of several hundred CID mass
spectra against the entire OWL database may take 4 h or longer. A search of
the same raw data against a single small genome, e.g., PG, takes
only 2–3 min.![Representative mass spectrometric data derived from P.
gingivalis proteins used for the OWL and PG genome database searches.
(a) Reconstructed ion chromatogram from the microcapillary HPLC
electrospray ionization auto-CID analysis of the tryptic peptides from a
protein band excised from SDS-PAGE. (b) The mass chromatogram trace of
m/z 438, [M + 3H]3+ parent ion from the peptide
HNRGFLTPELK (see Table 1 and text). (c)
CID mass spectrum of daughter ions from the parent ion at m/z 438,
with labels showing the most informative fragments. The ion
m/z 110 is most likely an immonium ion diagnostic for the
presence of histidine. Ions marked with an asterisk have lost either
ammonia or water from the mass of the indicated y series ion. The
nomenclature for peptide CID fragment ions has been reviewed by
Biemann.18](/image/article/2001/AN/b008012h/b008012h-f3.gif) |
| Fig. 3 Representative mass spectrometric data derived from P.
gingivalis proteins used for the OWL and PG genome database searches.
(a) Reconstructed ion chromatogram from the microcapillary HPLC
electrospray ionization auto-CID analysis of the tryptic peptides from a
protein band excised from SDS-PAGE. (b) The mass chromatogram trace of
m/z 438, [M + 3H]3+ parent ion from the peptide
HNRGFLTPELK (see Table 1 and text). (c)
CID mass spectrum of daughter ions from the parent ion at m/z 438,
with labels showing the most informative fragments. The ion
m/z 110 is most likely an immonium ion diagnostic for the
presence of histidine. Ions marked with an asterisk have lost either
ammonia or water from the mass of the indicated y series ion. The
nomenclature for peptide CID fragment ions has been reviewed by
Biemann.18 | |
PG genome search results
The searches of all three gel bands against the raw PG genome database
led to roughly 20 high quality matches. Mathematical details of how SEQUEST
calculates these matches and criteria for determining exactly what
constitutes ‘high scores’ have been published.5,6 Briefly, the program matches theoretical
peptide mass spectra based on the sequences found in the database against
the observed peptide CIDs, with adjustable parameters for the resolution
and mass accuracy of the mass spectral data, the type of database (DNA or
protein), the proteolytic cleavage specifity of the enzyme used to digest
the protein and mass increments for possible post-translational
modifications, among others. Each experimental CID is given a preliminary
score (Sp) and a cross correlation score (Xcorr) based on the quality of
the match with the theoretical spectra derived from the FASTA format
database. The quality of the match is indicated by rank order and values
for both Sp and Xcorr. Higher numerical values are better for both Sp and
Xcorr. The actual numbers observed and their interpretation depend to some
extent on the characteristics of the database being searched. By way of
example, for a search of the full OWL protein database an Xcorr value
>2.0 is usually significant. We examine the output for high scoring
entries for a given peptide that are significantly higher than the next
lowest scoring match, as indicated by the dCn (deltaCn, or 1 − Cn; Cn
= normalized Xcorr) parameter in the output. When the dCn value for the top
ranked match in terms of Xcorr or Cn is significantly higher by >0.1
units relative to the second highest ranking value, this suggests that the
cross-correlation algorithm converged on a unique sequence in the
database.5 Readers are urged to study
carefully refs. 5 and 6 and the references contained therein in order
really to understand the strengths and limitations of the algorithm with
respect to genome searches of the type reported here. For the data set
summarized in Table 1, the best 20
matches were abstracted from an output consisting of about 540 matches of
lesser quality. Of those 20, roughly half were rejected because of failure
to give good matches also with the ORF database. The 11 peptides judged to
give good matches in both databases are summarized in Table 1. Coverage was poor owing to the high level
of background from human proteins, which we estimate were more abundant in
the gel bands by a factor of >1000 on a weight basis. In general, the
more peptides matched with a given putative ORF, the more reliable is the
assignment. However, even one peptide from a noisy low quality data set can
serve to identify an ORF correctly. For the protein secretion data in
Table 1, we were able to identify
several PG genes for further study. The secretion data represented a
‘worst case scenario’ for our laboratory in terms of S/N and
high background, which was why it was chosen to test the robustness of the
method.Manual inspection of all high scoring matches of any length, found only
in the six reading frame ‘raw’ database (three forward, three
backwards, or one for each letter of the triplet genetic code taken in both
directions), but not in the ORF database, indicated that these hits were
artifacts. They did not correspond to real expressed protein or real
reading frames, as verified by genetic analysis of the surrounding DNA 1000
base pairs on either side of the codons assigned to the putative peptide
sequence. In order to use the experimentally derived peptide low energy CID
data16 to its greatest advantage, as a way
to probe microbial genomes, a practical way had to be found to reduce the
volume and improve the quality of the output from our search program. False
positives were expected, in that the search algorithm is designed to
achieve the best possible match of the CID spectra with the
database,5i.e., there exists a
presumption that the database being searched contains accurate sequence
information for the CID to match. Searching a small database made up
primarily of biologically irrelevant codon assignments, and where each
contig is treated computationally essentially as a single large protein, is
prone to artifact. Although the actual situation is more complex, to
illustrate our logic it can be assumed as an approximation that only about
17% of the six reading frame translation from DNA triplets to single amino
acids will be correctly in-frame, if one makes the simplifying assumption
that all the DNA is transcribed and translated into protein. In the case of
prokaryotic genomes, most of the DNA does in fact code for polypeptide.
Therefore, the search program is being used to optimize the best fit
possible of a CID spectrum against a database consisting mostly of codon
assignments that are out-of-frame. Or, alternatively, if non-coding DNA is
present, the potential exists for codon assignments due to the purely
theoretical translation of a stretch of DNA that is not translated as
protein in nature, e.g., regulatory sequences or other intergenic
sequences. The consequence is that much of the voluminous output from the
six frame database search was not useful, and a way had to be found to
filter out efficiently much of the output, ideally leaving only the matches
within a bona fide, biologically relevant ORF. This problem is inherent in
the use of any protein database searching software in the context of raw
genomic data and is not unique to SEQUEST.
The conventional wisdom in the genomics community seems to be that only
the ORF database search should be necessary for the purposes described in
this paper.17 However, we felt that such a
comparison should be done at least once for P. gingivalis as an
important part of validating the method. Based on prior experience with
ORFIND in the context of its intended application, as a tool for purely
genetic applications, we believe that the risk of false negatives using the
ORF database is not great. However, the previous statement assumes that in
fact a reasonable quality CID spectrum with at least one higher S/N y or b
ion series18 can be acquired, which is not
always the case. No useful information (see Table
1) was found in the much larger output from the six reading
frame search that was not present in the ORF search. The ORFIND database
was fairly ‘liberal’ in that it was biased towards listing all
plausible ORFs. This bias was preferred for our application, which relies
on having experimental data from direct measurements of expressed protein.
Hence the ORFIND program seems to fulfil our need for a way to access
unfinished or unannotated genome data, for purposes of locating genes for
proteins observed experimentally using tandem mass spectrometry. Another
way of looking at the putative ORF database is as a substitute for an
annotated database corresponding to a single organism version of a fully
annotated general protein database. This is very similar in concept to an
established method of shortening the time required for computations by
abstracting single species protein databases from much larger general
protein databases.5 Even with the ORF
database, in borderline cases with noisy data and in the presence of
suspected DNA sequencing errors, it is still necessary to interpret data
manually. Such manual interpretation presumes at least an empirical
knowledge of the underlying data structure,16,19,20i.e., peptide fragmentation behavior
under low energy CID conditions.
To complete the circle from genome to experimentally observed protein
back to the genome, we employed TBLASTN,21
running on our local host (see Fig. 1 and
2). This last step was necessary to utilize
fully the partial amino acid sequences that were found in both the raw and
ORF databases as probes of the protein’s origin in the PG genome. The
observed amino acid sequence, after giving a match in both databases, was
converted back to DNA codons internally by the TBLASTN program and the DNA
sequence matched with a specific location in the PG genome, as shown in
Table 1. This step provided both a
location in the original genome database and a confirmation of the reading
frame identified earlier in the process by ORFIND. Once the location,
reading frame and the SEQUEST generated peptide sequence were validated by
comparison with the in-frame partial DNA sequence, homology searches of the
genes and (or) the putative protein products were carried out over the
internet. In the absence of a suitable fully annotated ‘positive
control genome’, the validation of the method at present lies
ultimately with its ability to identify genes that express protein under a
given set of experimental conditions. The peptides shown in Table 1 served as the key to locate real ORFs
relevant to the interaction of PG and gingival epithelial cells. Two
examples taken from Table 1 are briefly
summarized below.
Peptide HNRGFLTPELK (see Fig. 3).. A BLAST search demonstrated homology to polysaccharide and
lipopolysaccharide biosynthesis proteins. This suggests that P.
gingivalis may alter either the composition or amount of surface
carbohydrates when in an epithelial cell environment. Modification of
surface carbohydrates may be required to facilitate interaction of cell
surface proteins with cognate receptors on the host cell. Previous studies
have shown that polysaccharide capsular material can interfere with
attachment to, and invasion of, epithelial cells.22 Furthermore, digestion of P. gingivalis
with amylglucosidase, that partially degrades capsule, increases the
invasion efficiency for endothelial cells.23 Modification of LPS could affect recognition of
the organism by epithelial cells.24 Peptide DLLFK.. A BLAST search with the translated ORF containing this short sequence
revealed homology to phosphoserine phosphatase enzymes. This molecule thus
has the potential to subvert host cell signaling pathways, many of which
are dependent on a tightly controlled series of phosphorylation and
dephosphorylation events. Similarly, Yersinia spp. and
Salmonella typhimurium produce phosphatases that are delivered
into the host cell by the type III system and mediate cytoskeletal
rearrangements.25 An additional advantage
to our procedure is that short sequences of four or five amino acid
residues can be used to locate an ORF quickly, as described in the
Experimental section, even though they often cannot be used effectively by
themselves with the BLAST programs. Conclusions
We expect the incorporation of whole genome data into our own protein
methods for PG and other bacterial pathogens to evolve in the direction of
(1) eliminating the six reading frame search entirely and relying on the
putative ORF database alone, and eventually the fully annotated genomes as
they develop, (2) streamlining the search for host cell proteins by using
smaller genomic databases, e.g., full length cDNA libraries, that
avoid the problems associated with non-coding DNA in the genomes of higher
organisms, and (3) incorporating parallelism either at the hard disk access
or processor (CPU) level26 to speed the
searches, particularly for human proteins and those from other higher
organisms serving as sources for model target cells. Acknowledgements
The authors thank Dr A. Kaas for PERL scripts that translate DNA to
protein and Dr D. O. V. Alonso for assistance with PERL and his comments.
Dr Maynard Olson and Mr Jimmy Eng also provided helpful comments and
criticism. The PG database was provided through a pre-publication license
from TIGR. We thank Kerry Nugent and Michrom BioResources for the HPLC
packing materials. This work was funded under NIH grants NIDCR DE11111 and
DE13061.References
- M. P. Washburn and J. R. Yates, III, Curr. Opin. Microbiol., 2000, 3, 292 CrossRef CAS.
- J. R. Yates, III, Trends Genet., 2000, 16, 5 CrossRef.
- A. J. Bleasby, D. Akrigg and T. K. Attwood, Nucleic Acids Res., 1994, 22, 3574 CAS.
- A. Bairoch and R. Apweiler, Nucleic Acids Res., 1997, 25, 31 CrossRef CAS.
- J. K. Eng, A. L. McCormack and J. R. Yates, III, J. Am. Soc. Mass. Spectrom., 1994, 5, 976 CrossRef.
- J. R. Yates, III, J. K. Eng, A. L. McCormack and D. Schietz, Anal. Chem., 1995, 67, 1426 CrossRef CAS.
- C. M. Belton, K. T. Izutsu, P. C. Goodwin, Y. Park and R. J. Lamont, Cell. Microbiol., 1999, 1, 215 CrossRef CAS.
- C. J. Hueck, Microbiol. Mol. Biol. Rev., 1998, 62, 379 Search PubMed.
- Y. Park and R. J. Lamont, Infect. Immun., 1998, 66, 4777 CAS.
- A. Shevchenko, M. Wilm, O. Vorm and M. Mann, Anal. Chem., 1996, 68, 850 CrossRef CAS.
- A. Ducret, I. V. Oostveen, J. K. Eng, J. R. Yates, III and R. Aebersold, Protein Sci., 1998, 7, 706 CAS.
- H. Wang, K. B. Lim, R. F. Lawrence, W. N. Howald, J. A. Taylor, L. H. Ericsson, K. A. Walsh and M. Hackett, Anal. Biochem., 1997, 250, 162 CrossRef CAS.
- H. Wang and M. Hackett, Anal. Chem., 1998, 70, 205 CrossRef CAS.
- R. T. Kennedy and J. W. Jorgenson, Anal. Chem., 1989, 61, 1128 CrossRef CAS.
- M. A. Moseley, L. J. Deterding, K. B. Tomer and J. W. Jorgenson, Anal. Chem., 1991, 63, 1467 CrossRef CAS.
- D. F. Hunt, J. R. Yates, III, J. Shabanowitz, S. Winston and C. R. Hauer, Proc. Natl. Acad. Sci. USA, 1986, 83, 6233 CAS.
- A. Kass, personal communication..
- K. Biemann, Annu. Rev. Biochem., 1992, 61, 977 CrossRef CAS.
- D. F. Hunt,
J. E. Alexander,
A. L. McCormack,
P. A. Martino,
H. Michel,
J. Shabanowitz,
N. Sherman,
M. A. Moseley,
J. W. Jorgenson,
L. J. Deterding and
K. B. Tomer, in
Techniques in Protein Chemistry II, ed. J. J. Villafranca,
Academic Press, New York,
1991, p. 441. Search PubMed.
- I. A. Papayannopoulos, Mass Spectrom. Rev., 1995, 14, 49 CAS.
- S. F. Altschul, T. L. Madden, A. A. Schäffer, J. Zhang, Z. Zhang, W. Miller and D. J. Lipman, Nucleic Acids Res., 1997, 25, 3389 CrossRef CAS.
- J. W. St. Geme and S. Falkow, Infect. Immun., 1991, 59, 1325.
- R. G. Deshpande, M. Khan and C. A. Genco, Invasion Metastasis, 1998, 18, 57 Search PubMed.
- R. P. Darveau, A. Tanner and R. C. Page, Periodontol. 2000, 1997, 14, 12 Search PubMed.
- I. DeVinney, I. Steele-Mortimer and B. B. Finlay, Trends Microbiol., 2000, 8, 29 CrossRef CAS.
- D. Tabb,
J. Eng and
J. R. Yates, III, in
Proteome Research: Mass Spectrometry, ed. P. James,
Springer, Berlin, 2000,
p. 125. Search PubMed.
|
This journal is © The Royal Society of Chemistry 2001 |
Click here to see how this site uses Cookies. View our privacy policy here.