The functional capacity of the natural amino acids for molecular recognition

Sara Birtalan , Robert D. Fisher and Sachdev S. Sidhu§ *
Department of Protein Engineering, Genentech Inc., 1 DNA Way, South San Francisco, CA 94080, USA. E-mail: sachdev.sidhu@utoronto.ca

Received 19th January 2010 , Accepted 26th March 2010

First published on 9th April 2010


Abstract

We tested the functional capacity of the natural amino acids for molecular recognition in a minimalist background of binary Tyr/Ser diversity. In phage-displayed synthetic antibody libraries, we replaced either Tyr or Ser with other residues. We find that Tyr is optimal for mediating contacts that contribute favourably to both affinity and specificity, but it can be replaced by Trp, which contributes favourably to affinity but is detrimental to specificity. Arg exhibited a limited capacity for mediating molecular recognition but was less effective than either Tyr or Trp, and moreover, was the major contributor to non-specific interactions. Nine other residue types (Phe, Leu, Ile, Asn, Thr, Pro, Cys, Ala, and Gly) were found to be ineffective as replacements for Tyr. By replacing Ser with Gly or Ala, we found that Gly is as effective as Ser for providing conformational flexibility that allows bulky Tyr residues to achieve optimal binding contacts, while Ala is less effective but still functional in this capacity. For some antigens, high affinity antibodies could be derived using only Tyr/Ser/Gly diversity, but for others, additional chemical diversity was required to achieve high affinity. Our results establish a minimal benchmark for the generation of synthetic antigen-binding sites with affinities comparable to those of natural antibodies. Moreover, our findings illuminate the fundamental principles underlying proteinprotein interactions and provide valuable guidelines for engineering synthetic binding proteins with functions beyond the scope of natural proteins.


Introduction

Proteinprotein interactions are involved in essentially all cellular processes,1 and thus, there is a great demand for affinity reagents that can target proteins, both for detection and for modulation of function.2 At present, monoclonal antibodies from immunized animals are the major class of affinity reagents. However, almost two decades ago, phage display technology enabled the development of synthetic antibodies with man-made binding sites,3,4 and in recent years, highly optimized synthetic antibody libraries have emerged as viable alternatives to natural repertoires.5 Synthetic approaches are particularly attractive to protein engineers, because they offer the opportunity to precisely control repertoire designs. Synthetic libraries can be designed for improved performance, and also, to investigate the basic principles underlying molecular recognition.6,7 Thus, aside from providing access to affinity reagents for practical applications, synthetic antibody repertoires offer a powerful approach to explore the basic principles underlying molecular recognition.

We have previously used synthetic antibody libraries to define the minimal requirements for molecular recognition.8 We arrived at a simple but functional repertoire built on a single antigen-binding fragment (Fab) framework with diversity restricted to four of the six complementarity-determining region (CDR) loops and only two amino acids (Tyr and Ser).9 The structures of several minimalist antibodies revealed that bulky tyrosines act mainly as contact residues that mediate interactions with antigens, while small serines act mainly as conformation residues that help to shape the CDR loops for antigen recognition.9–12 Herein, we take advantage of this simplified background to compare the capacity of different amino acids to act as either contact or conformation residues by assessing their ability to replace either Tyr or Ser in minimalist antigen-binding sites.

Many studies aimed at understanding the role of amino acid diversity in molecular recognition have relied on inference from in silico analysis of structural databases.13–20 In contrast, our study provides direct, empirical assessment of the functional capacity of natural amino acids for contributing to the affinity and specificity of molecular recognition.

Results

Experimental strategy

We have previously generated functional antibodies from phage-displayed libraries built on a single Fab framework by introducing binary Tyr/Ser diversity into the three heavy-chain CDRs (CDR-H1, -H2 and -H3) and the third light-chain CDR (CDR-L3) (Fig. 1).9 Because CDR-H3 is usually the most important component of antigen-binding sites,9,18,21 we allowed length diversity in this loop. In the other three CDRs, we fixed the length and diversified only those positions that are typically solvent exposed (i.e. paratope positions that may contact antigen). In this background we constructed synthetic repertoires to test the ability of different amino acids to replace either Tyr or Ser, either in the CDR-H3 loop only or in all four randomized CDR loops.
Design of libraries. The main chains of the humanized 4D5 heavy- and light-chain variable domains are coloured grey and blue, respectively. The CDR positions that were diversified are shown as coloured spheres as follows: CDR-L3, purple; CDR-H1, yellow; CDR-H2, orange; CDR-H3, red or grey. The grey positions were diversified as follows: 100b, Ala/Gly; 100c, Phe/Ile/Leu/Met. The other coloured positions were randomized with various binary combinations, as described in the main text. In CDR-H3, the red positions were replaced by all loop lengths between 6 and 17 residues in all repertoires except repertoire H3-YSGX in which all loop lengths between 4 and 17 residues were used. In the heavy-chain of repertoire H3-YSGX, position 28 was not diversified and additional positions were diversified as follows: 29, Phe/Ile/Val; 34, Ile/Met; 52a, Pro/Ser; 55, Gly/Ser. In CDR-L3 of repertoire H3-YSGX, positions 91–94 were replaced by four, five or six codons encoding Tyr/Ser and additional positions were diversified as follows: 95, Ile/Pro; 96, Phe/Ile/Val. Positions are numbered according to the nomenclature of Kabat et al.38 The figure was generated from crystal structure coordinates (PDB entry 1FVC) using the computer program PYMOL (http://pymol.sourceforge.net).
Fig. 1 Design of libraries. The main chains of the humanized 4D5 heavy- and light-chain variable domains are coloured grey and blue, respectively. The CDR positions that were diversified are shown as coloured spheres as follows: CDR-L3, purple; CDR-H1, yellow; CDR-H2, orange; CDR-H3, red or grey. The grey positions were diversified as follows: 100b, Ala/Gly; 100c, Phe/Ile/Leu/Met. The other coloured positions were randomized with various binary combinations, as described in the main text. In CDR-H3, the red positions were replaced by all loop lengths between 6 and 17 residues in all repertoires except repertoire H3-YSGX in which all loop lengths between 4 and 17 residues were used. In the heavy-chain of repertoire H3-YSGX, position 28 was not diversified and additional positions were diversified as follows: 29, Phe/Ile/Val; 34, Ile/Met; 52a, Pro/Ser; 55, Gly/Ser. In CDR-L3 of repertoire H3-YSGX, positions 91–94 were replaced by four, five or six codons encoding Tyr/Ser and additional positions were diversified as follows: 95, Ile/Pro; 96, Phe/Ile/Val. Positions are numbered according to the nomenclature of Kabat et al.38 The figure was generated from crystal structure coordinates (PDB entry 1FVC) using the computer program PYMOL (http://pymol.sourceforge.net).

Each of these repertoires was cycled through rounds of binding selection against four human antigens: vascular endothelial growth factor (VEGF), epidermal growth factor receptor 2 (HER2), insulin, and insulin-like growth factor-1 (IGF-1). We also selected for binding to protein A, which recognizes the heavy-chain variable domain22,23 and can be used to select for correctly folded protein.11,24 We assembled large panels of unique binding clones for statistical analysis to determine which amino acids are best suited to enable antigen recognition with high specificity.

Contact residues

We first assessed the ability of different amino acids to function as contact residues in CDR-H3 loops of antigen-binding sites containing Tyr/Ser diversity in the other CDRs. Ser is encoded by six codons, and consequently, we were able to use standard DNA synthesis to design 12 different CDR-H3 libraries combining Ser with one of 12 other amino acids (Tyr, Trp, Arg, Phe, Leu, Ile, Asn, Thr, Pro, Cys, Ala or Gly). The 12 libraries were pooled to form repertoire “H3-SX”. After selection against the four antigens, we obtained the sequences of 174 unique binding clones (Fig. S1, ESI) and binned these on the basis of CDR-H3 content. Aside from a single anti-IGF-1 CDR-H3 sequence that contained Pro, the CDR-H3 sequences contained Ser combined with Tyr, Trp, Arg or Phe (Fig. 2A), indicating that these residues are preferred for mediating antigen recognition when combined with Ser.
Antibodies from repertoires derived by combining Ser with different amino acids. (A) Chemical composition of antigen-binding CDR-H3 loops from repertoire H3-SX, in which the CDR-H3 loops contained Ser combined with one of 12 different amino acids (Tyr, Trp, Arg, Phe, Leu, Ile, Asn, Thr, Pro, Cys, Ala or Gly), and CDR-H1, -H2 and -L3 loops contained Ser combined with Tyr. A total of 174 unique clones were analyzed (Fig. S1, ESI). (B) Chemical composition of antigen-binding sites from repertoire All-SX, in which all four randomized CDR loops contained Ser combined with either Tyr, Trp, Phe or Arg. The following populations and numbers were analyzed: naïve (n = 125, white bars), protein A binding (n = 346, grey bars), antigen binding (n = 105, black bars, Fig. S2, ESI). Statistically significant deviations (an unadjusted p value < 0.05) from the naïve populations are indicated with an asterisk (*). (C) Relationship between nonspecific binding and the chemical composition of antigen-binding sites. Groups of antibodies with different chemical composition (x-axis) in CDR-H3 (white bars) or the entire antigen-binding site (black bars) were assayed for mean nonspecific binding (y-axis) using a phage ELISA to measure nonspecific binding to a panel of noncognate antigens. The number above each bar indicates the number of clones in the group.
Fig. 2 Antibodies from repertoires derived by combining Ser with different amino acids. (A) Chemical composition of antigen-binding CDR-H3 loops from repertoire H3-SX, in which the CDR-H3 loops contained Ser combined with one of 12 different amino acids (Tyr, Trp, Arg, Phe, Leu, Ile, Asn, Thr, Pro, Cys, Ala or Gly), and CDR-H1, -H2 and -L3 loops contained Ser combined with Tyr. A total of 174 unique clones were analyzed (Fig. S1, ESI). (B) Chemical composition of antigen-binding sites from repertoire All-SX, in which all four randomized CDR loops contained Ser combined with either Tyr, Trp, Phe or Arg. The following populations and numbers were analyzed: naïve (n = 125, white bars), protein A binding (n = 346, grey bars), antigen binding (n = 105, black bars, Fig. S2, ESI). Statistically significant deviations (an unadjusted p value < 0.05) from the naïve populations are indicated with an asterisk (*). (C) Relationship between nonspecific binding and the chemical composition of antigen-binding sites. Groups of antibodies with different chemical composition (x-axis) in CDR-H3 (white bars) or the entire antigen-binding site (black bars) were assayed for mean nonspecific binding (y-axis) using a phage ELISA to measure nonspecific binding to a panel of noncognate antigens. The number above each bar indicates the number of clones in the group.

To assess the capacity of different amino acids to function as contact residues across the entire antigen-binding site, we next designed four libraries, in each of which, all four randomized CDR loops were diversified with a binary combination of Ser and Tyr, Trp, Arg or Phe. The resulting repertoire “All-SX” yielded 105 unique antigen-binding clones after selection against the four antigens. The Phe library yielded only a single clone, which targeted IGF-1 (Fig. S2, ESI). The Phe library was also depleted amongst the protein A-selected clones relative to the naïve repertoire (Fig. 2B), suggesting that high densities of solvent-exposed Phe residues compromise the structural integrity of the Fab protein. The Arg library was less well represented as only eight clones were raised against a single antigen, insulin. In contrast, the Tyr and Trp libraries were well represented by numerous clones raised against four or three antigens, respectively.

To assess specificity, we used a phage ELISA to measure binding of the antigen-binding clones against a panel of proteins and calculated a mean nonspecific binding signal for each by averaging the ELISA signals against these non-cognate antigens (Fig. S1 and S2, ESI).25 To determine the level of nonspecific binding associated with each amino acid, we binned the sequences into groups based on the amino acid type combined with Ser and, for each group, quantified nonspecific binding as the mean of the mean nonspecific binding signals for all clones in the group (Fig. 2C). From the H3-SX repertoire, clones containing Tyr in CDR-H3 exhibited the lowest nonspecific binding signals and those containing Trp or Phe were also relatively specific. In contrast, those containing Arg exhibited high non-specific binding signals. From the All-SX repertoire, the clones containing Tyr and the single clone containing Phe in all four CDR loops were highly specific, while the clones containing either Arg or Trp exhibited high nonspecific binding signals. Taken together, these results show that binding surfaces containing high levels of Tyr are very specific. In contrast, surfaces containing Trp limited to CDR-H3 are also fairly specific but high contents of Trp across all four randomized CDR loops cause substantial nonspecific binding. Surfaces containing Arg are very nonspecific, as this residue causes nonspecific binding when present in only CDR-H3 or in all four randomized CDR loops.

Conformation residues

We next assessed the ability of the small residues Ala and Gly to substitute for Ser as conformation residues in antigen-binding sites. We constructed three libraries in each of which CDR-H3 loops contained binary combinations of Tyr with Ser, Ala or Gly and the other randomized CDR loops contained Tyr/Ser diversity (repertoire “H3-YX”). Following selection for binding to protein A, all three libraries were well represented, indicating that Ser, Ala and Gly are all well tolerated in the CDR-H3 loop (Fig. 3A). Following selection for binding to antigens, we sequenced 224 unique clones (Fig. S3, ESI) and found that all three libraries yielded binding clones against all four antigens, although the CDR-H3 sequences that contained Ser or Gly were more abundant than those that contained Ala (Fig. 3A). Thus, we conclude that Ser, Ala or Gly can combine with Tyr to generate CDR-H3 loops that can effectively recognize diverse antigens, but Ser and Gly are more effective than Ala.
Antibodies from repertoires derived by combining Tyr with different amino acids. (A) Chemical composition of CDR-H3 loops from repertoire H3-YX, in which CDR-H3 loops contained Tyr combined with either Ser, Gly or Ala, and CDR-H1, -H2 and -L3 loops contained Tyr combined with Ser. The following populations and numbers were analyzed: naïve (n = 50, white bars), protein A binding (n = 151, grey bars), antigen binding (n = 197, black bars, Fig. S3, ESI). Statistically significant deviations (an unadjusted p value < 0.05) from the naïve populations are indicated with an asterisk (*). (B) Chemical composition of antigen-binding sites from repertoire All-YX, in which all four randomized CDR loops contained Tyr combined with either Ser, Gly or Ala. The following populations and numbers were analyzed: naïve (n = 47, white bars), protein A binding (n = 123, grey bars), antigen binding (n = 178, black bars, Fig. S4, ESI). (C) Relationship between nonspecific binding and the chemical composition of antigen-binding sites. Groups of antibodies with different chemical composition (x-axis) in CDR-H3 (white bars) or the entire antigen-binding site (black bars) were assayed for mean nonspecific binding (y-axis) using a phage ELISA to measure nonspecific binding to a panel of noncognate antigens. The number above each bar indicates the number of clones in the group.
Fig. 3 Antibodies from repertoires derived by combining Tyr with different amino acids. (A) Chemical composition of CDR-H3 loops from repertoire H3-YX, in which CDR-H3 loops contained Tyr combined with either Ser, Gly or Ala, and CDR-H1, -H2 and -L3 loops contained Tyr combined with Ser. The following populations and numbers were analyzed: naïve (n = 50, white bars), protein A binding (n = 151, grey bars), antigen binding (n = 197, black bars, Fig. S3, ESI). Statistically significant deviations (an unadjusted p value < 0.05) from the naïve populations are indicated with an asterisk (*). (B) Chemical composition of antigen-binding sites from repertoire All-YX, in which all four randomized CDR loops contained Tyr combined with either Ser, Gly or Ala. The following populations and numbers were analyzed: naïve (n = 47, white bars), protein A binding (n = 123, grey bars), antigen binding (n = 178, black bars, Fig. S4, ESI). (C) Relationship between nonspecific binding and the chemical composition of antigen-binding sites. Groups of antibodies with different chemical composition (x-axis) in CDR-H3 (white bars) or the entire antigen-binding site (black bars) were assayed for mean nonspecific binding (y-axis) using a phage ELISA to measure nonspecific binding to a panel of noncognate antigens. The number above each bar indicates the number of clones in the group.

When all four randomized CDR loops were constructed by combining Tyr with Ser, Ala or Gly (repertoire “All-YX”), selection for binding to protein A resulted in the depletion or enrichment of CDR sequences that contained Ala or Gly, respectively, while the abundance of CDR sequences that contained Ser was not changed (Fig. 3B). These results suggest that high densities of Ala in the antigen-binding site tend to destabilize the antibody fold relative to antigen-binding sites containing Ser residues. High densities of Gly may stabilize the antigen-binding site relative to Ser, but it is also possible that the enrichment for sequences that contain Gly may be due to higher non-specific binding of these clones (see below). Following selection for binding to antigens, the sequences of 179 binding clones revealed that the Gly library generated binders against all four antigens, while the Ala and Ser libraries each generated binders against three antigens (Fig. S4, ESI). However, the Ala library only generated a total of six binding clones, as most of the binding clones were from the Gly or Ser libraries (Fig. 3B). Taken together, these results show that Gly is as functional or perhaps even more functional than Ser for generating functional antigen-binding sites in combination with Tyr. Antigen-binding sites that contain high densities of Ala residues appear to be compromised for stability and Ala residues are not as effective as Ser or Gly residues for mediating antigen recognition.

For specificity, the addition of Ala to the CDR-H3 loops appears to be detrimental in comparison with Ser (Fig. 3C). Surprisingly, the clones that contain Ala in all four CDR-H3 loops appear to be more specific than those with Ala in CDR-H3 only, but this may be due to the limited amount of data since we only isolated six clones of this type. Adding Gly to the CDR-H3 loops increases nonspecific binding, and in this case, the clones that contain Gly in all four CDR loops exhibit even higher nonspecific binding. These results show that Ser-rich binding sites are more specific than Gly-rich binding sites and are at least as specific as Ala-rich binding sites.

Characterization of anti-HER2 Fabs

Our results show that Tyr and Trp are effective as contact residues and Ser, Gly and Ala are effective as conformation residues, but Tyr and Ser appear to be best for specificity. Against the antigen HER2, we were successful in generating binders from all the various libraries using these amino acids and we compiled a panel containing a total of 146 anti-HER2 Fabs. This panel was screened for affinity using a rapid, single-point phage ELISA, and subsequently, detailed competitive phage ELISAs of the best clones revealed that many of the Fabs recognize HER2 with affinities in the low to sub-nanomolar range (Fig. S1–S4, ESI). We identified seven high affinity Fabs representing the seven different libraries containing combinations of these five amino acids (Tyr/Ser, Trp/Tyr/Ser, Trp/Ser, Tyr/Ser/Ala, Tyr/Ala, Tyr/Ser/Gly and Tyr/Gly). Interestingly, all seven Fabs use an identical CDR-H3 length (Fig. 4A), even though substantial length diversity was encoded in this region, and this suggests that the Fabs all recognize a similar epitope (see below).
Characterization of anti-HER2 Fabs. (A) Sequences of the heavy chain CDR loops. Residues in grey are at positions that were not diversified in the libraries. Residues at diversified positions are coloured as follows: Tyr (yellow), Ser (red), Gly (green), Ala (blue), Trp (orange). (B) Kinetic analysis by SPR for Fabs binding to immobilized HER2. (C) Epitope mapping. HER2 was captured with the indicated immobilized antibody (x-axis) and, subsequently, phage-displayed Fab was added and simultaneous binding of the two antibodies was detected by phage ELISA (y-axis). The following Fabs were analyzed: Fab-H-YS (white bars), Fab-H-WYS (grey bars), Fab-H-WS (black bars). (D) Flow cytometric analysis of Fabs binding to NR6 cells (grey trace) or stably transfected NR6 cells expressing HER2 (black trace).
Fig. 4 Characterization of anti-HER2 Fabs. (A) Sequences of the heavy chain CDR loops. Residues in grey are at positions that were not diversified in the libraries. Residues at diversified positions are coloured as follows: Tyr (yellow), Ser (red), Gly (green), Ala (blue), Trp (orange). (B) Kinetic analysis by SPR for Fabs binding to immobilized HER2. (C) Epitope mapping. HER2 was captured with the indicated immobilized antibody (x-axis) and, subsequently, phage-displayed Fab was added and simultaneous binding of the two antibodies was detected by phage ELISA (y-axis). The following Fabs were analyzed: Fab-H-YS (white bars), Fab-H-WYS (grey bars), Fab-H-WS (black bars). (D) Flow cytometric analysis of Fabs binding to NR6 cells (grey trace) or stably transfected NR6 cells expressing HER2 (black trace).

We analyzed in further detail those Fabs containing Ser combined with Tyr and/or Trp. In total, we had a panel of 43 antibodies of this type and we used competitive phage ELISAs to determine affinities for the entire panel (Fig. S1–S4, ESI). On average, the 12 clones that contained Trp/Ser exhibited the tightest affinities (mean IC50 = 1.1 nM), the 18 clones that contained Trp/Tyr/Ser exhibited intermediate affinities (mean IC50 = 4.4 nM) and the 13 clones that contained Tyr/Ser exhibited the lowest affinities (mean IC50 = 10.4 nM). However, the average non-specific binding activity for the group that contained Trp/Ser was significantly higher than for the other two groups.

We purified the highest affinity Fabs with antigen-binding sites that were derived from diversity restricted to only Tyr/Ser (Fab-H-YS), Trp/Tyr/Ser (Fab-H-WYS) or Trp/Ser (Fab-H-WS) (Fig. 4A). Kinetic analysis of the purified Fabs by surface plasmon resonance (SPR) agreed with the results of the competitive phage ELISA, showing that all three Fabs bind tightly to HER2 and Fab-H-WS exhibits the highest affinity (Fig. 4B). We also conducted epitope mapping experiments to compare the three Fabs to each other and to 4D5 and 2C4, two anti-HER2 antibodies with non-overlapping epitopes.25–27 A phage ELISA was used to test whether HER2 captured with each immobilized antibody could bind simultaneously to other phage-displayed Fabs. Fab-H-YS, Fab-H-WYS or Fab-H-WS cannot bind simultaneously but each can bind to HER2 captured by either 4D5 or 2C4, suggesting that the three Fabs recognize a common epitope that is distinct from the epitopes recognized by 4D5 and 2C4 (Fig. 4C). All three Fabs also recognize antigen on cells, as evidenced by flow cytometric analyses showing that the Fabs specifically label cells expressing cell-surface HER2 (Fig. 4D).

Effects of adding chemical diversity to a Tyr/Ser/Gly background

Taken together, our results show that antigen-binding sites that are rich in Tyr and Ser residues are highly specific and functional, but the addition of Gly residues can improve function. Thus, we next designed a repertoire designed to test whether other amino acid types improve the performance of a library constructed with Tyr/Ser/Gly diversity. We constructed a series of libraries in which paratope positions in CDR-H1, -H2 and -L3 were restricted to Tyr/Ser diversity and certain non-paratope positions were also diversified in a restricted manner, as described previously.11 In CDR-H3, all of the libraries contained Tyr/Ser/Gly diversity but each library contained one additional amino acid type (Ala, Asp, Trp, Leu, His, Asn, Thr, Phe, Lys, or Pro). The ten libraries were combined, the phage pool (repertoire “H3-YSGX”) was cycled through rounds of selections against protein A and each of the four antigens described above, and the distribution of binding clones was analyzed (Fig. 5).
Chemical composition of CDR-H3 loops. Clones from repertoire H3-YSGX were selected for binding to the indicated antigens and were binned into groups on the basis of amino acid types found in addition to Tyr/Ser/Gly in the CDR-H3 loop. The dash (-) indicates CDR-H3 loops that contain only Tyr/Ser/Gly residues. The number above each set of bars indicates the total number of binding clones for each antigen.
Fig. 5 Chemical composition of CDR-H3 loops. Clones from repertoire H3-YSGX were selected for binding to the indicated antigens and were binned into groups on the basis of amino acid types found in addition to Tyr/Ser/Gly in the CDR-H3 loop. The dash (-) indicates CDR-H3 loops that contain only Tyr/Ser/Gly residues. The number above each set of bars indicates the total number of binding clones for each antigen.

All of the libraries are well represented following selection for binding to protein A, indicating that, as expected, all of the different amino acid types are well tolerated in CDR-H3. There appears to be a slight bias in favour of the positively charged Lys/His residues, but this may be due to favourable electrostatic interactions with protein A, which has a net negative surface charge (data not shown). Notably, there is also a high abundance of CDR-H3 sequences composed solely of Tyr/Ser/Gly residues, and these arise from a fraction of each library that does not contain additional chemical diversity.

For clones selected for binding to insulin or IGF-1, there is a strong bias in favour of CDR-H3 sequences that contain Lys residues, and this is likely due to the fact that these antigens have overall negatively charged surfaces (data not shown). We used competitive phage ELISAs to survey affinities and found that most of the clones recognize antigen with low affinity. In the case of IGF-1, only eight of 94 clones exhibited greater than 50% inhibition of binding to immobilized antigen in the presence of 100 nM solution-phase antigen (Fig. S5A, ESI). The CDR-H3 sequences of these eight clones all contained additional chemical diversity beyond the basic Tyr/Ser/Gly background, and detailed analysis by competitive phage ELISA revealed that the two clones exhibiting the lowest IC50 values contain Lys residues in their CDR-H3 sequences (Fig. 6A). In the case of insulin, 12 of 54 clones exhibited greater than 25% inhibition of binding in the presence of 100 nM solution-phase antigen but none exhibited greater than 50% inhibition (Fig. S5B, ESI), and thus, we could not determine accurate IC50 values for these low affinity interactions. However, seven of the 12 clones that exhibited greater than 25% inhibition contain Lys residues in their CDR-H3 sequences (Fig. 6B).


Sequences and affinities of antigen-binding Fabs. Heavy-chain CDR sequences are shown for the highest affinity Fabs selected from repertoire H3-YSGX for binding to (A) IGF-1, (B) insulin, (C) HER2 or (D) VEGF (Fig. S5, ESI). Residues in grey are at positions that were not diversified in the libraries. Residues at diversified positions are coloured as follows: Tyr (yellow), Ser (red), Gly (green), additional diversity in CDR-H3 (purple). Binding parameters (kon, koff, Kd) were determined from kinetic analysis of Fabs binding to immobilized antigen by SPR. IC50 values were determined by competitive phage ELISA.
Fig. 6 Sequences and affinities of antigen-binding Fabs. Heavy-chain CDR sequences are shown for the highest affinity Fabs selected from repertoire H3-YSGX for binding to (A) IGF-1, (B) insulin, (C) HER2 or (D) VEGF (Fig. S5, ESI). Residues in grey are at positions that were not diversified in the libraries. Residues at diversified positions are coloured as follows: Tyr (yellow), Ser (red), Gly (green), additional diversity in CDR-H3 (purple). Binding parameters (kon, koff, Kd) were determined from kinetic analysis of Fabs binding to immobilized antigen by SPR. IC50 values were determined by competitive phage ELISA.

For clones selected for binding to HER2, there is a bias in favour of clones that contain only Tyr/Ser/Gly in their CDR-H3 sequences (Fig. 5), suggesting that additional chemical diversity does not significantly improve binding to this antigen. In this case, we found numerous clones that exhibited almost complete inhibition of binding in the presence of 100 nM solution-phase antigen (Fig. S5C, ESI), and detailed analysis of 16 clones revealed IC50 values in the low to sub-nanomolar range (Fig. 6C). We purified five of these anti-HER2 Fabs and analysis of binding kinetics by SPR confirmed high affinity binding in the sub-nanomolar range. Two of the four highest affinity Fabs (H-23 and H-35) contain only Tyr/Ser/Gly in their CDR-H3 sequences, thus confirming that high affinity binding to HER2 can be achieved with only this limited chemical diversity.

For clones selected for binding to VEGF, no single population dominates, but there is a significant proportion that contains only Tyr/Ser/Gly in CDR-H3 and a slight bias in favour of CDR-H3 sequences that also contain His residues (Fig. 5), suggesting that high affinity binding to VEGF can be achieved with only Tyr/Ser/Gly diversity, but the addition of His residues may improve affinity. As in the case of HER2, many of the anti-VEGF clones exhibited substantial inhibition of binding in the presence of 100 nM solution-phase antigen (Fig. S5D, ESI), and the IC50 values of the 14 best clones were in the low to sub-nanomolar range (Fig. 6D). SPR analysis of four purified Fabs confirmed that a Fab that contains only Tyr/Ser/Gly (V-8) or a Fab that contains only Tyr (V-11) in the CDR-H3 loop is capable of binding to VEGF with affinity in the single-digit nanomolar range. However, we found that the highest affinity anti-VEGF Fab (V-38) contains two His residues in addition to Tyr residues in its CDR-H3 loop.

In summary, the effect of additional diversity beyond Tyr/Ser/Gly diversity in the CDR-H3 loop depends on the antigen. For the negatively charged antigens IGF-1 and insulin, the addition of positively charged Lys residues into the CDR-H3 loops produces Fabs that dominate in the binding selections and recognize antigen with higher affinities than Fabs that contain only Tyr/Ser/Gly residues in their CDR-H3 loops. In contrast, extremely high affinities can be achieved for binding to HER2 by using only Tyr/Ser/Gly diversity and additional chemical diversity has no appreciable effect on affinity. The case of VEGF appears to be intermediate between these two extremes, because Tyr/Ser/Gly diversity is sufficient to achieve affinities in the single-digit nanomolar range but the highest affinity Fab contains two His residues in its CDR-H3 loop.

Discussion

Within a minimalist Tyr/Ser background, we assessed the intrinsic capacity of the natural amino acids for molecular recognition. Amongst the twelve residue types tested for mediating binding contacts, Tyr is the clear winner when considering favourable contributions to both affinity and specificity. Trp emerges as a viable alternative to Tyr, especially when limited to CDR-H3, but high Trp content across a large region of the antibody tends to increase nonspecific binding. Although Arg can mediate contacts to some extent, the specificity of Arg-rich binding sites is severely compromised, and these results are consistent with our previous findings with another synthetic repertoire25 and with the finding that self-reactive natural antibodies typically contain Arg-rich CDR-H3 loops.28

Our results also confirm the favourable contributions of Ser residues acting as conformation residues, but in this case, we find that Gly is comparable to Ser for facilitating antigen recognition, albeit with somewhat reduced specificity. These results are also consistent with our previous findings that Gly residues located at key positions within synthetic antigen-binding sites can be crucial for high affinity antigen recognition.11,25 We have reduced the requirements for molecular recognition beyond even our previous minimal system of binary Tyr/Ser diversity9 and have achieved affinities in the single-digit nanomolar range using only Tyr side chains presented by combinations of Tyr/Gly main chains.

Overall, our results with synthetic antigen-binding sites are consistent with studies of natural proteinprotein interactions, which have revealed that interfaces are enriched for Tyr, Trp and Arg residues that are often “hot spots” of binding energy.13–16,20,29,30 But our work goes further to provide an empirical assessment of the relative fitness of the various natural amino acids for mediating contacts in molecular recognition. Moreover, while previous studies focused on large residues that mediate intermolecular contacts, our results also highlight the importance of small residues that provide space and conformational flexibility for productive binding. Clearly, cooperation between large and small residues is critical for optimal molecular recognition, as demonstrated by the remarkably tight affinities achieved by our minimalist Fabs.

For the practical purposes of synthetic library design, our results show that Tyr/Ser/Gly diversity is sufficient for generating high affinity antibodies against some antigens, but additional chemical diversity in CDR-H3 is necessary for achieving high affinity for others. Overall, our findings suggest a library design that biases diversity in favour of Tyr/Ser/Gly residues but also adds small quantities of other amino acid types, and indeed, we have recently used this strategy to construct a highly functional synthetic antibody library.11 This library has provided numerous high affinity antibodies against diverse protein antigens and has enabled the development of exquisitely specific antibodies for targeting structured RNA,31 protein post-translational modifications,32 conformational epitopes33 and integral membrane proteins.34 Our new findings will enable further optimization of antibody design to produce synthetic repertoires with recognition capacities beyond the scope of natural repertoires.

Experimental

Library construction, selection and analysis

Phage display libraries were constructed using previously described methods11,25,35 using oligonucleotide-directed mutagenesis to replace CDR positions with degenerate codons (Fig. 1). For repertoires H3-SX and All-SX, positions were mutagenized using oligonucleotides synthesized with standard methods and the following degenerate codons: Tyr/Ser (T[A/C]T), Trp/Ser (T[C/G]G), Arg/Ser ([A/C]GC), Phe/Ser (T[C/T]C), Leu/Ser (T[C/T]A), Ile/Ser (A[T/G]C), Asn/Ser (A[A/G]C), Thr/Ser (A[C/G]C), Pro/Ser ([C/T]CT), Cys/Ser (T[C/G]C), Ala/Ser ([G/T]CT), Gly/Ser ([A/G]GC). For repertoires H3-YX and All-YX, for positions that were diversified as Tyr/Gly or Tyr/Ala, mutagenic oligonucleotides were synthesized using equimolar mixes of Trimer Phosphoramidites encoding for the two amino acids (Glen Research, Sterling, VA). For repertoire H3-YSGX, CDR-H1, -H2 and -L3 were diversified as described for “library D”.11 Ten separate libraries were constructed, in each of which the CDR-H3 loop was diversified using mutagenic oligonucleotides synthesized with Trimer Phosphoramidites to encode a 50 ∶ 15 ∶ 15 ∶ 20 ratio of Tyr ∶ Ser ∶ Gly ∶ X, where X is either Ala, Asp, Trp, Leu, His, Asn, Thr, Phe, Lys or Pro. Each library contained greater than 1010 unique members.

Phage from the antibody repertoires were cycled through three rounds of binding selection against antigen or protein A coated on 96-well Maxisorp Immunoplates (NUNC, Rochester, NY), as described.11,35 Clones that bound antigen in phage ELISAs were subjected to DNA sequence analysis.

Affinity assays

To estimate affinities of Fabs for antigen, a single-point competitive phage ELISA was performed, as described,25 to measure Fab-phage binding to immobilized antigen in the presence or absence of 100 nM solution-phase antigen. Clones that exhibited significant inhibition of phage binding in the presence of solution-phase antigen were subjected to a detailed competitive phage ELISA to determine IC50 values, as described,25 by measuring the binding of a subsaturating concentration of phage in the presence of serial dilutions of antigen. The binding affinities were estimated as IC50 values defined as the concentration of antigen that blocked 50% of the phage binding to the immobilized antigen. Kinetic measurements were carried out by SPR on a BIACORE 3000 instrument (General Electric Healthcare) using immobilized HER2 or VEGF and purified Fab proteins, as described.36

Specificity assays

To assess specificity, a modified phage ELISA was used to detect binding to a panel of eight proteins, including the cognate antigen, as described.25 Phage were produced from individual clones grown in a 96-well format and the culture supernatants were diluted threefold in phosphate-buffered saline (PBS), 0.5% (w/v) bovine serum albumin (BSA; Sigma-Aldridge, St Louis, MO), 0.1% (v/v) Tween 20 (Sigma-Aldridge) (PBT buffer). The diluted phage supernatants were incubated for 2 h in 384-well Maxisorp Immunoplates (NUNC) coated with antigen (2 µg/ml). The plates were washed six times with PBS, 0.05% (v/v) Tween 20 (PT buffer) and incubated 30 min with horse-radish peroxidase/anti-M13 antibody conjugate (1 ∶ 5000 dilution in PBT buffer) (Pharmacia). The plates were washed six times with PT buffer and twice with PBS, developed for 15 min with 3,3′,5,5′-tetramethyl-benzidine/H2O2 peroxidase substrate (Kirkegaard & Perry Laboratories), quenched with 1.0 M H3PO4 and read spectrophotometrically at 450 nm. The readings were corrected for background binding by subtracting the signal from wells on the same plate containing approximately equivalent concentrations of M13-KO7 helper phage (New England Biolabs, Beverly, MA).

Flow cytometry

NR6 cells or stably transfected cells expressing cell-surface HER237 were harvested using cell dissociation solution (Sigma). Cells were washed and resuspended in PBS and incubated with Fab protein for 1 h. After washing with PBS, the cells were stained with goat anti-human IgG conjugated to Alexa Fluor® 488 (Molecular Probes). Cells were again washed thoroughly in PBS and cell-associated fluorescence was analyzed with a FACSCalibur flow cytometer (BD Biosciences).

References

  1. T. Pawson and P. Nash, Protein–protein interactions define specificity in signal transduction, Genes Dev., 2000, 14, 1027–1047 CAS.
  2. A. Bradbury, Antibodies in proteomics I: generating antibodies, Trends Biotechnol., 2003, 21, 275–281 CrossRef CAS.
  3. C. F. Barbas, 3rd, Synthetic human antibodies, Nat. Med. (N. Y.), 1995, 1, 837–839 CrossRef CAS.
  4. C. F. Barbas, 3rd, J. D. Bain, D. M. Hoekstra and R. A. Lerner, Semisynthetic combinatorial antibody libraries: a chemical solution to the diversity problem, Proc. Natl. Acad. Sci. U. S. A., 1992, 15, 4457–4461 CrossRef.
  5. S. S. Sidhu and F. A. Fellouse, Synthetic therapeutic antibodies, Nat. Chem. Biol., 2006, 2, 682–688 CrossRef CAS.
  6. S. S. Sidhu and S. Koide, Phage display for engineering and analyzing protein interaction interfaces, Curr. Opin. Struct. Biol., 2007, 17, 481–487 CrossRef CAS.
  7. S. S. Sidhu and A. A. Kossiakoff, Exploring and designing protein function with restricted diversity, Curr. Opin. Chem. Biol., 2007, 11, 347–354 CrossRef CAS.
  8. S. Koide and S. S. Sidhu, The importance of being tyrosine: lessons in molecular recognition from minimalist synthetic binding proteins, ACS Chem. Biol., 2009, 4, 325–334 CrossRef CAS.
  9. F. A. Fellouse, L. Bing, D. M. Compaan, A. A. Peden, S. G. Hymowitz and S. S. Sidhu, Molecular recognition by a binary code, J. Mol. Biol., 2005, 348, 1153–1162 CrossRef CAS.
  10. F. A. Fellouse, P. A. Barthelemy, R. F. Kelley and S. S. Sidhu, Tyrosine plays a dominant functional role in the paratope of a synthetic antibody derived from a four amino acid code, J. Mol. Biol., 2006, 357, 100–114 CrossRef CAS.
  11. F. A. Fellouse, K. Esaki, S. Birtalan, D. Raptis, V. J. Cancasci, A. Koide, P. Jhurani, M. Vasser, C. Wiesmann, A. A. Kosiakoff, S. Koide and S. S. Sidhu, High-throughput generation of synthetic antibodies from highly functional minimalist phage-displayed libraries, J. Mol. Biol., 2007, 373, 924–940 CrossRef CAS.
  12. F. A. Fellouse, C. Wiesmann and S. S. Sidhu, Synthetic antibodies from a four amino acid code: a dominant role for tyrosine in antigen recognition, Proc. Natl. Acad. Sci. U. S. A., 2004, 101, 12467–12472 CrossRef CAS.
  13. P. Chakrabarti and J. Janin, Dissecting protein–protein recognition sites, Proteins: Struct., Funct., Genet., 2002, 47, 334–343 CrossRef CAS.
  14. S. Jones and J. M. Thornton, Principles of protein–protein interactions, Proc. Natl. Acad. Sci. U. S. A., 1996, 93, 13–20 CrossRef CAS.
  15. S. Jones and J. M. Thornton, Analysis of protein–protein interaction sites using surface patches, J. Mol. Biol., 1997, 272, 121–132 CrossRef CAS.
  16. L. Lo Conte, C. Chothia and J. Janin, The atomic structure of protein–protein recognition sites, J. Mol. Biol., 1999, 285, 2177–2198 CrossRef CAS.
  17. I. S. Mian, A. R. Bradwell and A. J. Olson, Structure, function and properties of antibody binding sites, J. Mol. Biol., 1991, 217, 133–151 CrossRef CAS.
  18. E. A. Padlan, Anatomy of the antibody molecule, Mol. Immunol., 1994, 31, 169–217 CrossRef CAS.
  19. D. Reichmann, O. Rahat, M. Cohen, H. Neuvirth and G. Schreiber, The molecular architecture of protein–protein binding sites, Curr. Opin. Struct. Biol., 2007, 17, 67–76 CrossRef CAS.
  20. H. O. Villar and L. M. Kauvar, Amino acid preferences at protein binding sites, FEBS Lett., 1994, 349, 125–130 CrossRef CAS.
  21. J. L. Xu and M. M. Davis, Diversity in the CDR3 region of V(H) is sufficient for most antibody specificities, Immunity, 2000, 13, 37–45 CrossRef CAS.
  22. M. Graille, E. A. Stura, A. L. Corper, B. J. Sutton, M. J. Taussig, J. B. Charbonnier and G. J. Silverman, Crystal structure of a Staphylococcus aureus protein A domain complexed with the Fab fragment of a human IgM antibody: structural basis for recognition of B-cell receptors and superantigen activity, Proc. Natl. Acad. Sci. U. S. A., 2000, 97, 5399–5404 CrossRef CAS.
  23. M. A. Starovasnik, M. P. O'Connell, W. J. Fairbrother and R. F. Kelley, Antibody variable region binding by Staphylococcal protein A: thermodynamic analysis and location of the Fv binding site on E-domain, Protein Sci., 1999, 8, 1423–1431 CrossRef CAS.
  24. C. J. Bond, J. C. Marsters, Jr. and S. S. Sidhu, Contributions of CDR3 to VHH domain stability and the design of monobody scaffolds for naive antibody libraries, J. Mol. Biol., 2003, 332, 643–655 CrossRef CAS.
  25. S. Birtalan, Y. Zhang, F. A. Fellouse, L. Shao, G. Schaefer and S. S. Sidhu, The intrinsic contributions of tyrosine, serine, glycine and arginine to the affinity and specificity of antibodies, J. Mol. Biol., 2008, 377, 1518–1528 CrossRef CAS.
  26. H. S. Cho, K. Mason, K. X. Ramyar, A. M. Stanley, S. B. Gabelli, D. W. Denney, Jr. and D. J. Leahy, Structure of the extracellular region of HER2 alone and in complex with the Herceptin Fab, Nature, 2003, 421, 756–760 CrossRef CAS.
  27. M. C. Franklin, K. D. Carey, F. F. Vajdos, D. J. Leahy, A. M. de Vos and M. X. Sliwkowski, Insights into ErbB signaling from the structure of the ErbB2-pertuzumab complex, Cancer Cell, 2004, 5, 317–328 CrossRef CAS.
  28. H. Wardemann, S. Yurasov, A. Schaefer, J. W. Young, E. Meffre and M. C. Nussenzweig, Predominant autoantibody production by early human B cell precursors, Science, 2003, 301, 1374–1377 CrossRef CAS.
  29. A. A. Bogan and K. S. Thorn, Anatomy of hot spots in protein interfaces, J. Mol. Biol., 1998, 280, 1–9 CrossRef CAS.
  30. W. L. DeLano, Unraveling hot spots in binding interfaces: progress and challenges, Curr. Opin. Struct. Biol., 2002, 12, 14–20 CrossRef CAS.
  31. J.-D. Ye, V. Tereshko, J. K. Frederiksen, A. Koide, F. A. Fellouse, S. S. Sidhu, S. Koide, A. A. Kosiakoff and J. A. Piccirilli, Synthetic antibodies for specific recognition and crystallization of structured RNA, Proc. Natl. Acad. Sci. U. S. A., 2008, 105, 82–87 CrossRef CAS.
  32. K. Newton, M. L. Matsumoto, I. E. Wertz, D. S. Kirkpatrick, J. R. Lill, J. Tan, D. Dugger, N. Gordon, S. S. Sidhu, F. A. Fellouse, L. Komuves, D. M. French, R. E. Ferrando, C. Lam, D. Compaan, C. Yu, I. Bosanac, S. G. Hymowitz, R. F. Kelley and V. M. Dixit, Ubiquitin chain editing revealed by polyubiquitin linkage-specific antibodies, Cell (Cambridge, Mass.), 2008, 134, 668–678 CrossRef CAS.
  33. J. Gao, S. S. Sidhu and J. A. Wells, Two-state selection of conformation-specific antibodies, Proc. Natl. Acad. Sci. U. S. A., 2009, 106, 3071–3076 CrossRef CAS.
  34. S. Uysal, V. Vasquez, V. Tereshko, K. Esaki, F. A. Fellouse, S. S. Sidhu, K. Koide, E. Perozo and A. A. Kossiakoff, The crystal structure of closed conformation of the full-length KcsA potassium channel, Proc. Natl. Acad. Sci. U. S. A., 2009, 106, 6644–6649 CrossRef CAS.
  35. F. A. Fellouse and S. S. Sidhu, Making antibodies in bacteria, in Making and using antibodies, ed. G. C. Howard, M. R. Kaser, CRC Press, Boca Raton, FL, 2007, pp. 157–180 Search PubMed.
  36. R. B. Gerstner, P. Carter and H. B. Lowman, Sequence plasticity in the antigen-binding site of a therapeutic anti-HER2 antibody, J. Mol. Biol., 2002, 321, 851–862 CrossRef CAS.
  37. G. Schaefer, L. Shao, K. Totpal and R. W. Akita, Erlotinib directly inhibits HER2 kinase activation and downstream signaling events in intact cells lacking epidermal growth factor receptor expression, Cancer Res., 2007, 67, 1228–1238 CrossRef CAS.
  38. E. A. Kabat, T. T. Wu, M. Redi-Miller, H. M. Perry and K. S. Gottesman, Sequences of Proteins of Immunological Interest, National Institutes of Health, Bethesda, MD, 4th edn, 1987 Search PubMed.

Footnotes

Electronic supplementary information (ESI) available: Sequences, affinities and specificities of antigen-binding Fabs. See DOI: 10.1039/b927393j
These authors contributed equally to the work.
§ Current address: Banting and Best Department of Medical Research, Department of Molecular Genetics, and the Terrence Donnelly Center for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario, Canada, M5S 3E1.

This journal is © The Royal Society of Chemistry 2010
Click here to see how this site uses Cookies. View our privacy policy here.