Natalia V.
Povarova
a,
Nina G.
Bozhanova
a,
Karen S.
Sarkisyan
a,
Roman
Gritcenko
a,
Mikhail S.
Baranov
ab,
Ilia V.
Yampolsky
ab,
Konstantin A.
Lukyanov
a and
Alexander S.
Mishin
*a
aShemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Miklukho-Maklaya 16/10, 117997 Moscow, Russia. E-mail: mishin@ibch.ru
bPirogov Russian National Research Medical University, Ostrovitianova 1, Moscow 117997, Russia
First published on 23rd February 2016
Synthetic analogs of the Green Fluorescent Protein (GFP) chromophore emerge as promising fluorogenic dyes for labeling in living systems. Here, we report the computational identification of protein hosts capable of binding to and enhancing fluorescence of GFP chromophore derivatives. Automated docking of GFP-like chromophores to over 3000 crystal structures of Escherichia coli proteins available in the Protein Data Bank allowed the identification of a set of candidate proteins. Four of these proteins were tested experimentally in vitro for binding with the GFP chromophore and its red-shifted Kaede chromophore-like analogs. Two proteins were found to possess sub-micromolar affinity for some Kaede-like chromophores and activate fluorescence of these fluorogens.
An alternative approach – molecular docking – is a potent tool for the identification of interacting pairs of small molecules and protein hosts, and even for the computational design of such pairs from the known binding mode.10 Unfortunately, judging by the vast experimental evidence on the role of the molecular properties of small-molecule ligands in receptor–ligand interaction,11 the GFP chromophore lacks sufficient size, flexibility, and the number of hydrogen-bonding groups to be a good candidate for specific protein binding. Therefore, very accurate docking with full-atom scoring functions and flexible receptor geometry would be necessary to identify protein hosts for the GFP chromophore, increasing computational costs of such a screening. These costs can be lowered if chromophores with auxochromic groups in the core are considered, as these groups increase ligand quality and simplify the search of high-affinity protein–chromophore pairs. But nevertheless, the tendency of docking algorithms to converge towards the lowest interface energy often leaves the specific requirements necessary for the fluorescence recovery out of the equation.
Here, we performed automated docking of GFP-like chromophores to available crystal structures of Escherichia coli proteins and tested selected candidate proteins experimentally.
For protein expression E. coli strain BW25113 was used. Expression was induced by 0.1% arabinose at 25 °C for 16 h. Cells were centrifuged, sonicated in PBS pH 7.4 (10 mM phosphate buffer, 137 mM NaCl, 2.7 mM KCl) with the PMSF protease inhibitor (Thermo Fisher Scientific), and then purified using TALON metal-affinity resin (Clontech). Finally, the protein was dialyzed against 10000× volume of PBS pH 7.4 supplied with 5 mM β-mercaptoethanol.
Fluorescence quantum yields (QYs) of the 3HO2–A12H protein–fluorogen pair (20 μM protein, 0.5 μM chromophore, PBS pH 7.4, 23 °C) were determined by direct comparison with EGFP (QY = 0.6). For unbound chromophores, compound 4c from ref. 13 was used as a reference for QY measurements. The solubility of the chromophores was determined by absorption measurements.
The ligand and receptor files for docking with rigid protein geometry were prepared using AutoDockTools23 version 1.5.6. The bounding box for the docking region was calculated automatically using PyMol. Docking was performed using AutoDock Vina as described24 with exhaustiveness set to 20 and the maximum number of collected distinct binding modes set to 20.
Docking with flexible geometry of the ligand-binding pocket was performed using Rosetta software (weekly release 2015.05.57576). The crystal structures, PDB IDs: 1PVS and 3HO2 (the residue A163 was manually changed into the native C) were preminimized using relax application25 with the following flags: -flip_HNQ -no_optH false -relax:constrain_relax_to_start_coords -nstruct 50 -ex1 -ex2 -use_input_sc. The output structures with the lowest total score were used for further calculations. For ligand docking we used the RosettaLigand docking algorithm.26 Three rounds of docking were made for each protein–ligand pair. During the first round, 5000 structures were generated using the transform mover with move_distance = 5 Å and angle = 360°. Fifty structures with the lowest interface_delta score among 2000 structures with the best total score were selected for the second round to generate another 5000 structures (100 from each of the selected) decreasing the move_distance to 1 Å and the angle to 45°. The third round of ligand docking was performed using the best 50 structures of the second round and the transform mover with move_distance = 0.2 Å and angle = 5°.
We applied molecular docking of the GFP chromophore to a number (over 3000) of crystal structures of E. coli proteins available in the Protein Data Bank with resolution better than 2.0 Å and the chain length shorter than 500 amino acid residues. We chose one of the fastest available docking tools24 and performed docking in a blind and automatic manner: ligands were stripped out from PDB files and only the first protein chain was examined. This massive docking analysis allowed us to select top-scoring structures that can putatively bind the GFP-like chromophore (Table S1, ESI†). We then performed molecular docking of larger GFP-like chromophore derivatives – so called Kaede-like chromophores27 – against top 500 structures from previous round of docking. As a result, chromophores A5, A12, A12H, A24, A26, A27, and A28 (ESI† Methods) were selected for further tests. From the candidate protein list, we succeeded in cloning and expression of 4 proteins with a high (Table 1) GFP or Kaede docking rank: (here and further named by PDB IDs) 3HO2 (β-ketoacyl-acyl-carrier-protein synthase II), 1DOS (fructose-bisphosphate aldolase), 2QRY (thiamine binding protein), and 1PVS (3-methyladenine glycosylase II).
Purified proteins were immobilized on beads and examined under a fluorescence microscope. However, no fluorescence increase was observed upon incubation with the 100 μM GFP chromophore. At the same time, the addition of chromophores A5, A12, A12H, and A24 resulted in a considerable increase of bead fluorescence (Fig. 1).
Next, we studied spectral changes upon mixing of the chosen chromophore–protein pairs in solution. Some of the tested pairs exhibited submicromolar Kd (Fig. 2A). Interestingly, in the case of binding of chromophores A5, A12, and A24 to 3HO2 protein in solution, the fluorescence intensity increase was minor, less than 2-fold, under protein or chromophore saturating conditions (Fig. 2A). Thus, the corresponding signal increase in the bead-based assay can probably be attributed to chromophore accumulation uncoupled from the increase in fluorescence quantum yield. In contrast, A12H and A12 showed a strong fluorescence increase upon binding to 3HO2 and 1PVS in solution (Fig. 2B) demonstrating truly fluorogenic behavior. Indeed, the binding of the chromophore to the protein host resulted in two orders of magnitude increase in fluorescence quantum yield of the chromophore (A12H–3HO2 pair: QY increase from 0.0003 to 0.052 and A12–3HO2 pair: QY increase from 0.0005 to 0.026). The observed fluorescence enhancement in the A12H–3HO2 fluorogen–protein pair is stronger than that for the human serum albumin protein host bound to GFP-like chromophore analogs identified by high-throughput protein screening5 and further directed chemical modifications.4
Fig. 2 Fluorescence increase in solution upon binding of the chromophore to the protein host. (A) Representative titration curve of 1 μM 3HO2 or 1PVS protein solutions with chromophores A12 or A12H; (B) emission response of the A12H chromophore upon binding to protein hosts 3HO2 or 1PVS; (C) TDDFT studies of the A12H chromophore. Dashed lines show the correspondence between the experimental absorption spectrum and theoretically obtained S0 → S1 transitions at the ZORA-PBE0/def2-TZVP (COSMO: H2O) level of theory. See ESI† (Fig. S3) for the description of the neutral and anionic species, which were taken into account. |
The fluorescence of A12H in the complex with proteins 3HO2 and 1PVS is spectrally similar to the dim fluorescence of the free chromophore in water solution (Fig. 2B) and is likely to arise from one of the multiple possible anionic states. As reported previously, spectral shifts of the maxima of Kaede-like chromophores are determined by the electron-donating and withdrawing properties of the aryl substituent at the ethylene double bond.28,29 Thus, the smallest bathochromic shifts were observed (Table S2, ESI†) for the neutral substituent (A24 with tolyl group, absorbance maxima 436 nm) while the hydroxy-phenyl or indolyl shifted the absorption and emission maxima for 40 nm. The absence of the alkyl group in the first position of the imidazolone ring leads to the small hypsochromic shift (15 nm for A12H in comparison to the A12) being in good agreement with the literature30 and resulting from the better electron donating properties of the alkyl group. Deprotonation of the hydroxyl group in all the compounds resulted in ∼80 nm red-shifts of absorption and emission maxima. The process is characterized by the pKa of 7–8, and it can take place in water buffer solutions with neutral pH.28,29 This suggests that in some cases (A12 and A12H) in protein complexes we might have observed the emission of the anionic form.
In order to further investigate this point, we performed time-dependent density functional theory (TDDFT) calculations on the free chromophores. According to these studies, the main peak in the experimental absorption spectrum of the A12H chromophore (Fig. 2C), at 465 nm, along with the 485 nm excitation peak in the complex with the 3HO2 protein host can be attributed to S0 → S1 excitations of different anionic species (Fig. S3, ESI†). Similar red shifting of anionic species in comparison to neutral species was found for all the studied chromophores (see ESI†, Fig. S4–S6).
In order to address the observed strong fluorescence increase in pairs 3HO2/A12H and 1PVS/A12H despite the inconsistencies with ranking in rigid-residue docking, we performed docking analysis with flexible residues and the full-atom Rosetta scoring function. High-resolution docking resulted in highly converged docking poses of the A12H chromophore within a tight binding pocket of the 3HO2 host (Fig. 3B) that was present in top-50 out of 5000 last-round-optimized structures (Fig. S2, ESI†). Also, one of the anionic forms of A12H provided a slightly better overall docking score than the neutral one.
Fig. 3 High-resolution docking. Top-scoring docking modes for A12H–1PVS (A), and A12H–3HO2 (B) interactions, correspondingly. Residues with significant contribution are shown, A12H is shown in green, and the surface of the binding pocket is outlined in grey mesh. |
Among amino acid residues of 3HO2 with significant contribution to the ligand docking (ΔΔG < −1) stacking interactions of F399 (PDB numbering) with the chromophore occurred in all cases (A12H, A12, A24, A5) and might have been the major factor for hindering of the chromophore isomerization. Importantly, it was found that the fluorescence recovery of the GFP-like chromophores by Spinach RNA aptamer binding occurs by stacking interactions too.31,32 Additional stacking and H-bonding interactions are present in some cases (Table S3, ESI†). Apparently, the convergence of docking modes should be used as an additional metric for the computational design of fluorogen chromophore–protein pairs. It should be noted that both 3HO2 and 1PVS showed a fluorescence increase in bead assay with four out of seven chromophores selected by initial computational screening, therefore less computationally extensive rigid-residue docking could indeed be applied for narrowing down the search of suitable ligand candidates before accurate high-resolution analysis.
We believe that the present approach could be further applied in various fields. First, the fluorescence increase of fluorogenic dyes upon binding to a protein host provides an easy and reliable way of the experimental verification of various docking protocols. Second, bacterial proteins found in this work or analogous protein hosts and their mutant variants can potentially be used as fluorescent tags in heterologous expression models (e.g., mammalian cells), similar to antibody-based Fluorogen Activating Proteins8 and the recently developed Yellow Fluorescence-Activating and absorption-Shifting Tag (Y-FAST).33 Finally, this method can provide a way to visualize native endogenous proteins (e.g., 3HO2) in living cells.
Footnote |
† Electronic supplementary information (ESI) available: Supplementary data and figures. See DOI: 10.1039/c5tc03931b |
This journal is © The Royal Society of Chemistry 2016 |