RNA model evaluation based on MD simulation of four tRNA analogs

Anna Grzybkowskaa, Dominika Jędrzejczykb, Michał Rostkowskia, Arkadiusz Chworos*b and Agnieszka Dybala-Defratyka*a
aInstitute of Applied Radiation Chemistry, Faculty of Chemistry, Lodz University of Technology, Zeromskiego 116, 90-924, Lodz, Poland. E-mail: agnieszka.dybala-defratyka@p.lodz.pl
bCenter of Molecular and Macromolecular Studies, Polish Academy of Sciences, Sienkiewicza 112, 90-363 Lodz, Poland. E-mail: achworos@cbmm.lodz.pl

Received 8th June 2016 , Accepted 17th October 2016

First published on 20th October 2016


Abstract

High resolution 3D structures of tRNA molecules are scarce. Therefore, there is a burning need for tools aiming at three-dimensional (3D) RNA structure characterization able to provide accurate information on the tertiary structure of ribonucleic acids. In this study, RNA structure assessment based on primary sequence, molecular dynamics (MD) simulations as well as selected RNA model evaluation methods are used to propose and evaluate four mitochondrial tRNA analogs. The 3D structures of tRNAAla, tRNAGly, tRNAHis, and tRNAPhe are generated from primary sequence using mFold and RNAComposer, and subjected to 100 ns explicit solvent MD simulations with AMBER. The global and local root-mean-square deviations (RMSD), interaction network fidelity (INF), deformation index (DI), the contact area difference-score (CAD-score) as well as principal component analysis (PCA) revealed that the largest changes occurring during MD simulation were in the D–T and anticodon loops but each of the studied tRNA analog is affected differently. tRNAAla and tRNAPhe undergo limited structure perturbation, mostly in the D–T loops regions, while tRNAGly changes the geometry within the anticodon loop. The tRNAHis analog is the most flexible, changes its structure significantly, including separation of the D and T loops. Furthermore, the anticodon loops are visibly more stable than the D–T region and their structure does not change that significantly, except for tRNAGly.


Introduction

Ribonucleic acid (RNA) earned its place in structural biology, and so in computational studies, after the discovery of numerous types of RNA playing an important role in cell biochemistry. It is commonly known that RNA molecules are crucial for protein biosynthesis and the regulation of many processes connected with gene expression.1 There are three classes of RNAs directly involved in peptide biosynthesis: messenger RNA (mRNA), transfer RNA (tRNA) and ribosomal RNA (rRNA). The mRNA is a molecular matrix written in a series of three-base-long codons, each corresponding to the defined amino acid (AA). Codons are identified by specific tRNAs through complementary interaction between codon and anticodon sequence. The tRNA molecules are ones of the most abundant non-coding (ncRNA) molecules in the cell (10% of total cellular RNA), and are found in all living organisms. The sequence of mRNA is translated into amino acid (AA) language in the translation process involving ribosomal RNA (rRNA) protein biosynthesis machinery.

RNA shares the levels of structural organization with DNA and proteins. The primary structure is defined as a sequence of nucleotides for RNA and DNA, or amino acids for protein. The secondary structure of nucleic acids accounts for canonical Watson–Crick interactions between complementary bases. Finally, the tertiary structure of RNA that is pre-determined by primary and secondary structures, shows the three-dimensional organization including long-range interactions between components. The tRNA molecules are typically 70–80 nucleotides long, folding into four-helix system. Although sequence varies between tRNAs, the cloverleaf secondary structure and the L-shaped tertiary structure are common.2 Another common feature is abundance of nucleobase modifications present in tRNA. There are more than 100 modified nucleobases found in tRNAs.3,4 Some are typical for tRNA structure, such as dihydro (D) and pseudouridine (Y) characteristic for D and T loops, respectively, others are unique for specific tRNAs.5 There are multiple examples where modifications are important for the tRNA recognition during aminoacylation by their cognate synthetases, however unmodified tRNA transcripts have been shown to be substrates for the processing enzymes.6 The presence of chemical modifications is linked to occurrence of cancer, diabetes type II, mitochondrial related diseases and neurological disorders.7 Some reports show that mutations in mitochondrial tRNA genes and cleavage cause cellular disorder.8,9 Apparently, tRNA fragments, which are formed under stress conditions, may influence gene expression regulation.10,11 Recently, tRNAs were found to interact with cytochrome c and influence the programmed cell death process.12–14 This is important in the view that the higher level of the tRNA pool was reported for cancer cells, which are known to alter the apoptosis and consequently escape the cell arrest.15–18 Interestingly, interaction of mitochondrial tRNAs with cytochrome c is more prevalent then cytosolic12,13,19 and this interaction seems to be sequence/structure specific.

There are over 450 tRNA genes identified in the human genome, out of them, 22 are found in mitochondrial genome.20,21 However, to date, there are limited high-resolution (<3.0 Å) tRNA structures available in Protein Data Bank: three for yeast tRNAs (tRNAAsp3TRA, tRNAiMet1YFG, tRNAPhe1EHZ), one for human tRNASec (3A3A), mouse variant tRNASec (3RG5), and three structures for E. coli (3LOU, 1B23, 3CW5). The need to understand the folding process and function of tRNA leads to the use of computational predictions and modeling for structures that cannot be established otherwise, by crystallographic or NMR methods. In the view of theoretical structural predictions and experimental methods, one can wonder if we can omit modified nucleobases in tRNA. Zhang et al.5 analyzed the influence of modifications in three yeast tRNAs (tRNAAsp, tRNAPhe, and tRNAiMet). They reported that, based on the RMSD calculations, modifications did not introduce any significant rigidity in tRNA structure, however, there might be differences in correlated motions in the presence and absence of modifications. Although tRNA modifications are essential for the biological function in the cell, their influence on the tRNA molecular structure could be neglected. Application of theoretical methods for RNA-protein interactions prediction is also simplified without the nucleobase modifications.22 Such simplification would not only lower the cost of in silico modeling, but will also allow for structure validation via experimental methods. For instance, inexpensive in vitro transcription methods can be used to generate molecules mimicking tRNAs.

In order to test our ability to characterize structures of tRNA analogs, which models were constructed without having prior structures resulted from high-resolution experimental techniques, we have built respective tRNA fragments and subjected them to 100 ns molecular dynamics simulations. Trajectories resulted from those simulations were then subsequently analyzed using various RNA model evaluation methods. In this study, we investigated the structural and dynamical features of selected tRNA analogs and tested the performance of evaluation methods ranging from the well established (backbone and all atoms RMSD, PCA) on one hand, to the brand new (INF, CAD-score, local/residue RMSD) on the other, in assessing the studied systems. Such studies are necessary in order to ascertain the RNA tertiary structure and can be potentially used for creating complexes with other biological components such as proteins.

Methods

Structures and models

The structures for four mammalian (human), mitochondrial tRNA analogs (alanine, glycine, phenylalanine and histidine tRNA) were generated. Primary sequences for selected tRNAs were extracted from the Mamit-tRNA database.23 In order to establish secondary structure base-pairing elements the mFold algorithm24 was used. Obtained structures were cross-checked with those listed in the literature.25 Finally, the tertiary structures of tRNA analogs were determined using the RNAComposer software.26,27 For independent, experimental evaluation there might be a need to synthetize such tRNA analogs and for that the sequences GGG or GGGA have to be added at the 5′-termini for each tRNA molecule. In this theoretical prediction, the xLEaP module of the AMBER 12 simulation package28 was used to prepare starting topology and coordinates of all models. The charge of the system was neutralized by adding 68, 67, 68, and 70 sodium ions into tRNAAla, tRNAGly, tRNAHis, and tRNAPhe models, respectively. All systems were soaked in a cubic box of TIP3P29 water molecules, such that no solute molecule was present in the region of less than 15 Å from the border of the box. Magnesium, calcium and other divalent ions have been found in the RNA structures obtained by crystallography methods (PDB database). Particularly, Mg2+ is known to be important for catalytic RNAs, such as ribozymes,30 aptamers and RNA recognition elements, for instance P4–P6 domain.31,32 Such ions can stabilize the conformation at the active site, but also entire phosphate-sugar backbone of RNA,33 which might limit the dynamics. Variability of divalent ions across the cellular compartments creates the challenge in the precise evaluation of the amount used for the calculation. Additionally there are limitations in the van der Waals parametrization for Mg2+ ions.5,34 For these reasons divalent ions were not included in the simulation.

MD simulations

tRNA molecules were described by the bsc0χOL3 force field35–39 which is a default force field for RNA in AMBER since 2010. All simulations were performed using the PMEMD program from AMBER 12 with the CUDA implementation for calculations on graphics cards.40,41

All systems were energy-minimized in two steps. First, a tRNA molecule was held fixed using the position restraints with a force constant of 5000 kcal mol−1 Å−2, water molecules and sodium ions were free to move during 500 steps of steepest descent and 500 subsequent steps of conjugate gradient minimization. Then, the whole system was energy-minimized for 2500 steps of steepest descent minimization and 1000 steps of conjugate gradient minimization. Weak positional restraints (force constant of 10 kcal mol−1 Å−2) were applied to the solute atoms during the heating of the system to the temperature of 300 K for 20 ps under constant volume periodic boundary conditions (NVT). Then, constant pressure and temperature ensemble (NPT) was applied to equilibrate the system and perform subsequent production simulation runs. An average pressure of 1 atm and temperature of 300 K were maintained by using the isotropic position scaling with a relaxation time of 2 ps and Langevin dynamics42 with collision frequency of 1 ps−1. The SHAKE method43 was used for all bonds involving hydrogen atoms and the time step for numerical integration was 2 fs. The Particle Mesh Ewald (PME)44 was applied to calculate electrostatic interactions. The cutoff distance of the nonbonded Lennard Jones interactions equal to 10 Å was applied. Each analog was subjected to three independent 100 ns simulations and the snapshots were written every 2 ps. All three trajectories for the same analog started from the same initial configuration but were run with different initial velocities.

Data analysis

PTRAJ and CPPTRAJ modules45 of AMBER 12 and the Visual Molecular Dynamics program (VMD)46 were used for analyzing obtained trajectories. Root mean squared deviations (RMSDs) from the reference structure (geometry resulted from RNAComposer) for the whole tRNA molecules and their anticodon loops (nucleotides 32–37) were calculated for all atoms and backbone heavy atoms (P, O3′, O5′, C3′, C4′, C5′). RMSD is a classical index giving general information about shape similarity of the whole modeled molecule with respect to a structure selected as a reference. However, it does not provide any information about local changes in the structure or base-pairing and base-stacking. To gain a more detailed insight in local structural changes, local RMSD analyses were performed, in which a residue being analyzed, alone or together with its one to three former and subsequent neighboring residues along the sequence, was aligned over the trajectory saved every 20 ps, using in-house made VMD scripts. To extend this analysis for base-pairing and base-stacking interactions, tool named deformation index (DI)47 was applied. DI is defined as the RMSD divided by the base interaction network fidelity (INF). The INF indicates how good the predicted model is in terms of interaction network as compared to a reference system.42 The corresponding INF values were computed from base-pairing and base-stacking interactions of the aligned structures using MC-Annotate.48,49 Furthermore, CAD-score (the Contact Area Difference-score)50 online server was used to obtain more details about differences between modeled systems and their reference structures. CAD-score provides information about similarity of various contact areas in the two structures (model and reference). Contact areas such as A–A (all atoms–all atoms), S–S (side chain–side chain), S–S stacking and S–S non-stacking were analyzed.

Dynamical changes during the MD simulations were analyzed by visual inspection of trajectories and by the principal component analysis (PCA) approach, which is an unsupervised method to reduce problem dimensionality in order to facilitate analysis of multivariate systems.51 Applying PCA to trajectories of biological systems allows to extract modes that affect system variance the most, omitting statistically insignificant fluctuations.52,53 These analyses were performed in R54 using Bio3D55,56 package for simulation snapshots saved every 60 ps using backbone heavy atoms. Molecular modes corresponding to the most important principal components for each tRNA analog were saved in a PDB format for visualization purpose and animated with VMD.

Results

Initial structures assessment

The tRNA is a group of molecules essential for the translation process of protein biosynthesis and the regulatory function. Although the tRNA is known for more than 50 years, there are currently only few high-resolution crystal structures available, besides the ones in the complex with ribosomal RNAs. The aim of this research is to gain better understanding of structural differences and folding of this small molecule, based on the example of four tRNA analogs (tRNAAla, tRNAGly, tRNAHis and tRNAPhe) lacking of nucleobase modifications. We started the assessment with secondary structure Watson–Crick base-pair prediction using mFold. Because mFold, as well as other nearest neighbor-based software, often predicts the structure of the tRNA inaccurately, this has to be validated and cross-checked with the general tRNA pattern (Table S1). Sequence, along with the secondary structure (written in a linear representation), had been used for the initial tertiary structural assessment done by the RNAComposer software. Generated structures for all four tRNA analogs resemble the classical L-shape of tRNA (Fig. 1). Generally tRNAAla and tRNAPhe have more compact structure with anticodon and acceptor stems at close to 90° angle towards the acceptor stem. However, the structures of tRNAGly and tRNAHis analogs are more extended and less structurally defined. As the RNAComposer is known to create sometimes inaccurately folded structures, with the RNA strands passing over each other, all structures were closely examined, but none of such defects were detected. Interestingly, classical shape of anticodon loop, two nucleotide stacks 32–33 nt and 34–38 nt with a sharp turn at wobble 34 nt position, has been found in all but one case (discussed later).
image file: c6ra14933b-f1.tif
Fig. 1 Graphical representation of tRNA analogs (tRNAAla, tRNAGly, tRNAHis, tRNAPhe). The 3D structures were obtained with the RNAComposer software and used as reference. The D–T interactions and anticodon loop for tRNAGly are shown in red.

RMSD analysis on global structures

From each 100 ns simulation 100 structures were chosen for further analysis based on total energy in a function of simulation time. From each 10 ns interval 10 structures having the lowest total energy (Fig. S1) were selected. The RMSD values for the whole trajectories were calculated as a function of time for all heavy atoms in the backbone, all atoms in the structure of the overall tRNA, and for the anticodon loop (Fig. 2). We analyzed individual RMSD values calculated for each trajectory of respective tRNA analog separately, as well as average values derived from the overall inputs from all structures selected from all three trajectories for a single analog (Table S2). The obtained values from three independent simulations for each analog are relatively similar and oscillate around the average values. The largest fluctuations in RMSDs for the whole tRNA molecules were observed for both the tRNAGly and tRNAHis. Interestingly, in the case of the latter one RMSD for the anticodon loop seemed to fluctuate the least among the studied analogs. As for the whole structure, the lowest average RMSD values were obtained for tRNAPhe and tRNAAla (6.03 ± 1.21 and 6.00 ± 1.89 Å, respectively). The RMSD of tRNAPhe resulted from this study is larger than the value reported for the same species but coming from yeast and having the crystal structure available (4.32 ± 0.095 Å).5 Further discussion on this issue is presented in the next section. The mean RMSDs for tRNAGly and tRNAHis are 7.92 ± 1.19 and 8.54 ± 2.45 Å, respectively. Much smaller differences among the studied molecules are observed with respect to the mean anticodon loop RMSDs. We calculated 2.76 ± 0.75, 3.92 ± 0.74, 2.29 ± 0.31, and 2.71 ± 0.48 Å for the tRNAAla, tRNAGly, tRNAHis, and tRNAPhe analogs, respectively. Zhang et al.5 reported the average all-atom RMSD as a function of time for unmodified tRNAPhe anticodon loop of 2.57 ± 0.82 Å.
image file: c6ra14933b-f2.tif
Fig. 2 Backbone heavy atom (left top plot) and all atoms (left bottom plot) RMSD as a function of time for studied tRNA analogs and backbone heavy atom (right top plot) and all atoms (right bottom plot) RMSD as a function of time for anticodon loop. Red, black and blue are designated to separate simulations of each analog.

RMSD analysis on local fragments

In order to study structural changes in the course of simulations in more detail, we performed partial RMSD analysis for trajectory snapshots saved every 20 ps. Standard RMSD analysis for nucleotides considered separately can reveal changes in geometry of a particular residue, but it lacks any information about changes in its local environment. Therefore, we extended such analysis to compute local RMSD values for more than one residue at a time, by considering also former and consecutive neighbors in the sequence for each residue in the structure. In order to choose the most informative description that can be obtained from such approach, local RMSD values along the simulation progress were computed for every nucleotide alone, or together with its one, two and three closest neighboring residues. The results are presented for the first simulation of each tRNA analog as 2D RMSD maps in Fig. S2a–d. As it can be seen from comparison between maps involving n = 0 and n = 1 neighboring residues, obtained for standard RMSD analysis and extended calculation of RMSD values for three consecutive nucleotides (residue i and its one closest neighbor residue on each side of the sequence), respectively, the latter analysis facilitates location of locally-disturbed regions due to the changes in relative orientation between the residues taken into account. Further extension of residue selection along the RNA sequence used for computing RMSD values does not change RMSD maps significantly (maps n = 2 and n = 3) but can blur the analysis due to more extensive averaging, hence, only RMSD maps for three residues (n = 1) were used in further analyses.

Local RMSDs were computed for each 100 ns simulation and the results are collected in Fig. S3a–d. Even though local RMSD maps for each 100 ns simulation corresponding to particular analog are not identical, similar regions undergo local changes during the simulations with respect to the initial structure. The differences between the corresponding analyses result mostly in the time range of the occurrence of higher RMSD regions and their intensity. This observation is also confirmed by the comparison of the local RMSD values averaged over the whole simulation time for each analyzed simulation (Fig. 3). The most pronounced differences among all compared time-averaged local RMSD were found only for the third simulation of the tRNAAla analog (blue curve in Fig. 3). Nucleotides especially above number 40 in the sequence appeared to be less disturbed than in the case of the two remaining simulations. Thus, in order to facilitate comparisons between different tRNA analogs, local RMSD values were averaged over the three corresponding simulations (Fig. 4).


image file: c6ra14933b-f3.tif
Fig. 3 Local RMSD plots obtained from averaging RMSD maps over total simulation time for each 100 ns trajectory, collected for tRNAAla, tRNAGly, tRNAHis, and tRNAPhe systems, respectively.

image file: c6ra14933b-f4.tif
Fig. 4 Local RMSD analysis for each residue considered together with its 2 closest neighbors (former and subsequent residue in the sequence) obtained from averaging of local RMSD values calculated for three independent simulations for tRNAAla, tRNAGly, tNAHis, and tRNAPhe systems, respectively.

In general, all four tRNA systems share a similarity in local changes in the two corresponding sequence regions; first, involving nucleotides 8 to 10 and from 13–14 to 19–21, depending on the analog studied, and the second, for nucleotides 43 to 45 (shifted for tRNAPhe to nucleotides between 45 and 48) and between nucleotides 50–51 to 53–55 (shifted for tRNAPhe to nucleotides 54 and 59, respectively). The RMSD changes occurring in the central sequence region show that region between nucleotides 29 and 31 is the most disturbed in the case of the tRNAAla and tRNAGly analogs, while the least in the case of tRNAHis.

Principal component analysis (PCA)

PCA allowed us to extract molecular motions responsible for the highest system fluctuations during the performed simulations of the tRNA analogs. PCA results for three highest principal components for all simulations are depicted in Fig. S4a–d. Structure movements related to the highest principal components were visualized and compared (movie files for respective analogs are in ESI). In general, results obtained in the PCA show a mixture of different movements, i.e. rotations along own stem axes, winding–unwinding motions of the anticodon as well as in and out-of-plane motions. Relative movements of the horizontal and vertical helices cause bending of the whole structure and displacements of D and T loops. Although PCA of separate trajectories gives molecular motions related to the statistically most important principal components not exactly the same or in the same order, similar overall motion can be found among visualized modes. Therefore, we concluded that all separate PCAs still reveal the same dynamical features for each tRNA analog and none of them should be disregarded or treated as non-representative. To confirm this, we calculated the degree of overlap between essential subspaces obtained from PCAs of different trajectories of the same systems. Root mean square inner product (RMSIP) between the subspaces value is one, when compared subspaces are the identical, and zero when they are completely orthogonal. Averaged values of RMSIPs obtained from pairwise comparison of the three trajectories for each system are 0.70, 0.67, 0.63, 0.77 for a subspace of the first three principal components, and 0.81, 0.78, 0.78, 0.84 for a subset of the first ten principal components for tRNAAla, tRNAGly, tRNAHis, and tRNAPhe, respectively.

Based on the comparative analysis of three trajectories run independently for each analog using RMSD and PCA approaches we concluded that the structural evolution for all studied systems were qualitatively similar and the same regions were affected. Therefore, for further RNA model evaluation analysis one trajectory for each analog was selected (black curves, Fig. 2).

Interaction network fidelity (INF) and deformation index (DI)

INF and DI values were calculated for each selected structure (Fig. 5). For each set of analyzed structures INFs are nearly constant. The largest deviation, in particular in the first half of the simulation, was observed for tRNAAla. The INF values for conformations from the last 10 ns of simulation were within the range of 0.7–0.8 for all four tRNA analogs.
image file: c6ra14933b-f5.tif
Fig. 5 Backbone heavy atom RMSD (red), DI (black), and INF (green) indices calculated for all selected structures from 100 ns simulations.

Individual values of RMSD, INF, and DI for selected lowest total energy structures (Table S3) were converted into the color coded figure for visualization purpose (Fig. 6).


image file: c6ra14933b-f6.tif
Fig. 6 RMSD, INF (Interaction Network Fidelity) and DI (Deformation Index) values for 100 selected conformations having the lowest total energy. From the top: tRNAAla, tRNAGly, tRNAHis, and tRNAPhe analogs, respectively. The color scheme used is as follows: green represents the best whereas red the worst values, respectively.

Contact area difference-score (CAD-score)

In order to assess the discrepancies between the final models resulted from MD simulation and reference structures in further detail we also applied different CAD-score variants (Table 1). The nature of all atom–all atom (A–A) contacts allows for the most complete structure description whereas the quality of base–base (S–S) contacts can tell us more on 3D shape of the studied RNAs. Very similar values of global A–A CAD-score are the indication of the very similar accuracy of all four studied models. Slightly lower value obtained for tRNAGly might be associated with worse modeling of S–S interactions. The lowest score of 0.48 for non-stacking interactions for this analog apparently affected also stacking and altogether led to the lower global value. S–S CAD-scores for other analogs are comparable.
Table 1 Specific CAD-score values obtained using different variants: A–A, S–S, S–Sstacking, and S–Snon-stacking
Variant tRNAAla tRNAGly tRNAHis tRNAPhe
A–A 0.77 0.71 0.71 0.75
S–S 0.70 0.63 0.67 0.66
S–Sstacking 0.68 0.62 0.66 0.67
S–Snon-stacking 0.54 0.48 0.58 0.52


Superimposition of contact maps of all atom–all atom contacts along with local errors profiles were also assessed (Fig. S5). Agreement in contact between the reference and the model structure is shown on color-coded scheme using blue-white-red gradient (blue for regions that are in good agreement, red for those which are not). For the four studied tRNA analogs the following residues were indicated as responsible spots for deviations from the reference structure. These are as follows: 15A, 17U, 18A, 44U, 52G, 53G, 69A for tRNAAla, 1A, 16A, 17U, 18A, 19G, 33C, 42A, and 68A for tRNAGly, 18C, 19C, 20U and 52C, 53U, 54C, and 55A for tRNAPhe, and 8U, 9A, 16C, 43A, 44C, 50C, 53A, 54C, 55G, and 56A for tRNAHis. Differences in local S–S contacts between reference and model molecules for all studied analogs are shown in Fig. 7. Very similar regions in the sequences affected by the most substantial deviations from the initial constructed structures were found during the local RMSD analysis. We also analyzed local errors in the context of 3D structures using different variants (Fig. S6a–d).


image file: c6ra14933b-f7.tif
Fig. 7 The S–S local contact differences between the reference structure (left) and the final conformation from 100 ns simulation for studied systems (right). Other CAD-score variants are available in ESI.

Discussion

Four mitochondrial tRNA analogs were modeled using 100 ns classical MD simulation on GPUs and assessed with selected model evaluation methods. With the limited number of crystallographic coordinates available for tRNA molecules, we aimed to characterize the 3D structures, the flexibility of their components and stability for four specific tRNAs. Important question how these four tRNA analogs (tRNAAla, tRNAGly, tRNAHis, and tRNAPhe) are structurally similar/different was asked. In these studies we considered the RNA molecules without nucleobase modifications as it was suggested that modifications do not alter tRNA structural rigidity during MD simulations.5 The initial structures were generated using the RNAComposer software that is a comparative tool based on the 3D coordinates deposited in RNA database. As such, it creates non-optimized artificial structures, which have to be further evaluated. However, we can already see differences between the tRNA analogs used in this study.

Global and local RMSD evaluation method

Visible differences (Fig. 1) are observed and more pronounced during the MD simulations. Considering the RMSD values, one can tell that the structures of tRNAAla analog fluctuates significantly in the course of simulations to reach the relative stability at around 80–90 ns (black and red curves in Fig. 2) or much earlier (around 20 ns) as presented by the blue curve. tRNAGly shows similar fluctuations to tRNAAla, however three simulations resulted in slightly different points of reaching and loosing stability. In the case of tRNAPhe the structure is the least prone for changes. In contrast, the tRNAHis analog depending on the simulation either changes its structure entirely during the simulation and within 100 ns does not reach the stable conformation (black curve, Fig. 2) or is stable from 20 ns of simulation on (red curve) or reaches it at the end of the simulation time (70–100 ns, blue curve). This is also clearly seen in 3D structural analysis of the D–T motif, described later (Fig. 1 and S7a–d). However, these aforementioned changes among individual plots of RMSD values as a function of time do not necessary mean that each individual simulation led to a totally different conformation of a given analog. To ascertain whether tRNAHis structure can reach proper folding, we continued simulation for additional 400 ns. Results indicate that the D–T kissing complex is a flexible interaction, which can open and close (Fig. S9). Local RMSD analyses for tRNA analogs reveal the “hottest” regions for D and T loops. Besides, local structural changes that develop during simulations affect also nucleotides 8 and 9, the variable region and the anticodon loop, especially in region of coding nucleotides. In simulations of tRNAAla, tRNAGly, and tRNAPhe D loop is the most changed region, while in the case of tRNAHis the T loop becomes more locally rearranged in comparison to the initial structure.

Principle component analysis for tRNA analogs simulations

Visualization of dynamical changes in the simulated systems was carried out using the PCA method, since it allows to extract only structural rearrangements that are statistically the most significant. PCA revealed that in the case of tRNAAla the most dominating movement was related to the rotation of the anticodon arm, while fluctuations of amino-acyl stem and T arm were small. Other analogs show larger movements related to the vertical helix. In the case of tRNAPhe, the latter undergoes small, rather in-plane motions while rotation of the anticodon is replaced with out-of-plane motion. This results in small relative displacements of T and D loops for tRNAAla analog, and in the case of tRNAPhe these loops remain almost intact, in close proximity to each other. On the other hand, for the tRNAGly and tRNAHis analogs, horizontal helix additionally undergoes significant in- and out-of-plane movements, what causes larger bending between the vertical and horizontal fragments. This strongly affects flexibility and relative displacement of D and T loops. Moreover, in the case of the tRNAHis analog, there is a clearly visible tendency for opening and closing movement of the D–T kissing complex by displacement mainly of the D loop. It is important to note, these large movements are facilitated by the short flexible undefined regions acting as molecular hinges, one between the amino acyl and D stems, and the other between the anticodon and T stems. Similar dynamic was observed and described by Zhang et al.5 for unmodified yeast tRNAs (tRNAPhe, tRNAAsp, and tRNAiMet). In- and out-of-plane motions of the two arms of the L-shaped tertiary structure of tRNA, bending motion as well as compression in a spring-fashion were found the most predominant. Furthermore, the authors did not observe other major differences in the correlated motions in the majority of the studied trajectories.

Insight from INF and CAD-score

To gain more details from the analysis, we also applied the Interaction Network Fidelity (INF) that provides a quality measure during modeling of RNA 3D architectures.57 It was demonstrated that in the case of structural fidelity comparison of the INF values for RNA better describes any deviation from the model.58 Here, we compared results of the MD simulation to the initially generated structure using RMSD, INF and deformation index (DI). The DI values are derived from RMSD so the traces are very similar among all tRNA analogs (Fig. 5). However, the INF values exhibit small deviations and may indicate minor perturbations in the structures resulted from the simulations. The differences between fragments along the simulation are better pronounced in a graphical view of the table values (Fig. 6 and Table S3). The deviations in RMSD/DI and INF do not always overlap. For instance, in the case of tRNAHis the structure changes significantly throughout the simulation, mostly due to the D–T loops disconnection, but changes captured by the INF index are less significant (Fig. 5). Majority of commonly used methods are based on the structural change between positions of atoms or/and RNA fragments. In order to ascertain the relation between neighboring base-pairs we used CAD-score. Clearly, fluctuations in helical regions are smaller than for the D–T and anticodon loops (Fig. 7 and S6a–d). The smallest changes are for tRNAAla and tRNAPhe, in contrast to tRNAGly and tRNAHis. Using the blue-white-red color gradient for representing the conservation of the reproduced contacts in the structures resulted from the presented simulations we can assess our predicted models. tRNAAla seems to be pretty accurate with only two regions (D and T loops) where the contacts were not predicted correctly and did not differ significantly from its reference structure. The predicted model of tRNAGly is similar with the accuracy to the tRNAAla analog. However, the non-stacking base pair contacts are a bit worse. Analyzing the tRNAHis fate from the A–A and S–S CAD-score perspective one may get an impression that only one region is largely affected (D and T loops). However, the analysis of S–S stacking and non-stacking reveals that other regions, like in tRNAAla and tRNAGly, also undergo significant changes (variable loop, central motif between anticodon stem and D–T loops, and anticodon loop). In the case of tRNAPhe the spotted errors seem to be dispersed throughout the structure although overall similarity is quite satisfying, even during longer simulation (Fig. S9).

Based on the overall comparison of different evaluation methods (RMSD, INF, DI, and CAD-score) the highest CAD-score values obtained for tRNAAla correlate nicely with the low value of RMSD and high value of INF. All these three methods predict structures of similar characteristic features. Analysis of simulated structure of tRNAPhe resulted in similar picture with slightly lower CAD-score values. Surprisingly, the structures with the highest INF indices (Gly and His analogs) exhibit the lower CAD-scores for S–S contacts. This observation agrees well with the higher values for RMSD instead.

Analysis of the D–T and the anticodon loops

As the largest changes occurring during MD simulation were in the D–T and anticodon loops, we analyzed the structural changes and their relative position in these regions. Starting with the tRNAAla analog, we took structures with the lowest energies, pulled the loops out and aligned them preserving the distance. The most visible change was disruption of two wobble loop–loop interactions (U16–U17 with G57–G56) induced by the G56 movement outside the bounding plane and the U17 interdigitation between G57 and G56 (ESI, Fig. S7a). Guanine G56 swayed around the backbone along the simulation time, however, the whole structure of D–T interaction was stabilized toward the end of the simulation (70–100 ns). The overall D–T loops structures and relative positions remained quite intact in comparison with other analyzed tRNA analogs. For tRNAGly, the initial assessments (based on RNAComposer) created an interesting set of interactions (between A17, U18, and A56, C57, A58), which instantly broke and changed the entire conformation of D–T loops (Fig. S7b). However, it was surprising to see that with limited number (1–2) of hydrogen bonds, both loops remained in close proximity (data not shown). The A16 and A55 both swung around the backbone during the simulation, but in the end they reached their stable arrangement. There was a visible trend toward enhancement of stacking interactions via nucleobases interdigitation; however, loops were slowly restoring initial structures (Fig. S7b). In the case of tRNAHis simulation the D and T-loops were completely separated. It was difficult to define any structural interaction between loops during MD and from the 79 ns of the simulation they were disconnected. In this particular case it was important to continue the MD calculation for the total of 500 ns and indeed the D–T interaction can open and close during the simulation (Fig. S9). As for the tRNAPhe analog, the distance between loops seems intact, but the nucleobases relative positions changed (Fig. S7d). For instance, C20–U21 from D-loop were embedded inside the T-loop preventing them from any further interaction. Similarly, the C56 was turned towards inside of the T-loop. Throughout the entire simulation, the C17–U18, A55 and U58 were engaged into stacking contributing into stability of interaction.

The anticodon loop flexibility analysis was performed in the same manner. The initial structure was aligned with the ones corresponding to the lowest energies (ESI, Fig. S8a–d). It should be pointed out that the 3 nt anticodon sequence is unique for each tRNA that is coding for the particular amino-acid. The anticodon recognizes complementary mRNA by 3 nt coding sequence, however entire loop is a 7 nt structure and as such allows some flexibility. During the MD simulation in all four tRNA analogs, the anticodon loops were visibly more stable than D–T region and their structure did not change that significantly. Within the tRNAAla analog's anticodon loop the wobble U34 swung back and forth from its initial position. This is important, in the view of the specific recognition for this commonly modified position.59 The U33 changed its conformation from syn to anti during the simulation. The stacking interaction between C36, G37, U38 and U39 remained intact (Fig. S8a). tRNAGly anticodon loop had an unique fold, that during the simulation was drifting toward the classical two-stack shape. Initially tRNAGly analog folded with U34–U35 out of the diagonal plane and C36–C37 on the other side (Fig. S8b, time 0 ns). U33 and C36 seemed to sway the most from their initial position. However, the greatest deviation from initial structure was noted for tRNAHis, where the backbone unfolded to semicircle at 57 ns, but the initial structure was restored at 63 ns (Fig. S8c). The U33 nucleobase was the most flexible but also returned to the original position at the end of the MD simulation. As expected, the tRNAPhe analog behaved similarly to tRNAAla, the second stacking interaction was stable during MD. The flexible U33 moved initially out of the diagonal, but finally returned to its initial position (Fig. S8d).

Conclusions

Transfer RNA is one of the most important RNA molecules existing in every living form, yet its structure is not fully understood. With limited number of tRNA crystal structures we aimed to investigate four specific tRNA analogs constructed using automated tools for predicting their tertiary structures and atomistic explicit solvent (water) MD simulations to sample the time evolution of the resulted 3D structures. Results of 100 ns simulation for each tRNA analog were analyzed with the standard and local variant of RMSD, DI, INF, and CAD-score evaluation methods, as well as with PCA. All of them indicate the two most compact analogs tRNAAla and tRNAPhe undergo limited structure dynamics, mostly in the D–T loops regions, while tRNAGly changes the arrangement of its anticodon loop. The tRNAHis analog is the most flexible and changes its structure significantly by breaking the D–T interactions. The most dramatic changes in its structure happened with the first 100 ns of the overall 0.5 μs simulation, which might indicate that different analogs present different stability and rigidity, and each of them should be treated individually. This may further confirm that the structure of tRNA is very dynamic, can change (adjust) the conformation probably depending of the interacting partners. This can occur whether tRNA is processed by aminoacyl tRNA synthetase or forms a complex with tRNA elongation factor or finally with ribosome, but this has to be further investigated.

In the presented study we are also proposing the methodology for structural analysis, based on the primary sequence, that does not require 3D coordinates. Molecular dynamics trajectories analyzed by RMSD or DI, INF, CAD-score, and PCA show some differences, but overall conclusions they provide are similar, however different scales (global vs. local) of analysis, hence different information can be obtained. RMSD gives very little information about local differences. More importantly, its value averages errors over the whole molecule to get global superimposition. In order to overcome these limitations, as shown in the presented study, local RMSD changes along the simulation can be studied. Colored 2D maps allow for visualization of regions that undergo the largest local structural changes with respect to reference structure in a convenient manner. This approach gives also a possibility to follow these structural changes in a function of time, which preserves information how these changes develop in a course of simulation. This information is lost when the overall RMSD values are computed due to the averaging over trajectory snapshots. While RMSD analysis related to particular structural motif in analyzed tRNAs describes quantitatively how the whole structural fragment is conserved or rearranged during the simulation, local RMSD analysis can pick up short nucleotide regions that may allow for movement of larger, but intact fragments, what in turn can cause significant increase in overall RMSD values. Even though the other tested approaches, namely DI, which is based on INF and RMSD, and CAD-score extend structural comparative analysis also by considering non-covalently bound neighbors explicitly or implicitly, they all lack information on the nature of indicated changes. These different evaluation methods are able to pinpoint where, and possibly when structural changes occur. For extracting information on dynamic characteristics of molecular simulation, PCA can step in as an helpful method since it extracts and ranks molecular modes that are statistically most important, separating these modes from simulation noise. One has to be careful though, as on the other hand it may hide smaller atomic but possibly important changes when the whole structure undergoes larger-scale movements. Therefore, combination of automated evaluation procedures as well as careful visual inspection of structure is necessary to provide as detailed information as possible about the system of interest.

This study provides also the library of RNA 3D structures, which can be further used to screen the RNA-protein complexes. We also believe, that the presented approach could be applied to other tRNA molecules or RNA structures in general.

Acknowledgements

We would like to thank Pawel Adamczyk for setting up GPUs in the laboratory at the Lodz University of Technology. This work was supported by Dz. St. 2016 (for A. D.-D.) and National Science Center in Poland (2011/01/B/NZ3/02090 for A. C.).

References

  1. D. L. Nelson and M. M. Cox, Lehninger Principles of Biochemistry, W. H. Freeman, 5th edn, 2008 Search PubMed.
  2. H. Lodish, A. Berk, S. L. Zipursky, P. Matsudaira, D. Baltimore and J. Darnell, Molecular Cell Biology, W. H. Freeman, 4th edn, 2000 Search PubMed.
  3. The RNA Modification Database: Modifications, http://mods.rna.albany.edu/mods/ Search PubMed.
  4. Modomics–Modified bases, http://modomics.genesilico.pl/modifications/ Search PubMed.
  5. X. Zhang, R. C. Walker, E. M. Phizicky and D. H. Mathews, J. Chem. Theory Comput., 2014, 10, 3473 CrossRef CAS PubMed.
  6. P. J. Beuning and K. Musier-Forsyth, Biopolymers, 1999, 52, 1 CrossRef CAS PubMed.
  7. A. G. Torres, E. Batlle and L. Ribas de Pouplana, Trends Mol. Med., 2014, 20, 306 CrossRef CAS PubMed.
  8. J. A. Abbott, C. S. Francklyn and S. M. Robey-Bond, RNA, 2014, 5, 158 Search PubMed.
  9. T. Ogawa, A. Shimizu, K. Takahashi, M. Hidaka and H. Masaki, Biochem. Biophys. Res. Commun., 2014, 451, 131 CrossRef CAS PubMed.
  10. B. Nawrot, E. Sochacka and M. Düchler, Cell. Mol. Life Sci., 2011, 68, 4023 CrossRef CAS PubMed.
  11. P. Anderson and P. Ivanov, FEBS Lett., 2014, 588, 4297 CrossRef CAS PubMed.
  12. Y. Mei, J. Yong, A. Stonestrom and X. Yang, Cell Cycle, 2010, 9, 2936 CrossRef CAS PubMed.
  13. Y. Mei, J. Yong, H. Liu, Y. Shi, J. Meinkoth, G. Dreyfuss and X. Yang, Mol. Cell, 2010, 37, 668 CrossRef CAS PubMed.
  14. Y.-M. Hou and X. Yang, Antioxid. Redox Signaling, 2013, 19, 583 CrossRef CAS PubMed.
  15. Y. Zhou, J. M. Goodenbour, L. A. Godley, A. Wickrema and T. Pan, Biochem. Biophys. Res. Commun., 2009, 385, 160 CrossRef CAS PubMed.
  16. M. Pavon-Eternod, S. Gomes, R. Geslain, Q. Dai, M. R. Rosner and T. Pan, Nucleic Acids Res., 2009, 37, 7268 CrossRef CAS PubMed.
  17. R. J. White, Oncogene, 2004, 23, 3208 CrossRef CAS PubMed.
  18. S. Kirchner and Z. Ignatova, Nat. Rev. Genet., 2015, 16, 98 CrossRef CAS PubMed.
  19. T. Suryanarayana, J. K. Uppala and U. K. Garapati, Mol. Biol. Rep., 2012, 39, 9187 CrossRef CAS PubMed.
  20. K. A. Dittmar, J. M. Goodenbour and T. Pan, PLoS Genet., 2006, 2, e221 Search PubMed.
  21. B. E. Christian and L. L. Spremulli, Biochim. Biophys. Acta, Gene Regul. Mech., 2012, 1819, 1035 CrossRef CAS PubMed.
  22. M. Krepl, M. Havrila, P. Stadlbauer, P. Banas, M. Otyepka, J. Pasulka, R. Stefl and J. Sponer, J. Chem. Theory Comput., 2015, 11, 1220 CrossRef CAS PubMed.
  23. Compilation of mammalian mitochondrial tRNA genes, http://mamit-trna.u-strasbg.fr/ Search PubMed.
  24. M. Zuker, Nucleic Acids Res., 2003, 31, 3406 CrossRef CAS PubMed.
  25. M. Helm, H. Brulé, D. Friede, R. Giegé, D. Pütz and C. Florentz, RNA, 2000, 6, 1356 CrossRef CAS PubMed.
  26. M. Popenda, M. Szachniuk, M. Antczak, K. J. Purzycka, P. Lukasiak, N. Bartol, J. Blazewicz and R. W. Adamiak, Nucleic Acids Res., 2012, 40, e112 CrossRef CAS PubMed.
  27. RNAComposer, http://rnacomposer.cs.put.poznan.pl/ Search PubMed.
  28. D. A. Case, T. A. Dareden, T. E. Cheatham III, C. L. Simmerling, J. Wang, R. E. Duke, R. Luo, R. C. Walker, W. Zhang, et al., AMBER 12, University of California, San Francisco, 2012 Search PubMed.
  29. W. L. Jorgensen and J. D. Madura, J. Am. Chem. Soc., 1983, 105, 1407 CrossRef CAS.
  30. L. Jaeger, Curr. Opin. Struct. Biol., 1997, 7, 324 CrossRef CAS PubMed.
  31. J. H. Cate, A. R. Gooding, E. Podell, K. Zhou, B. L. Golden, C. E. Kundrot, T. R. Cech and J. A. Doudna, Science, 1996, 273, 1678 CrossRef CAS PubMed.
  32. K. Juneau, E. Podell, D. J. Harrington and T. R. Cech, Structure, 2001, 9, 221 CrossRef CAS PubMed.
  33. J. C. Bowman, T. K. Lenz, N. V. Hud and L. D. Williams, Curr. Opin. Struct. Biol., 2012, 22, 262 CrossRef CAS PubMed.
  34. A. Lahiri and L. Nilsson, Biophys. J., 2000, 79, 2276 CrossRef CAS PubMed.
  35. P. Banáš, D. Hollas, M. Zgarbová, P. Jurečka, M. Orozco, T. E. Cheatham, J. Šponer and M. Otyepka, J. Chem. Theory Comput., 2010, 6, 3836 CrossRef.
  36. W. D. Cornell, P. Cieplak, C. I. Bayly, I. R. Gould, K. M. Merz, D. M. Ferguson, D. C. Spellmeyer, T. Fox, J. W. Caldwell and P. A. Kollman, J. Am. Chem. Soc., 1995, 117, 5179 CrossRef CAS.
  37. J. Wang, P. Cieplak and P. A. Kollman, J. Comput. Chem., 2000, 21, 1049 CrossRef CAS.
  38. A. Pérez, I. Marchán, D. Svozil, J. Sponer, T. E. Cheatham, C. A. Laughton and M. Orozco, Biophys. J., 2007, 92, 3817 CrossRef PubMed.
  39. M. Zgarbová, M. Otyepka, J. Šponer, A. Mládek, P. Banáš, T. E. Cheatham and P. Jurečka, J. Chem. Theory Comput., 2011, 7, 2886 CrossRef PubMed.
  40. R. Salomon-Ferrer, A. W. Götz, D. Poole, S. Le Grand and R. C. Walker, J. Chem. Theory Comput., 2013, 9, 3878 CrossRef CAS PubMed.
  41. S. Le Grand, A. W. Götz and R. C. Walker, Comput. Phys. Commun., 2013, 184, 374 CrossRef CAS.
  42. R. J. Loncharich, B. R. Brooks and R. W. Pastor, Biopolymers, 1992, 32, 523 CrossRef CAS PubMed.
  43. J.-P. Ryckaert, G. Ciccotti and H. J. C. Berendsen, J. Comput. Phys., 1977, 23, 327 CrossRef CAS.
  44. T. Darden, D. York and L. Pedersen, J. Chem. Phys., 1993, 98, 10089 CrossRef CAS.
  45. D. R. Roe and T. E. Cheatham, J. Chem. Theory Comput., 2013, 9, 3084 CrossRef CAS PubMed.
  46. W. Humphrey, A. Dalke and K. Schulten, J. Mol. Graphics, 1996, 14, 33 CrossRef CAS PubMed.
  47. M. Parisien, J. A. Cruz, E. Westhof and F. Major, RNA, 2009, 15, 1875 CrossRef CAS PubMed.
  48. P. Gendron, S. Lemieux and F. Major, J. Mol. Biol., 2001, 308, 919 CrossRef CAS PubMed.
  49. S. Lemieux and F. Major, Nucleic Acids Res., 2002, 30, 4250 CrossRef CAS PubMed.
  50. K. Olechnovič and C. Venclovas, Nucleic Acids Res., 2014, 42, W259 CrossRef PubMed.
  51. J. Edward Jackson, A User's Guide to Principal Components, 2004 Search PubMed.
  52. A. Amadei, A. B. M. Linssen and H. J. C. Berendsen, Proteins: Struct., Funct., Bioinf., 1993, 17, 412 CrossRef CAS PubMed.
  53. A. E. García, Phys. Rev. Lett., 1992, 68, 2696 CrossRef PubMed.
  54. R Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Austria, Vienna, 2015 Search PubMed.
  55. B. J. Grant, A. P. C. Rodrigues, K. M. ElSawy, J. A. McCammon and L. S. D. Caves, Bioinformatics, 2006, 22, 2695 CrossRef CAS PubMed.
  56. L. Skjærven, X.-Q. Yao, G. Scarabelli and B. J. Grant, BMC Bioinf., 2014, 15, 399 CrossRef PubMed.
  57. E. Westhof, B. Masquida and F. Jossinet, Cold Spring Harbor Perspect. Biol., 2011, 3, a003632 CrossRef PubMed.
  58. Z. Miao, R. W. Adamiak, M.-F. Blanchet, M. Boniecki, J. M. Bujnicki, S.-J. Chen, C. Cheng, G. Chojnowski, F.-C. Chou, P. Cordero, J. A. Cruz, A. R. Ferré-D'Amaré, R. Das, F. Ding, N. V. Dokholyan, S. Dunin-Horkawicz, W. Kladwang, A. Krokhotin, G. Lach, M. Magnus, F. Major, T. H. Mann, B. Masquida, D. Matelska, M. Meyer, A. Peselis, M. Popenda, K. J. Purzycka, A. Serganov, J. Stasiewicz, M. Szachniuk, A. Tandon, S. Tian, J. Wang, Y. Xiao, X. Xu, J. Zhang, P. Zhao, T. Zok and E. Westhof, RNA, 2015, 21, 1066 CrossRef CAS PubMed.
  59. A. Rozov, N. Demeshkina, I. Khusainov, E. Westhof, M. Yusupov and G. Yusupova, Nat. Commun., 2016, 7, 10457 CrossRef CAS PubMed.

Footnote

Electronic supplementary information (ESI) available: Additional material includes graphic representation of tRNA structures, total energy as function of time, average backbone heavy atom and all atoms RMSD for tRNA analogs and backbone heavy atom and all atoms RMSD for anticodon loops, table of sequences and secondary structures interactions for tRNA analogs, local RMSD color-maps, table of RMSD, INF and DI values for selected conformations, contact maps of A–A contacts, local differences between the reference structure and final conformation, the 3D models of D–T loop–loop interaction, the 3D models of anticodon loops, movies representing the most important movements of each tRNA molecule resulted from PCA, PCA plots for all trajectories, PDB files containing starting and end coordinates of the studied analogs. See DOI: 10.1039/c6ra14933b

This journal is © The Royal Society of Chemistry 2016
Click here to see how this site uses Cookies. View our privacy policy here.