Molecular insights into the improved clinical performance of PEGylated interferon therapeutics: a molecular dynamics perspective

PEGylation is a widely adopted process to covalently attach a polyethylene glycol (PEG) polymer to a protein drug for the purpose of optimizing drug clinical performance. While the outcomes of PEGylation in imparting pharmacological advantages have been examined through experimental studies, the underlying molecular mechanisms remain poorly understood. Using interferon (IFN) as a representative model system, we carried out comparative molecular dynamics (MD) simulations of free PEGx, apo-IFN, and PEGx-IFN (x = 50, 100, 200, 300) to characterize the molecular-level changes in IFN introduced by PEGylation. The simulations yielded molecular evidence directly linked to the improved protein stability, bioavailability, retention time, as well as the decrease in protein bioactivity with PEG conjugates. Our results indicate that there is a tradeoff between the benefits and costs of PEGylation. The optimal PEG chain length used in PEGylation needs to strike a good balance among the competing factors and maximizes the overall therapeutic efficacy of the protein drug. We anticipate the study will have a broad implication for protein drug design and development, and provide a unique computational approach in the context of optimizing PEGylated protein drug conjugates.


Introduction
Recent advances in recombinant DNA technology have led to a signicant increase of protein therapeutics development and production. Interferons (IFNs) are a representative class of immunomodulatory protein drugs. IFNs are an evolutionarily conserved family of signaling cytokines that participate as extracellular messengers in a wide range of host responses. 1 IFNs are 166-residue proteins comprised of 5 alpha helices and weighs about 20 kDa. The secondary structure of the protein and the key residues Arg 27, Arg 35, and Lys 123 known to be crucial to IFN antiviral activity are shown in Fig. 1. 2 Binding of IFNs to specic cell surface receptors triggers an intracellular signaling cascade resulting in the synthesis of proteins that mediate immunomodulatory responses to maintain homoeostasis and host defense. 3 IFNs are the rst-line treatment of multiple sclerosis 4 and also have been shown to possess antiviral activity 5,6 and anti-proliferative effects on tumor cells. 7 IFNs and many other protein therapeutics suffer from bioavailability issues such as immunogenicity, protein aggregation, rapid renal clearance, and instability, leading to a shortened serum half-life and reduced therapeutic efficacy. As a results, frequent parenteral administrations are necessary, which signicantly increase medication costs and skin adverse reactions in patients. Immunogenic effects are observed when a self-antigen is offered to the immune system in the same manner over an extended period of time. The immune system's tolerance for the protein deteriorates, and the protein is targeted for elimination. 8 Aggregation is commonplace in protein drugs as a result of insolubility. Aggregation is an entropically favorable process that minimizes the exposure of protein hydrophobic surface residues by driving the hydrophobic residues of one protein in contact with the hydrophobic residues of another protein. Protein instability can be attributed to the same entropic effects because proteins have a tendency to fold into conformations in which the number of solvent-exposed surface residues is minimized. 9 Rapid renal clearance is the result of a combination of IFN's small size and its identication as a xenobiotic.
PEGylation, a process that covalently attaches polyethylene glycol (PEG) polymer chains to macromolecules, is a widely adopted strategy to alter the pharmacokinetic and pharmacodynamic properties of protein therapeutics such as IFNs. The benets of PEGylation can be achieved by a number of mechanisms, including increased protein stability, solubility, and bioavailability, reduced renal clearance, degradation by proteolysis, and antibody-mediated clearance. [10][11][12][13][14] In addition, the hydrophobic nature of IFNs makes them prone to protein aggregation in vivo, an issue that can be effectively addressed by PEGylation. 15 Nevertheless, an important requirement is that PEGylation must enhance the pharmacokinetic properties of the protein without signicantly affecting its therapeutic activity.
Non-PEGylated IFNs have been experimentally characterized using Fourier transform infrared spectroscopy, differential scanning calorimetry, and hydrogen-deuterium exchange mass spectroscopy. 16,17 There has been only a limited number of simulations studies that involved PEGylated biosystems. [18][19][20][21][22][23] The molecular mechanisms underlying the effects of PEGylation on protein therapeutics, however, are still poorly understood.
Here we present a comparative molecular dynamics (MD) study that specically examines the molecular mechanisms of how PEGylation affects IFN molecular structures and therapeutic properties with respect to different PEG chain lengths. We further analyze the molecular interactions between PEG and IFN, and reveal their implications to the benets and costs of PEGylation. Our goal of this study is to provide atomic insights into the effects of PEGylation on IFN with the expectation that they will be used to better understand the mechanisms of PEGylation in a broader range of protein therapeutics.

Site of PEGylation
PEG is not reactive to covalently bond to a protein by itself, so a linker molecule must be used. Studies have shown that reductive amination by the PEG-propionaldehyde linker, a widely adopted PEGylation process, selectively PEGylates the primary amine of protein N-terminus. 2 While there are other potential PEGylation sites, they are less common. Therefore, we focused on PEGylations at the protein N-terminus in this study.

MD simulation validation
A small number of simulation studies have been published for various PEGylated biosystems. [18][19][20][21][22][23] These earlier studies, however, did not parametrize the PEGs in a systematic fashion, nor did they validate these PEG models and MD simulations against experimental data. Here we used the persistence length (l), a fundamental mechanical property quantifying the stiffness of a polymer 24 and one of the most important metrics, to evaluate the quality of our MD simulations involving PEGs. To check if the MD simulations were capable of reproducing the experimentally observed mechanical property for the PEGs, the mean end-to-end distances h 2 were calculated from the MD simulations (see Materials and methods section). Nonlinear tting of eqn (1) yielded the persistence length l ¼ 3.67 AE 0.14 A, consistent with the experimental measurements obtained from atomic force microscopy 25 and other techniques 26 ( Table  1). The excellent agreement between the calculated and experimental data supports that the MD simulations based on our AM1-BCC partial charge model accurately reproduced the correct characteristics and behavior of free and conjugated PEGs.

Effects of PEGylation on IFN stability
The root mean square deviation (RMSD) of Ca atoms of the apo-IFN and PEGx-IFNs (protein only) as a function of the simulation time is shown in Fig. 2A. The IFN crystal structure was used as a reference structure for superimposing the MD trajectories in the RMSD calculations. The RMSD of apo-IFN uctuates around 0.45 nm whereas the RMSDs for the PEGx-IFNs are comparatively smaller, indicating a stabilizing effect of the attached PEG chain on the IFN conformation. IFN protein RMSD was only slightly increased with the increase of PEG chain length. In Fig. 2B, the RMSD of the PEGx-IFN (protein and PEG) showed relatively large uctuations compared to the Table 1 PEG persistence length (l) Experimental 25 3.80 AE 0.02 A Experimental 26 3.73 AE 0.29 A Calculated (this work) 3.67 AE 0.14 A protein-only RMSDs in Fig. 2A due to the movement of the exible PEG chains included in the calculations. However, the uctuations quickly plateaued aer 5 ns into the simulations. The results showed it only took approximately 5 to 15 ns for the attached PEG chain to fold into low-energy congurations on the IFN surface.
The change of radii of gyration (R g ) over time presented a similar picture. Fig. 2C shows that the IFNs in PEGx-IFNs became more compact than the apo-IFN. Fig. 2D shows that the overall size of the PEGx-IFN conjugated systems (protein and PEG) stabilized as the PEG chains adopted low-energy congurations on the IFN surface. Similar to the RMSD data, the  This journal is © The Royal Society of Chemistry 2018 longer the PEG chain length, the longer the time required to fold into low-energy congurations. The nal overall size of the PEGx-IFN conjugated systems (protein and PEG) is only marginally larger than the size of the apo protein. The mean R g increased from 1.7 nm in apo-IFN to 1.8-1.9 nm in PEG200-IFN and PEG300-IFN. The nal overall sizes of PEG100-IFN and PEG50-IFN were almost identical to that of the apo IFN, indicating tight wrapping of the PEG chains on the IFN surface.
It is worth noting that the overall RMSD and R g data ( Fig. 2B  and D) showed that all the simulations had indeed converged at 25 ns. The small uctuations seen in the apo-INF protein ( Fig. 2A and C) corroborates our aforementioned perspective that PEGylation stabilizes INF by holding the protein tightly together. Without PEGylation, INF would be more conformationally exible due to the lack of b sheets.
The secondary structure of IFN remained unchanged during the simulations for all PEGx-IFN conjugated systems. Fig. 3 shows that the 5 a-helices that make up the IFN structure were clearly visible in all simulations. The results conrmed that the covalently bound PEG chains did not affect the native structure of IFN, which is important for IFN to maintain its biological functions. There were occasional transitions between coils, bends, and turns, but they were all transient. Overall, the number of structural transitions were fewer in the conjugated PEGx-IFNs than in the apo-IFN, suggesting a stabilizing effect of the PEG chain on the IFN secondary structure. The residue root mean square uctuations (RMSF) shows the locations and scales of the secondary structure transitions (Fig. 4). Larger uctuations were found near residues 40, 50, 80, and 120, corresponding to the white and yellow bands (loop/coil/turn segments) found in Fig. 3. All of the PEGx-IFN conjugated systems appeared to have smaller magnitude of uctuations than the apo-IFN, again, indicating a stabilizing effect on the IFN structure.

Effects of PEGylation on IFN bioavailability
Serum half-life or retention time is a critical pharmacokinetic property for protein drugs. It is a measure of how long a drug stays in the circulation before being cleared by the system. Since IFN has poor solubility due to a large number of hydrophobic residues on the protein surface, non-PEGylated IFNs are quickly ltered out from the plasma. PEGs are highly soluble because of its ability to form a large number of H-bonds with water. The attached PEG chains increase IFN solubility by shielding the hydrophobic residues on the protein surface and at the same time forming H-bonds with surrounding water molecules. Fig. 5A shows that the number of IFN intramolecular H-bonds was not signicantly altered by the PEGs whereas the number of IFN-PEG intermolecular H-bonds increased substantially, contributing to the shielding effect on the IFN hydrophobic residues.
The effect of PEGylation on IFN solubility can be estimated using solvent accessible surface area (SASA). Fig. 5B shows that as the PEG chain length increases, a large amount of hydrophilic SASA was generated in the PEGx-IFN conjugated system while the hydrophobic SASA of IFN was substantially reduced. This drastic modication of SASA and protein hydrophilicity allowed more H-bonds to be formed between the PEGylated IFN and surrounding water molecules, contributing to the increase of IFN solubility. Furthermore, this effect also prevents IFN aggregation. Hydrophobic proteins such as IFNs have a strong tendency to clump together and minimize their contact with the  aqueous environment due to the favorable entropy-driven hydrophobic interactions. Shielding the hydrophobic regions of IFN with hydrophilic PEGs reduces the contact between the hydrophobic regions and the chances of establishing hydrophobic interactions, thus effectively resolving the protein aggregation issue.
Another contributing factor is neutralizing antibody (NAb) immunoglobulin G3 binding to IFN, which demolishes IFN bioactivity. 27 PEF shielding obscures the IFN epitope (residues 1-12 and residues 151-162) recognized by NAbs, effectively blocking NAb binding and contributing to the increased IFN bioavailability and serum half-life. 28

Effects of PEGylation on IFN bioactivity
Fig. 6A-D illustrate the atomic details of the IFN-PEG interactions. A 4 A cut-off was used to determine the contact between IFN residues and the PEG chains. The contact frequency was calculated as the average number of contacts between each IFN residue and the PEG chain over the course of the MD simulations. We observed that the PEG chain length increased a larger IFN surface was covered and shielded. Table 2 shows that the residues known to be crucial for antiviral activity, Arg 27, Arg 35, and Lys 123, were not affected in PEG50-IFN and PEG100-IFN. However, Arg 27 was signicantly affected in PEG200-IFN and PEG300-IFN. Arg 35 was moderately affected in PEG300-FN. Lys 123 was not affected in PEG200-IFN and PEG300-IFN because of its location at the opposite side of the protein surface. Our data revealed that the benets of PEGylation are not without limit, and that the intended therapeutic functions of a protein drug could be compromised beyond a certain PEG chain length because of the shielding effect of key functional residues. Indeed, studies have reported that longer PEG chains increase the bioavailability of IFN, but not the therapeutic efficacy of the protein. 15 In this case, the retention time of a PEGylated IFN system with a 12 kDa PEG chain (equivalent to PEG300-IFN in this study) has been determined to be 29.5 to 34 hours, a 2-3 fold increase over non-PEGylated IFNs. 29 However, its antiviral activity has been reduced to 66%. 15 The pharmacokinetic data is consistent with our molecular-level observations in Fig. 6D. In PEG300-IFN, Arg 27 was completely blocked; Arg 35 was partially blocked; and Lys 123 was not blocked at all, allowing the protein to retain a certain level of antiviral properties. Studies have also shown that the antiviral activity of a 20 kDa PEG chain (equivalent to PEG500-IFN) drops to 55% with 4-6 fold increase of retention time over non-PEGylated IFNs. When the PEG chain length is below 200, IFN antiviral activity remains at 100%. Its retention time, however, is merely 10 to 15 hours. 15 Taken together, the results suggest that there is a tradeoff between increasing retention time and decreasing therapeutic activity. It appears that the optimal PEG chain length with regards to IFN antiviral activity lies between PEG300 and PEG500 (12-20 kDa). Other protein therapeutics have the same issue. It has been reported that PEG500 (20 kDa) is the optimal PEG chain length for the PEGylated human granulocyte colony stimulating factor (GCSF), a protein drug of similar size to IFN, to achieve longer circulation time while retaining bioactivity. 30

IFN protein
IFN b-1a (referred as IFN) was used as the representative model system in this study. The crystal structure of IFN was obtained from the RCSB Protein Data Bank (PDB ID: 1AU1). 31

Building PEG
We focused on linear PEG polymers in this study. The PRODRG online server was used to produce a set of 10 mol2 les for each PEG molecule of length ranging from PEG10 to PEG30. 32 The AM1-BCC atomic partial charges 33 for each set of 10 structurally identical PEG molecules were assigned and averaged with antechamber in the AMBER 14 package. 34 The observed changes of partial charges from short to long PEG chains were used to construct a systematic and self-consistent charge model for assigning atomic partial charges for PEG chains of any length. PEG and propylamine fragments were added to the organic_bld le of Maestro academic version. 35 PEG-propylamine molecules containing 50, 100, 200 and 300 PEG units were constructed in Maestro.

Linking PEG to IFN
VMD RMSD calculator extension was used to align PEG-propylamines and the IFN N-terminal Met 1 residue on the atoms of the amine group while IFN coordinates were xed. 36 The structures were visually inspected to ensure that there were no atomic clashes. Mol2 les containing the new coordinates of the aligned PEG-propylamines were created. VMD molefacture extension was used to remove the amine group of PEG-propylamines, along with one of the amine H atoms in the Met 1 residue. AMBER LEaP was used to create a bond between the N atom of Met 1 and the C1 atom of the PEG-propylamines. 34 A frcmod le was also created to provide LEaP with the information necessary to recognize the nonstandard residue "PEG" and dene the proper dihedral angles and torsions. AMBER LEaP was then used to create the initial les (inpcrd, prmtop, and PDB) for the MD simulations.

MD simulations
MD simulations were carried out using AMBER PMEMD 37 on free PEGx, apo-IFN, and PEGx-IFN (x ¼ 50, 100, 200, 300) parametrized with the AMBER ff99SB force eld 34 and the custom PEG charge model. Each system was solvated in a TIP3P water box, leaving 10 A between the solute surface and the box boundary. All crystallographically resolved water molecules were retained. The IFN systems had a net charge of +9, which were neutralized by adding 9 Cl À counterions. An ionic concentration of 0.15 M NaCl was introduced to mimic experimental conditions. Protonation state of the protein residues was determined at pH 7 for the AMBER ff99SB force eld by the PDB2PQR web server at http://nbcr-222.ucsd.edu. Simulated annealing procedures were used to effectively optimize the PEGs. With IFN backbone restrained, the systems were gradually heated from 0 K to 500 K at 1 K incremental for every 2 ps, and equilibrated for 200 ps at 300 K, 400 K and 500 K. The systems were then gradually cooled back to 300 K using the same procedure. 25 ns production runs were performed aer the simulated annealing process. A time step of 2 fs was used in all simulations. Bonds to hydrogen atoms were constrained using the SHAKE algorithm with a relative geometrical tolerance of 10 À6 . A cutoff of 14 A was applied to the non-bonded interactions. The output frequency of trajectory les was every 1000 steps (2 ps). The isothermal-isobaric (NPT) ensemble was applied in all simulations with temperature regulation (thermostat) achieved by Langevin dynamics (collision frequency of g ¼ 1.0 ps À1 ) and constant pressure regulated by Berendsen barostat.
To ensure reproducibility, multiple MD simulations were performed for each system. No statistically signicant differences were observed among these independent simulations. The MD trajectory of a single representative simulation for each system was analyzed using DSSP, 38 VMD, 36 Wordom, 39 Gromacs, 40 and in-house scripts. Hydrogen bonds were determined by a donor-acceptor distance of 3.0 A and a 120-degree angle cutoff. A 4 A cutoff was used to determine the contact between IFN residues and the PEG chains.

PEG persistence length
To validate our charge model used to parametrize the PEGs and that our MD simulations reect contact the dynamics of PEGs in physiological conditions, we calculated the persistence length l of the free PEGs from the MD trajectories using the worm-like chain model: 24 where h is the mean-squared end-to-end distance of the PEG series, L is the length of the fully extended PEGs; l was calculated by a nonlinear least-squares t, standard errors of l were estimated by statistical errors of random samples of hh 2 i generated from the MD trajectories.

Conclusions
Using comparative molecular dynamics simulations, we have provided atomic insights into the improved clinical performance of PEGylated IFNs. We demonstrated the molecular-level evidence related to three areas of IFN clinical performance: (1) PEGylation improves IFN stability by stabilizing protein overall conformation and secondary structures, and restricting residue exibility without increasing the overall size of the protein drug.

Conflicts of interest
The authors declare no conict of interest.