Edward Peter Booker and
Ghassan E. Jabbour*
Department of Electrical Engineering and Computer Science, University of Ottawa, Canada. E-mail: gjabbour@uottawa.ca
First published on 1st July 2021
To help contain the spread of the COVID-19 pandemic and to protect front-line workers, new antiviral measures are required. Antiviral nanoparticles are one such possible measure. Metal nanoparticles made from a variety of metals including gold, silver, and copper can kill or disable viruses that cause significant health problems in humans (such as SARS-CoV-2, HIV, or influenza). To promote interaction between nanoparticles and viruses the stabilizing ligands on the nanoparticle surface should be optimized for docking with proteins. The enormous chemical space of possible nanoparticle ligands makes this optimization experimentally and computationally intractable. Here we present a datamining-based study that searched for nanoparticle ligands that have previously been used, and computationally tested these for their ability to dock with the SARS-CoV-2 spike glycoprotein. These ligands will coat future antiviral nanoparticles to be used outside of the body, not as drugs. The best of these ligands identified were: nitric acid (score: 0.95), phosphoroselenoic acid (score: 0.88), hydroxyammonium (score: 0.83), pyrophosphoric acid (score: 0.81). Inspection of the best of these ligands has suggested design principles for future antiviral nanoparticle ligands, and we suggest further ligands based on these principles. These results will be used to inspire further in vitro and in silico experimentation to accelerate the development of antiviral nanoparticles.
Over the last 15 years, metal and metal-oxide nanoparticles (NPs) have emerged as a novel class of highly customizable broad-spectrum viricide. These NPs have been shown to deactivate a wide range of viruses that present significant public health threats, including HIV,4 herpes,5 influenza,6 and arenavirus.7 The model for how NP viricides deactivate viruses is by binding to the external glycoproteins, which prevents the viruses being able to dock with any cells so they cannot infect cells or reproduce (Fig. 1b).8 The mechanisms driving NP deactivation of viruses rely on different aspects of these NPs. These aspects include the constituent elements of the NP (Zn, Au, Ag, or Cu6,9–11), their size (three to fifty nanometers12,13), and coatings (coated with shells, ligands or bare NPs5,13,14). To stabilize nanoparticles in solution, they often need to be coated in a chemical that allows them to be stable in a solvent without the nanoparticles dissolving or agglomerating together.15 These can be referred to as surfactants, coordinating solvents, or ligands. Improving the performance of a material by varying just one of these properties (size, composition, coating, etc.) presents a significant challenge. With three or more properties to vary there are hundreds of thousands of combinations of materials of which many will give some antiviral effect. Screening these to find a scalable and effective viricidal solution will require down-selecting candidate materials for the ideal formulation for commercialization.
High-throughput and automated experiments can accelerate materials discovery.16a,b However, in the context of devices, biomaterials applications or finding new reagents, experiments can be time-consuming, expensive, or have bottlenecks due to procurement that prevent screening many tens or hundreds of candidate materials. An active area of research is the application of machine learning and datamining to pre-select experiments or reagents to facilitate rapid prototyping and development of new materials and compounds.17
In the context of nanomaterials, there are numerous published papers and different methods of synthesis that it is impractical for researchers to read all the literature. The Royal Society of Chemistry alone has published over 118000 articles or chapters that respond to a search for nanoparticles.18 A recently developed text classification tool (ChemDataExtractor19) lets researchers pass whole articles to computer software that can identify chemical names. Fortuitously, the RSC's bibliography is available electronically upon request. In this paper, we used the RSC's electronic archive and the CDE tool to search through several thousand of the most recent research papers on nanoparticles. We identified several tens of thousands of compounds that could be candidates as nanoparticle ligands, and we used protein docking simulations to find out which of these ligands is best docked with the SARS-CoV-2 virus spike glycoprotein.
Our approach is based on how well identified nanoparticle ligands dock with the spike protein, and not on their suitability as ligands while docking with the protein, nor their specificity as components in drugs. It may be that some of the identified species may destroy the target nanoparticles, or be unsuitable for toxicity reasons. However, the results here still inspire other potential ligands. Further, the methods used here do not select for docking with other SARS-CoV-2 proteins or other interactions of nanoparticles and viruses. The goal is to provide a first pass selection screening to choose nanoparticle ligands that may augment any other mechanisms by which nanoparticles deactivate viruses by allowing them to bind more effectively.
The design rules suggested by this study should provide generic advice for the synthesis of antiviral nanoparticles: a large number of functional groups that can form hydrogen bonds with virus proteins to prevent those proteins from bonding with host cells. These conclusions are very speculative, however, and to validate them extensive in vitro and in silico experimentation is required which goes beyond the scope of this work. Further, this work highlights that data-mining-based high-throughput screening may provide valuable insight to accelerate experimental programs.
The ChemDataExtractor (CDE) package was developed by the Cole group at the University of Cambridge.23 This package allows for the analysis of text documents to extract the chemicals that are mentioned. In this study we used the selenium python package to search the RSC publishing website for article corresponding to the term nanoparticles. These articles were read as HTML files by the CDE package which looked for paragraphs containing the terms ligand, surfactant or coordinating solvent, as these were considered to be sufficiently broad to capture all the necessary chemicals which stabilized the nanoparticles in the articles. These paragraphs were then searched for chemicals. The found chemical names were filtered to remove any lone elements, and the rest were converted first to CID numbers, then to SMILES structures. As in the Galaxy project study, we varied the charge states of these SMILES structures from a pH of 4.4 to 10.4. This provided a list of around 12000 unique chemicals (see ESI†). We then determined the 3D structures of these molecules using the rdkit tool. This process is illustrated in Fig. 1c, and a detailed breakdown of the methodology is available in the ESI.†
Our study used similar principles, and made use of many of the same tools. We took the structure of the spike glycoprotein, which has been suggested from published studies as the part of the coronavirus that initially binds to the host.24 The part of the host that the virus appears to bind to is the ACE2 layer. This is a large protein, and many parts of it interact with the spike glycoprotein. To simplify the analysis, the largest fragment of the ACE2 layer that interacts with the spike glycoprotein was chosen and this was used to identify the active site in the glycoprotein (from residue 21 to 61 of the ACE2 layer as informed by Benton et al.,25a see ESI document 2† for the ACE2 fragment used, and ESI document 4† for the ACE2 fragment used). This active site was used in our rDock analyses. The ∼12000 chemicals were converted into the SDF format, suitable for the rDock package, and then the docking was carried out. As in the initial Galaxy project study we then used the xchem rdsort tool, which is a machine learning-based method,25b to score the docked poses of our identified nanoparticle ligands. This provided a ranked list of the nanoparticle ligands that were assessed manually.
The results from the rDock procedure gave a ranked list of the chemicals which may be used as nanoparticle ligands. Molecules based on the acids of nitrogen and phosphorus were the best small molecules (less than 20 non-hydrogen atoms). Fig. 2 presents the top four of these small molecules.
What can be seen from these molecules is that large numbers of very polar bonds should provide good docking opportunities with the amino acids in the spike glycoprotein. While this is not a particularly surprising result, it is an interesting one. Very polar acid and amine groups will also allow these molecules to both form many different bonds with the surface of ionic nanoparticles (e.g., PbSe26a), so there will be potentially many different groups of each ligand exposed to the spike protein.
In addition to small molecules, some of the best larger ligands identified in this study also present interesting options for antiviral nanoparticle ligands. The most interesting examples of these are shown in Fig. 3:
Again, these sizable ligands indicate that large numbers of groups enabling polar and hydrogen bonding between the ligand and the spike glycoprotein. What can also be seen is that these large ligands do not have fully conjugated backbones, which may allow the ligands to sufficiently rotate and distort to give good docking postures with the virus protein. These molecules may be less ideal for the development of antiviral solutions on a large scale, but they do inspire further investigation of large molecules that may bind well with the virus. It should be noted the overall dipole moment of the molecule or ion is not as important as the functional groups themselves. For example, decaprenyl diphosphate scores higher than pyrophosphoric acid (0.98 compared to 0.81) but clearly the long aliphatic end group on decaprenyl diphosphate renders the total molecule less polar than pyrophosphoric acid.
We elaborated on our findings that molecules with large aliphatic groups and many polar groups tended to dock best with the spike glycoprotein to explore the potential for other molecules to act as ligands for antiviral nanoparticles. Five groups of molecules were investigated, related to: famotidine, sucrose, remdesivir, fabric softeners and trimetaphosphoric acid. Famotidine is an antacid that is purported to have significant benefits against COVID-19 (ref. 16b) and several structures related to famotidine were investigated, and indeed have been suggested from other in silico studies as good molecules to dock with different SARS-CoV-2 proteins (3CLpro,26c and the papain-like protease26d). Sucrose and related sugars have been used to coat nanoparticles27 and are very widely available, so sucrose and sucrose esters were also investigated. The drug remdesivir has been used as COVID-19 treatments and may inhibit the virus.28 By inspection of the structures of the best ligands identified in the previous sections, it was hypothesized that the surfactant chemicals in fabric softeners (which prevent triboelectric charging in tumble dryers)29,30 may be interesting candidates. Trimetaphosphoric acid, a widely available surfactant, was one of the high-performing data-mined candidates, so we also investigated compounds related to this structure. The docking results of these compounds and their related groups can be seen in Fig. 4, along with diagrams of these molecules. The full list of compounds investigated and their scores may be seen in the ESI.†
The results from these inspired molecules suggest that, although many polar functional groups are necessary for successful docking with virus proteins, this alone is not sufficient, and molecules like remdesivir (Fig. 4c) and famotidine (Fig. 4d) do not perform as effectively as other molecules identified in this survey, they both scored less than 0.01. We stress that while our results suggest that these molecules will dock poorly with the spike glycoprotein as we have investigated it, they do not account for other protein bindings, chemical reactions or other pharmacological effects of these compounds. Our findings for the compounds related to 2,2-dichlorovinyl dihydrogen phosphate and 1,2-dioleoyl-3-trimethylammonium propane are also encouraging, and prompt the investigation of additional ligands related to these structures.
Higher ranked docking compounds (4a, 4c and 4d) have been excluded as candidates based on the impracticality of using them as nanoparticle ligands, either due to high cost (4a), or the fact that they cannot be sourced commercially at this time, and so would be unsuitable for rapid scaling (4c, 4d).
The main drawback with our approach is that it selects molecules based solely on how well they dock with the spike protein, and not on their suitability as ligands. In certain conditions, attempting to use these very strong oxidizers as ligands may result in the destruction of the target nanoparticles. The results here inspire other potential ligands. Small organic molecules with nitric acid groups, such as 5-thio-2-nitrobenzoic acid (used in nanoparticle synthesis by Lai et al.30) may be suitable. The same procedure used in this study suggested this molecule would dock with the spike protein with a score of 0.078. The trimetaphosphate ion may be a suitable nanoparticle ligand that would interact well with the spike protein, but not destroy the nanoparticles in the process. Sodium trimetaphosphate is used in the food industry, and so may be suitable for incorporation in breathing apparatus, for example.
These suggested ligands shall be used with the leading antiviral metallic nanoparticle cores to produce fast and highly effective antiviral solutions, similar to work carried out by our group that is currently in press. The datamining aspect of this study returned chemicals that were in the same paragraph as the word ligand. This was done in the hope that the chemicals had been used as nanoparticle ligands before. The full list of chemicals found and the associated DOI references are found in ESI document 1.† The scores, and docked ligand-amino acid poses of the candidate ligands have been included in ESI documents 4 and 5 respectively.† Many of these molecules have been used in nanoparticle syntheses before, and as such the incorporation into our fabrication methods should be relatively straightforward. Further, as the small molecules identified are often used in industrial applications (nitrobenzoic acid, in the synthesis of procaine,31 sodium trimetaphosphate, as an additive to chewing gum32) they present cost-effective molecules that will facilitate the scale-up of production of antiviral nanoparticles for efforts to protect front-line workers and the general public in the current and future pandemics. The results of our study may be compared to the study by Mulholland and coworkers.31b Molecules such as vitamin K and dexamethasone were successful with their method. These have similar features to the structures identified in our study: highly polar functional groups and aliphatic tails (in the case of vitamin K), which support the methodology pursued here.
Based on from these results, we will synthesize several batches of antiviral nanoparticles, both with silver cores and with silica cores, to determine whether it is the reduction of the virus proteins by the nanoparticle core or the binding of the protein to the reactive ligands that causes these nanoparticles to deactivate viruses so effectively. These results will be complemented by additional molecules to provide a training set for the development of a generative adversarial network to design and evaluate potential new ligands that could be synthesized or purchased for next generation antiviral nanoparticles.
Using a basic data mining approach, we have identified guiding principles for the design of antiviral nanoparticle ligands that will be used to help determine the mechanisms of action for antiviral nanoparticles and to improve their ability to kill viruses. We have also seen that our method of discovering and testing chemicals to be used for nanoparticle ligands has been dramatically accelerated by electronic access to scientific publications. We also suggest that all publishers should provide means to carry out datamining campaigns for their research projects, and that experimentalists should use these tools to accelerate their literature reviews and lower the cost of their research.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/d1ra02293h |
This journal is © The Royal Society of Chemistry 2021 |