Issue 28, 2020

Machine learning model for fast prediction of the natural frequencies of protein molecules

Abstract

Natural vibrations and resonances are intrinsic features of protein structures and enable differentiation of one structure from another. These nanoscale features are important to help to understand the dynamics of a protein molecule and identify the effects of small sequence or other geometric alterations that may not cause significant visible structural changes, such as point mutations associated with disease or drug design. Although normal mode analysis provides a powerful way to accurately extract the natural frequencies of a protein, it must meet several critical conditions, including availability of high-resolution structures, availability of good chemical force fields and memory-intensive large-scale computing resources. Here, we study the natural frequency of over 100 000 known protein molecular structures from the Protein Data Bank and use this dataset to carefully investigate the correlation between their structural features and these natural frequencies by using a machine learning model composed of a Feedforward Neural Network made of four hidden layers that predicts the natural frequencies in excellent agreement with full-atomistic normal mode calculations, but is significantly more computationally efficient. In addition to the computational advance, we demonstrate that this model can be used to directly obtain the natural frequencies by merely using five structural features of protein molecules as predictor variables, including the largest and smallest diameter, and the ratio of amino acid residues with alpha-helix, beta strand and 3–10 helix domains. These structural features can be either experimentally or computationally obtained, and do not require a full-atomistic model of a protein of interest. This method is helpful in predicting the absorption and resonance functions of an unknown protein molecule without solving its full atomic structure.

Graphical abstract: Machine learning model for fast prediction of the natural frequencies of protein molecules

Supplementary files

Article information

Article type
Paper
Submitted
03 Jun 2019
Accepted
03 Apr 2020
First published
27 Apr 2020
This article is Open Access
Creative Commons BY-NC license

RSC Adv., 2020,10, 16607-16615

Machine learning model for fast prediction of the natural frequencies of protein molecules

Z. Qin, Q. Yu and M. J. Buehler, RSC Adv., 2020, 10, 16607 DOI: 10.1039/C9RA04186A

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements