Joseph Lahoud
Sleiman‡
a,
Filippo
Conforto‡
a,
Yair Augusto Gutierrez
Fosado
a and
Davide
Michieletto
*ab
aSchool of Physics and Astronomy, University of Edinburgh, Peter Guthrie Tait Road, Edinburgh, EH9 3FD, UK. E-mail: davide.michieletto@ed.ac.uk
bMRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh EH4 2XU, UK
First published on 20th October 2023
Knots are deeply entangled with every branch of science. One of the biggest open challenges in knot theory is to formalise a knot invariant that can unambiguously and efficiently distinguish any two knotted curves. Additionally, the conjecture that the geometrical embedding of a curve encodes information on its underlying topology is, albeit physically intuitive, far from proven. Here we attempt to tackle both these outstanding challenges by proposing a neural network (NN) approach that takes as input a geometric representation of a knotted curve and tries to make predictions of the curve's topology. Intriguingly, we discover that NNs trained with a so-called geometrical “local writhe” representation of a knot can distinguish curves that share one or many topological invariants and knot polynomials, such as mutant and composite knots, and can thus classify knotted curves more precisely than some knot polynomials. Additionally, we also show that our approach can be scaled up to classify all prime knots up to 10-crossings with more than 95% accuracy. Finally, we show that our NNs can also be trained to solve knot localisation problems on open and closed curves. Our main discovery is that the pattern of “local writhe” is a potentially unique geometric signature of the underlying topology of a curve. We hope that our results will suggest new methods for quantifying generic entanglements in soft matter and even inform new topological invariants.
To rigorously prove that the early tabulated knots did not contain duplicates, so-called topological invariants and knot polynomials were developed, the first of which was the Alexander polynomial,1,5,6 followed more recently by the Jones and HOMFLY polynomials.1,7 Knot polynomials are mathematical constructs that can be computed on knot diagrams and are invariant under smooth deformations of the curve, i.e. deformations that preserve the curve topology. However, there are knots that share many topological invariants and cannot even be distinguished by knot polynomials. Famously, the 11-crossing Conway knot has the same Alexander polynomial as the unknot and shares the same Jones polynomial of its mutant, the Kinoshita–Terasaka (KT) knot.1 More generally, all mutants of a knot have the same HOMFLY polynomials and the same hyperbolic volume,1 while some composite knots share the same homeomorphic complements.8–10
Alongside the development of topological invariants, several attempts were made to identify a relationship between a specific geometric embedding of a knot and its underlying topology.11 We note that this relationship is different from the one sought between so-called geometric and algebraic invariants,12,13e.g. between the hyperbolic volume of a knot and its Jones polynomial.14 Perhaps one of the most rigorous results in this direction is the Fáry–Milnor theorem, stating that the total absolute curvature of non-trivially knotted curves must be greater than 4π.15 Unfortunately, this result only imposes a weak constraint on the topology of the underlying curve, as an unknot can itself have large total curvature due to, for example, deformations of its contour. In parallel, a large body of work on so-called “ideal knots” was carried out with the aim of finding geometric features that could reflect the underlying knot topology. One impressive result in this context is that different DNA knots display a spatial separation when run on a gel electrophoresis that is linearly proportional to the so-called average crossing number;16,17 this result entails that there is an intimate relationship between the physical shapes assumed by knots and their underlying topology. Another result that inspired our work is that the total so-called “writhe” (see below) of an ideal knot is the same (up to a constant that is only a function of the curve length) as that of a non-ideal, thermally agitated curve with the same topology.18 Though this suggests that “writhe” may be a good measure that is invariant under thermal fluctuations, there is no one-to-one relationship between the global writhe of a knot and its underlying topology; for instance, the global writhe of the 41 knot is 0, the same as the unknot.11
Thus, the problem of determining a curve topology based only on the geometric information of its segments (without using any projection or algebraic invariant) is an open challenge in knot theory that has ramification in many fields, for instance polymer physics, biophysics and fluid dynamics. In this paper, we propose to address this open challenge by using the power of artificial intelligence, and in particular deep learning, in recognising and classifying patterns in certain knot geometric features. Our main discovery is that by using a quantity we dub “local writhe”, even simple machine learning (ML) algorithms can identify the topology of knotted curves undergoing thermal fluctuations with very high accuracy. We argue that this is an example of geometric learning, whereby the only quantity we pass to the ML algorithm is a quantity that can be computed from the Cartesian positions of a curve's segments, without the need to compute algebraic invariants such as Alexander or Jones polynomials. Our method can even distinguish 11-crossing knots that are otherwise impossible to distinguish using standard invariants (the Conway and KT knots). Finally, we show how this algorithm can be scaled to classify all 250 prime knots up to 10 crossings with 95% accuracy, and can even be employed to solve knot localisation problems. Overall, we argue that local writhe is an excellent feature – determined purely by the 3D positions of a curve segments – that results in patterns easily identifiable by ML algorithms. We argue that our results will be applied to other classification problems such as threading19,20 and entanglements21,22, and also prompt knot theorists to employ local writhe to define new geometric knot invariants.
(1) |
The StA writhe, ωStA(x), is a 1D geometrical representation of a knot that we hypothesise may display some patterns that are topology-dependent (Fig. 1(A)–(C)). Since complex pattern recognition is a task that naturally lends itself to being addressed using a machine learning approach, we thus asked ourselves if a neural network (NN) trained to recognise patterns within ωStA(x) was able to solve ambiguous knot classification problems. To do this, we built feed forward and recurrent (long-short term memory, LSTM) neural networks (FFNN and RNN, respectively) and trained them using 105 statistically uncorrelated and pre-labelled conformations for each knot. To generate these conformations, we initialised a bead-spring polymer with known topology, N = 100 beads, and persistence length lp = 10σ (other lengths and lp are reported in the ESI†) using KnotPlot (knotplot.com) and subsequently evolved the polymer configurations in LAMMPS29via Langevin dynamics in an implicit solvent and fixed temperature, using a Kremer–Grest model30 to preserve polymer topology (see Methods and ESI† for more details). The code to generate these conformations are available open access at https://git.ecdf.ed.ac.uk/taplab/mlknotsproject. We confirmed that the topology was conserved either by computing their Alexander determinant via KymoKnot (https://kymoknot.sissa.it)31 or, when ambiguous, visually.
The NNs were built with an input layer that was determined according to the input representation being studied, e.g., the Cartesian (XYZ) coordinate representation used 3 neurons (one for each dimension) per polymer bead. Other local input features, such as StA writhe, used one neuron per bead, while the StS writhe feature requires N × N input neurons. The optimal number of hidden layers, hidden units, learning rate and batch size were determined via an automated hyperparameter tuning method conducted on the Cartesian representation (KerasTuner32). Unless otherwise stated, our NNs contained 4 hidden layers, with around 4 × 105 trainable parameters. The output layer consisted of C output neurons, corresponding to the C knot types being classified, each implemented with a softmax activation function in order to return the probability that a given input is a certain knot type. We took the sparse categorical cross-entropy as the loss function, as the most appropriate for individual class probabilities and integer target labels, i.e. our knot types (Fig. 1(D)).
We found that these results are generally robust for different choices of dataset splitting, persistence length (lp = 1 − 10 σ), window length chosen to perform the StA calculation, and length of the chains (see ESI†). Nevertheless, they do display a significant reduction in accuracy when tested on knots generated using a different method (for instance freely jointed chains), and also when the window length for the StA writhe calculation is comparable to the full contour of the chain. In this case, the StA writhe is constant and equals the global writhe of the knot, which is not unique for different knots.11 This is also in agreement with principal component analysis (PCA, see ESI†) of the StA-trained NNs, where we see that different knots are clearly separable in the reduced 2D PCA space, yet the 01 and 41 cluster together due to the fact that they share the global writhe (zero), which is related to the mean value of ωStA(x) along the contour.
We then asked ourselves if our NN could distinguish knots sharing multiple knot polynomials. As mentioned above, mutant knots share the same hyperbolic volume and several knot polynomials, including HOMFLY. We therefore performed simulations of the Conway (K11n34) knot and one of its mutants, the Kinoshita–Terasaka (KT, K11n42) knot. These 11-crossings knots have a number of identical knot invariants, sharing the same Jones, Alexander, and Conway polynomials.1 Intriguingly, the latter two are also shared with the unknot. Thus, we generated 105 statistically uncorrelated conformations of N = 200 beads long polymers with the Conway, KT, and unknot topologies (Fig. 2(D)), and trained our FFNN to classify them either using a COM-subtracted XYZ or ωStA(x) (Fig. 2(E)) representations. When tested on unseen conformations, we found that while the XYZ-trained NN could not distinguish the Conway and KT knots, both were accurately distinguished from the unknot (Fig. 2(F)). In marked contrast, we discovered that the StA-trained NN perfectly disentangles the three knots with 99.6% accuracy (Fig. 2(F)). We therefore conclude that the StA-trained NN has the ability to convert StA patterns into a topological knot classification, even for knots sharing multiple knot polynomials, such as mutants and composites. In turn, we argue that the StA writhe is a geometric quantity computed on a particular 3D embedding of a curve that carries high-density information about its underlying topology. Importantly, we stress that to classify these knots, the network does not compute any knot polynomial, as other standard software do.
Somewhat unsatisfactorily, we cannot fully pinpoint why StA writhe is so powerful at identifying different topologies. We hypothesise that the 1D patterns generated by StA writhe- specifically the sequence, sign and amplitudes assumed by consecutive maxima and minima - contain information on the relative orientation and magnitude of consecutive entanglements. As mentioned above, the average value of ωStA(x) is related to the global writhe of the knot, which itself contains non-unambiguous information about its topology. Thus, we argue that the NNs can extract additional information from the full ωStA(x) patterns, related to the chirality of individual entanglements and render the information unique. This hypothesis is also supported by the fact that the unsigned StA writhe (which cannot distinguish chirality) yields, in general, a lower accuracy (see Fig. 1(E)). We thus hypothesise that the information encoded in the pattern of the StA writhe may be related to the underlying knot's Dowker code. These hypotheses will be tested in more detail in future works.
Based on these results, we argue that the StS writhe is therefore the most scalable and precise geometric feature to utilise for knot classification problems. Most importantly, we would like to stress that the impressive accuracy demonstrated for a 250-class problem was achieved with a simple feed forward NN with 4 layers (around 3600k for the StS writhe and 400k parameters for StA writhe). A natural extension going forward will be to employ more complex architectures, and in particular convolutional NNs, to classify the 2D StS writhe maps.
We first tackled this problem using the same FFNN architecture as in the knot classification task, but the accuracies generated were very low. We hypothesised that this was due to the fact that FFNNs do not preserve the sequential information along the polymer. For this reason, we consider a long-short term memory (LSTM) model, also known as a recurrent NN (RNN). More specifically, we employed a sequence-to-sequence LSTM, with an output layer corresponding to a binary sequence of N = 100 neurons, equivalent in dimension to the length of the input polymer. Each output neuron is passed through a sigmoid function, which converts the output into a probability between 0 and 1 representing the likelihood that a given monomer is within the knotted segment of the polymer conformation. The true output labels were generated using KymoKnot,31 which employs a minimally-interfering closure algorithm followed by a standard Alexander determinant calculation to identify the start and end monomers of the knot. This data was then transformed into a vector of 100 bits, i.e. a value of 0 or 1, corresponding to whether a certain monomer was part of the knotted arc.
Unlike normal multi-class classification problems where the classes are mutually exclusive, here we consider a multi-label classification task, with mutually non-exclusive class labels (multiple classes per prediction).39 To quantify the error in a multi-label classification task, we use the binary cross-entropy (BCE) function, suited to an output layer of sigmoid functions, given by
(2) |
Finally, to determine the accuracy of the model, we converted the probabilities generated by the sigmoid function yprob into binary values using a Heaviside step function (ypred = Θ(yprob − 0.5)), and compared the result to the true binary value obtained using KymoKnot. The final accuracy is given by the binary accuracy, i.e. Accuracy = correct/total.
Overall, we find that the StA-trained RNNs perform extremely well, reaching above 90% accuracy in localising any knot that we tested: the 5 simplest knot types, 01, 31, 41, 51 and 52 (Fig. 4). We argue that this excellent performance relies on the effectiveness of RNNs in handling multi-scale sequential data and tracking multi-scale correlations along the polymer. This capability likely plays a major role in allowing the network to recognise that nearby monomers are more likely to be in the same knotted arc. More precisely, we find that the StA writhe representation is superior to all other descriptors, with a localisation accuracy of 93%, confirming its potential usefulness as a tool to help in knot localisation tasks. For instance, in Fig. 4(D) we report the prediction and ground truth for the 41 knot shown in Fig. 4(A) and (B). In this case, the StA writhe perfectly agrees with the KymoKnot ground truth, whereas the XYZ and unsigned StA writhe yield less accurate localisation predictions.
In the ESI† (Fig. S9), we also used our StA-trained RNN model to track the unknotting of a 51 knot tied on an open curve. Despite the fact that the algorithm was not trained on open curves, the results were surprisingly accurate. The model can be seen to clearly detect the presence of short knotted arcs even at the final step before complete unknotting. Once again, we find that the StA-trained model is largely superior to the XYZ-trained model.
Overall, our results highlight the power of StA and StS writhe in not only classifying but also localising knots. We acknowledge that our results are non-exhaustive and more work will be needed in the future to find the best architectures and models to optimally solve these tasks.
We stress that this method only requires a snapshot of a knot embedding with a list of 3D coordinates for each polymer segment and is trained on thermal conformations under a readily tunable temperature. For this reason, it will require longer training for longer polymers but should be essentially insensitive to the number of non-essential crossings, as shown by the excellent accuracy achieved in spherically confined polymers.24 This feature is in marked contrast to standard knot topology algorithms, that take 2D projections and need to compute matrices as big as the number of crossings in a given projection, irrespective of whether they are essential or not.1 Finally, we show that by deploying recurrent NNs, our geometric StA descriptor can also solve knot localisation problems (Fig. 4). More work will be needed in the future to determine optimal NN architectures.
We note that though we do not have a full understanding of how the NNs are using StA and StS writhe features to identify knots, we hypothesise that they are classifying the patterns of consecutive maxima and minima, thus capturing the entanglement of pairs of segments, accounting for their chirality and magnitude. This argument directly suggests that employing a distance map between segments or other geometric “unsigned” representations will yield lower accuracies, due to the fact that they do not capture the chiral nature of the entanglements between segments. For these reasons, we believe that StS (or StA) representations are possibly the best features to connect the geometry of a given curve embedding, to its underlying topology. A possible limitation of this method is that it is restricted to pair-wise entanglement. Generalising the Gauss linking number to higher-order relations is itself an active field of research, and it is foreseeable that a local version of the Milnor triple linking number40 may be used to generate 3D tensors of Brunnian links, for example.
In conclusion, we established that StS/StA-trained NNs are powerful tools to accurately classify and localise knots in thermally equilibrated curves. Importantly, knot classification and localisation are achieved without any explicit calculation of Alexander or other algebraic invariants. We propose that the local writhe – once fed through deep NNs – yields an accurate map from the configurational space of a curve to its underlying topology. The approach we reported in our paper naturally lends itself to be applied to protein folding,34,41 DNA42,43 and, in general, entanglements in open curves and complex systems.20,21,36,44–47 We hope that our results will also inspire mathematicians and topologists to formulate new topological invariants based on the geometrical embeddings of knotted curves.
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3sm01199b |
‡ Joint first author. |
This journal is © The Royal Society of Chemistry 2024 |