Quirine J. S. Braat‡
a,
Giulia Janzen‡
ab,
Bas C. Jansen‡a,
Vincent E. Debets
a,
Simone Ciarella
cd and
Liesbeth M. C. Janssen
*ae
aDepartment of Applied Physics, Eindhoven University of Technology, P.O. Box 513, 5600 MB Eindhoven, The Netherlands. E-mail: l.m.c.janssen@tue.nl
bDepartment of Theoretical Physics, Complutense University of Madrid, 28040 Madrid, Spain
cNetherlands eScience Center, Amsterdam 1098 XG, The Netherlands
dLaboratoire de Physique de l’Ecole Normale Supérieure, ENS, Université PSL, CNRS, Sorbonne Université, Université de Paris, F-75005 Paris, France
eInstitute for Complex Molecular Systems, Eindhoven University of Technology, P.O. Box 513, 5600 MB Eindhoven, The Netherlands
First published on 24th June 2025
Cell motility in dense cell collectives is pivotal in various diseases like cancer metastasis and asthma. A central aspect in these phenomena is the heterogeneity in cell motility, but identifying the motility of individual cells is challenging. Previous work has established the importance of the average cell shape in predicting cell dynamics. Here, we aim to identify the importance of individual cell shape features, rather than collective features, to distinguish between high-motility and low-motility (or zero-motility) cells in heterogeneous cell layers. Employing the cellular Potts model, we generate simulation snapshots and extract static features as inputs for a simple machine-learning model. Our results show that when cells are either motile or non-motile, this machine-learning model can accurately predict a cell's phenotype using only single-cell shape features. Furthermore, we explore scenarios where both cell types exhibit some degree of motility, characterized by high or low motility. In such cases, our findings indicate that a neural network trained on shape features can accurately classify cell motility, particularly when the number of highly motile cells is low, and high-motility cells are significantly more motile compared to low-motility cells. This work offers potential for physics-inspired predictions of single-cell properties with implications for inferring cell dynamics from static histological images.
Recent breakthroughs have already revealed important morphodynamic links that correlate static, structural features with the collective dynamics of multicellular aggregates. Indeed, pioneering work has established that the average cell shape (as quantified by a dimensionless shape index) in confluent cell layers can serve as a remarkably good proxy for collective cell dynamics, including jamming and unjamming behaviour.9–16 Additional static features such as the shape and size of cell nuclei can further refine the predictive power.5,6,17 However, these studies have focused mainly on morphodynamic links for the emergent collective cell dynamics. The question to what extent static or structural information can also inform on single-cell dynamical properties, such as individual cell motility, has thus far remained largely unexplored. Gaining knowledge about such single-cell properties is particularly important in heterogeneous cell layers, where the presence of more intrinsically motile cells, as in the context of a partial epithelial-to-mesenchymal transition (EMT), is associated with more aggressive cancer progression.18–22
Here, we seek to derive information about individual cell motility from purely static cell data. In particular, we aim to discriminate between two different cellular phenotypes, high-motility and low-motility cells, based on static images of a minimally heterogeneous in silico confluent cell layer. The static information that is extracted includes both single-cell geometric shape features and structural properties of the neighbouring cells surrounding a given cell. Our work draws inspiration from Janzen et al.,23 who recently investigated the possibility of predicting particle motility in a dense, heterogeneous mixture of spherical active and passive colloidal particles. Briefly, they showed that the shapes of the Voronoi polygons surrounding active particles exhibit distinct characteristics which can serve as sufficient static information to accurately classify different particle motilities. In the present work, we expand upon this approach to study the more challenging, and more biologically realistic, case of a heterogeneous confluent cell layer. Our primary goal is to infer the phenotype of individual cells based on their static properties. This approach allows us to make predictions about single-cell behaviour without relying on collective cell data.
Our confluent cell model is based on the cellular Potts model (CPM), a simulation technique that allows for cell-resolved dynamics with controllable single-cell motilities.13,24–29 The CPM, despite its simplicity, has been used successfully in the past to capture the behaviour of biological systems.27,30,31 To distinguish between high-motility and low-motility cells, we employ a machine-learning (ML) approach that takes as input instantaneous static information derived from CPM simulation snapshots. Our choice to invoke machine learning stems from the fact that, in recent years, ML has emerged as a powerful tool for identifying structure-dynamics relations in dense disordered passive systems,32–51 purely active systems,51–59 and active–passive colloidal mixtures.23 Moreover, it has been successfully employed in experimental studies to predict information about the properties of cell collectives.60–63 We therefore envision that this work could not only advance our understanding of distinguishing motile and non-motile cells in simulations, but also find future applications in studying the behaviour of individual cells in biological confluent cell layers.
A schematic overview of our methodology is shown in Fig. 1. Briefly, we extract different static features for a given cell from an instantaneous CPM configuration, from which a simple ML algorithm subsequently seeks to classify the cell's motility phenotype. The static input features are subdivided into four categories, namely single-cell (local) shape features, neighbouring-cell (non-local) shape features, local structural features and non-local structural features. The shape features refer to the geometric properties of the cells, such as their size, aspect ratio, and perimeter. Structural features, on the other hand, encompass the spatial arrangement and include metrics such as the cell's position relative to the neighbouring cells. The distinction between shape and structural features allows us to identify how much information regarding a cell's intrinsic motility is captured by its shape. By comparing the predictive power of shape features with structural features, we can determine if the intrinsic motility of a cell can be accurately classified solely on the basis of its shape or if the structural context provides essential additional information. Additionally, focusing on shape features helps minimize the number of parameters to be extracted from images, as pinpointing structural features that require an accurate centre of mass position can be more challenging and computationally intensive. To test the validity range of our ML model, we vary the number of motile cells and their motility strength, thus allowing us to control the cell properties in the heterogeneous confluent layer.
![]() | ||
Fig. 1 Schematic overview of our machine learning approach for identifying active cells within a mixture of active and passive cells. The cellular Potts model generates static cell snapshots, and from each snapshot, a set of shape and structure features is extracted. These features include both local and non-local characteristics. Local features are determined by information on individual cells, while non-local features depend on the cell's neighbours, encompassing neighbour averages, neighbour maximum, and neighbour minimum values. Table 1 shows the complete list of features and the corresponding formulas used to compute them. Local shape features are highlighted in blue, local structure features in green, non-local shape features in orange, and non-local structure features in violet. Following feature extraction, a multilayer perceptron is trained to classify cell types, distinguishing between low motility and high motility cells solely based on the features extracted from a snapshot. |
Our analysis reveals that local (single-cell) shape features alone are sufficient to predict whether a cell is highly motile or non-motile for the computational model at hand. The local shape features work particularly well in the regime where the number of motile cells is small and the difference in cell motility between the two cell types is large. In this regime, the cells have a clearly distinct phenotype and local distortions due to a small number of motile cells can be more easily detected. These results illustrate that the shape of a single cell contains a significant amount of information about the motility of an individual cell. We also investigate how the ML algorithm performs with different cell parameters and show that the model trained only on local shape features is also successful in generalising to data with a different number of motile cells.
We simulate a two-dimensional confluent layer composed of cells with either a high or low motility. The reference Hamiltonian without motility is defined as follows24,25
![]() | (1) |
The individual pixels are indicated with i, j. All cells can be identified with their cell number σ and have an associated cell type α; the cell type is either active or passive, indicating high and zero (or low) motility of the cells, respectively. Each of the terms in the Hamiltonian corresponds to a different physical aspect of the cells. The first term, Hadhesion, accounts for the change in adhesion energy associated with cell–cell adhesion contacts. The magnitude of the cell–cell adhesion term between the cell type is set by Jαi,αj. The Kronecker delta function (δ(σi,σj)) ensures that cells do not experience adhesion interactions with themselves. The second term, Harea, penalises large differences between a cell's actual area Aσ and its preferred area At and maintains a cell's size. Similar to the area constraint, an energy penalty term is included for large variations of a cell's perimeter, Hperimeter. Contrary to the area constraint, this penalty is only accounted for if the cell's perimeter Pσ exceeds a threshold value Pt. When a cell's perimeter is below the threshold Pt, no perimeter constraint is applied. We include the perimeter constraint to avoid cell shapes with non-physically large perimeters, which we observed primarily when the motility of the cells was large. We only include the perimeter constraint for these non-physical cell shapes such that the term does not affect the emerging cell shapes otherwise.
To implement the motility of the cells, we include an energy bias26,67 in the Monte Carlo algorithm using
![]() | (2) |
Here, Δ is the centre-of-mass displacement due to the proposed pixel-copy attempt. The strength of the cell motility is given by κα and depends on the specific cell type α (either active or passive) for each individual cell. The unit vector
σ represents the directional persistence of the cell. When the centre-of-mass displacement is in the direction of the unit vector
σ, the cell is biased to migrate in that direction. The dynamics of
σ is governed by rotational diffusion, i.e., the direction gets updated every Monte Carlo step (MCS) with a random angular perturbation η. We set η in the range from −π/36 to π/36, which is sufficiently long to allow the cells to escape their local environment and explore space. Overall, for a single motile cell, this implementation effectively amounts to a persistent random walk akin to e.g. active Brownian particles.68
Phenotypic heterogeneity is included via the motility term. In biology, motility is controlled by many intrinsic and external factors,3,4,69 but here we reduce this complexity to a single parameter. The key difference between the high-motility and low-motility cells is the strength of the active force κα (which depends on the cell type α). We distinguish between two different scenarios in the simulations, namely
(1) zero-motility cells (κp = 0; passive) combined with high-motility cells (κa = 1500; active);
(2) both cell types are motile, but the high-motility cells are more motile than the low-motility ones (κa > κp > 0).
The first situation allows us to investigate how active cells distort the cellular arrangements in a purely passive cellular environment. The second resembles a more realistic representation of confluent cell layers, as the motility of cells shows heterogeneity even within confluent tissue.70 The actual heterogeneity in cell motility can vary significantly between different biological systems and experimental conditions. In this study, we aim to provide a proof of principle by varying the number of highly motile cells (Na) and the ratio γ, which represents the ratio between low-motility (κp) and high-motility cells (κa). This approach allows us to explore the effects of motility heterogeneity in a controlled manner.
For the numerical implementation, we employ a two-dimensional square lattice of 300 by 300 pixels with periodic boundary conditions. The simulation contains 144 cells where a number Na of these cells are randomly chosen to be active, creating a mixture of active and passive cells. We vary the number of active cells between 1 and 60. We set the adhesion strength Jαi,αj = 5.0 for all cells, and each cell has a target area At of 625 pixels which is enforced with an energy penalty constraint of λA = 1.0. To avoid any cell fragmentation, the pixels of an individual cells are forced to remain connected throughout the entire simulation. This can cause artefacts in the cell shapes (long tails are formed). To circumvent this problem, the perimeter constraint (with λP = 1.0) is applied when the cell perimeter exceeds a value of Pt = 150 pixels. The complete set of simulation parameters is provided in Table S1 in the ESI.†71 The same simulation set-up is used for the heterogeneous mixture of high-motility and low-motility cells.
After equilibration, the static snapshots are stored every 1000 mcs. This time interval is chosen such that the high-motility cells can move sufficiently between consecutive snapshots. Snapshots for different parameters are shown in Fig. 2. It is challenging to distinguish between the two cell types in the static images by visual inspection only. We therefore extract physical features from these snapshots to determine whether a machine-learning algorithm can predict the phenotype of the cell based on a set of simple physical properties.
To evaluate the model's performance, we calculate the accuracy, defined as the number of correct predictions divided by the total predictions. Correct predictions include both accurately identifying high-motility cells as motile and low-motility cells as non-motile. Here, a prediction is a single classification attempt (motile or non-motile) based on the static input features for one given cell from one simulation snapshot (detailed in the section below). When the number of motile cells Na deviates from the number of non-motile cells, indicating an imbalanced dataset, we address this by randomly selecting a subset of non-motile cells and excluding them. This method ensures a balanced dataset with the same number of motile and non-motile cells. We use multiple independent snapshots to obtain a total of 120000 cells. Note that the number of snapshots used depends on Na, but the overall number of cells remains fixed. We randomly divide the dataset into training and test sets, allocating 80% of the data to the training set and 20% to the test set. We train 20 independent neural networks, and the reported accuracy is the average accuracy obtained from these neural networks. Consequently, while the single-cell features used for training are extracted from multiple snapshots, the trained model can be tested on features extracted from a single cell. This means that although multiple snapshots are used during the training phase to improve the model's robustness, the properties of an individual cell are sufficient for making predictions during the testing phase.
While this paper presents results based on the application of a multilayer perceptron, we have confirmed that similar results can be achieved using a more sophisticated ML algorithm, specifically a gradient-boosting model, which is a machine-learning method based on decision trees.77 Additionally, our results show that a simple logistic regression model78–80 exhibits markedly lower accuracy in predicting cell motility than either the multilayer perceptron or the gradient-boosting algorithm (see Table S2 in the ESI†71). Consequently, we can conclude that, for this classification problem, a more advanced non-linear model such as a multilayer perceptron is necessary.
Structural features are derived from the centres of mass (COM) of cells and are based solely on properties akin to local structural metrics commonly used for dense, disordered particle systems. These features encompass bond order parameters ψn with n = 2, …, 12,81 along with the first and second moment of the neighbour distance and its standard deviation. The single-cell shape features, instead, are computed based on the pixels that constitute each cell. These geometric features include cell size, border length, semi-minor and semi-major axes, parallel and perpendicular alignment, number of neighbouring cells (calculated for each cell to determine how many other cells are adjacent to it), and eccentricity. The eccentricity is determined by fitting cells with an ellipse using a least squares approach.82
For both shape and structural features, we further divide these types into two categories: local features, derived from information about individual cells, and non-local features, which depend on the properties of a cell's neighbours. Since motile cells tend to deform their neighbourhoods more significantly, examining non-local features provides additional insights. These non-local properties include neighbour averages, and maximum and minimum distances between the centres of mass of a cell and its neighbouring cells. Cells are classified as neighbours when they share at least one pixel. Similar to previous work,83 local shape alignment between neighbouring cells has also been included. Table 1 illustrates local shape features in blue, local structure features in green, non-local shape features in orange, and non-local structure features in violet. The distributions of various features used in the ML model are provided in the ESI.†71
Note that the list of features used here is by no means complete. Depending on the specific biological situations, other features could be relevant as well. For example, individual human bone marrow stromal cells (hBMSCs) exhibit strong surface curvature, which can also be a relevant shape characteristic to include.84 These features have not been included here, since the cells in the simulations do not exhibit strong curvature. Moreover, it is worth noting that additional radial and angular descriptors can be incorporated into the structural features, as outlined previously.42 However, we choose to focus on a simpler approach for computational efficiency23 and because, as will be explained in the results section, our approach, though simple, is robust and provides sufficiently accurate results.
Our second approach involves using SHAP85 to determine the relative contribution of each feature to the prediction. In essence, the SHAP explanation method computes Shapley values by integrating concepts from cooperative game theory. The objective of this analysis is to distribute the total payoff among players, considering the significance of their contributions to the final outcome. In this context, the feature values act as players, the model represents the coalition, and the payoff corresponds to the model's prediction.
Lastly, our third approach involves applying PCA86 on our dataset, including shape and structural features. PCA is a valuable tool for condensing multidimensional data with correlated variables into new variables, representing linear combinations of the original ones. Essentially, PCA serves as a method to reduce the dimensionality of high-dimensional data. By identifying the features with significant variances, we can reveal the inherent characteristics within our dataset. The first component corresponds to the projection axis that maximises variance in a particular direction, whereas the second principal component represents an orthogonal projection axis that maximises variance along the subsequent leading direction. This iterative process can be continued to identify additional components.
To gain a deeper understanding of the importance of shape features, we have further subdivided the shape features into local features (single-cell information) and non-local features (information from neighbouring cells). As shown in Fig. 3, it is noteworthy that relying solely on local shape features predicts the correct cell phenotype with almost the same accuracy as using the full set of features.
Apart from the accuracy, we have also investigated the types of errors made by our machine learning model. The model can generate false negatives (high-motility cell is not identified as motile) or false positives (zero-motility cells identified as motile). These results for the ML models trained on all features and local shape features are presented in Fig. S13 in the ESI.†71 This analysis shows that when all features are used, both types of error occur with approximately the same frequency, with a slight bias toward not identifying the active cell as Na increases. When only local shape features are used, active cells are missed more frequently, while false identification of passive cells as active is less likely. This suggests that the local environment actually contains some information to improve the prediction of an active cell in a confluent layer. Despite these differences, overall performance remains similar and is still reliable for predicting cell phenotype.
Lastly, we perform analyses using both SHAP and PCA. Both reveal that the list of important features is not limited to local shape features but rather encompasses a combination of the four feature groups: shape (both local and non-local) and structural (both local and non-local) features. Retraining the neural network with the features selected by SHAP or with the principal components obtained from the PCA yields an accuracy almost identical to that obtained with a neural network trained with all features.71 As shown in Fig. S16 and Table S3 in the ESI,†71 these analyses indicate that features related to neighbour distance (e.g., standard deviation of neighbour distance) are often the most important ones. Since the neighbour distance-related features are indirectly connected to the shape of the cells, it is perhaps not surprising that these analyses identify these features as the most important ones.
Although SHAP and PCA reveal that the most important features are a combination of both shape and structure, the list of relevant features selected by these machine-learning approaches changes with Na, making these analyses less computationally efficient. This inefficiency arises from the need to repeat these analyses (SHAP or PCA) for each specific configuration to obtain this list of most important features. Therefore, we can conclude that our simpler approach of selecting only shape features is sufficient for achieving reasonable accuracy for our simplest CP model and is the most robust, consistently yielding results almost identical to those obtained using all features, regardless of Na.
Following a similar approach as in the previous section, we train a neural network for each dataset using static properties, as introduced in Section 2.3. Here, each dataset corresponds to a distinct ratio between low and high cell motility, denoted as γ = κp/κa, along with the number of highly motile cells Na. Fig. 4 shows the accuracy within the (γ, Na)-plane for a neural network trained with only local shape features. Consistent with the results observed for non-motile (passive) cells in the previous section, the neural network exclusively trained on local shape features has nearly identical accuracy compared to the one trained with all 145 features (see Fig. S14 in ESI†71). This figure shows that when the number of highly motile cells Na is low, and the ratio between cell motility γ is small, indicating a substantial difference between high-motility and low-motility cells, the model can accurately classify the cell motility.
While the machine learning model relies on individual static images, our numerical CPM simulations also enable the explicit tracking of the emergent dynamics. Notably, we find that our machine learning model tends to fail only when the emergent dynamics, specifically the long-time diffusion coefficients, of high-motility and low-motility cells are very similar (see Fig. S15 in the ESI†).71 These findings align with those presented in earlier work,23 where it was shown that in an active–passive mixture of spherical, rigid particles, a machine learning model can correctly classify particle types when the number of active particles is low, and the activity is high.
Finally, invoking a SHAP analysis or PCA, we achieve accurate predictions using only the most important SHAP- or PCA-selected features (see Fig. S16 and Table S3 in the ESI†71). Similarly to the previous section, where the cells are passive, we observe that the most important features identified by these analyses are a combination of shape (both local and non-local) and structural (both local and non-local) features. While this feature list remains consistent for fixed γ > 0 and different Na, it varies for different γ. Consequently, as discussed in the previous section, this approach is less computationally efficient compared to the case of using local shape features, which yields accurate results for different configurations.
In summary, our findings indicate that our machine-learning model can accurately classify cell motility when the number of motile cells is low, and the motility of highly-motile cells significantly surpasses that of low-motility cells. Comparable to the case in which the low-motility cells are passive, accurate predictions can be achieved using local shape features alone. While this approach shows its potential for the simplified CP model that we have developed, we speculate that these results can generalize to other computational models and potentially to experimental results.
In Fig. 5, we compare the accuracy obtained when the ML model is trained and tested on a singular specific value of the number of motile cells Na (black dots), with the accuracy of models trained with Na = 1 (red stars), Na = 15 (blue triangles), or Na = 60 (orange inverted triangles). As expected, this figure shows that the model performs best when trained and tested on a singular, specific value of Na. Nevertheless, all four curves yield an accuracy surpassing 0.7, suggesting that a single model trained at a fixed Na can provide accurate predictions for unseen parameter regions.
Fig. 5 reveals that the ML model trained with an intermediate number of motile cells, Na = 15, yields nearly identical results compared to the model trained and tested on a single, specific value of Na. This model is the most effective across the entire range of Na. We attribute this to the fact that a system with an intermediate number of motile cells shares similarities with both low and high numbers of motile cells, contributing to its robust performance. Additionally, while the model trained on one motile cell generalises better to different data associated with a small number of motile cells (Na < 10), the model trained on Na = 60 generalises better to different data corresponding to a high number of motile cells (Na > 30). This discrepancy arises from the distinctive system structures (see feature distributions in ESI†71) between scenarios with only one motile cell and those with a substantial number, respectively.
Lastly, we explore whether the machine learning approach can generalise to a different data set when the number of motile cells is constant, and the ratio between cell motilities γ varies. For each value of Na, we train four distinct models: one with γ = 0 (where κp = 0 and κa = 1500), another with γ = 0.1 (where κp = 150 and κa = 1500), a third with γ = 0.2 (where κp = 150 and κa = 750), and the last one with γ = 0.4 (where κp = 150 and κa = 375). Subsequently, each of these four models is tested with the local shape features corresponding to the dataset with a fixed ratio γ = 0.1. We have decided to test the generalization of the ML model with γ = 0.1, as the previous section has shown that the accuracy is highest for this mixture of high-motility and low-motility cells.
Fig. 6 demonstrates that, as expected, the highest performance is achieved by a model trained and tested on the identical ratio between cell motility, γ = 0.1 (black dots). Additionally, the figure shows that a model trained on γ = 0 (represented by red stars) yields an accuracy nearly indistinguishable from the model trained and tested on γ = 0.1 when the number of motile cells is high (Na > 10). In both datasets corresponding to γ = 0 and 0.1, the motility of the highly-motile cells remains constant. Consequently, the model exhibits effective generalisation within this parameter range, even if is trained on data associated with different low motility. This generalisation can be attributed to the similarity in behaviour between the two systems, given the abundant high-motile cells sharing the same motility.
When the model is trained on γ = 0.2 and tested on γ = 0.1 (blue triangles), the accuracy is always lower than that of the model trained and tested on γ = 0.1. Nonetheless, the accuracy is consistently higher than 0.7, indicating that this model can reasonably generalise unseen data. Lastly, when the model is trained on γ = 0.4 (orange inverted triangles), the accuracy significantly diminishes compared to the model trained and tested on γ = 0.1. Furthermore, the accuracy drops below 0.7 as the number of motile cells increases (Na > 20). These results indicate that a decrease in the motility of motile cells corresponds to a lower predictive power of the model when tested on unseen data. We expect that testing the trained networks with larger values for γ also makes it more difficult to generalize as the baseline prediction (see Fig. 4) is significantly worse for this parameter in the first place.
In summary, we find that the generalisation capability of our machine-learning approach to different unseen data is reasonable. The model is capable of making fairly accurate predictions when the number of motile cells is unknown, but its predictive power diminishes when the motility of the highly-motile cells in the training and testing sets are significantly different.
Common limitations of machine-learning approaches are that they may generalise poorly to unseen data, and that they may offer limited physical insight. We find that our model exhibits reasonably good generalisation when the number of motile cells or the motility ratio is unknown, provided that the motility strengths in the training and testing sets do not differ greatly. This reaffirms that the power of machine-learning methods relies heavily on the use of a sufficiently diverse data set. Additionally, to gain some physical insight from our machine-learning predictions, we have employed three different methods to assess the importance of the various input features. Of these, the analyses based on SHAP and PCA reveal that there is not a universal list of most important static features: in general, the most important features combine cellular shape and structural characteristics, and the list varies with different heterogeneity settings (Na). Nonetheless, if we restrict the data set to local single-cell shape features alone, we find that this simple approach leads to remarkably robust predictions across the different settings studied in this work. This suggests that the full list of structural input features may contain some redundancies. Importantly, it also allows us to conclude that a cell's instantaneous shape, though not perfect, can serve as a remarkably useful informant on a cell's phenotype.
Our work, which establishes a morphodynamic link for individual cells, is complementary to recent research on morphodynamic links at the collective cell level. In particular, previous studies have demonstrated that the average cell shape within confluent tissue can be used as a static order parameter for emergent, collective cell jamming and unjamming dynamics.9–16 By integrating these insights, our work not only reinforces the significance of cell shape in understanding collective behaviour, but it also provides a more nuanced perspective on how intrinsic single-cell properties are coupled to a cell's morphology. This study also opens up avenues for further research on the role of heterogeneity in dense cell collectives, following previous work that has studied the heterogeneity in size and softness of cells.89,90 Given the simplicity, performance, and computational efficiency of our machine-learning approach, we anticipate that a similar approach could ultimately prove valuable in analyzing experimental cell data—particularly for diagnostic tasks like assessing the progression of partial or complete EMT in tumors or tissues.
Footnotes |
† Electronic supplementary information (ESI) available: Details on additional shape and structure characterization, additional figures about accuracy of the machine-learning algorithm and results for SHAP and PCA analysis on the data. See DOI: https://doi.org/10.1039/d5sm00222b |
‡ These authors contributed equally to this work. |
This journal is © The Royal Society of Chemistry 2025 |