An Integrative Omics-Machine Learning Framework for Predicting Pulmonary Responses to Titanium Dioxide Nanoparticles
Abstract
We address the challenge of leveraging complex lung transcriptomic data for nanosafety by re-optimizing the Transcriptomic Response Index (TRI) as a single, New Approach Methodology (NAM)-ready omics endpoint for instilled titanium dioxide nanoparticles (TiO2-NPs). Using mouse lung gene-expression profiles after intratracheal instillation of five TiO2-NPs across three doses and two post-exposure times (30 conditions), we compressed 621 differentially expressed genes (DEGs) into a one-variable index via Principal Component Analysis (PCA) and systematically evaluated TRI variants composed of Principal Components PC1–PC2 up to PC1–PC29 (TRI2–TRI29). The full composition (TRI29) reconstructs ~99.9% of transcriptomic variance, while a compact TRI2 maximized predictability when linked to exposure predictors via ridge regression with Genetic Algorithm-based feature selection (R2 = 0.79; Q2CV = 0.70; Q2 = 0.79). Thus, while the full-space TRI29 covers the whole transcriptomic variance, a compact TRI2 captures ≈ 44% but yields the best predictive performance. Because TRI is built from PC loadings, it remains reversible to genes. Collectively, our results demonstrate that TRI is portable across different nanomaterials, but must be re-optimized per dataset to balance explained variance and predictive performance. TRI2 provides a compact, interpretable, and externally validated bridge from TiO2-NP attributes to system-level transcriptomic response. By replacing hundreds of gene-specific models with a single, validated endpoint, this workflow streamlines omics-driven screening, potency ranking, and study-internal extrapolation in regulatory-relevant NAM pipelines.
Please wait while we load your content...