Computational Modeling using Multi-Omics to Extract Early Predictive Signatures of T-cells Quality
Chimeric Antigen Receptor (CAR) T-cell therapy involves the genetic modification of patient's autologous T-cells to find and attack cancer cells throughout the body. Establishing critical quality attributes (CQAs) and critical process parameters (CPPs) is crucial for ensuring the potency, safety, and consistency of this therapy. To better understand these therapies, a design of experiments evaluating the expansion of T cells was carried out. Through a supervised learning approach, multi-omics predictors (i.e. secretomes and NMR metabolomics features) can be used to understand T cell behavior based on growth and memory responses. The purpose of this work is to develop a computational pipeline that enables the characterization of multi-omics profiles that are predictive of quality responses at early stages of the manufacturing process. This includes the design of a computational tool that will measure the predictive power of omics variables and the sensitivity of these models to highly correlated predictors. The computational tool has been developed using mathematical modeling and machine learning techniques such as Random Forest (RF), Gradient Boosted Trees (GBT), Support Vector Machines (SVM), and Symbolic Regression (SR). A consensus measurement between the different models was used in order to identify potential CQAs and CPPs. The biological meaning of these features is then assessed through discussions with the domain experts. A degree of consensus was achieved throughout the models in identifying important variables when modeling the percentage of T cells that are CCR7+CD62L+CD4+ (CD4_mem_frac). The models presented high prediction performance with R-squared ranging from 75% to 95% where SVM and SR provided the least and best prediction performance, respectively. The ranking of important features from these best-performing models can be misleading if these features are highly correlated. Hence, to mitigate the computational impact of highly correlated predictors, an approach involving the clustering of these variables is proposed before the model fitting process. Preliminary results showed that not only was this methodology able to rank groups of correlated predictors but was also able to improve performance for RF models. The findings of this work could enable the discovery of new knowledge necessary to achieve scalable biomanufacturing of CAR-T cell therapies in an automated manner.