Time: 2024-08-11
The framework for prediction of patient survival and response incorporates knowledge graphs ( KGs ) in patient survival prediction models . By using the patient 's genomic signature , gene information is projected to the KG , expanding knowledge to include gene - gene interactions . The identified connected graph is then projected to a lower - dimensional embedding using the SocioDim algorithm . Gene - specific embeddings are aggregated to a patient - level one for input into the machine learning ( ML ) patient survival prediction model . The use of various methods for generating low - dimensional embeddings from KGs was evaluated , with the SocioDim algorithm proving advantageous.
A stand - alone software application knowledge interface ( API ) was developed to incorporate patient genomic data with relevant prior knowledge from KGs as input to the ML model for predicting patient survival . The study compared the performance of internally developed knowledge graph BIKG with a well - established biomedical KG Hetionet , demonstrating the adaptability of the framework to various KG platforms.
The proposed framework was applied to analyze improvements in using KGs to predict survival in non - small cell lung cancer ( NSCLC ) patients from different studies and clinical trials . The study evaluated multiple ML - based predictive algorithms for patient survival , with random survival forest proving to be the fastest , most stable , and most generalizable method.
The accuracy of OS prediction was enhanced through the incorporation of prior knowledge from KGs in the analysis of NSCLC patients treated with immune checkpoint inhibitors ( IO ) in clinical trials . Models trained with the BIKG in combination with gene panel data outperformed models based solely on gene panel data in predicting OS in both OAK and POPLAR clinical trials.
The study identified a biomarker - based signature for differentiating OS in NSCLC patients using a combination of key genes . The gene mutation signature as an OS differentiator was found to significantly differentiate OS in patient cohorts from the OAK and MSK studies . The prevalence of mutations in predicted high- and low - risk groups was compared , with models using the BIKG identifying more high - risk patients associated with specific gene mutations.
Overall , the incorporation of prior knowledge from KGs in predictive models shows promise in improving the accuracy of OS prediction in NSCLC patients , particularly when combined with gene panel data . The study highlights the potential of using biomarker - based signatures to differentiate OS outcomes in NSCLC patients across various datasets and clinical trials.