Abstract:
Objective: To develop a stacked ensemble model that integrates clinical risk factors and plasma proteins to predict the risk of bladder cancer.
Methods: This study utilized the UK Biobank cohort. We included 419 incident bladder cancer cases and 33,453 controls. The Cox proportional hazards model was used to screen out the plasma proteins associated with the risk of bladder cancer, and the two algorithms of random forest and Boruta were used to select the common characteristic proteins selected by the two algorithms. A bladder cancer risk prediction model was constructed using a stacked ensemble learning strategy, training and predicting the model by integrating clinical risk factors with feature proteins.
Results: A total of 104 proteins associated with the risk of bladder cancer were identified through the Cox proportional hazards model, and 20 feature proteins were ultimately selected after algorithmic screening. Using a stacked ensemble model that integrates clinical risk factors and feature proteins to predict the risk of bladder cancer, the area under the receiver operating characteristic (ROC) curve reached 0.788.
Conclusion: Various plasma proteins play an important role in the risk of bladder cancer, and a stacked ensemble model combined with clinical risk factors effectively predicts the risk of developing bladder cancer.