基于机器学习算法的肺癌四级胸腔镜手术后肺部感染风险预测模型构建

Risk prediction model construction of postoperative pulmonary infection in lung cancer patients undergoing four-level thoracoscopic surgery based on machine learning algorithms

  • 摘要:
    目的 开发并验证基于5种机器学习算法的风险预测模型, 以评估行肺癌四级胸腔镜手术患者术后发生肺部感染的风险。
    方法 采用回顾性研究方法选取2022年1月—2024年6月上海市某三甲医院行四级胸腔镜手术的肺癌患者2 380例为研究对象,并将其分为训练集1 665例和验证集715例。采用Logistic回归(LR) 模型、人工神经网络(ANN)、支持向量机(SVM)、随机森林(RF)和极端梯度增强(XGB)5种机器学习算法构建肺癌四级胸腔镜手术患者术后肺部感染风险预测模型,并绘制列线图。
    结果 2 380例患者中,术后肺部感染患者226例。最小化绝对收缩和选择算子(LASSO)回归算法筛选出每日吸烟量、糖尿病病史、术前弥散功能、肿瘤最大直径、术后24 h胸液引流量、围术期口服营养补充(ONS)情况、术后留置导尿管情况、术中胸膜粘连程度8个特征变量。使用8项临床特征构建术后肺部感染风险预测模型, 5个模型的曲线下面积(AUC)为0.862~0.947, XGB模型表现最好, AUC为0.947(95%CI: 0.937~0.962); LR模型AUC为0.926(95%CI: 0.918~0.933), 仅次于XGB模型。
    结论 基于机器学习算法的模型可帮助精准识别肺癌四级胸腔镜手术后发生肺部感染的高危患者。列线图有望成为护理人员对此类患者进行风险管理的有效工具。

     

    Abstract:
    Objective To develop and validate risk prediction models utilizing five machine learning algorithms for assessing postoperative pulmonary infection (PPI) risk in lung cancer patients undergoing grade Ⅳ thoracoscopic surgery.
    Methods A retrospective cohort study included 2, 380 lung cancer patients who underwent grade Ⅳ thoracoscopic surgery at a tertiary hospital in Shanghai (January 2022 to June 2024). Patients were stratified into training (n=1, 665) and validation (n=715) cohorts. Five machine learning algorithms—Logistic regression (LR), artificial neural network (ANN), support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGB)—were employed to construct predictive models. A nomogram was developed for clinical utility.
    Results Among 2, 380 patients, 226 (9.5%) developed PPI. The Least Absolute Shrinkage and Selection Operator (LASSO) regression identified eight predictive variables: daily cigarette consumption, diabetes history, preoperative diffusing capacity, maximal tumor diameter, 24-hour postoperative chest drainage volume, perioperative oral nutritional supplementation (ONS), postoperative urinary catheterization, and intraoperative pleural adhesion severity. All models demonstrated robust discrimination, with area under the curve (AUC) values ranging from 0.862 to 0.947. The XGB model achieved superior performance (AUC=0.947, 95%CI, 0.937 to 0.962), followed closely by the LR model (AUC=0.926, 95%CI, 0.918 to 0.933).
    Conclusion Machine learning-based algorithms models effectively stratify PPI risk in lung cancer patients following grade Ⅳ thoracoscopic surgery. The derived nomogram provides a practical tool for perioperative risk management by healthcare providers.

     

/

返回文章
返回