Abstract:
Objective To construct and validate a risk prediction model for malnutrition in non-dialysis patients with stages 3 to 5 chronic kidney disease (CKD) based on Logistic regression (LR) and XGBoost algorithms, and to compare the predictive performance between the two models.
Methods A total of 506 CKD patients were enrolled as study subjects. According to chronological order, they were divided into training set (n=404) and test set (n=102) at the ratio of 8 to 2. The training set was divided into case group and control group based on whether they were malnourished, with 202 cases in each group. The LR and XGBoost models were established, and the model efficacy was evaluated through the area under the receiver operating characteristic (ROC) curve (AUC), sensitivity, specificity, GiViTI calibration curve band and clinical decision curve.
Results The LR model identified age ≥60 years, disease of stage 5, reduced appetite, hypoalbuminemia, low prealbumin, low mid-arm muscle circumference and high perceived stress as independent risk factors for malnutrition among non-dialysis CKD patients, while physical activity was identified as a protective factor (P < 0.05). In the XGBoost model, the top five influential variables were serum albumin, appetite, physical activity, prealbumin and mid-arm muscle circumference. The AUC of the LR and XGBoost models in the training set were 0.930 and 0.947 respectively, and those in the test set were 0.925 and 0.933. The predictive ability of the latter was slightly higher (P>0.05). The GiViTI calibration curve bands all showed good calibration capability.
Conclusion The XGBoost model combined with shapley additive explanation performs better in identifying malnourished patients and guiding precise care.