Skip to main content

Development and validation of a LASSO prediction model for cisplatin induced nephrotoxicity: a case-control study in China

Abstract

Background

Early identification of high-risk individuals with cisplatin-induced nephrotoxicity (CIN) is crucial for avoiding CIN and improving prognosis. In this study, we developed and validated a CIN prediction model based on general clinical data, laboratory indications, and genetic features of lung cancer patients before chemotherapy.

Methods

We retrospectively included 696 lung cancer patients using platinum chemotherapy regimens from June 2019 to June 2021 as the traing set to construct a predictive model using Absolute shrinkage and selection operator (LASSO) regression, cross validation, and Akaike’s information criterion (AIC) to select important variables. We prospectively selected 283 independent lung cancer patients from July 2021 to December 2022 as the test set to evaluate the model’s performance.

Results

The prediction model showed good discrimination and calibration, with AUCs of 0.9217 and 0.8288, sensitivity of 79.89% and 45.07%, specificity of 94.48% and 94.81%, in the training and test sets respectively. Clinical decision curve analysis suggested that the model has value for clinical use when the risk threshold ranges between 0.1 and 0.9. Precision-Recall (PR) curve shown in recall interval from 0.5 to 0.75: precision gradually declines with increasing Recall, up to 0.9.

Conclusions

Predictive models based on laboratory and demographic variables can serve as a beneficial complementary tool for identifying high-risk populations with CIN.

Peer Review reports

Introduction and background

Cisplatin and its analogues are widely used in chemotherapy regimens for cancer treatment, with approximately 10-20% of cancer patients receiving such treatment. However, the side effects of cisplatin can lead to reduced dosage or the selection of alternative therapies, ultimately affecting prognosis. The lack of effective treatment measures to alleviate side effects, such as gastrointestinal problems, hematologic toxicity, neurotoxicity, and ototoxicity, can decrease the quality of life and increase medical costs [1]. Cisplatin-induced nephrotoxicity (CIN) is a common side effect affecting 20-45% of patients, which is also the main limitation for its use [2,3,4]. Chemotherapy itself can cause renal tubular injury, interstitial nephritis, and thrombotic microvascular disease [5]. As cisplatin uptake and excretion are mainly mediated by proximal tubule transporters, its accumulation in renal proximal tubule cells can lead to cell injury [2]. Up to now, risk factors associated with CIN include advanced age, smoking, type of cancer, comorbidities, baseline blood biochemical levels before chemotherapy (such as creatinine, albumin, cystatin, etc.), exposure to nephrotoxic drugs (such as iodinated contrast agents, long-term use of non steroidal anti-inflammatory drugs (NSAIDs), and gemcitabine), electrolyte disorders (low serum magnesium levels), alcohol intake, and high-dose cisplatin (≥ 50 mg/m2) per dose, Frequency of administration, cumulative dose, and insufficient hydration during administration [6, 7]. By investigating related pathological mechanisms, such as reactive oxygen species and mitochondrial dysfunction, cell death pathways, inflammatory responses, autophagy, and other related signaling pathways, researchers have identified differences in the genetic characteristics of key genes in CIN [2, 8,9,10]. However, variations in clinical features, laboratory and genetic results, and the weight of risk factors have been observed across different studies, and there is a lack of sensitive and specific CIN prediction biomarkers for both genetic and non-genetic factors [11]. These differences may be attributed to genetic variability among research subjects, disease types and protocols, inconsistencies in laboratory results and research design and the standardization of data analysis [1, 12].

Predictive models have been widely used to diagnose, treat, and evaluate prognosis by integrating non-unique factors and comprehensively assessing their weight [13]. Such models may help identify individuals at risk of nephrotoxicity, guide optimal drug and dose selection, and inform prevention strategies. Given the objectivity of tumor genetic heterogeneity, it is necessary to construct a prediction model that combines prediction indicators based on more comprehensive clinical information and specific target gene information for unique types of tumors.

Genetic candidate genes and GWAS have identified several genetic risk factors for CIN [7, 11]. Okawa T [5] et al have developed a prediction model for CIN in elderly prostate cancer patients using a random forest algorithm that incorporated clinical and genomic characteristics extracted from saliva samples. It is believed that Genomic markers associated with nephrotoxicity are believed to be located in the regions between NAT1, NAT2, CNTN6, and CNTN4. Lung cancer remains the leading cause of cancer-related deaths worldwide, accounting for 30% of all cancer deaths in China [14, 15]. In terms of incidence, lung cancer is the most common cancer in China, with a mortality rate of 50% in Chinese males in 2020 [14]. Commonly recognized genetic variants associated with lung cancer and CIN include single nucleotide polymorphisms in genes such as ERCC1, ERCC2, and SLC22A2 [12]. In our study on mitochondrial pathway disorders, we observed a reduced risk of nephrotoxicity in carriers of the T allele of rs920829 in the TRAP1 gene compared to carriers of the C allele (OR 0.684, 95% CI 0.524–0.894, p = 0.003). Consequently, we plan to include SNP features of ERCC1, ERCC2, SLC22A2, and TRAP1 gene in future research.

The objective of this study is to utilize Lasso regression to identify suitable clinical and genetic features and construct and validate a CIN risk prediction model for lung cancer patients.

Materials and methods

Study subjects

A retrospective traing set was constructed to develop a predictive model for patients with clear lung cancer diagnosis and platinum chemotherapy regimen. The traing set included 696 patients who were hospitalized at Sichuan Provincial People’s Hospital between June 2019 and June 2021, of which 189 cases had CIN. A test set of 283 patients with lung cancer and platinum chemotherapy regimen was prospectively and continuously included from July 2021 to December 2022 in the same hospital. All patients underwent the same preliminary clinical evaluation and treatment observation. The research process was shown in Fig. 1.

Fig. 1
figure 1

Flow diagram of the study population

Inclusion criteria were as follows: unrelated Han Chinese; having carboplatin-based chemotherapy; signed written informed consent; having demographic characteristics, physical examination, laboratory examinations, pathologically and histologically confirmed lung cancer; normal liver and kidney function before chemotherapy; and no obvious abnormalities in the preliminary clinical evaluation. Exclusion criteria included: <18 years old; liver or kidney dysfunction prior to initial chemotherapy [16]. This study conformed to the provisions of the Declaration of Helsinki (as revised in 2013) and it was authorized by the Ethics Committee of Sichuan Provincial People’s Hospital, University of Electronic Science and Technology of China Hospital. (Registration Number: AF-02/01.0). The chemotherapy regimens are listed in Table 1.

Table 1 Clinical characteristics and Indications for clinical laboratory tests of the study subjects

Definitions

Throughout each treatment cycle, toxicology information pertinent to the evaluation of cisplatin therapy (defined using the Common Terminology Criteria for Adverse Events version 5.0) was documented at least twice weekly [17]. This is the criteria how nephrotoxicity was rated: Grade 1, increased levels of creatinine above 0.3 mg/dL or 1.5–2.0 times higher than baseline levels; grade 2, 2–3 times higher than baseline levels; grade 3, more than 3 times higher than baseline levels or absolute levels above 4.0 mg/dL or requiring hospitalization; and grade 4, life-threatening consequences or requiring dialysis [17]. After 2 and 14 cycles, oncologic outcome reporting criteria were used to classify patient responses to treatment into 4 categories: complete response (CR), partial response (PR), stable disease (SD), and progressing illness (PD) [18].

Data collection, preprocessing, and feature variable screening

The definitive diagnosis of CIN and basic medical history of subjects were exported from the HIS system by data collectors, and all relevant laboratory indications were exported in the LIS system. of complete blood count (SYSMEXXN-10, Sysmex, Japan), coagulation tests (SYSMEXCS-5100, Sysmex, Japan), and biochemical examination (Cobas c702, Roche, Germany)(Table 1). Candidate SNPs loci were typed using 48-Plex SNPscan® high-throughput SNP typing technology (18). Thirty samples were randomly selected for double-blind experiments to ensure the repeatability and stability of the genotyping results, and all the genotype calling success rates were greater than 99.0% [19]. For single variables measured multiple times, we retrieved patients’ admission records from the Hospital Information System (HIS) for those who underwent cisplatin chemotherapy regimens, and measurements, we retrieved patients’ admission records from the Hospital Information System (HIS) for those who underwent cisplatin chemotherapy regimens, and included their initial test records upon admission. The missing data of < 10% were filled with the median for continuous variables and plural for categorical variables, while missing data of > 10% were excluded. The medical records were used by data collectors to diagnose CIN, and any records without a definitive diagnosis were excluded after confirmation by a consulting clinician. Genetic polymorphism testing staff and clinical data collectors worked independently, and data analysts used all data jointly to build predictive models and perform performance validation. Absolute shrinkage and selection operator (LASSO) regression was used to initially screen candidate variables, with 1 standard deviation (1sd) penalty coefficient lambada (λ) selected.

Identification of candidate predictors and construction of prediction models

The prediction model was constructed using multivariate logistic regression based on demographic variables and laboratory panel data [20]. . STATA software v15.0 was used to model candidate variables, with the goodness of fit evaluated using Akaike’s Information Criterion (AIC) [13, 21]. . The selection criteria were AIC minimization and candidate variable minimization without affecting predictive efficacy [21].

Adjustment for model confounders and evaluation of predictive efficacy using training and test set data

Through 10-fold cross-validation, the model with the highest accuracy was selected. Covariance and interaction analyses were also performed on the candidate predictors. We used sensitivity, specificity, positive predictive value, negative predictive value, receiver operating characteristic (ROC) curves and C-index were used for model differentiation assessment, while calibration curve plots were used for consistency assessment [20].

Statistical analysis

The clinical and laboratory data were analyzed using SPSS software (version 23.0). Quantitative data with normal distribution were analyzed using t-tests or ANOVA, while non-normal quantitative data were analyzed using Mann-Whitney or Kruskal-Wallis nonparametric tests. Count data were analyzed using the chi-square test or logistic regression [16]. Potential predictors were screened using Lasso regression in R version 3.6.1 software. Multi-factor analysis was performed using STATA version 14 software with logistic regression stepwise selection method, and the model was constructed based on the minimum AIC and the minimum number of predictors. Precision-Recall (PR) curve was plotted using the “ggplot2” package in R version 3.6.1 software. A nomogram was used to visualize the prediction model, and decision curves were used to analyze its clinical application value. The incidence of CIN in the China population was approximately 20% [22]. The bilateral significance level was set at 5%, with a test power of 80%. Taking into account a 10% loss to follow-up, the sample size for each group was estimated at approximately 100 cases [23].

Results

Basic information about the study population and clinical characteristics

In total, 979 patients were included in this study, with 696 patients (189 CIN vs. 507 controls) in the traing set and 283 patients (71 CIN vs. 212 controls) in the test set. There was no significant difference in the frequency of CIN between the two sets. Table 1 presents the clinical characteristics of the study subjects, while Table 2 displays the distributions of allele and genotype frequencies of all SNPs.

Table 2 The distributions of allele and genotype frequencies of all SNPs

Model predictor screening

Lasso regression was utilized to screen variables in the traing set, revealing that the optimal subset of non-zero coefficient variables for inclusion in the model was 36 at the 1sd value of 10-fold cross-validation error λ = 0.02185674 and 11 at the minimum value of 10-fold cross-validation error λ = 0.006521281, as depicted in Figs. 2 and 3.

Fig. 2
figure 2

Determination of the optimal penalty factor λ = 0.006521281 (mininum) and λ = 0.02185674(1 Ssd) in the Lasso model using 10-fold cross-validation

Fig. 3
figure 3

Distribution of Lasso coefficients for the 69 clinical characteristics. The left dashed vertical line shows the 36 non-zero coefficient variables for which λ was chosen as the minimum and 11 non-zero coefficient variables for which λ was chosen as the 1se

Identification of candidate predictors and prediction model building

36 candidate predictors were modeled in various ways, and the screening p values, AIC, and BIC were presented in Table 3. Model 1 had the smallest AIC of 246.41, but it contained an excessive number of predictive factors. Model 2 incorporated 11 variables with an AIC of 274.35, model 4 incorporated 10 variables with an AIC of 285.48, and model 8 incorporated 9 variables with an AIC of 344.94. A comparison of model 2, model 4, and model 8 using the “lrtest test command” of STATA software revealed that although model 4 and model 8 incorporated fewer variables, their predictive efficacy was reduced (both p < 0.05). The inclusion of rs3212986 as a dummy variable in the predictive factors did not improve the predictive efficiency as the AIC and the number of predictive factors of the model increased. Therefore, model 2 was considered the best model with the characteristics of incorporated variables as shown in Table 3.

Table 3 Multiple models using multivariate logistic regression for comparison

Adjustment for model confounders and evaluation of predictive efficacy

In the adjustment for model confounders, interaction and collinearity were evaluated among the variables included in model 2 using the “corr test” command of STATA software. There was no interaction or collinearity between the predictors (data availabe if necessary). Logistic regression models were recreated in the test set data summary using the regression coefficients from the traing set model:

Odds(CIN)=1/(1+exp(-(6.62-2.191709*mg-0.1459131*alb-0.0252943*gfr-+0.0626123*tp-19.34694*cys+0.0132124*ldh+0.6151795*urea1+5.472858*p-0.686818*ca+0.3402039*dbil))).

Table 4 presents the variables and characteristics that were ultimately included in Model 2. The predictive performance of the model is displayed in Table 5; Fig. 4, while the nomogram based on this prediction model is presented in Fig. 5. The agreement between the predicted and observed actual risk of CIN is compared in Fig. 6, and the clinical decision curve for the CIN prediction model is shown in Fig. 7. The model is deemed clinically valuable when the risk threshold ranges between 0.1 and 0.9.

Table 4 Variables and characteristics eventually included in the model
Table 5 Performance of prediction model in training and test set
Fig. 4
figure 4

(a) ROC curve of the prediction model built from the training set data. The area under the curve is 0.9217, indicating good discrimination. ROC, receiver opertating characteristic. (b) ROC curves established by applying the CIN prediction model in the validation set. the area under the ROC curve is 0.8288, indicating good discrimination

Fig. 5
figure 5

CIN prediction model presented as a column line graph plot

Fig. 6
figure 6

(a) Comparison of the agreement between the predicted risk of the CIN prediction model and the observed actual risk of the CIN in the training set. the gray straight line at 45° over the origin represents the ideal line; the gray dashed line represents the actual observed value and the black straight line represents the predicted value according to the logistic model, S:p = 0.790. CIN: cisplatin induced nephrotoxicity Dxy, Somer’s rank correlation between p and y: DXY = 2(C-0.5); C, ROC area; ROC, receiver opertating characteristic; R2 Nagalkerke-Cox-Snell-Magee R-saquard index; D, Discrimination index D; U, unreliability index; Q, the quality index; Brier, Brier score (average squared difference in p and y); Emax, maximum absolute difference in predicted and loess-calibrated probabilities; E90, the 0.9 quantile absolute difference in predicted and loess-calibrated probabilities; Eavg, the average quantile absolute difference in predicted and loess-calibrated probabilitie; S:Z, The Spiegelhalter Z-test for calibration accuracy; S:P, the two-tailed value of Spiegelhalter Z test

Fig. 7
figure 7

Clinical decision curves for the established CIN prediction model. The thin blue line is the net benefit of therapeutic intervention for all men; the thin green line is the net benefit of therapeutic intervention for the men on the basis of the statistical model; the thick black line is the net benefit of therapeutic intervention for no man. The threshold probalility of X-axis and Net benefit of Y-axis are displayed as a ratio. Pr, Threshold Probability

Given the class imbalance, we used Precision-Recall (PR) curve for the assessment of the model’s predictive performance as shown in Fig. 8. In recall interval from 0.5 to 0.75: precision gradually declines with increasing Recall, remaining relatively high, up to 0.9. Within this range, the model maintains high accuracy in identifying positive samples and minimizing errors. In ecall interval from 0.75 to 0.90,precision drops more rapidly, from 0.9 to 0.60. To improve recall further and identify more positive samples, the model sacrifices more Precision, resulting in more false positives. In recall interval from 0.90 to 1.0,as recall approaches completeness, precision sharply decreases to about 0.10. In the pursuit of complete recall, the model’s accuracy significantly diminishes, introducing a large number of false positive predictions.

Fig. 8
figure 8

Precision-Recall (PR) curve of the predction model. The vertical axis in the figure represents accuracy, the horizontal axis represents recall, and the curves represent the corresponding accuracy and recall values at different cut-off points

Independent validation

The proposed model’s performance was evaluated using test set data, and its fit was consistent with that of the traing set data, as determined by the Hosmer-Lemeshow test (p = 0.4636). The overall predictive performance of the model is illustrated in Table 5; Fig. 4, and Fig. 6.

Discussion

This study utilized machine learning algorithms to construct a CIN prediction model based on clinical, laboratory, and genetic variables. The construction process was conducted strictly to the statement of clinical prediction models as follows: developing the prediction model, validating the prediction model, and predictive effectiveness evaluation [24]. The model demonstrated good sensitivity and specificity, indicating that combining laboratory and clinical variables can effectively identify high-risk populations of CIN. While the model cannot be used as an independent diagnostic method, it can serve as a supplementary tool due to its common, objective, and easily obtainable predictive factors.

The predictive set factor included 69 feature variables, 8 of which were genetic. If the genetic variables were considered as dummy variables, the total number of variables would increase to nearly 80. we employed LASSO regression with a 1sd penalty coefficient to consolidate the laboratory variables. This method effectively reduced the number of predictors and eliminated unimportant variables. LASSO is a method of shrinkage estimation based on model reduction. By constructing different penalty functions, the regression coefficients of variables will decrease accordingly, and the regression coefficients of unimportant variables will eventually decrease to zero. Compared with the classical screening method, Lasso can effectively avoid the influence of factors such as different orders of magnitude, different units and possible collinearity between variables [25]. To screen candidate variables, we opted for Lasso regression over classic single factor regression, using a 1 standard deviation penalty coefficient lambda (λ) as the screening parameter to prevent the exclusion of relatively unimportant variables [7, 26, 27]. The LASSO algorithm was executed using the “glmmet” R package, while the logistic regression model was constructed using the “glm” R package [20]. Subsequently, we employed multifactor logistic stepwise regression to identify a concise and effective set of variables, which were then fitted into the formula based on their respective weights. This standardized approach to variable selection and weight conversion helps mitigate differences in the same indicator arising from different laboratory methods [13, 28].

In the traing set, the genetic variable rs3212986 of ERCC1 exhibited statistically significant differences in allele frequency and genotype characteristics between the CIN group and the control group. The proportion of A-allele carriers was higher in the CIN group (31.21%) than in the control group (24.92%). The proportions of AA, CA, and CC genotypes were 11.64%, 39.15%, and 49.20% in the CIN group, and 12.03%, 25.64%, and 62.32% in the control group, respectively. These findings suggest that carriers of the A allele of rs3212986 are more likely to develop CIN, which is consistent with previous studies [29]. Similarly, the allele frequency and genotype characteristics of rs920829 of TRPA1 were also statistically different between the CIN group and the control group. The proportion of T allele carriers was lower in the CIN group (22.75%) than in the control group (28.69%). The proportions of TT, CT, and CC genotypes were 8.46%, 28.57%, and 62.96% in the CIN group, and 16.96%, 23.47%, and 59.57% in the control group, respectively. These results suggest that T allele carriers of rs920829 are less likely to develop CIN. However, during the optimization of variables through multiple factor logistic regression, neither rs3212986 nor rs920829 were incorporated. It is possible that these variables lack independent predictive power or their independent predictive value is not significant enough [30].

Cystatin-C (Cys-C) was identified as the independent risk factor with the highest odds ratio (OR) value in the prediction model, surpassing other factors in predictive performance. The reasons for the increase of Cys-C and the high risk of CIN are analyzed as follows: 1) Cys-C is produced by all nucleated cells in the body. Cys-C in the blood is filtered by the glomerulus, and is degraded through reabsorption of the renal tubules, and is not secreted through the renal tubules. The progress makes it a more effective indicator of early glomerular filtration function than creatinine, urea nitrogen, and other indicators [31, 32]. Secondly, Cys-C is a member of the cysteine protease inhibitor family and an imbalance between cathepsin and protease inhibitors may lead to tumor invasion and metastasis, which can also promote an elevation of Cys-C [33, 34]. Other factors in the model, such as dbil and LDH, were not traditional renal function indicators or related to cisplatin metabolism pathway, but may reflect changes in physiological or pathological pathways during the occurrence and development of CIN (such as secretion and excretion, inflammatory response, oxidative stress damage, and electrolyte imbalance) during the occurrence and development of CIN [27]. Therefore, using appropriate weighted models for joint evaluation can can aid in the earlier identification of CIN risks.

The model showed high sensitivity and negative prediction value(NPV), which can help to recognize the high risk of CIN and remind clinical attention to the selection of chemotherapy regimen and the compatibility with drug dosage. The results also showed a satisfactory discrimination ability and a prediction curve that is close to the actual curve, which indicates that the model can provide prediction results that are highly consistent with the actual ones to identify cases with high risk of CIN. The model had a C-index = 0.922 for the traing set’s discriminant test, with the consistency test S: P = 0.790, Emax = 0.044, Eave = 0.007 and S: p = 0.790, suggesting both the model’s discriminant and consistency were good. To avoid overfitting of the model due to random and systematic errors, a validation model was constructed from aother prospective dependent set data. The fitting of the model constructed from the test set data is consistent with the fitting of the model constructed from the traing set data. Further clinical decision curve analysis of the model revealed that the model was of good value for clinical use when the high-risk threshold was between 0.1 and 0.9. Meanwhile, Recision-Recall curve shown in recall interval from 0.5 to 0.75: precision gradually declines with increasing Recall, up to 0.9.

The prediction model developed in this study has certain limitations. Firstly, it is a single-center study, and although the test set data was prospectively included, the test set data was obtained retrospectively from the electronic medical record system. Consequently, there were unavoidable factors such as missing data, resulting in a final traing set of 696 patients, which may limit the model’s scalability and necessitate further multicenter research and external validation. Secondly, the study did not incorporate the latest CIN-related biomarkers, such as malondialdehyde (MDA), NADPH oxidases (NOX), or heme oxygenase 1 (HO-1), which could potentially impact the results [2]. Future research should focus on gradually conducting validation studies across multiple centers to continuously refine and enhance the model and provide guidance for clinical practice.

Conclusion

Predictive models based on laboratory and demographic variables can serve as a beneficial complementary tool for identifying high-risk populations with CIN.

Data availability

All data generated or analysed during this study are included in this published article [and its supplementary information files].

References

  1. Trendowski M R, El Charif O, Dinh P C JR, et al. Genetic and modifiable risk factors contributing to cisplatin-induced toxicities [J]. Clin Cancer Res. 2019;25(4):1147–55.

    Article  PubMed  Google Scholar 

  2. Holditch SJ, Brown C N, Lombardi AM et al. Recent advances in models, mechanisms, biomarkers, and interventions in Cisplatin-Induced Acute kidney Injury [J]. Int J Mol Sci, 2019;20(12).

  3. Miyoshi T, Uoi M, Omura F, et al. Risk factors for Cisplatin-Induced Nephrotoxicity: a Multicenter Retrospective Study [J]. Oncology. 2021;99(2):105–13.

    Article  CAS  PubMed  Google Scholar 

  4. Kidera Y, Kawakami H, Sakiyama T, et al. Risk factors for cisplatin-induced nephrotoxicity and potential of magnesium supplementation for renal protection [J]. PLoS ONE. 2014;9(7):e101902.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Okawa T, Mizuno T, Hanabusa S, et al. Prediction model of acute kidney injury induced by cisplatin in older adults using a machine learning algorithm [J]. PLoS ONE. 2022;17(1):e0262021.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Jelinek M J, Lee S M, Wyche Okpareke A, et al. Predicting Acute Renal Injury in Cancer patients receiving cisplatin using urinary Neutrophil Gelatinase-Associated Lipocalin and Cystatin C [J]. Clin Transl Sci. 2018;11(4):420–7.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Zazuli Z, De Jong C, Xu W et al. Association between Genetic variants and Cisplatin-Induced nephrotoxicity: a genome-wide Approach and Validation Study [J]. J Pers Med, 2021;11(11).

  8. Zhang J, Zhou W. Ameliorative effects of SLC22A2 gene polymorphism 808 G/T and cimetidine on cisplatin-induced nephrotoxicity in Chinese cancer patients [J]. Food Chem Toxicol. 2012;50(7):2289–93.

    Article  CAS  PubMed  Google Scholar 

  9. Liu H E, Bai K J, Hsieh Y C et al. Multiple analytical approaches demonstrate a complex relationship of genetic and nongenetic factors with cisplatin- and carboplatin-induced nephrotoxicity in lung cancer patients [J]. Biomed Res Int, 2014;2014(937429.

  10. Chang C, Hu Y. Hogan S L, Pharmacogenomic variants may influence the urinary excretion of novel kidney Injury biomarkers in patients receiving cisplatin [J]. Int J Mol Sci, 2017;18(7).

  11. Wang S Y, Gao J, Song Y H et al. Identification of potential gene and MicroRNA biomarkers of Acute kidney Injury [J]. Biomed Res Int, 2021;2021(8834578.

  12. Zazuli Z, Vijverberg S et al. SLOB E,. Genetic Variations and Cisplatin Nephrotoxicity: A Systematic Review [J]. Front Pharmacol, 2018;9(1111.

  13. Huang Y, Liang C. He L, et al. Development and validation of a Radiomics Nomogram for Preoperative Prediction of Lymph Node Metastasis in Colorectal Cancer [J]. Journal Of Clinical Oncology; 2016;34(10):109.

  14. Yang D, Liu Y, Bai C et al. Epidemiology of lung cancer and lung cancer screening programs in China and the United States [J]. Cancer Lett, 2020:468(82 – 7.

  15. Oliver Al. Lung Cancer: epidemiology and screening [J]. Surg Clin North Am. 2022;102(3):335–44.

    Article  PubMed  Google Scholar 

  16. Zhang J, Zhou W. Combined electronic medical records and gene polymorphism characteristics to establish an anti-tuberculosis drug-induced hepatic injury (ATDH) prediction model and evaluate the prediction value [J]. Annals Translational Med. 2022;10(20):1114.

    Article  CAS  Google Scholar 

  17. WHO. Common terminology criteria for adverse events (CTCAE) Version 5.0 [J]. Uppsala Monit Centre, 2017, https://www.who-umc.org/media/2768/standardised-case-causality-assessment.pdf): 1–155.

  18. Miller A B, Hoogstraten B. Reporting results of cancer treatment [J]. Cancer. 1981;47(1):207–14.

    Article  PubMed  Google Scholar 

  19. Zhang J, Jiao L, Song J, et al. Genetic and functional evaluation of the role of FOXO1 in Antituberculosis Drug-Induced hepatotoxicity [J]. Evidence-Based Complementary and Alternative Medicine; 2021;2021(1–13.

  20. Meng Z, Wang M, Guo S et al. Development and Validation of a LASSO Prediction Model for Better Identification of Ischemic Stroke: A Case-Control Study in China [J]. Frontiers in aging neuroscience, 2021;13(630437.

  21. Jaddoe V W, De Jonge L L, Hofman A, et al. First trimester fetal growth restriction and cardiovascular risk factors in school age children: population based cohort study [J]. BMJ. 2014;34(8):14–25.

    Article  Google Scholar 

  22. Wang Z, Xu B, Lin D, et al. XRCC1 polymorphisms and severe toxicity in lung cancer patients treated with cisplatin-based chemotherapy in Chinese population [J]. Lung Cancer. 2008;62(1):99–104.

    Article  PubMed  Google Scholar 

  23. Lu T, He L, Zhang B, et al. Percutaneous mastoid electrical stimulator improves Poststroke depression and cognitive function in patients with ischaemic stroke: a prospective, randomized, double-blind, and sham-controlled study [J]. BMC Neurol. 2020;20(1):217.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Moons K G, Altman D G, Reitsma JB, et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): explanation and elaboration [J]. Ann Intern Med. 2015;162(1):W1–73.

    Article  PubMed  Google Scholar 

  25. Tibshirani R. Regression shrinkage and selection via the Lasso [J]. J Royal Stat SocietySeries B (Methodological, 1996;58(1):267–88.

  26. Huang S H, Chu C Y, Hsu Y C et al. How platinum-induced nephrotoxicity occurs? Machine learning prediction in non-small cell lung cancer patients [J]. Comput Methods Programs Biomed, 2022;221(106839.

  27. Mcsweeney K R, Gadanec L K, Qaradakhi T et al. Mechanisms of Cisplatin-Induced Acute kidney Injury: pathological mechanisms, pharmacological interventions, and genetic mitigations [J]. Cancers, 2021;13(7).

  28. Tang X-R, Li Y-Q, Liang S-B, et al. Development and validation of a gene expression-based signature to predict distant metastasis in locoregionally advanced nasopharyngeal carcinoma: a retrospective, multicentre, cohort study [J]. Lancet Oncol. 2018;19(3):382–93.

    Article  PubMed  Google Scholar 

  29. Li F, Sun X. Association between polymorphisms of ERCC1 and XPD and clinical response to platinum-based chemotherapy in advanced non-small cell lung cancer [J]. Am J Clin Oncol. 2010;33(5):489–94.

    Article  CAS  PubMed  Google Scholar 

  30. Kimcurran V, Zhou C, Schmid-Bindert G, et al. Lack of correlation between ERCC1 (C8092A) single nucleotide polymorphism and efficacy/toxicity of platinum based chemotherapy in Chinese patients with advanced non-small cell lung cancer [J]. Adv Med Sci. 2011;56(1):30–8.

    Article  CAS  PubMed  Google Scholar 

  31. Papassotiriou G P, Kastritis E, Gkotzamanidou M, et al. Neutrophil Gelatinase–Associated Lipocalin and Cystatin C are sensitive markers of Renal Injury in patients with multiple myeloma [J]. Clinical lymphoma, myeloma & leukemia; 2016;16(1)29–35.

  32. Taha MM, Mahdy-Abdallah H, Shahy E M, et al. Diagnostic efficacy of cystatin-c in association with different ACE genes predicting renal insufficiency in T2DM [J]. Sci Rep. 2023;13(1):5288.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Tan P, Shi M, Chen J, et al. The preoperative serum cystatin-C as an independent prognostic factor for survival in upper tract urothelial carcinoma [J]. Asian J Androl. 2019;21(2):163–9.

    Article  CAS  PubMed  Google Scholar 

  34. Kwon W S, Kim T S, Nahm C H, et al. Aberrant cystatin-C expression in blood from patients with breast cancer is a suitable marker for monitoring tumor burden [J]. Oncol Lett. 2018;16(5):5583–90.

    PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We would like to thank Dr. Deng and Dr. Li for their valuable contributions to this research.

Funding

This work was supported by the Sichuan Natural Science Foundation Project (2023NSFSC0550), the Sichuan Medical Research Project (S21058), Open Project of Sichuan Provincial Key Laboratory for Clinical Immunology Translational Medicine. Ethical approval (LCMYZHYX-KFKT202304),the Chengdu Science and Technology Project(2019-YF09-00220-SN).

Author information

Authors and Affiliations

Authors

Contributions

(A) Jingwei Zhang, (B) Xuyang Luo have made substantial contributions to the conception; C.Yi Fan, D. Wei Zhou have made substantial contributions to design of the work; E. Shijie Ma, F. Yuwei Kang have made substantial contributions to the acquisition, analysis; G. Wei Yang, H. Xiaoxia Geng have made substantial contributions to interpretation of data; I. Heping Zhang, j. Fei Deng have drafted the work or substantively revised it.All authors reviewed the manuscript.

Corresponding authors

Correspondence to Heping Zhang or Fei Deng.

Ethics declarations

Ethics approval and consent to participate

Ethical approval for this study was obtained from the Institutional Review Board of Sichuan Provincial People’s Hospital Jinniu Hospital.The certificate number is 2023NSFSC0550.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, J., Luo, X., Fan, Y. et al. Development and validation of a LASSO prediction model for cisplatin induced nephrotoxicity: a case-control study in China. BMC Nephrol 25, 194 (2024). https://doi.org/10.1186/s12882-024-03623-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12882-024-03623-w

Keywords