Skip to main content
  • Research article
  • Open access
  • Published:

Changing relative risk of clinical factors for hospital-acquired acute kidney injury across age groups: a retrospective cohort study



Likelihood of developing acute kidney injury (AKI) increases with age. We aimed to explore whether the predictability of AKI varies between age groups and assess the volatility of risk factors using electronic medical records (EMR).


We constructed a retrospective cohort of adult patients from all inpatient units of a tertiary care academic hospital and stratified it into four age groups: 18–35, 36–55, 56–65, and > 65. Potential risk factors collected from EMR for the study cohort included demographics, vital signs, medications, laboratory values, past medical diagnoses, and admission diagnoses. AKI was defined based on the Kidney Disease Improving Global Outcomes (KDIGO) serum creatinine criteria. We analyzed relative importance of the risk factors in predicting AKI using Gradient Boosting Machine algorithm and explored the predictability of AKI across age groups using multiple machine learning models.


In our cohort, older patients showed a significantly higher incidence of AKI than younger adults: 18–35 (7.29%), 36–55 (8.82%), 56–65 (10.53%), and > 65 (10.55%) (p < 0.001). However, the predictability of AKI decreased with age, where the best cross-validated area under the receiver operating characteristic curve (AUROC) achieved for age groups 18–35, 36–55, 56–65, and > 65 were 0.784 (95% CI, 0.769–0.800), 0.766 (95% CI, 0.754–0.777), 0.754 (95% CI, 0.741–0.768), and 0.725 (95% CI, 0.709–0.737), respectively. We also observed that the relative risk of AKI predictors fluctuated between age groups.


As complexity of the cases increases with age, it is more difficult to quantify AKI risk for older adults in inpatient population.

Peer Review reports


Acute kidney injury (AKI) is a common and highly lethal clinical problem, affecting 11–12% of all hospitalized patients worldwide with a mortality rate of ~ 10% [1]. AKI is associated with significant short- and long-term morbidity and mortality [2], and prevention is the best means for dealing with AKI. Delays in identification and intervention for AKI may lead to rapid progression of the kidney injury, likelihood of developing chronic kidney disease (CKD), need for renal replacement therapy, and risk of death [3]. Hence, AKI risk assessment and management based on susceptibilities and exposures are recommended by the Kidney Disease Improving Global Outcomes (KDIGO) guidelines as it may trigger early effective interventions such as drug dose adjustment, avoidance of nephrotoxins, and intravenous fluids management [4]. Early subspecialist (nephrologist, intensivist) or pharmacist involvement in the care of AKI patients can reduce the risk of further kidney function decline [5].

AKI is associated with various risk factors including inherent factors, exposure to nephrotoxins (e.g. non-steroidal anti-inflammatory drugs [6]), acute illnesses (e.g. sepsis [7]) and major surgeries (e.g. cardiopulmonary bypass or coronary angiography [8,9,10]). Inherent risk factors include susceptibilities of each individual patient (e.g. age [11]) and those associated with reduced kidney reserve or failure of other organs with known cross-talk with the kidneys (e.g. heart, liver, and respiratory system) [12]. There is strong evidence supporting the role of advanced age in AKI. Elderly patients are at much higher risk for developing AKI due to their decreasing renal reserve and structural changes in the aged kidney that impair its ability to withstand and recover from injury [13].

Primary focus of existing AKI studies has been prediction tools for the early identification of at-risk patients. Studies [11, 14, 15] mainly used a small set of highly correlated risk factors based on existing evidence to build prediction models, which may miss potential unknown risk factors. In addition, there has been significant progress in the applications of machine learning to predict AKI risk using electronic health records (EHR) [16]. Sutherland et al. [17] found that most models had modest predictive success with AUC approximating 0.75. Li et al. applied convolutional neural network to ICU patients achieving an AUC of 0.78. In particular, Tomasev et al. [18] used EHR from the US VA health system to build a deep prediction model achieving an AUC of 0.92 for the 48-h prediction time window.

Existing studies suggest that age can modify the intensity of relationships between other factors and AKI. For instance, Kane-Gill et al. [11] analyzed risk factors of AKI for older patients in intensive care units (ICU), and found that the impact of age was so substantial that other risk factors (e.g. sepsis, hypertension, nephrotoxins) lost their ability to predict AKI risk among patients older than 75 years. However, most previous studies [11, 14, 15] examined age as the main effect and considered its interactions with other risk factors one at a time. Despite higher AKI incidence in older adults, how the predictability of AKI risk changes with age is an unanswered question in the current literature. In this study, we investigated the predictability trend of hospital-acquired AKI across age groups using machine learning algorithms and assessed whether relative importance of risk factors in predicting AKI change across age groups.


Study population

All adult patients (age at visit≥18) admitted to the University of Kansas Health System (a tertiary academic hospital) for 2 days or more from November 2007 to December 2016 were included in this retrospective observational cohort study, which included adult patients from all ICU, surgical, and general wards. From a total of 179,370 encounters, we excluded those samples that lacked necessary data elements required to determine the outcome, that is, less than two serum creatinine measurements; and patients with evidences of moderate or severe kidney dysfunction at admission (estimated Glomerular Filtration Rate (eGFR) less than 60 mL/min/1.73 m2 or serum creatinine (SCr) level of > 1.3 mg/dL) were also excluded. eGFR was calculated with the Modification of Diet in Renal Disease (MDRD) equation. The final analysis cohort contained 76,957 encounters.

AKI definition

We staged AKI for severity according to the SCr-based criteria described in the KDIGO clinical practice guidelines [19] (see Supplementary Table S1). Baseline SCr level was defined as the most recent SCr value within two-day window prior to admission if available; otherwise the first SCr value after admission was used as the baseline. Then all pairs of SCr levels measured between admission and discharge were evaluated on a rolling basis to determine the occurrence of AKI.

Clinical variables

For each hospital encounter in the final analysis cohort, we extracted time stamped clinical data on demographics, vital signs, medications, laboratory values, past medical diagnoses, and admission diagnoses. This study explored the entirety of the above mentioned EHR data types except for laboratory tests where a selected list of labs that may represent potential presence of a comorbidity correlated with AKI [14] was considered. Details of the 1888 clinical variables considered are available in Table 1. It is important to note that SCr and eGFR were not included as predictive variables because they were used to determine the outcome variable, and we aimed to focus on the contribution of other factors. Laboratory values were categorized as unknown, less than reference normal range, within normal range, or greater than the reference normal range. Patient vital signs were discretized into groups as shown in Supplementary Table S2.

Table 1 Clinical variables extracted for the study cohort

Drug exposure included inpatient (i.e. dispensed during hospitalization) and outpatient drugs (i.e. medication reconciliation and prior outpatient prescriptions). All medication names were standardized by mapping to RxNorm components. Admission diagnosis, that is, the detailed diagnosis-related group (APR-DRG) of all patients, were collected from the data source of the University Health System Consortium (UHC; in HERON. Patient past medical history was captured as primary diagnoses (ICD-9 codes grouped based on the Clinical Classifications Software (CCS) diagnosis categories by the Agency for Healthcare Research and Quality.

Data processing and statistical analysis

Only the most recently recorded vitals and lab tests before the AKI prediction point (i.e. 24 h prior to AKI event or last normal SCr for non-AKI cases) were used for each encounter. For vital signs, if no values were available, then the median value across the entire cohort for that variable was imputed [20] (information on missing percentages is available in Supplementary Table S3). Missing values among lab tests were captured as a separate category because information may be contained in the choice to not perform a particular test [14]. Medication exposure was defined as true if it was taken within 7-days before the AKI prediction point. Medical history was defined as true if it occurred before the AKI prediction point. Hence, medical history, medication and admission diagnosis were all binary variables (i.e. presence or absence). Finally, we stratified the cohort into four age groups: 18–35, 36–55, 56–65 and > 65 years.

To analyze the volatility of relative risk and prediction performance associated with AKI across age groups, we implemented the following steps: (a) Feature selection or ranking – applied a multivariate embedded Gradient Boosting Machine (GBM [21]) method to rank individual variables according to its importance in AKI prediction. This step ranked the candidate variables among 1888 features to obtain the top-k most important predictors for AKI; (b) Predictive modeling – explored four machine learning methods, i.e. logistic regression, support vector machine (SVM) [2], LogitBoost [22, 23], and random forest [24], to assess the prediction performance across age strata. Area under the receiver operating characteristic curve (AUROC) [25] was calculated as the evaluation metric for prediction performance through a 10-fold cross-validation scheme. To determine stable feature ranking across the 10-folds, we averaged the relative importance weights of variables obtained from each fold. Additionally, to address the imbalanced positive-to-negative class issue (AKI to non-AKI ratio), we implemented an under-sampling strategy that would ensure the same number of samples per class in training the model for each fold but keeping the original class ratio in the test dataset. Under-sampling of training dataset is necessary because skewed samples can mislead machine learning algorithms to favor the majority class, in this case non-AKI samples. For comparison, evaluation strategy without under-sampling was also established for the prediction models. Two-tailed P values < 0.05 were used to denote statistical significance for all comparisons. Data extraction and processing were executed using Python 3.7 software with scikit-learn package, and other analysis and graphs were drawn using MATLAB software, version R2017b.


Of the 76,957 encounters meeting the inclusion and exclusion criteria, any stages of AKI occurred in 7259 (9.43%), and 38,887 (50.53%) were aged 56 years or older. Table 2 illustrates the characteristics of patients by age groups, showing that the incidence of AKI rises with age from 7.29% in the youngest group to 10.55% in the oldest group, and the incidence of AKI in male patients is slightly higher than that of females. Additionally, Table 2 shows that most AKI episodes (namely, AKI onset time in terms of number of days from admission) occurred within a week after admission and there is no significant difference between age groups.

Table 2 Demographic characteristics and AKI onset time of patients by age category

Figure 1 is the Venn diagram of the top 200 risk factors identified for each of the four age groups obtained by the GBM algorithm and shows the number of overlapping factors identified across strata. Figure 2 shows the common factors that appear in the top 200 important risk factor list across all four age groups. The variable importance plots for the top-ranked features for predicting AKI across age strata are shown in Supplementary Fig. S1, which illustrated an exponential decline trend in the contribution of top-k variables to AUROC gain. Moreover, overlapping factors in the top 200 list across only three age groups are shown in Supplementary Fig. S2.

Fig. 1
figure 1

Venn diagram for the top 200 features identified in four age groups. This figure shows the number of overlapping features identified as top 200 across the four age groups

Fig. 2
figure 2

Heat map of the top-200 risk factors appeared in all four age groups. The figure shows the corresponding ranking of each factor in the GBM model

To further assess the discrimination of the top-ranked features for AKI prediction, we conducted a series of prediction experiments by including different numbers of top-k features (i.e. k = 10 to 300) in four machine learning models (i.e., logistic regression, SVM, LogitBoost, and random forest). Figure 3 shows the predictability trend of top-k important features with under-sampling of the majority class (please refer to Supplementary Fig. S3 for results from without under-sampling). Supplementary Table S4 illustrates the AUROC and corresponding 95% confidence interval values (CI) for AKI prediction with under-sampling using top-200 variables of four age groups based on four machine learning models. Supplementary Table S5 provides results for several predicted probability cutoffs for the final model and corresponding sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) in predicting AKI across age groups. Figure 4 shows the AUROC achieved by random forest using top-200 features without under-sampling for age groups 18–35, 36–55, 55–65, and > 65 years at 0.809 (95% CI, 0.769–0.842), 0.787 (95% CI, 0.758–0.813), 0.776 (95% CI, 0.729–0.803), and 0.740 (95% CI, 0.716–0.756) respectively. Above results demonstrated the predictability of AKI in the general inpatient population decreased as age increased, which may be due to more complex physiology of older adults. Table 3 shows that the significance levels of pairwise comparison of AKI incidence and prediction performance based on four machine learning methods between age groups, in which older patients showed a significantly higher incidence of AKI than younger age groups (p < 0.001), however the predictive power of the older group (i.e. > 65 age group) was significantly lower than that of other young groups (p < 0.05).

Fig. 3
figure 3

Prediction trends of the top-ranking features with under-sampling for the four age groups across different machine learning models

Fig. 4
figure 4

ROC curves of random forest without under-sampling for the four age groups

Table 3 Significant level of pairwise comparison


Advanced age is an established independent risk factor for AKI [17], which may be due to the deterioration of renal function and the decrease in detoxification ability of drugs in the elderly [11, 26], making elderly highly sensitive to nephrotoxic drugs and susceptible to AKI. Findings from previous studies [11, 13, 27,28,29,30] support the proposition that age represents an important risk factor among the spectrum of risk factors for AKI. Although the incidence of AKI increases with age, we observed the predictability of AKI in the general inpatient population to decrease with age (Fig. 4). When comparing two data-sample processing mechanisms for addressing the imbalanced AKI vs non-AKI classification problem, namely with or without under-sampling, we consistently observed the predictive power of the four age-stratified models to decrease as age increased. The predictability of AKI risk in the older age group was significantly lower than that of the younger groups (p < 0.05).

Our research reached the same conclusions as Kane-Gill et al. [11]; however, our study had more patients (179,370 vs. 45,655) from all inpatient units (only ICU patients in Kane-Gill et al.), collected more clinical variables (1888 vs. 25), and achieved higher overall AUROCs. Moreover, the recent AKI prediction work by Google published in Nature [18] utilized EHR data from the U.S. Department of Veterans Affairs with over 700,000 patients and 366,856 distinct clinical variables, and their subgroup analysis on age showed lower AUROCs for patients in the older age groups which may also be due to patient heterogeneity.

To examine the change in relative risk of AKI predictors in the general inpatient population, we applied a machine learning-based feature selection algorithm over a large EMR dataset with close to two thousand variables to derive the relative ranking profiles and compared profiles across different age groups. Based on our previous research [31], we acknowledge that relative importance rankings of variables are affected by data samplings and feature selection methods. This study is not in any way to provide an absolute ranking of important predictors. It is to analyze and compare the relative variability and volatility of predictors between age groups using a single feature selection method (i.e., GBM). Figures 1 and 2 and Supplementary Fig. S2 illustrated the phenomenon that the relative risk of AKI predictors fluctuated between age groups.

Specifically, as shown in Table 2, low body mass index (BMI) was found to be associated with higher AKI risk in younger patients, but high BMI was found to be associated with higher AKI risk in elderly patients; and younger patients with cystic fibrosis, nutritional deficiencies or esophageal disorders have a higher risk of developing AKI compared with the older patients. Since efforts to quantify risk of AKI in older patients may be more difficult and older adults frequently have impaired drug clearance in addition to polypharmacy [26], clinical decision support systems to ensure proper drug usage and dosing in elderly may have special value. These findings implicate that AKI risk factors are heterogeneous, and age can modify the intensity of relationships between other factors and AKI. Therefore, future studies that evaluate risk factors needs to consider complex interactions between factors and their combinatorial effect on the outcome.

It is worth noting that feature selection method can identify factors with strong predictive ability, but these factors are not necessarily causal inducers. More specifically, some medicines by themselves do not increase risk for AKI, but the disease that is treated by the medicine increases the risk of AKI. The data we extracted is time-stamped with daily interval since admission, so the data granularity is coarse, and it is difficult to affirm whether a disease caused AKI or taking a medicine for an illness led to AKI. For example, in Supplementary Fig. S3, insulin (MED548) was identified as an important predictor of AKI in all age groups, and presumably this is just a marker for diabetes, i.e. patients with diabetes or diabetic nephropathy are at higher risk for AKI. Another example, polyethylene glycol 3350 (MED677, see Supplementary Fig. S3) is an osmotically acting laxative and its relative risk ranking for predicting AKI increases with age (namely, 69, 84, 22, 18 across the four age groups). However, we cannot clearly affirm in this case whether AKI was caused by the clinical indication that requires laxatives or because using a large amount of such laxatives would cause disturbance of water and electrolytes in the intestine, thereby inducing AKI. Thus, whether a drug increases patient risk for developing AKI requires rigorous demonstration from clinical experiments.

Furthermore, the granularity of medication data extraction and processing may not change prediction performance but will affect the knowledge learned by the machine learning models [32]. Considering the drug metabolism cycle, in this study, we only considered medications taken within a week, which would treat long-term (> 7 days) and short-term medication intakes the same. In recent years clinical studies have recognized that long-term use of drugs that inhibit gastric acid secretion (e.g., proton pump inhibitor [33, 34]) is likely to cause acute renal failure. Our model identified glycopyrronium bromide (MED566, see Supplementary Fig. S3) typically used for functional gastrointestinal disorders with an effect of inhibiting gastric secretion and regulating gastrointestinal motility, to have a higher relative risk ranking with respect to AKI that increased with age (namely, 136, 40, 28, 23 for four age groups). Hence, future work needs to consider length and amount of drug usage.

Several limitations in the present research must be considered. First, we limited the analysis to patients with a minimum eGFR (estimated glomerular filtration rate) of 60 ml/min/1.73m2 and normal serum creatinine on the day of admission at hospital admission. We acknowledge that patients with reduced eGFR have an increased risk of developing AKI; however, we made the decision to focus on hospital-acquired AKI. Second, to enhance machine learning model interpretability, our discretization of lab tests and vitals would lead to the loss of some information in data. Third, we did not include service unit as a risk factor and only selected certain lab tests based on previous literature for AKI prediction. Fourth, since our study was not limited to the ICU, we did not include urine output criteria as a predictor nor using it to define AKI. Fifth, our age stratification was not fine grained, for example patients > 65 years old were lumped into one category. Finally, although we utilized a large cohort observed for up to a decade, they only reflect the population of one academic medical center. Replicating this study in other institutions would generalize conclusions.


In conclusion, we took advantage of a large EMR dataset and applied machine learning methods to analyze the changing relative risk and prediction performance of AKI across age strata. Analysis results demonstrate that (a) AKI risk increases with age, but the ability to predict AKI declines with age due to the increasing complexity of the patients; (b) the relative importance of clinical predictors in predicting hospital-acquired AKI fluctuates between age groups. The study findings suggest that accurate AKI risk prediction in elderly may require additional effort. It highlights the importance of considering age-specific risk differences in hospitalized patients to enhance AKI prevention in clinical care.

Availability of data and materials

The clinical dataset used for analysis described in this study was obtained from the University of Kansas Medical Center (KUMC) HERON clinical data repository, which are not publicly available. Open reasonable request, amendment can be requested to the corresponding author to share the necessary data.



Acute kidney injury


Electronic medical records


Kidney Disease Improving Global Outcomes


Area under the receiver operating characteristic curve


Chronic kidney disease


Intensive care units


Estimated Glomerular Filtration Rate


Serum creatinine


Modification of Diet in Renal Disease


Health Enterprise Repository for Ontological Narration


Health Insurance Portability and Accountability Act




Lab tests


Admission diagnoses


Medical History




Gradient boosting machine


Support vector machine


Confidence interval values


Positive predictive value


Negative predictive value


  1. Al-jaghbeer M, Dealmeida D, Bilderback A, Ambrosino R, Kellum JA. Clinical decision support for in-hospital AKI. J Am Soc Nephrol. 2018;29:654–60.

    Article  Google Scholar 

  2. Kate RJ, Perez RM, Mazumdar D, Pasupathy KS, Nilakantan V. Prediction and detection models for acute kidney injury in hospitalized older adults. BMC Med Inform Decis Mak. 2016;16:39.

    Article  Google Scholar 

  3. Himmelfarb J, Joannidis M, Molitoris B, Schietz M, Okusa MD, Warnock D, et al. Evaluation and initial management of acute kidney injury. Clin J Am Soc Nephrol. 2008;3:962–7.

    Article  Google Scholar 

  4. Group IGO. (KDIGO) AKIW. KDIGO clinical practice guideline for acute kidney injury. Kidney Int Suppl. 2012;2.

  5. Balasubramanian G, Moiz A, Rauchman M, Zhang Z, Gopalakrishnan R, Balasubramanian S. Early nephrologist involvement in hospital-acquired acute kidney injury : a pilot study. Am J Kidney Dis. 2011;57:228–34.

    Article  PubMed  Google Scholar 

  6. Cronin RM, VanHouten JP, Siew ED, Eden SK, Fihn SD, Nielson CD, et al. National Veterans Health Administration inpatient risk stratification models for hospital-acquired acute kidney injury. J Am Med Inform Assoc. 2015;22:1054–71.

    Article  Google Scholar 

  7. Malhotra R, Kashani KB, Macedo E, Kim J, Bouchard J, Wynn S, et al. A risk prediction score for acute kidney injury in the intensive care unit. Nephrol Dial Transplant. 2017;32:814–22.

    Article  CAS  Google Scholar 

  8. Jiang W, Teng J, Xu J, Shen B, Wang Y, Fang Y, et al. Dynamic predictive scores for cardiac surgery-associated acute kidney injury. J Am Heart Assoc. 2016;5:1–10.

    Google Scholar 

  9. Pablo Jorge-Monjas C, Bustamante-Munguira J, Lorenzo M, Heredia-Rodríguez M, Fierro I, Gómez-Sánchez E, et al. Predicting cardiac surgery-associated acute kidney injury: the CRATE score. J Crit Care. 2016;31:130–8.

    Article  PubMed  Google Scholar 

  10. Palomba H, De Castro I, Neto ALC, Lage S, Yu L. Acute kidney injury prediction following elective cardiac surgery: AKICS score. Kidney Int. 2007;72:624–31.

    Article  CAS  Google Scholar 

  11. Kane-Gill SL, Sileanu FE, Murugan R, Trietley GS, Handler SM, Kellum JA. Risk factors for acute kidney injury in older adults with critical illness: a retrospective cohort study. Am J Kidney Dis. 2015;65:860–9.

    Article  Google Scholar 

  12. Leblanc M, Kellum JA, Gibney RTN, Lieberthal W, Tumlin J, Mehta R. Risk factors for acute renal failure: inherent and modifiable risks. Curr Opin Crit Care. 2005;11:533–6.

    Article  Google Scholar 

  13. Chao C, Wu V, Lai C, Shiao C, Huang T, Wu P. Advanced age affects the outcome-predictive power of RIFLE classification in geriatric patients with acute kidney injury. Kidney Int. 2012;82:920–7.

    Article  PubMed  Google Scholar 

  14. Matheny ME, Miller RA, Ikizler TA, Waitman LR, Denny JC, Schildcrout JS, et al. Development of inpatient risk stratification models of acute kidney injury for use in electronic health records. Med Decis Mak. 2010;30:639–50.

    Article  Google Scholar 

  15. Kashani K. Acute kidney injury risk prediction. In: Annual update in intensive care and emergency medicine 2018. Cham: Springer; 2018. p. 321–32.

    Chapter  Google Scholar 

  16. Shickel B, Tighe PJ, Bihorac A, Rashidi P. Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis. IEEE J Biomed Heal Inform. 2017;22:1589–604.

    Article  Google Scholar 

  17. Sutherland SM, Chawla LS, Kane-Gill SL, Hsu RK, Kramer AA, Goldstein SL, et al. Utilizing electronic health records to predict acute kidney injury risk and outcomes: workgroup statements from the 15 th ADQI consensus conference. Can J Kidney Heal Dis. 2016;3:99.

    Article  Google Scholar 

  18. Tomašev N, Glorot X, Rae JW, Zielinski M, Askham H, Saraiva A, et al. A clinically applicable approach to continuous prediction of future acute kidney injury. Nature. 2019;572:116.

    Article  Google Scholar 

  19. Fliser D, Laville M, Covic A, Fouque D, Vanholder R, Juillard L, et al. A European renal best practice (ERBP) position statement on the kidney disease improving global outcomes (KDIGO) clinical practice guidelines on acute kidney injury: part 1: definitions, conservative management and contrast-induced nephropathy. Nephrol Dial Transplant. 2012;27:4263–72.

    Article  Google Scholar 

  20. Koyner JL, Adhikari R, Edelson DP, Churpek MM. Development of a Multicenter Ward–Based AKI Prediction Model. Clin J Am Soc Nephrol. 2016;11(11):1935–43.

    Article  Google Scholar 

  21. Koyner JL, Carey KA, Edelson DP, Churpek MM. The development of a machine learning inpatient acute kidney injury prediction model. Crit Care Med. 2018;46:1070–7.

    Article  Google Scholar 

  22. Zhang G, Fang B. LogitBoost classifier for discriminating thermophilic and mesophilic proteins. J Biotechnol. 2007;127:417–24.

    Article  CAS  Google Scholar 

  23. Zuo YC, Chen W, Fan GL, Li QZ. A similarity distance of diversity measure for discriminating mesophilic and thermophilic proteins. Amino Acids. 2013;44:573–80.

    Article  CAS  Google Scholar 

  24. Flechet M, Güiza F, Schetz M, Wouters P, Vanhorebeek I, Derese I, et al. AKIpredictor, an online prognostic calculator for acute kidney injury in adult critically ill patients: development, validation and comparison to serum neutrophil gelatinase-associated lipocalin. Intensive Care Med. 2017;43:764–73.

    Article  CAS  Google Scholar 

  25. Bradley AP. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn. 1997;30:1145–59.

    Article  Google Scholar 

  26. Ftouh S, Thomas M. Acute kidney injury: summary of NICE guidance. Bmj. 2013;347:f4930.

    Article  Google Scholar 

  27. Chia-Ter C, Hung-Bin T, Chia-Yi W, Yu-Feng L, Nin-Chieh H, Jin-Shin C, et al. Cumulative Cardiovascular Polypharmacy Is Associated With the Risk of Acute Kidney Injury in Elderly Patients. Med (Baltimore). 2015;94:e1251.

    Article  Google Scholar 

  28. Ter Chao C, Bin TH, Wu CY, Hsu NC, Lin YF, Chen JS, et al. Cross-sectional study of the association between functional status and acute kidney injury in geriatric patients. BMC Nephrol. 2015;16:186.

    Article  Google Scholar 

  29. Grams ME, Sang Y, Ballew SH, Gansevoort RT, Kimm H, Kovesdy CP, et al. A meta-analysis of the Association of Estimated GFR, albuminuria, age, race, and sex with acute kidney injury. Am J Kidney Dis. 2015;66:591–601.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Chao C-T, Wang J, Wu H-Y, Huang J-W, Chien K-L. Age modifies the risk factor profiles for acute kidney injury among recently diagnosed type 2 diabetic patients: a population-based study. GeroScience. 2018;40:201–17.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Wu L, Hu Y, Liu X, Zhang X, Chen W, Yu ASL, et al. Feature ranking in predictive models for hospital-acquired acute kidney injury. Sci Rep. 2018;8:17298.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Song X, Waitman LR, Hu Y, Yu ASL, Robbins D, Liu M. An exploration of ontology-based EMR data abstraction for diabetic kidney disease prediction. AMIA Summits Transl Sci Proc. 2019;2019:704.

    PubMed  Google Scholar 

  33. Klepser DG, Collier DS, Cochran GL. Proton pump inhibitors and acute kidney injury : a nested case – control study. BMC Nephrol. 2013;14:150.

    Article  Google Scholar 

  34. Xie Y, Bowe B, Li T, Xian H, Yan Y, Al-aly Z. Long-term kidney outcomes among users of proton pump inhibitors without intervening acute kidney injury. Kidney Int. 2017;91:1482–94.

    Article  CAS  PubMed  Google Scholar 

Download references


The authors are grateful to reviewers for their valuable suggestion.


This research was partially supported by the Major Research Plan of the National Natural Science Foundation of China (Key Program, Grant No. 91746204), the Youth Science Fund of the National Natural Science Foundation of China (Grant No. 61802149), the Science and Technology Development in Guangdong Province (Major Projects of Advanced and Key Techniques Innovation, Grant No.2017B030308008), Guangdong Engineering Technology Research Center for Big Data Precision Healthcare (Grant No.603141789047), and the Fundamental Research Funds for the Central Universities (Grant No.21618315). ML, LRW, AY, and JK were supported by the National Institute of Diabetes and Digestive and Kidney Diseases of the National Institutes of Health (NIH) under award number R01DK116986. The clinical dataset used for analysis described in this study was obtained from the University of Kansas Medical Center (KUMC) HERON clinical data repository which is supported by institutional funding and by the KUMC Clinical Translational Science Award (CTSA) grant UL1TR002366 from NIH.

Author information

Authors and Affiliations



LW designed the overall study, carried out the experiments, and wrote the manuscript. ML and YH critically appraised and revised the manuscript. ML and LRW performed the EMR data extraction. XZ and WQ contributed in data processing. AY and JK advised on the clinical experiment design and result interpretation. All authors reviewed the manuscript critically for scientific content, and all authors gave final approval of the manuscript for publication.

Corresponding author

Correspondence to Mei Liu.

Ethics declarations

Ethics approval and consent to participate

The retrospective cohort was built using the University of Kansas Medical Center’s de-identified clinical data repository called HERON (Health Enterprise Repository for Ontological Narration). No approval by the Institutional Review Board was required for the study because all identifiers were removed and event date were shifted, meeting the de-identification criteria specified in the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule. The de-identified data request for this study was approved by the HERON Data Request Oversight Committee.

Consent for publication

Not applicable.

Competing interests

The authors declare no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1: Table S1.

The KDIGO serum creatinine based staging system for acute kidney injury.

Additional file 2: Table S2.

Discretization for patient vital signs.

Additional file 3: Table S3.

The percentage of missing values in vital signs.

Additional file 4: Table S4.

Prediction performance in terms of area-under-the-operating-characteristic-curve (AUROC) for model built with top-200 important features and under-sampling of majority class samples.

Additional file 5: Table S5.

Sensitivity and specificity at different operating probability cutoffs for the random forest model prediction of acute kidney injury.

Additional file 6: Figure S1.

Variable importance plot for top-ranked features across four age groups.

Additional file 7: Figure S2.

Heat map of the top-200 important risk factors that appeared in only three age groups with the corresponding ranking of each factor in the GBM model.

Additional file 8: Figure S3.

Prediction performance trend of different machine learning models learned with top-ranking features and without under-sampling of majority class samples across the four age groups.

Additional file 9: Supplementary Method:

Gradient Boosting Machine (GBM).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, L., Hu, Y., Zhang, X. et al. Changing relative risk of clinical factors for hospital-acquired acute kidney injury across age groups: a retrospective cohort study. BMC Nephrol 21, 321 (2020).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: