Comparison of urine and blood NGAL for early prediction of delayed graft function in adult kidney transplant recipients: a meta-analysis of observational studies

Background Neutrophil gelatinase-assoicated lipocalin (NGAL) appears to be a promising proximal tubular injury biomarker for early prediction of delayed graft function (DGF) in kidney transplant recipients. However, its predictive values in urine and blood were varied among different studies. Here, we performed the meta-analysis to compare the predictive values of urine NGAL (uNGAL) and blood NGAL (bNGAL) for DGF in adult kidney transplant recipients. Methods We systematically searched Medline, Cochrane library and Embase for relevant studies from inception to May 2018. The summary receiver operating characteristic (SROC) curves, the pooled sensitivity, specificity and diagnostic odds ratio (DOR) were used to evaluate the prognostic performance of uNGAL and bNGAL for the identification of DGF. Results A total of 1036 patients from 14 eligible studies were included in the analysis. 8 studies focused on NGAL in urine and 6 reported NGAL in serum or plasma. The composite area under the ROC (AUC) for 24 h uNGAL was 0.91 (95% CI, 0.89–0.94) and the overall DOR for 24 h uNGAL was 24.17(95% CI, 9.94–58.75) with a sensitivity of 0.88 (95% CI, 0.75–0.94) and a specificity of 0.81 (95% CI, 0.68–0.89). The composite AUC for 24 h bNGAL was 0.95 (95% CI, 0.93–0.97) and the overall DOR for 24 h bNGAL was 43.11 (95% CI, 16.43–113.12) with a sensitivity of 0.91 (95% CI, 0.81–0.96) and a specificity of 0.86 (95% CI, 0.78–0.92). Conclusions Urine and serum/plasma NGAL were valuable biomarkers for early identification of DGF in kidney transplantation. In addition, the bNGAL was superior to uNGAL in early prediction of DGF. Electronic supplementary material The online version of this article (10.1186/s12882-019-1491-y) contains supplementary material, which is available to authorized users.


Background
Delayed graft function (DGF), traditionally defined as the need for dialysis within 7 days after kidney transplantation, remains one of the major obstacles for patients to recover from transplantation during the postoperative course. It significantly prolongs hospitalization stay, increases medical costs, and even worsens patients with increased risk of chronic kidney diseases or graft loss in the first year after transplantation [1]. In addition, the incidence of DGF is highly ranged from 5 to 50% in deceased-donor recipients and from 4 to 10% in living-donor recipients [2]. Therefore early identification of DGF is warranted as it not only provides adequate time for clinicians to adjust therapeutic intervention to limit the further development of allograft injury, but also greatly reduces the economic burden of patients [3].
Previously, the requirement for dialysis, the failure of serum creatinine (Scr) to decrease, the graft histopathology and the urine output following transplantation have been applied to identify DGF in kidney transplant recipients during a few days after transplantation [4]. However, these criteria not only induce marked variation in the diagnosis of DGF, but also delay the identification of DGF, thus disturbing the timely medication adjustment in kidney transplant recipients. Therefore, a lot of efforts have been made to find novel ideal biomarkers that can detect DGF soon after transplantation. Among a variety of these potential biomarkers [5], neutrophil gelatinase associated lipocalin (NGAL), a kidney proximal tubular injury biomarker, has been intensively studied in urine and blood (serum/plasma) and appears to be a non-invasive and valuable marker of DGF with high sensitivity and specificity in many centers. However, whether urine (uNGAL) or blood NGAL (bNGAL) is superior to its blood or urine counterpart in predicting the occurrence of DGF remains unclear so far. Therefore, we conducted the present meta-analysis of 14 observational studies to compare the accumulative predictive values of uNGAL and bNGAL for early identification of DGF in kidney transplant recipients.

Data sources and literature search strategies
Two investigators (Y.M.L. and Y.L.) independently searched the literature in Medline (via Pubmed), Cochrane Library and Embase (via Ovid) electronic databases from inception to May 2018. Following medical subject headings or keywords were systematically searched without language restriction: "NGAL", "neutrophil gelatinase-associated lipocalin", "Lipocalin 2", "delayed graft function", "DGF" and "kidney transplantation". In addition, reference lists of included articles were hand-searched to identify other relevant studies. All potentially relevant records were imported into Endnote X7.7 (Thomson Corporation, Connecticut, USA) for further management.

Study selection and included criteria
Two reviewers (Y.M.L. and Y.L.) independently screened the titles and abstracts and further retrieved the full text of each potentially relevant study to determine study eligibility. Disagreements were resolved by consensus adjudication. Prospective and retrospective cohort studies were eligible for inclusion if they evaluated the predictive performance of urine, serum or plasma NGAL for DGF in adult kidney transplant recipients. Conference abstracts, reviews, editorials, commentaries, letters and studies without mandatory predictive variables including the area under the receiver operating characteristic curve (AUC), sensitivity and specificity were excluded.

Data extraction and quality assessment
Two investigators (Y.M.L. and Y.L.) independently extracted the data from eligible articles according to previously prepared data sheet. The following information were included in the sheet: (1) Study and patient characteristics, including first author, publish year, study design, research country, sample resource and size, DGF definition, mean or median NGAL levels of DGF and Non-DGF groups, the time of obtaining specimen, cold ischemia time (CIT), expanded criteria donor (ECD) or donor after cardiac death (DCD) ratio, donor age and donor terminal Scr level. (2) Predictive values, including sensitivity, specificity, cut-off value and AUC with 95% confidence interval (CI). If multiple AUC values of different sampling time points were provided in the same study, the AUC values of 24 h and 48 h were recorded. In addition, if there were multiple cut-off values at the same sampling time point, the one showing the highest Youden index (sensitivity + specificity-1) was extracted. We evaluated the quality of the included articles via Quality Assessment of Diagnosis Accuracy Studies-2 (QUADAS-2) tool that comprises 14 questions [6]. "Yes", "Unclear" and "No" were chosen for each signaling question in the context of original articles.

Statistical analysis
Based on the heterogeneity of included studies, a random effect model (DerSimonian Laird) or fixed effect model (Mantel-Haenszel) was employed to estimate the pooled sensitivity, specificity and diagnostic odds ratio (DOR) with 95% CI. Moreover, forest plots of sensitivity, specificity, DOR and summary ROC (SROC) with overall AUC value were presented. Cochran's Q test was used to assess heterogeneity across studies and inconsistency index I 2 was applied to estimate the degree of heterogeneity. I 2 values of 25, 50 and 75% were thought to be low, moderate and high heterogeneity, respectively [7]. When substantial heterogeneity was found (I 2 > 50%), Spearman correlation coefficient test was performed to detect the presence of threshold effect; Sensitivity analysis was conducted to explore the source of non-threshold effect heterogeneity. Publication bias was statistically assessed by Deeks' linear regression test. In addition, we compared the risk factors of DGF (including DCD or ECD ratio, donor age, donor Scr and CIT) between uNGAL and bNGAL studies to figure out whether they were significantly different between these two groups of studies. The overall levels of these risk factors in each study were calculated using available mean or median values and patient numbers from DGF and Non-DGF groups. Student's t-test or Mann-Whitney U-test were utilized to compare continuous variables with normal distribution and skewed distribution, respectively. All statistical analyses were performed with Stata version 12.0 (Stata Corporation, College Station, Texas) and SPSS version 23.0 (SPSS Inc., Chicago, IL, USA). A two tailed P-value less than 0.05 was considered statistically significant.

Search results and study selection
Our initial search yielded 810 records. After removing all duplicates, 606 records remained for further screening. By looking through titles and abstracts, the remaining 29 articles were evaluated in full-text manner. Eventually, 14 eligible articles [8][9][10][11][12][13][14][15][16][17][18][19][20][21] were accepted in the meta-analysis according to pre-established inclusion criteria. No additional articles were found by sifting for titles and abstracts of the references in eligible articles and most related reviews. Detailed flow diagram of study selection is shown in Fig. 1.

Study characteristics
Characteristics of individual study were summarized in Table 1. 14 eligible articles from 10 different countries were published between 2006 and 2016. All studies were prospective cohort studies except for 2 studies that were designed in retrospective manner [12,13]. Eight studies including 10 datasets (seven 24 h uNGAL and three 48 h uNGAL) evaluated uNGAL and six studies evaluated bNGAL (three serum and three plasma  [13,[17][18][19] combined the dialysis-based criterion and the failure of Scr to decrease or the graft histopathology criterion to define DGF and pezeshgi's study [20] did not describe the detailed definition of DGF. No significant differences were observed between 8 uNGAL studies and 6 bNGAL studies in terms of DCD or ECD ratio, donor age, donor Scr and CIT (P > 0.05).

Quality assessment and publication bias
The quality of studies based on QUADAS-2 tool was summarized in Additional file 1: Figure S1 Deeks' Funnel plot indicated no significant publication biases were existed among the included studies of both uNGAL and bNGAL (Additional file 2: Figure S2).

Data synthesis and heterogeneity analysis
AUC value, cut-off value, sensitivity and specificity of individual dataset were summarized in Table 2. Truepositive (TP), false-positive (FP), true-negative (TN) and false-negative (FN) were calculated based on the   Fig. 2 (a, c) and Fig. 3 (a, c), respectively. Pooled sensitivities and specificities for 24 h uNGAL and 24 h bNGAL were both presented in Additional file 3: Figure S3. To explore possible reasons for heterogeneity, Spearman correlation coefficient tests and Sensitivity analyses were performed. No significant threshold effect was observed among bNGAL studies (r = 1.00, P = 1.00). While the coefficient of 7 datasets analyzing 24 h uNGAL (r = − 0.10, P = 0.01) indicated the presence of threshold effect. Therefore, hierarchical SROC modeling was plotted to pool the sensitivity and specificity [22]. Sensitivity analyses were conducted by omitting study one by one to investigate the source of non-threshold effect heterogeneity across studies. As demonstrated in Fig. 4a, the estimate value of uNGAL was beyond the limitation of upper CI when removing Hollmen's (2011) study, suggesting that Hollmen's (2011) study was the main source of the heterogeneity. Then, we performed the pooled analysis without including Hollmen's (2011) study. The results revealed that the composite AUC for uNGAL was 0.91 (95% CI, 0.89-0.94); the overall DOR was 43.11 (95% CI, 16.43-113.12) with I 2 value dropped from 61.9 to 35.1%; the sensitivity was 0.88 (95% CI, 0.75-0.94) and the specificity was 0.81 (95% CI, 0.68-0.89) (Figs. 2b 3b, 4b showed an overlapping CI, enhancing the stability of our data synthesis for bNGAL.

Discussion
Previously, several meta-analysis and systematic reviews have summarized the significance of uNGAL and bNGAL in predicting acute kidney injury (AKI) in different clinical settings, such as in cardiac surgery [23], contrast-induced AKI after cardiac catheterization [24] and found they were promising biomarkers for early detection of AKI. However, their predictive ability for DGF, a form of AKI posttransplantation, varied among different studies. So we quantitatively investigated the predictive values of uNGAL In addition, bNGAL appeared to be a better biomarker than uNGAL in predicting DGF with consideration of AUC values, sensitivities and specificities. NGAL, also known as lipcalin-2 (LCN2), is secreted by various tissues such as kidney tubules, liver, lung and gastrointestinal tract at a low level in healthy controls [25]. It has been reported to involve in several pathways such as apoptosis, bacteriostasis, renal tubule epithelial cell proliferation and regeneration [26]. NGAL has several forms including monomeric froms, dimers and trimers. The majority of NGAL was in monomeric form (with a molecular weight of 25 kDa) that was mainly produced by injured kidney tubule epithelium [27]. When there is proximal tubular injury, NGAL levels would increase rapidly. So NGAL has emerged as a promising predictor of AKI in recent years and been regarded as the "troponin of kidney" [28].
In kidney transplantation, uNGAL and bNGAL have been extensively studied in the prediction and diagnosis of short-and long-term renal functions [29]. The sources of NGAL are different in urine and blood. Most uNGAL was derived from distal nephron synthesis rather than filtered from blood, while bNGAL not only came from the damaged kidney, but also the systematic pool [30]. So, theoretically, uNGAL was expected to be more representative of kidney injury than bNGAL. However, this hypothesis was not borne out in our analysis of DGF. The results of the current meta-analysis demonstrated that bNGAL had a pooled DOR of 43.11 that was almost twice as higher as uNGAL (DOR, 24.17). Moreover, bNGAL showed an obviously higher sensitivity and a slightly higher specificity than those of uNGAL, which supported the superiority of bNGAL over uNGAL in predicting DGF within 24 h in kidney transplant recipients. This was consistent with Buemi's study [31], which investigated the predictive values of uNGAL and plasma NGAL (pNGAL) for DGF in 97 recipients and found that pNGAL was superior to uNGAL in terms of AUC values. In addition, two studies [11,15] from Hollmen's group separately explored the predictive potential of uNGAL and serum NGAL for DGF in the same kidney transplant recipient cohort and found higher AUC value of sNGAL (AUC, 0.85) compared to uNGAL (AUC, 0.74). One of the possible reasons was that many factors (such as urine concentration and glomerular filtration rate) might affect the levels of uNGAL, thus weakening the predictive ability of uNGAL. These findings were interesting and had great significance. Because in addition to its greater predictive performance, bNGAL would be more feasible for kidney transplant recipients when they suffered from a severe condition of oliguria and even anuria post-operatively. However, due to the inconsistent cut-off values and unavailability of original data from included studies, we were unable to determine the optimal predictive cut-off value for bNGAL to be implement in clinical practice.
Determining when to detect NGAL is important for clinicians to identify DGF effectively in kidney transplant recipients. It was reported that NGAL began to rise 2 h after the injury, peak at 8-12 h and return after 24-48 h [9]. So we recorded the prediction data when the sampling time was within 24 h and within 48 h, trying to  Limitations in this meta-analysis should be mentioned here. Firstly, the inconsistence of DGF definition may alter the predictive performance of uNGAL and bNGAL for DGF in individual studies, thereby may bring some deviations. Secondly, we only included the published literatures whose predictive variables (TP, TN, FP and FN) were available or can be calculated. So the predictive performance of uNGAL and bNGAL may be overestimated due to the publication and selection bias. Thirdly, different cut-off values were used in the included studies. These discrepancies may be attributed to the differences in experimental designs including detection methods, sample size and patient baseline characteristics, which may affect the sensitivity and specificity in individual study. To find an optimal cut-off value that can be implemented in clinical settings, further investigation in large prospective cohorts are required. Importantly, it might be necessary for each research center to pre-define specific normal ranges for urine and blood NGAL in normal kidney transplant recipients before its clinical application.

Conclusions
In conclusion, the results of the present meta-analysis demonstrated that both urine NGAL and blood NGAL appeared to be valuable biomarkers of DGF in adult kidney transplant recipients. In addition, the blood NGAL was superior to urinary NGAL in early prediction of DGF. However, further large-scale prospective cohort studies are needed to determine the optimal cut-off value for NGAL that is suitable for clinical use.

Additional files
Additional file 1: Figure S1. Quality assessment of the 14 included studies using QUADAS-2 tool. The assessment of risk of bias and applicability concerns for individual study (PDF 98 kb) Additional file 2: Figure S2. Deeks' funnel plots for the evaluation of potential publication bias in prediction of uNGAL (a) and bNGAL (b) for DGF (PDF 876 kb) Additional file 3: Figure S3. Forest plots of the pooled sensitivities and specificities of uNGAL (a) and bNGAL (b) level in predicting DGF in kidney transplant recipients. The black squares in the gray squares and the horizontal lines represent the point estimate and 95% CI, respectively. The dotted lines represent the pooled estimate, and the hollow diamonds represent the 95% CI of the pooled estimate (PDF 5145 kb)