The cost-effectiveness of using chronic kidney disease risk scores to screen for early-stage chronic kidney disease

Background Better treatment during early stages of chronic kidney disease (CKD) may slow progression to end-stage renal disease and decrease associated complications and medical costs. Achieving early treatment of CKD is challenging, however, because a large fraction of persons with CKD are unaware of having this disease. Screening for CKD is one important method for increasing awareness. We examined the cost-effectiveness of identifying persons for early-stage CKD screening (i.e., screening for moderate albuminuria) using published CKD risk scores. Methods We used the CKD Health Policy Model, a micro-simulation model, to simulate the cost-effectiveness of using CKD two published risk scores by Bang et al. and Kshirsagar et al. to identify persons in the US for CKD screening with testing for albuminuria. Alternative risk score thresholds were tested (0.20, 0.15, 0.10, 0.05, and 0.02) above which persons were assigned to receive screening at alternative intervals (1-, 2-, and 5-year) for follow-up screening if the first screening was negative. We examined incremental cost-effectiveness ratios (ICERs), incremental lifetime costs divided by incremental lifetime QALYs, relative to the next higher screening threshold to assess cost-effectiveness. Cost-effective scenarios were determined as those with ICERs less than $50,000 per QALY. Among the cost-effective scenarios, the optimal scenario was determined as the one that resulted in the highest lifetime QALYs. Results ICERs ranged from $8,823 per QALY to $124,626 per QALY for the Bang et al. risk score and $6,342 per QALY to $405,861 per QALY for the Kshirsagar et al. risk score. The Bang et al. risk score with a threshold of 0.02 and 2-year follow-up screening was found to be optimal because it had an ICER less than $50,000 per QALY and resulted in the highest lifetime QALYs. Conclusions This study indicates that using these CKD risk scores may allow clinicians to cost-effectively identify a broader population for CKD screening with testing for albuminuria and potentially detect people with CKD at earlier stages of the disease than current approaches of screening only persons with diabetes or hypertension.


Background
Chronic kidney disease (CKD) affected 13.6% of U.S. adults in 2007-2012 [1] and is estimated to affect nearly 17% of adults by 2030 [2]. All stages of CKD have been shown to impose significant health and economic burden [3][4][5]. Better treatment during early stages of CKD may slow progression to end-stage renal disease (ESRD), the most severe stage of CKD, and reduce complications, medical costs, and mortality associated with CKD [6][7][8][9]. Achieving early treatment of CKD is challenging, however, because as many as 94.5% of persons with CKD are unaware of having the disease [10,11]. Therefore, increasing awareness among patients and clinicians about CKD and CKD screening is important to achieve earlier treatment of CKD and mitigate its associated costs and complications. Screening for moderately increased albuminuria (microalbuminuria), a marker of early-stage CKD, was found to be cost-effective in populations with diabetes or hypertension [12][13][14]. Diabetes and hypertension are primary risk factors for CKD, but approximately 52% of those with CKD do not have diabetes, and approximately 10% do not have hypertension [1]. Thus, identifying costeffective methods of screening for CKD in other populations is crucial to increase awareness of CKD and CKD screening among patients and clinicians and to improve early detection and management of CKD.
Using CKD risk scores to identify persons for CKD screening with testing for albuminuria may prove to be a cost-effective method for identifying a population broader than just those with diabetes or hypertension. In this study, we used the CKD Health Policy Model, a microsimulation model of CKD progression, to examine the costeffectiveness of identifying persons for early-stage CKD screening (i.e., screening for moderate albuminuria) using two published CKD risk scores: one published by Bang et al. [15] and one published by Kshirsagar et al. [16]. We assessed the cost-effectiveness of alternative screening scenarios by varying risk score thresholds above which persons were assigned to receive screening and frequencies of follow-up screening if the initial test was negative.

Model overview
This study used the CKD Health Policy Model in 2015, a microsimulation model of CKD progression [2,12,17,18]. Briefly, the model simulates progression of CKD and its complications in a nationally representative cohort drawn from the National Health and Nutrition Examination Survey (NHANES) through age 90 years or death. The model includes eight states: no CKD, CKD stages 1 through 5 (with stage 3 divided into 3a and 3b), and death. CKD stages are defined by estimated glomerular filtration rates (eGFR) and the presence of elevated albuminuria (urinary albumin to creatinine ratio ≥30 mg/g) [19]. The model concomitantly simulates the natural history of complications from CKD. Model parameters are derived from the epidemiological literature, clinical trials, and a previous cost-effectiveness study [14]. Importantly, the model simulates screening for moderate and severe albuminuria. Figure 1 presents the screening and treatment pathway in the model for a person diagnosed with CKD. In the model, treatment with angiotensin-converting enzyme (ACE) inhibitors or angiotensin receptor blockers (ARBs) decreases the probability of progression from moderate to severe Fig. 1 Flowchart of CKD screening and treatment in the CKD health policy model. ACE, angiotensin-converting enzyme inhibitor; ACR, albuminto-creatinine ratio; ARB, angiotensin receptor blockers; CKD, chronic kidney disease; GFR, glomerular filtration rate albuminuria, slows the annual decline in GFR for persons with moderate albuminuria, and reduces the annual mortality rate for persons with moderate albuminuria. The model does not include parameters related to possible harms associated with screening, incidental findings, or over-diagnosis, but because of the two stage test and the sensitivity and specificity parameters, we expect misdiagnosis to be low. Model parameters related to CKD screening and treatments with ACE inhibitors or ARBs are shown in Table 1.  15 Boulware et al. [14]; CMS [28] No diabetes 85. 23 Boulware et al. [14]; CMS [28] General practitioner visit 132.88 Boulware et al. [14]; CMS [28] Annual drug therapy  [21].

Risk scores
We assigned persons to receive CKD screening based on published risk scores. Risk scores were identified from literature review based on four criteria: (1) the predictive factors are commonly collected as part of regular physician office visits to ensure that the risk score can be feasibly implemented to identify a broad population for CKD screening, (2) the study pertains to the U.S. population, (3) the study has good internal predictive ability, and (4) the study has good external predictive ability as measured using external data sources. We allowed for the inclusion of some factors-diabetes, cholesterol, and anemia-that are collected at office visits with slightly less regularity. Two risk scores were identified based on these criteria: one published by Bang et al. [15] and one published by Kshirsagar et al.   Table 3. These coefficients are derived from logistic regressions, so logistic transformation was used to construct risk scores in the model cohort, based on each person's risk factors. The two risk scores are constructed using largely similar risk factors.
After determining a person's risk score, it was necessary to determine the risk score threshold over which a person is assigned to receive early-stage CKD screening (i.e., screening for moderate albuminuria). No optimal threshold was defined ex-ante, so five alternative thresholds (0.20, 0.15, 0.10, 0.05, and 0.02) were tested for the Bang et al. and Kshirsagar et al. risk scores. This range of thresholds was chosen because of the concentration of risk scores at the lower range of the distribution. For both risk scores, 95% of the cohort had a risk score less than 0.20. Persons with risk scores less than or equal to the threshold did not receive any screening or follow-up until their risk scores rose above the threshold. Once a person's risk score rose above the risk score threshold, he or she was assigned to receive screening for moderate albuminuria. If the initial screening was negative, the person received a follow-up screening for moderate albuminuria at a specified interval. Because no optimal interval was defined ex-ante, we tested three intervals: 1 year, 2 years, and 5 years. These intervals were chosen because they have been used in past measures of CKD screening, [12] although there is no definitive recommendation for follow-up interval.

Incremental cost-effectiveness ratios
Lifetime costs and quality-adjusted life years (QALYs) were simulated for the screening scenarios using the Bang et al. and Kshirsagar et al. risk scores described above for a nationally representative cohort drawn aged 30 or older from the 1999-2010 NHANES. The increment we evaluate is a change from the next highest risk score threshold, so each incremental change represents an increase in the number of people screened due to a lower risk score threshold. Lifetime costs and lifetime QALYs for each scenario were compared with the next higher risk score threshold to evaluate incremental costs and QALYs. The incremental cost-effectiveness ratios (ICERs) were computed as incremental lifetime cost divided by incremental lifetime QALYs for each screening scenario. Costs and QALYs were discounted (i.e. reduced) at a 3% annual rate, as recommended for all cost-effectiveness analysis by Weinstein et al. [21] We computed 95% confidence intervals for each ICER using a probabilistic sensitivity analysis where we allowed the following key model  Cost-effectiveness of any screening scenario depends on the specific willingness to pay for additional QALYs. The commonly used benchmark is $50,000 per QALY [22]. A screening scenario was determined to be costeffective if the ICER per QALY gained is less than the willingness to pay threshold. The optimal scenario was determined as the cost-effective scenario that yields highest QALYs gained.

Model validation
The external validity of the model was tested against data from the longitudinal ARIC study. The ARIC study tracked persons over approximately 9 years and included 4 office visits to collect laboratory data and health status information. Data from the first ARIC office visit were used to populate the simulation cohort in the model validation. We simulated 9 years in the model for this cohort and generated the distribution of the change in eGFR. The distribution of the actual change in eGFR between the first and last office visit in the ARIC study was compared with the simulated distribution to examine the performance of the model.
Results from validation testing of eGFR progression in the model demonstrated strong model performance. The model simulated a 9.72 mL/min per 1.73 m 2 average decrease in eGFR over 9 years. In the ARIC data, the actual decrease in eGFR over the 9-year study period was 9.24 mL/min per 1.73 m 2 . The difference between the simulated and actual change in eGFR was statistically not different (i.e., p > 0.05).

Sensitivity analysis
To test the sensitivity of our results and conclusions to the choice of parameters for risks and costs, we conducted a number of one-way sensitivity analyses by varying key model parameters by ±25%: the hazard ratio of ACE inhibitor/ARB treatment on transition from moderate to severe albuminuria, the hazard ratio of ACE inhibitor/ARB treatment on eGFR decline, the hazard ratio of ACE inhibitor/ARB treatment on the annual mortality rate, ACE inhibitor inhibitor/ARB adherence, the costs of screening, and the costs of ACE inhibitor/ARB treatment. We performed these tests for the optimal screening scenarios for each risk score identified in the main analysis. For each test, we examined the ICER relative to the no screening scenario and determined the percentage change from results in the main analysis. These parameters relate to the benefits and costs of early screening, so varying them tests the sensitivity of results to these benefits and costs. We also conducted probabilistic sensitivity analysis to generate 95% confidence intervals for simulation results. Table 4 shows the cost-effectiveness of screening using the Bang et al. [15] and Kshirsagar et al. [16] risk scores for various risk score thresholds and screening follow-up frequencies. Using the Bang et al. risk score, lifetime QALYs and costs had only small differences across screening scenarios, but ICERs across the screening scenarios ranged from $8,823 per QALY to $124,626 per QALY. With annual follow-up screening, risk score thresholds of 0.10 or higher had ICERs below the willingness to pay benchmark of $50,000 per QALY. For both the 2-year and 5-year screening follow-up all risk score thresholds evaluated had ICERs less than the willingness to pay benchmark. Lower risk score thresholds had higher QALYs and in most cases also had higher ICERs than the next higher threshold. Among the cost-effective screening scenarios, a risk score threshold of 0.02 with 2-year follow-up had the highest lifetime QALYs (21.373) with an ICER of $19,116 per QALY.

Results
Using the Kshirsagar et al. risk score (Table 4), lifetime QALYs and costs had only small differences across screening scenarios, but ICERs across the screening scenarios ranged from $5,750 per QALY to $368,000 per QALY. With annual follow-up screening, risk score thresholds of 0.20 and 0.15 had ICERs lower than the willingness to pay benchmark. With 2-year follow-up, risk score thresholds 0.05 and higher had ICERs below the willingness to pay benchmark. With 5-year follow-up, all thresholds had ICERs lower than the willingness to pay benchmark. As with the Bang et al. risk score, using lower risk score thresholds had higher QALYs, however the pattern for ICERs was inconsistent. Among the cost-effective screening scenarios, a risk score threshold of 0.05 with 2-year follow-up screening had the highest lifetime QALYs (21.373) with an ICER of $12,667 per QALY. Comparing the optimal screening scenarios for the two risk scores, both yielded the same level of QALYs, but the optimal screening scenario using the Bang et al. score had a lower lifetime cost and therefore can be considered optimal overall. Figure 2 shows the results of one-way sensitivity analysis for 25% changes in parameter estimates on the ICER relative to the no screening scenario when using a risk threshold of 0.02 for the Bang et al. risk score with 2-year follow-up screening. Varying the costs of angiotensin-converting enzyme (ACE) inhibitors/angiotensin receptor blockers (ARBs) treatment  Table 3 had a large impact on the ICER relative to the no screening scenario. Increasing treatment cost parameters by 25% led to an ICER of $1,066,741 per QALY and decreasing treatment cost parameters by 25% resulted in cost savings. If costs are higher than the parameters used here, results will be very different. Figure 3 shows a similar sensitivity analysis for the Kshirsagar et al. risk score using a risk threshold of 0.05 and 2-year follow-up screening. Similar to the Bang et al. risk score, only varying the costs of ACE/ARB treatment has a large impact on the ICER relative to the no screening scenario. Increasing treatment cost parameters by 25% led to an ICER of $1,045,704 per QALY and decreasing treatment cost parameters by 25% resulted in cost savings. If costs are higher than the parameters used here, results will be very different.

Discussion
The Bang et al. and Kshirsagar et al. risk scores examined here, in general, produced a similar pattern of results. Using lower risk score thresholds when identifying persons for screening led to screening more people, identifying more early cases of moderate albuminuria, and saving more QALYs. However, because of costs associated with screening more persons, the incremental costs were greater for scenarios with lower risk score thresholds. Less frequent follow-up screening for person above the risk score threshold whose test was negative mitigated some of these additional costs while preserving most of the QALY gains from early detection. The ICER here summarizes the trade-off between increased QALYs and increased costs from broader screening. Importantly in these results, for 1-year follow-up screening, ICERs generally increased as risk thresholds decreased, whereas for 2-year and 5-year follow-up, ICERs decreased as risk thresholds decreased. This result illustrates that, with less frequent follow-up, few of the health gains of screening were lost while cost reductions were substantial.
The pattern of change in ICERs when moving to each lower risk score threshold was not always consistent due to different rates of change in the QALYs and costs at each threshold. Although costs and QALYs increased for all lower risk score thresholds, they did not always increase at the same rate. There were some specific differences in results between the two risk scores. The scenarios using the Kshirsagar et al. risk score to identify persons for screening produced slightly greater QALYs, but generally at higher costs than scenarios using the Bang et al. risk score at all thresholds. This could be because for each risk score threshold, the Kshirsagar et al. risk scores classifies more persons as high risk. This would lead to more persons screened and more QALYs gained, albeit at greater cost.
This study showed that there were several screening scenarios that were cost-effective for the given willingness to pay benchmark. Among all the cost-effective screening scenarios, the Kshirsagar et al. risk score with a threshold of 0.05 and 2-year follow-up screening and the Bang et al. risk score with threshold of 0.02 with 2-year follow-up screening generated the same maximum level of lifetime QALYs (21.373), but the one using the Bang et al. risk score had lower lifetime costs than that using Kshirsagar et al. risk score ($139,997 vs. $140,022) and can therefore be considered optimal. However, it should be noted that the difference in cost between these two screening scenarios is small ($25), so a clinician could optimally use either depending on the availability of patient data available to construct the alternative risk scores. It should also be noted that lifetime QALYs and costs had only small differences across all screening scenarios using both risk scores, so little costs or QALYs are gained incrementally, which should encourage caution when choosing a particular risk score threshold especially because the confidence intervals around lifetime costs and QALYs are relatively large.
For clinicians, this means, each patient could be evaluated using the Bang et al. risk score and screened for moderate albuminuria if the risk score is greater than 0.02. For example, persons older than age 50; those of any age with diabetes, hypertension, and anemia; or those of any age with a history of CVD would be candidates for CKD screening ( Table 2). If the screening test is positive, the clinician would proceed with treatment; if negative, the clinician would conduct a follow-up screening in 2 years.
Past studies have found that screening the broad population for CKD may not be cost-effective, but screening populations at high risk, such as persons with diabetes or hypertension, may be cost-effective [12][13][14]. This study builds upon this past work by using risk scores to identify persons in the broader population to receive CKD screening. These risk scores rely not only on diabetes and hypertension, but also on age, gender, and health history, including CVD and anemia. Table 2 illustrates how this method of screening with risk scores leads to screening higher risk persons based on combinations of age, gender, CVD history, and anemia. These are persons that would not have been screened based on previous research showing that only screening those with diabetes or hypertension is cost-effective [12][13][14]. Using CKD risk scores allows for the examination of various thresholds for screening, which dichotomous criteria, such as history of diabetes or hypertension, does not. The information from this study could be used to frame future recommendations and programs for CKD screening that are not only effective from a clinical but also from a cost perspective. This analysis is limited by the need to make assumptions regarding costs and other model parameters such as that all patients will present for initial and follow-up screening and be offered and accept treatment and that all providers will use risk scores and follow screening guidelines. The analysis only included medical costs and potentially omitted important societal costs, such as opportunity costs of time for screening, which would raise the ICERs associated with screening, and productivity losses and long-term care costs, which would decrease the ICERs. In addition, although model parameters were based on current epidemiologic literature, they may be imperfect or may omit additional unknown factors.

Conclusions
In summary, the Bang et al. CKD risk score with a threshold of 0.02 and 2-year follow-up was found to be the most cost-effective for CKD screening. In contrast with current approaches for CKD screening that rely only on identifying high risk persons with diabetes or hypertension, CKD risk scores could be used by clinicians to identify a broader population for CKD screening. This is an important tool for increasing awareness of CKD and CKD screening in patients and clinicians. In particular, people with CKD who are detected in earlier stages of the disease would consequently benefit from receiving earlier clinical management and treatment to potentially slow down progression and prevent or delay ESRD.