Skip to main content
  • Research article
  • Open access
  • Published:

Predicting the prevalence of chronic kidney disease in the English population: a cross-sectional study



There is concern that not all cases of chronic kidney disease (CKD) are known to general practitioners, leading to an underestimate of its true prevalence. We carried out this study to develop a model to predict the prevalence of CKD using a large English primary care dataset which includes previously undiagnosed cases of CKD.


Cross-sectional analysis of data from the Quality Improvement in CKD trial, a representative sample of 743 935 adults in England aged 18 and over. We created multivariable logistic regression models to identify important predictive factors.


A prevalence of 6.76% was recorded in our sample, compared to a national prevalence of 4.3%. Increasing age, female gender and cardiovascular disease were associated with a significantly increased prevalence of CKD (p < 0.001 for all). Age had a complex association with CKD. Cardiovascular disease was a stronger predictive factor in younger than in older patients. For example, hypertension has an odds ratio of 2.02 amongst patients above average and an odds ratio of 3.91 amongst patients below average age.


In England many cases of CKD remain undiagnosed. It is possible to use the results of this study to identify areas with high levels of undiagnosed CKD and groups at particular risk of having CKD.

Trial registration

Current Controlled Trials ISRCTN: ISRCTN56023731. Note that this study reports the results of a cross-sectional analysis of data from this trial.

Peer Review reports


Chronic Kidney Disease (CKD) is largely asymptomatic [1]. Early identification affords opportunities to prevent and delay disease progression [2]. The potential benefits of active management include: reducing mortality and morbidity from cardiovascular diseases; progression to renal failure amongst patients with proteinuric disease; improving the quality of life for patients with more severe symptomatic disease; and reducing the use of resources and costs for health services [1].

Within England, CKD is included in a national pay-for-performance (P4P) scheme for chronic disease management. However, identification of CKD relies on opportunistic testing, and there is evidence that not everyone with CKD is being identified through the P4P scheme. Data from the Health Survey for England quote a national prevalence of 6% [3], the corresponding estimate from the P4P scheme is 4.3% [4]. Because of this difference, modelled estimates of the prevalence of CKD are required to support case-finding for CKD. These would enable public-health practitioners to identify and target areas where there is an under-detection of CKD and hence a need to promote awareness of its importance and improve existing local methods for identifying individuals at risk of CKD such as testing based on currently recommended risk factors [1].

There are existing models that may be used to estimate the prevalence of CKD for an area [58]. However, these models all have limitations: none of them use data from English patients and none check for interactions between variables.

We carried out this study to create a prevalence model for CKD. We used routinely collected data and a novel method to identify patients with CKD that has not been identified under the P4P scheme [9].



The Quality Improvement in CKD (QICKD - ISRCTN56023731) trial [10, 11] population includes a representative large sample of patients with CKD stages 3 to 5 in England [12]. The primary aim of the QICKD trial is to compare quality improvement interventions aimed at lowering systolic blood pressure in patients with CKD in primary care; ethical approval has been given for secondary analyses of the data. Ethics approval was received from the Oxford Research Ethics Committee (Committee C) (ref: 07/H0606/141). Here, the QICKD dataset was used to model the association between the prevalence of CKD and its potentially explanatory factors. This dataset contained patient-level data, extracted from the computer systems of 129 English general practices (GP), based in London, Surrey, Leicester, Birmingham, Cambridge and Sussex. The full dataset contained information on 930,997 people, of whom 743,935 are aged 18 or over. We only included people aged 18 or over to be consistent with the P4P scheme. The data are cross-sectional, extracted in 2009.

Cases of CKD stages 3 to 5 were strictly defined in accordance with the 2002 K-DOQI classification [13] on the basis of an estimated glomerular filtration rate (eGFR) of less than 60 ml/min/1.73 m2 for at least 90 days. Laboratories in England report eGFR using the four-variable modified diet in renal disease formula [14], with correction factors applied under the guidance of the National External Quality Assessment Service [15] to account for differences in local creatinine assays. People without a serum creatinine measurement in their electronic record were assumed for the purposes of the analysis not to have CKD.

We considered 11 potentially explanatory variables. These may be loosely classified as socio-demographic variables and variables about the presence or absence of cardiovascular disease. The socio-demographic variables were: age of subject (in years), gender, ethnicity, smoking status and deprivation score. Ethnicity was based on the 2001 England and Wales Census ‘5 + 1’ categories: ‘Asian’, ‘Black’, ‘Mixed’, ‘White’ ‘Other’, and ‘Not Stated’ [16]. There were two additional categories: ‘Not recorded’ occurred when there was an explicit code stating that ethnicity was not recorded, and ‘Missing’ was for missing ethnicity data. Smoking status was recorded as ‘never smoked’, ‘ex-smoker’, ‘smoker’ or it may be missing Ethnicity and smoking status were both modelled using dummy variables. Deprivation score was a continuous variable from the 2007 index of multiple deprivation [17], and based on the patient’s postcode [18].

The cardiovascular diseases considered were: diabetes, ischaemic heart disease (IHD), heart failure, hypertension, peripheral vascular disease (PVD) and stroke. These were modelled using dichotomous indicators. Data on systolic and diastolic blood pressure were available, but were not used due to high levels of missing data (23% of values were missing).

Statistical analyses

Multivariable logistic regression was used to model the dependency of having CKD on the potentially explanatory factors. Model building was mainly based on the recommendations of Hosmer and Lemeshow [19]. Briefly, variables that showed a significant univariate association with CKD were included in a multivariable model. Manual backwards elimination was then applied; dropping non-significant variables one at a time. When no more variables could be deleted a check was made to see if any variables could be included. This gave a preliminary main-effects model. We checked if transformations were required for any continuous variables in this model, and considered possible interactions. We only considered interactions with age as this was known to be a strong predictor of CKD prevalence [57] and important interactions with age are often identified [20]. We used sample splitting to validate this approach; the sample was randomly split into two sub-sets (of approximately equal size), and the model-building process applied to both sub-sets.

Due to the large sample size nearly all variables were statistically significant with p-values less than 0.001. Instead we used clinical significance; for our study a variable was defined as clinically significant if its odds ratio (OR) was either above 1.49 or below 0.67. These values are derived from published CKD guidelines [1], in which it is stated that a rise in serum creatinine of over 20% should be considered significant. We checked for interactions by plotting the prevalence of CKD against age and stratifying by the levels of each factor (deprivation was categorised into quintiles). We observed a non-linear interaction between age and the cardiovascular diseases. This was modelled by introducing a dummy variable ‘below age 50’ which takes the value 0 if patients are below the age of 50 years and 1 otherwise, and including its interaction with the cardiovascular diseases.

We constructed a ‘clinical’ model (considering all the potentially explanatory variables) and a ‘parsimonious’ model which considered just age and gender as it was noted that sometimes these are the only variables for which data are available. We also present the results from the full main-effects model, as the use of this is sometimes recommended in the literature [20, 21]. We used STATA version 10.1 [22] for all analyses.

Missing values for smoking status and ethnicity were treated as separate categories. Individuals with missing blood pressure readings (23%) were assumed not to have hypertension. Missing data for deprivation (19%) were due to a computer error during data collection, and so are assumed to be missing completely at random and were imputed by using a single implementation of the ICE procedure [23]. There were no missing data for any of the other variables.

Models were compared based on both their Akaike’s and the Bayesian information criteria (AIC and BIC respectively) [24], models with lower values were preferred. We performed a residual analysis using the deviance residuals to check goodness of fit. The Hosmer-Lemeshow test [19] may also be used to formally check goodness of fit. However, this measure is known to be of limited use for large sample sizes [25], so a graphical alternative was used: predicted and observed CKD prevalence were compared using deciles of predicted values [26, 27]. The ability of the models to correctly classify patients was summarised by their ‘area under the receiver-operating-characteristic curve’ (AUROC) [28], along with their sensitivity, specificity, positive predictive value and negative predictive value. These values are known to be biased (give values that are too optimistic about model performance) when calculated based on the same data to which the model was built. To avoid this, we used a model built on one sub-set of the data, and calculated the statistics on the remaining sub-set.

Additional analyses

The model-building process was repeated separately for CKD that had and had not been identified under the P4P scheme. Descriptive statistics were also produced for these two CKD classifications.


Of the 743 935 patients, 50 321 had CKD, giving a prevalence in the adult population of 6.76%. The mean age of the population was 46.7 years.

Variations in the prevalence of CKD were observed for all of the potentially explanatory factors (Table 1). With the exception of gender and the missing levels of both smoking status and ethnicity, all the univariable odds ratios are shrunk towards unity when controlled for differences in age. This shrinkage is the most notable for the cardiovascular diseases, for example the univariable odds ratio for heart failure changes from 16.07 to 2.78 after controlling for age.

Table 1 Summary statistics of co-variables used in the analysis

To reach the clinical model we applied the manual stepwise method, with the following exception:

  • It was not possible to apply clinical significance to multi-categorical variables (ethnicity and smoking status). These variables only had one significant level which related to missing data. After examining patterns of missing data, it was decided that smoking (but not ethnicity) data were not missing at random. Briefly, individuals with missing smoking status had a very low recorded prevalence for CKD, all cardiovascular diseases, and were also more likely to have a missing ethnicity. This suggests that the observed prevalence amongst subjects with missing smoking status is biased downwards, possibly due to a lack of GP contact. The remaining smoking categories were not clinically significant, and so smoking status was dropped.

Using the sample splitting approach, the final model in both sub-sets was the same. Hence these were pooled, and the model re-estimated based on all of the data. Summary measures of classification for the model are based on the sample-splitting approach. For the clinical model the variables deprivation score, PVD, stroke and smoking status were excluded.

A graphical check of the functional form for age [19] indicated that a quadratic term was required. This was confirmed by residual analysis and was also noticeable in the graphs constructed to check for interactions with age (Figure 1). Including the quadratic also reduced both information criteria.

Figure 1
figure 1

Observed prevalence of CKD by age. Results are stratified by diabetes status (left-hand pane) and hypertension status (right-hand pane).

For all the cardiovascular diseases an interaction with age was observed which appears to begin at the same age, two examples are shown in Figure 1. Because of these consistencies, an interaction with age was included for every cardiovascular disease in the final clinical model even though this interaction is only clinically significant for hypertension, and is not statistically significant for IHD and heart failure. The lack of statistical significance is likely to be due to the small numbers of people with the disease who are below average age.

Results for the clinical and the parsimonious model are presented in Table 2. Increasing age, female gender and white ethnicity were associated with a significantly increased prevalence of CKD, as was the presence of a cardiovascular disease. These cardiovascular diseases were stronger predictive factors in younger than in older patients. For example, using the results from the clinical model, the increases in the odds of CKD due to having hypertension is 2.02 amongst patients aged over 50 and 3.91 amongst patients aged below 50. Heart failure is associated with odds of 2.31 in older subjects and 3.14 amongst younger patients.

Table 2 Final multivariable logistic regression model for chronic kidney disease

Summary measures and graphs for the three models suggested that they all fit the data well. There was very little difference between the in-sample classification measures, and the out-of sample measures. This suggests that the optimism due to building and evaluating a model on the same data is almost neglible for this analysis; possibly due to the large sample size. Out-of-sample AUROC scores were 0.898 (full model), 0.898 (clinical model) and 0.889 (parsimonious model) (Table 3). Graphs comparing observed and expected deciles of risk were similar for all three models, only those for the full and clinical model are shown (Figure 2). These graphs have been plotted on a log-scale, and show that use of the full model systematically underestimates CKD prevalence for this with low prevalence. This bias is mostly removed by the use of the clinical model, suggesting that it is due to the omission (in the full model) of the interactions between age and cardiovascular diseases.

Table 3 Summary measures of the regression models considered
Figure 2
figure 2

Comparison of observed and expected probabilities of having CKD, plotted on the log-scale. Results are presented for the full model (left-hand pane) and the clinical model (right-hand pane).

Applying the model-building process to just patients with identified CKD gave similar results to using all cases of CKD, the main difference was the inclusion of diabetes. However, when it was applied to patients with unidentified CKD the resulting model was very different. None of the cardiovascular diseases were clinically significant predictors, whilst being of an Asian or Black ethnicity was a much stronger predictor of not having CKD (Additional file 1 and Additional file 2). More research is required into why these differences arise.


We have developed new models to give accurate predictions of CKD. Increasing age, female gender, white ethnicity and cardiovascular disease were all associated with an increased prevalence of CKD. In addition, we have also shown that there is a complex association with age which in turn interacts with cardiovascular disease. The effects of these diseases were greater amongst younger than older adults. The pattern of this interaction was very similar for all the cardiovascular diseases.

The results of our study support those previously published by confirming the important roles of age, gender [58, 29, 30] and CVD [3133] in predicting cases of CKD. We also found statistically significant associations with deprivation and ethnicity, but these were not clinically significant; this may explain why there is weak or mixed evidence on their importance in predicting the prevalence of CKD [34].

There are four studies that look at multivariable models for predicting the prevalence of CKD [58]. ORs for female varied between 1.19 and 1.49. All four studies considered the effect of hypertension and diabetes; for the ORs ranged from 1.4 to 1.72, whilst for the latter ORs ranged from 0.9 to 2.68. None of the studies considered interactions between the cardiovascular diseases and age.

Both Bang et al. [5] and Whaley-Connell et al. [7] considered ethnicity. Bang et al. [5] found that, compared to non-Whites, Whites had a statistically significant univariable odds ratio (2.1, p = 0.03) of having CKD, but that ethnicity was not a significant predictor in the multivariable model. Whaley-Connell et al. [7] considered two different cohorts; in one White ethnicity was associated with a statistically significant odds ratio of 1.23 (p < 0.001), in the other it had a non-significant odds ratio of 0.91 (p = 0.2).

All four studies confirm the important effect of age; Chadban et al. [6] compared subjects aged under 65 to those aged over 65 and reported an odds ratio of 102 (p < 0.001). The other three studies categorised age, and reported significant odds ratios for all categories. Our study further shows that age has a complex association with CKD.

The results of this model are also consistent with cohort studies of CKD in showing that there are interactions between CKD, age and cardiovascular disease [35]. Other interactions between age and cardiovascular disease have also been reported in the literature [20, 36].

This is the first study that we are aware of that provides multivariable models for predicting the prevalence of CKD in England. Our results are similar to those based in other countries in identifying important variables, but the magnitude of the associations often vary. For example, we found an odds ratio for female gender of about two for all three models; a larger value than that reported in the other studies.

We have identified important interactions between age and cardiovascular disease in predicting the prevalence of CKD. These interaction have not been included in any of the existing models for predicting the prevalence of CKD (or in models for predicting the incidence of CKD [29, 30]) despite evidence of its importance in the literature. Service planning based on existing models, which fail to capture these interactions, may result in a mismatch between supply and demand for renal services in primary and secondary care. More accurate predictions of CKD prevalence may allow more accurate targeting of resources toward areas of unmet need. A particular strength of our study is the large sample size available. This resulted in increased power to estimate coefficients, especially for interactions. The large sample size, along with the consistency of findings when employing sample-splitting, suggest that the interactions identified in this study will generalise to the rest of the England CKD population.

The QICKD study includes patients whose CKD has not been diagnosed in general practice, and so these estimates may be compared with the P4P CKD indicator to determine areas with high levels of un-met need. At a national level, the P4P indicator in England gives a prevalence of CKD of 4.3% [4], we reported a prevalence of 6.76%, suggesting that over a third of people with CKD are not known to their GP. This confirms findings in the recent Health Survey for England, which also included cases of CKD not diagnosed in general and reported a prevalence of 6% [3]. Analysis of patients with unidentified CKD suggests that their risk profile may be different to patients with identified CKD, this is an area that requires further research.


Using cross-sectional data is a limitation, as it is known that rates of progression vary by patient characteristics [1]. The results of this analysis may be used to identify areas with a high prevalence of CKD, where early identification will be beneficial in reducing both progression to renal failure and morbidity from cardiovascular disease. However, when targeting resources for CKD, consideration should also be given to variations in rates of progression across populations. We also made no distinction between varying levels of kidney disease. The available literature suggests that the risk profile for CKD may vary as kidney disease progresses; for example it has been shown that the proportion of males with CKD increases with worsening stage [37], and a recent study found that non-white ethnicity was a significant predictor of renal replacement therapy [38]. As renal failure can be devastating for the patient and very expensive [1], more research is required into rates of progression.

We have assumed that people without a serum creatinine measurement did not have CKD. Whilst this is consistent with previous approaches [39], there was no measurement recorded for 56% of the sample. Hence the prevalence of CKD reported here is likely to be an under-estimate.

The choice to use clinical significance instead of statistical significance posed some problems. In particular the choice of whether or not to include the multi-categorical variables ethnicity and smoking status was slightly arbitrary. The importance of all the omitted variables warrants further research. For the continuous variables the value of the odds ratio (and hence their clinical significance) depends on the units reported. We used the odds ratio per 10-year increase in age, which is commonly employed in the literature [5, 8, 29, 30, 34]. For deprivation we used a 10-point increase (deprivation values range between 0.75 and 77.37). Using the results from the full main-effects model, we would need to use a 45-point increase in deprivation for it to become clinically significant, and a 5-year increase in age for it to become not clinically significant.

We did not anticipate a priori the nature of the observed interactions between age and the cardiovascular diseases and this feature needs to be independently confirmed. In addition there is scope to improve the modelling of this interaction; noticeably the choice of at what age to start modelling the interaction warrants further research.


CKD is largely asymptomatic, making accurate identification and subsequent management of patients at risk of progression difficult. However, identification is important because progression of patients to symptomatic disease impairs their quality of life and results in increased costs for health services. Although included within the P4P scheme, it is recognised that CKD is under-ascertained within primary care in England.

We have developed disease prevalence models for CKD that will allow decision makers to identify areas where the P4P rates are lower than expected and target these for possible public health interventions. The results of this study may also be used to identify sub-groups or patient profiles in whom the demands for renal services and treatment may be increased, such as young people with a cardiovascular disease.



Akaike’s information criteria


Area under the receiver operating characteristic curve


Bayesian information criteria


Chronic kidney disease


Degrees of freedom


Estimated glomerular filtration rate


Ischaemic heart disease


Pay for performance


Peripheral vascular disease


Quality improvement in chronic kidney disease.


  1. National Collaborating Centre for Chronic Conditions: Chronic kidney disease: national clinical guideline for early identification and management in adults in primary and secondary care. 2008, London: Royal College of Physicians

    Google Scholar 

  2. Department of Health: The National Service Framework for Renal Services – Part Two: Chronic Kidney Disease, Acute Renal Failure and End of Life Care. 2005, London: Department of Health Renal National Service Framework Team

    Google Scholar 

  3. Roderick P, Roth M, Mindell J: Prevalence of chronic kidney disease in England: findings from the 2009 health survey for England. J Epidemiol Commun H. 2011, 65 (Suppl I): A12-

    Article  Google Scholar 

  4. QOF. 2010,, /11 data tables,

  5. Bang H, Vupputuri S, Shoham DA, Klemmer PJ, Falk RJ, Mazumdar M, Gipson D, Colindres RE, Kshirsagar AV: Screening for occult renal disease (SCORED): a simple prediction model for chronic kidney disease. Arch Intern Med. 2007, 167 (4): 374-381. 10.1001/archinte.167.4.374.

    Article  PubMed  Google Scholar 

  6. Chadban SJ, Briganti EM, Kerr PG, Dunstan DW, Welborn TA, Zimmet PZ, Atkins RC: Prevalence of kidney damage in Australian adults: the AusDiab kidney study. J Am Soc Nephrol. 2003, 14 (7:Suppl 2): S131-S138.

    Article  PubMed  Google Scholar 

  7. Whaley-Connell AT, Sowers JR, Stevens LA, McFarlane SI, Shlipak MG, Norris KC, Chen SC, Qiu Y, Wang C, Li S, Vassalotti JA, Collins AJ: CKD in the united states: kidney early evaluation program (KEEP) and national health and nutrition examination survey (NHANES) 1999–2004. Am J Kidney Dis. 2008, 51 (suppl 2): S13-S20. 4

    Article  CAS  PubMed  Google Scholar 

  8. Kwon KS, Bang H, Bomback AS, Koh DH, Yum JH, Lee JH, Lee S, Park SK, Yoo KY, Park SK, Chang SH, Lim HS, Choi JM, Kshirsagar AV: A simple prediction score for kidney disease in the Korean population. Nephrology. 2012, 17 (3): 278-284. 10.1111/j.1440-1797.2011.01552.x.

    Article  PubMed  Google Scholar 

  9. de Lusignan S, Chan T, Stevens P, O’Donoghue D, Hague N, Dzregah B, Van Vlymen J, Walker M, Hilton S: Identifying patients with chronic kidney disease from general practice computer records. Fam Pract. 2005, 22 (3): 234-241. 10.1093/fampra/cmi026.

    Article  PubMed  Google Scholar 

  10. de Lusignan S, Gallagher H, Chan T, Thomas N, van Vlymen J, Nation M, Jain N, Tahir A, du Bois E, Crinson I, Hague N, Reid F, Harris K: The QICKD study protocol: a cluster randomised trial to compare quality improvement interventions to lower systolic BP in chronic kidney disease (CKD) in primary care. Implement Sci. 2009, 4: 39-10.1186/1748-5908-4-39.,

    Article  PubMed  PubMed Central  Google Scholar 

  11. de Lusignan S, Tomson C, Harris K, van Vlymen J, Gallagher H: Creatinine fluctuation has a greater effect than the formula to estimate glomerular filtration rate on the prevalence of chronic kidney disease. Nephron Clin Pract. 2010, 117: 213-224.

    Article  Google Scholar 

  12. de Lusignan S, Gallagher H, Jones S, Chan T, van Vlymen J, Tahir A, Thomas N, Jain N, Dmitrieva O, Rafi I, McGovern A, Harris K: Using audit-based education to lower systolic blood pressure in chronic kidney disease (CKD): results of the quality improvement in CKD (QICKD) trial [ISRCTN:. 56023731, ]. Accepted for Publication Kidney International, 13th January 2013, Ref: KI-06-12-0863.R3

    Google Scholar 

  13. Hogg RJ, Furth S, Lemley KV, Portman R, Schwartz GJ, Coresh J, Balk E, Lau J, Levin A, Kausz AT, Eknoyan G, Levey AS: National Kidney Foundation practice guidelines for chronic kidney disease: evaluation, classification, and stratification. Ann Intern Med. 2003, 139: 137-147.

    Article  PubMed  Google Scholar 

  14. Levey AS, Bosch JP, Lewis JB, Greene T, Rodgers N, Roth D: A more accurate method to estimate glomerular filtration rate from serum creatinine: a new prediction equation. Ann Intern Med. 1999, 130: 461-470.

    Article  CAS  PubMed  Google Scholar 

  15. United Kingdom National External Quality Assessment Service (UK NEQAS) - Group website.,

  16. Department of Health: A practical guide to ethnic monitoring in the NHS and social care: Annex D – Detailed breakdown of the ONS 2001 census codes for ethnic group. 2007, London: Department of Heath,,

    Google Scholar 

  17. Communities and Neigbourhoods: Indices of Deprivation 2007. 2007, London: Department for Communities and Local Government,,

    Google Scholar 

  18. de Lusignan S, Nitsch D, Belsey J, Kumarapeli P, Vamos EP, Majeed A, Millett C: Disparities in testing for renal function in UK primary care: cross-sectional study. Fam Pract. 2011, 28 (6): 638-646. 10.1093/fampra/cmr036.

    Article  PubMed  Google Scholar 

  19. Hosmer D, Lemeshow S: Applied Logistic Regression. 2000, New York, NY: John Wiley & Sons Inc, 2

    Book  Google Scholar 

  20. Harrell F, Lee K, Mark D: Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996, 15 (4): 361-387. 10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4.

    Article  PubMed  Google Scholar 

  21. Robins J, Greenland S: The role of model selection in causal inference from non-experimental data. Am J Epidemiol. 1986, 123 (3): 392-402.

    CAS  PubMed  Google Scholar 

  22. STATACORP Stata Statistical Software: Release 10. 2007, College Station, TX: StataCorp LP,,

  23. Royston P: Multiple imputation of missing values: update of ice. Stata J. 2005, 5 (4): 527-536.

    Google Scholar 

  24. Buckland ST, Burnham KP, Augustin NH: Model selection: an integral part of inference. Biometrics. 1997, 53 (2): 603-618. 10.2307/2533961.

    Article  Google Scholar 

  25. Campbell M: Statistics at Square Two: Understanding Modern Statistical Applications in Medicine. 2006, Oxford, UK: Wiley

    Book  Google Scholar 

  26. Pepe MS, Feng Z, Huang Y, Longton G, Prentice R, Thompson IM, Zheng Y: Integrating the predictiveness of a risk marker with its performance as a classifier. Am J Epidemiol. 2008, 167: 362-368.

    Article  PubMed  Google Scholar 

  27. Collins GS, Altman GS: External validation of QDSCORE for predicting the 10-year risk of developing type 2 diabetes. Diabetic Med. 2011, 28: 599-607. 10.1111/j.1464-5491.2011.03237.x.

    Article  CAS  PubMed  Google Scholar 

  28. Zou KH, O’Malley J, Mauri L: Receiver-operating characteristic analysis for evaluating diagnostic tests and predictive models. Circulation. 2007, 115: 654-657. 10.1161/CIRCULATIONAHA.105.594929.

    Article  PubMed  Google Scholar 

  29. Kshirsagar AV, Bang H, Bomback AS, Vupputuri S, Shoham DA, Kern LM, Klemmer PJ, Mazumdar M, August PA: A simple algorithm to predict incident kidney disease. Arch Intern Med. 2008, 168 (22): 2466-2473. 10.1001/archinte.168.22.2466.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Fox CS, Larson MG, Leip EP, Culleton B, Wilson PW, Levy D: Predictors of new-onset kidney disease in a community based population. JAMA. 2004, 291 (7): 844-849. 10.1001/jama.291.7.844.

    Article  CAS  PubMed  Google Scholar 

  31. Parikh N, Hwang S-J, Larson MJ, Meigs JB, Levy D, Fox CS: Cardiovascular disease risk factors in chronic kidney disease: overall burden and rates of treatment and control. Arch Intern Med. 2006, 166 (17): 1884-1891. 10.1001/archinte.166.17.1884.

    Article  PubMed  Google Scholar 

  32. Elsayed EF, Tighiouart H, Griffith J, Kurth T, Levey AS, Salem D, Sarnak MJ, Weiner DE: Cardiovascular disease and subsequent kidney disease. Arch Intern Med. 2007, 167 (11): 1130-1136. 10.1001/archinte.167.11.1130.

    Article  CAS  PubMed  Google Scholar 

  33. Saran AM, DuBose Jnr TD: Cardiovascular disease in chronic kidney disease. Ther Adv Cardiovasc Dis. 2008, 2 (6): 425-434. 10.1177/1753944708096379.

    Article  PubMed  Google Scholar 

  34. Scottish Intercollegiate Guidelines Network: Diagnosis and management of chronic kidney disease; a national clinical guideline. 2008, Edinburgh: SIGN

    Google Scholar 

  35. Tonelli M, Wiebe N, Culleton B, House A, Rabbat C, Fok M, McAlister F, Garg AX: Chronic kidney disease and mortality risk: a systematic review. J Am Soc Nephrol. 2006, 17: 2034-2047. 10.1681/ASN.2005101085.

    Article  PubMed  Google Scholar 

  36. Hippisley-Cox J, Coupland C, Vinogradova Y, Robson J, Minhas R, Sheikh A, Brindle P: Predicting cardiovascular risk in England and Wales: prospective derivation and validation of QRISK2. BMJ. 2008, 336 (7659): 1475-1482. 10.1136/bmj.39609.449676.25.

    Article  PubMed  PubMed Central  Google Scholar 

  37. de Lusignan S, Chan T, Gallagher H, van Vlymaen J, Thomas N, Jain N, Tahir A, Nation M, Moore J, Reid F, Harris K, Hague N: Chronic kidney disease management in southeast England: a preliminary cross-sectional report from the QICKD – quality improvement in chronic kidney disease study. Prim Care Cardiovasc J. 2009, 33 (9): 33-39.

    Article  Google Scholar 

  38. Dhoul N, de Lusignan S, Dmitrieva O, Stevens P, O’Donoghue D: Quality achievement and disease prevalence in primary care predicts regional variation in renal replacement therapy (RRT) incidence: an ecological study. Nephrol Dial Transplant. 2012, 27 (2): 739-746. 10.1093/ndt/gfr347.

    Article  PubMed  Google Scholar 

  39. Stevens PE, O’Donoghue DJ, de Lusignan S, Van Vlymen J, Klebe B, Middleton R, Hague N, New J, Farmer CKT: Chronic kidney disease management in the United Kingdom: NEOERICA project results. Kidney Int. 2007, 72: 92-99. 10.1038/

    Article  CAS  PubMed  Google Scholar 

Pre-publication history

Download references


We would like to thank James Hollinshead of the East Midlands Public Health Observatory for support in developing this theme, as well as the practices and their patients who took part in the QICKD trial. The QICKD trial was funded by the Health Foundation with additional support from the Edith Murphy Trust. We would also like to thank the participating practices from the National Institute for Health Research; the trial is run in partnership between: Kidney Research UK, St. George’s – University of London, and the University of Surrey.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Benjamin Kearns.

Additional information

Competing interests

There was no specific funding for this project. However, SdeL is the principal investigator for the QICKD trial, SdeL led the expert reference group that created the CKD Pay-for-Performance indicator; and is now a GP Advisor to NICE for CKD. SdeL has received funding to attend and present at two European conferences in the last five years, and Editorial fees for two articles (joint publications with HG).

Authors’ contributions

BK performed the analysis and interpretation of the data for this manuscript and drafted the article. He had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. SdeL and HG were involved in the conception of the study and acquisition of the data. They also critically revised the manuscript for important intellectual content, contributing to all drafts of the manuscript. SdeL also contributed to the analysis and interpretation of the data. All authors have given approval for the final version to be published.

Electronic supplementary material


Additional file 1: Summary statistics for the total sample, and for subjects with chronic kidney disease (CKD), broken-down by identified and unidentified CKD.(DOC 68 KB)


Additional file 2: Full main-effects and ‘clinical’ multivariable logistic regression models for subjects with identified Chronic Kidney Disease.(DOC 119 KB)

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Kearns, B., Gallagher, H. & de Lusignan, S. Predicting the prevalence of chronic kidney disease in the English population: a cross-sectional study. BMC Nephrol 14, 49 (2013).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: