Cohort profile: Canadian study of prediction of death, dialysis and interim cardiovascular events (CanPREDDICT)

Background The Canadian Study of Prediction of Death, Dialysis and Interim Cardiovascular Events (CanPREDDICT) is a large, prospective, pan-Canadian, cohort study designed to improve our understanding of determinants of renal and cardiovascular (CV) disease progression in patients with chronic kidney disease (CKD). The primary objective is to clarify the associations between traditional and newer biomarkers in the prediction of specific renal and CV events, and of death in patients with CKD managed by nephrologists. This information could then be used to better understand biological variation in outcomes, to develop clinical prediction models and to inform enrolment into interventional studies which may lead to novel treatments. Methods/Designs Commenced in 2008, 2546 patients have been enrolled with eGFR between 15 and 45 ml/min 1.73m2 from a representative sample in 25 rural, urban, academic and non academic centres across Canada. Patients are to be followed for an initial 3 years at 6 monthly intervals, and subsequently annually. Traditional biomarkers include eGFR, urine albumin creatinine ratio (uACR), hemoglobin (Hgb), phosphate and albumin. Newer biomarkers of interest were selected on the basis of biological relevance to important processes, commercial availability and assay reproducibility. They include asymmetric dimethylarginine (ADMA), N-terminal pro-brain natriuretic peptide (NT-pro-BNP), troponin I, cystatin C, high sensitivity C-reactive protein (hsCRP), interleukin-6 (IL6) and transforming growth factor beta 1 (TGFβ1). Blood and urine samples are collected at baseline, and every 6 monthly, and stored at −80°C. Outcomes of interest include renal replacement therapy, CV events and death, the latter two of which are adjudicated by an independent panel. Discussion The baseline distribution of newer biomarkers does not appear to track to markers of kidney function and therefore may offer some discriminatory value in predicting future outcomes. The granularity of the data presented at baseline may foster additional questions. The value of the cohort as a unique resource to understand outcomes of patients under the care of nephrologists in a single payer healthcare system cannot be overstated. Systematic collection of demographic, laboratory and event data should lead to new insights. The mean age of the cohort was 68 years, 90% were Caucasian, 62% were male, and 48% had diabetes. Forty percent of the cohort had eGFR between 30–45 mL/min/1.73m2, 22% had eGFR values below 20 mL/min/1.73m2; 61% had uACR < 30. Serum albumin, hemoglobin, calcium and 25-hydroxyvitamin D (25(OH)D) levels were progressively lower in the lower eGFR strata, while parathyroid hormone (PTH) levels increased. Cystatin C, ADMA, NT-proBNP, hsCRP, troponin I and IL-6 were significantly higher in the lower GFR strata, whereas 25(OH)D and TGFβ1 values were lower at lower GFR. These distributions of each of the newer biomarkers by eGFR and uACR categories were variable.

Methods/Designs: Commenced in 2008, 2546 patients have been enrolled with eGFR between 15 and 45 ml/min 1.73m2 from a representative sample in 25 rural, urban, academic and non academic centres across Canada. Patients are to be followed for an initial 3 years at 6 monthly intervals, and subsequently annually. Traditional biomarkers include eGFR, urine albumin creatinine ratio (uACR), hemoglobin (Hgb), phosphate and albumin. Newer biomarkers of interest were selected on the basis of biological relevance to important processes, commercial availability and assay reproducibility. They include asymmetric dimethylarginine (ADMA), N-terminal pro-brain natriuretic peptide (NT-pro-BNP), troponin I, cystatin C, high sensitivity C-reactive protein (hsCRP), interleukin-6 (IL6) and transforming growth factor beta 1 (TGFβ1). Blood and urine samples are collected at baseline, and every 6 monthly, and stored at −80°C. Outcomes of interest include renal replacement therapy, CV events and death, the latter two of which are adjudicated by an independent panel. Discussion: The baseline distribution of newer biomarkers does not appear to track to markers of kidney function and therefore may offer some discriminatory value in predicting future outcomes. The granularity of the data presented at baseline may foster additional questions. The value of the cohort as a unique resource to understand outcomes of patients under the care of nephrologists in a single payer healthcare system cannot be overstated. Systematic collection of demographic, laboratory and event data should lead to new insights.
(Continued on next page) (Continued from previous page) The mean age of the cohort was 68 years, 90% were Caucasian, 62% were male, and 48% had diabetes. Forty percent of the cohort had eGFR between 30-45 mL/min/1.73m 2 , 22% had eGFR values below 20 mL/min/1.73m 2 ; 61% had uACR < 30. Serum albumin, hemoglobin, calcium and 25-hydroxyvitamin D (25(OH)D) levels were progressively lower in the lower eGFR strata, while parathyroid hormone (PTH) levels increased. Cystatin C, ADMA, NT-proBNP, hsCRP, troponin I and IL-6 were significantly higher in the lower GFR strata, whereas 25(OH)D and TGFβ1 values were lower at lower GFR. These distributions of each of the newer biomarkers by eGFR and uACR categories were variable.
Keywords: Chronic kidney disease, Biomarkers, Observational cohort study, Outcomes, Progression, CV disease Background Chronic kidney disease (CKD), defined as the presence of persistent reduction in kidney function (i.e. glomerular filtration rate (GFR) <60mL/min for more than 3 months) or evidence of chronic kidney damage (e.g. proteinuria), is a growing global health problem. CKD afflicts 10-13% of adults in North America, Europe and Australia [1][2][3][4][5]. There is evidence that the prevalence of CKD is increasing in parallel with the increasing prevalence of hypertension, diabetes and obesity. The diagnosis of CKD is important because it is a powerful risk factor for development of end-stage renal disease (ESRD), a condition associated with significant patient morbidity, excessive mortality, and high societal cost related to provision of dialysis (an expensive therapy) [6]. More recently, CKD, even in the early stages, has been associated with accelerated cardiovascular (CV) disease and death [7,8]. For all of these reasons, identification of patients with CKD, appropriate longitudinal follow-up, and treatment with therapies to prevent progression, are major current and future challenges for healthcare systems worldwide.
The major causes of CKD in developed societies are hypertension, diabetes, atherosclerotic vascular disease, and certain glomerular diseases (e.g. IgA nephropathy). Even within etiological categories, however, there is wide variation in rates of progression [9][10][11]: some patients progress rapidly to ESRD, whereas other patients remain stable indefinitely with minor reduction in kidney function. The link between CKD and accelerated CV disease adds prognostic complexity because some patients with progressive CKD will succumb to death from CV causes rather than progress to dialysis. This phenomenon of competing risks further complicates the ability to predict specific outcomes in individual patients.
The variable prognosis of CKD is highly problematic for health systems, health practitioners, and patients alike. Patients very reasonably want to know what will happen to their kidneys down the road. Will dialysis be needed, and when? Uncertainty in prognosis is troubling for patients, hampers psychosocial adaptation to illness, and degrades quality of life [12][13][14][15]. Health practitioners need accurate prognostic estimates in order to appropriately counsel CKD patients, plan frequency of follow-up, and determine optimal timing for procedures required in preparation for dialysis, such as arteriovenous fistula creation, or referral for pre-emptive transplantation. From the health systems perspective, CKD care is expensive, requiring specialized resources and frequent visits. These resources would be optimally directed to those patients at true risk of progression, and not to those at minimal risk of adverse outcomes.
Although progress has been made in developing usable prediction models for risk of dialysis in CKD populations [16], much less progress has been made in terms of predicting other important outcomes such as CV disease and death. Better identification and understanding of the factors predisposing to these key outcomes in CKD are needed. In this regard, several newer biomarkers which reflect biological processes linked to renal and cardiac disease progression have shown promise in predicting outcomes in CKD, but have not yet been properly validated and compared in the context of conventional risk factors for progression.
The Canadian study of Prediction of Risk and Evolution to Dialysis, Death and Interim CV events over Time (CanPREDDICT) was established in 2008 to address these questions of interest.

Overarching objectives
CanPREDDICT is a large, prospective, pan-Canadian cohort study with the primary objective of describing the associations between traditional and newer biomarkers in the prediction of renal and CV events in patients with CKD managed by nephrologists. This information will then be used to better understand biological variation in outcomes, to develop clinical prediction models and to inform enrolment into interventional studies which may lead to novel treatments.

Study cohort
CanPREDDICT includes 2546 adult patients recruited from outpatient nephrology clinics in 25 Canadian centres. The centres represent various types of nephrology practice in Canada: rural and urban, university and nonuniversity affiliated centres are represented. Recruitment of the cohort was achieved over an 18-month period between June 2008 and December 2009. Patients were eligible for inclusion in the cohort if they had a baseline eGFR of 15-45 mL/min/1.73m 2 . Patients were excluded if they were unable to provide informed consent, had an organ transplant, were on immunomodulatory therapy for active vasculitis or glomerulonephritis, or who had a life expectancy of less than 1 year (e.g. due to cancer) in the opinion of their attending nephrologist ( Figure 1).
The study protocol was approved by the institutional review boards of all 25 participating centres, led by the University of British Columbia and Providence Health Care Research Institute as the coordinating site; and the research was conducted in accordance with the Declaration of Helsinki. The study was registered at www. clinicaltrials.gov (# NCT00826319).

Funding sources
The direct costs of the study are funded by an unrestricted educational grant from Janssen-Ortho Inc. The concept, design and execution of the study, including all data management and analysis, were entirely investigator driven. Statistical and methodological support is provided from University of British Columbia and the BC Provincial Renal Agency. Funding from the Kidney Foundation of Canada for ancillary studies (Bioimpedance in CKD) has been received, and other applications for peer-reviewed funding are pending.

Specific study objectives and outcomes of interest
The main objectives of the CanPREDDICT study are 1) to examine the role of both traditional risk factors and a select panel of newer, non-traditional serum and urine biomarkers, in the progression of kidney and CV disease in patients with CKD, alone and separately and 2) to develop robust predictive models to discriminate between high and low risk patients.
The main outcomes of interest in the CanPREDDICT study include renal endpoints: progression of CKD to renal replacement therapy (RRT), CV events (both heart failure and ischemic events) and death.

Definitions of outcomes and adjudication
RRT is defined as need for dialysis initiation or renal transplantation. Major CV events are defined as fatal or non-fatal myocardial infarction (MI), defined as chest pain, dynamic troponin change, cardiogenic shock and ECG to distinguish ST-elevated MI vs non-ST-elevated MI), ischemic stroke (defined as an acute focal neurologic deficit of sudden onset attributed to the occlusion of a cervicoencephalic artery by a thrombus, supported by CT or MRI results), or need for coronary revascularization (coronary artery bypass graft/percutaneous coronary intervention/percutaneous transluminal coronary angioplasty supported by procedural note). Congestive heart failure (CHF) is defined as dyspnea plus 2 of the following: bibasilar rales, raised jugular venous pressure or chest x-ray with evidence of interstitial or alveolar pulmonary edema. A panel of three physicians comprising a nephrologist, a cardiologist, and a neurologist independently adjudicated all CV outcomes based on source documentation.

Duration of the study
Follow-up was originally planned for 3 years, with completion of the main study in December 2012, but has been extended for an additional 2 years.

Data collection
Demographics, clinical status, medications, as well as blood and urine samples are collected at baseline and every 6 months at study visits for the first 3 years. An abbreviated set of data (events and clinical data only) will be collected annually during the 2-year extension.
Measurement details of the newer biomarkers are described in the Additional file 1. Traditional biomarkers (creatinine, urine albumin-creatinine ratio (uACR), hemoglobin (Hgb), phosphate etc.,) are all measured in local accredited laboratories across Canada. Serum creatinine is calibrated to local platforms but traceable to NIST standards in all laboratories. The calculation of eGFR used MDRD formula, as is the norm in Canada at the time of the study start [36].

Sample size considerations
The primary considerations for the sample size estimation were: 1) to ensure adequate power to demonstrate that inclusion of novel biomarkers in predictive models would enhance discrimination between subjects who will or will not experience outcomes; and 2) a high level of precision when assessing the discriminatory value of the new predictive models that include biomarkers. A sample size of 2500 would yield estimated standard errors of approximately 1%, which would provide 99% power to demonstrate that the novel biomarkers would be statistically significant predictors if the hypothesized increase in discrimination of 5% existed. Also, this sample size allowed quantification of the magnitude of the increase with high precision. As described in the Additional file 1, we used a simulated biomarker behavior, not any specific biomarker to develop the sample size.

Patient follow up
As a longitudinal observational cohort study, clinical visits every 6 months for the initial 3 years, and then annually for an additional 2 years are planned. During the first 12 follow-up months, attrition was low, with 4% of the cohort lost to follow-up (see Figure 1 for details).

Variables measured
Clinical and demographic data were obtained at baseline visits. Data elements include age, sex, race, diabetic status, cause of renal disease, and pre-existing comorbidities including ischemic heart disease (IHD), congestive heart failure (CHF), cerebrovascular disease, peripheral arterial disease, chronic lung disease, chronic liver disease, chronic gastrointestinal disease and previous diagnosis of cancers. Blood pressure, height and weight, routine laboratory testing (near study visit date, maximum 4 months prior and 2 months after) were also obtained. Serum, plasma and urine samples are collected for analysis in a central laboratory at each visit. Patient follow-up continues after transition to dialysis or transplantation, until death or lost to follow-up. In addition to the demographic and clinical data described above, the six pre-specified newer biomarkers were selected for measurement. As described above, the selection was done on the basis of biological relevance, commercial availability of assays, and published data suggesting prognostic value for heart disease or kidney disease progression, or death. ADMA, a potent inhibitor of endothelial nitric oxide production, impairs vascular relaxation, contributes to hypertension, and is correlated with CV events and renal decline [22][23][24][25].

Baseline findings Baseline characteristics of patients and correlates of GFR
The baseline demographics and laboratory values of the CanPREDDICT cohort, stratified by GFR, are summarized in Table 1. The mean age of the cohort at enrollment was 68 years, 62% are male, and 48% diabetic. Forty percent of the cohort was in the CKD Stage 3 (eGFR 30-45 mL/min/1.73m 2 ) and 60% of the cohort was in CKD Stage 4 at baseline, with 22% of the cohort with eGFR below 20 mL/min/1.73m 2 . Diabetes and hypertensive nephropathy were the most frequent primary kidney diseases (30% each), and 22% of patients had a history of cardiac disease (CHF or IHD) at entry. 68% had either diabetes or CV disease. Compared with patients in the higher eGFR strata, patients at lower eGFR were slightly younger and slightly less likely to be male. Diabetes was significantly more prevalent in the 20-29 mL/min/1.73m 2 stratum than in the lower or higher strata. At each eGFR stratum, there was a similar distribution of those with diabetes, IHD, CHF or any combination thereof. Of note, only 32% of the cohort had neither diabetes nor CV comorbidities (Figure 2). The expected relationship between CKD related complications and GFR is clearly evident in the baseline analysis: abnormalities of Hgb, calcium, phosphate, parathyroid hormone (PTH) and 25(OH)D were progressively more pronounced at lower GFR strata. The majority of patients were microalbuminuric or non-proteinuric (61%); only 22% exhibited heavy albuminuria > 1 g/day. The uACR data is variable across strata of eGFR.

Distribution and expected values of novel biomarkers
The baseline distributions of the newer biomarkers are illustrated in Figure 3a-g. Cystatin C and ADMA had mound shaped, approximately normal distributions, whereas IL-6, troponin I, hsCRP, TGFβ1 and NT-pro-BNP exhibited marked positive skew, with the majority of measurements at or below the lower limit of detection of the assay. Of note, the median value of hsCRP was 3 mg/L, which corresponds to the upper limit of normal in general populations. The median, range, and proportion above the detection limit for biomarkers with the majority of measurements below the limit of detection is described in the second part of Table 1.
The variation of biomarker levels across strata of eGFR and uACR are presented graphically in Figure 4a-g. The values in each cell represent the mean (cystatin C, ADMA) or median value (NT-pro-BNP, hsCRP, TGFβ1) for the biomarker in that cell, or the proportion of patient results above the upper limit of detection (for IL-6, troponin I), as appropriate. Different colors indicate statistically significant differences between cells. Such graphical representations are useful in discerning at a glance the potential predictive utility of a biomarker.

Discussion
CanPREDDICT represents a large cohort of CKD patients followed by nephrologists in a single payer healthcare system, across multiple geographical locations in Canada. As a source of information about 'current state' of patients in Canada, it is representative of that group. Better understanding of the outcomes of these patients will be important for healthcare planning, and for patient counseling. Through this ongoing work development and testing of prediction equations using additional biomarkers should prove important. The baseline characteristics in CanPREDDICT are qualitatively similar to other referred CKD cohorts [41,42], but with some important differences. By design, CanPREDDICT had a higher representation of lower GFR strata than other similar published cohort studies such as CRIC and CRIB [41,42]. Our cohort is also older by a decade, has greater male predominance, but a similar proportion of diabetics. CanPREDDICT patients also have a higher prevalence of IHD and CHF than CRIC, findings which may relate to the aforementioned age and eGFR differences [42]. As with CRIC, NHANES, and AASK, the proportion of proteinuric renal disease was relatively low, but distributed across all strata of eGFR [42][43][44]. We observe a well described relationship between CKD related laboratory abnormalities and eGFR [45,46]. As expected, abnormalities of Hgb, calcium, phosphate, PTH and 25(OH)D were progressively more pronounced in the lower eGFR strata. Our cohort is also predominantly white, a finding discussed separately under limitations below.    Figure 4 a-g Biomarker mean, median values or percentage of patients above the upper detection limit/top tertile by eGFR and uACR level at baseline.
The biomarker distribution findings are important with respect to future studies and potential utility in prediction equations. A biomarker exhibiting a high degree of covariation with uACR and eGFR would exhibit a smooth "wave" of colour changing diagonally across the table; such a biomarker would likely provide little additional independent information to a predictive model beyond what is already provided by measurement uACR and eGFR, themselves strong predictors of renal disease progression. On the other hand, a biomarker which does not co-vary perfectly with eGFR/ACR would exhibit a "patchwork" pattern of colours, indicating that it may be capturing information independent of uACR and eGFR and might therefore prove prognostically useful. Of note, almost all biomarkers measured in the study exhibit this patchwork pattern to some degree, suggesting these biomarkers could add additional information to conventional measures of CKD severity (via uACR and eGFR).
Our observations on these newer biomarker distributions have both practical and research applications. As noted above, most biomarkers of inflammation and CV disease appear right-shifted in this CKD cohort, indicating that a higher proportion of patients with CKD have elevated values. For example, the lowest NT-pro-BNP value in the cohort in the cell of the highest eGFR and lowest uACR within the cohort (Figure 4c) is within the range suggested for the diagnosis for pulmonary edema in the general population. Caution must be used, therefore, in applying distribution based thresholds (i.e. "normal ranges") derived from the general population for clinical decision making, as these may not be correct when applied in CKD populations. Ultimately, our objective is to develop true risk-based thresholds, once follow-up is completed and all outcomes of interest are known.

Strengths and weaknesses
The main strength of CanPREDDICT is that it is a large, national, prospective observational study of referred CKD patients, with comprehensive data capture on risk factors for progressive renal and cardiac diseases in Canada. The dataset includes measures of six novel nontraditional biomarkers of cardiorenal disease progression. Biobanking of urine and blood samples will permit future genetic and proteomic analyses. While the CanPREDDICT cohort is qualitatively similar to other CKD cohorts, its sampling of patients at lower eGFR and its setting in the Canadian health system make it complementary to other national cohorts, and provides the basis for international comparisons and cross-validation of findings.
The low prevalence of non-Caucasian individuals enrolled is a relative limitation. Although the proportion of non-white individuals in Canada is lower than in the US, for example, non-white individuals are still underrepresented in the CanPREDDICT Cohort relative to Canadian demographics as a whole. A funding application to extend and to enrich the cohort with non-white individuals so that it more closely reflects Canadian demographics is under review.
As patients were recruited at nephrology clinics across Canada, the results of CanPREDDICT will be applicable to CKD patients seen and followed by nephrologists (referral cohort). This is an important group of patients to characterize and understand, and it is expected that the results of CanPREDDICT over time will inform management in these patients. However, CanPREDDICT results may not necessarily translate to CKD patients who are not referred to nephrologists, as they are not represented in this cohort. The logistics of identifying and intensively following non-referred CKD patients are considerable, and will have to be resolved in future studies. The pre-selected biomarkers, chosen for practical reasons, did not include FGF-23, which has been shown in multiple populations to predict CV outcomes and death. Arrangements to measure FGF-23 in an approved laboratory have been completed at the writing of this paper; results are pending.

CanPREDDICT data are available for collaborations
CanPREDDICT was designed at the outset to be a platform for further collaborations and studies. A 8 person steering committee, consisting of 6 nephrologists, a statistician and methodologist, and a laboratory physician evaluates all requests, based on a predefined set of criteria. To date, several sub-studies have been approved, including one looking at bioimpedance and outcome, and one reviewing urine protein evaluation and outcomes.
Requests for collaboration may be directed to Dr. Adeera Levin, principal investigator and chair of the steering committee, at canpreddict@providencehealth.bc.ca.

Additional file
Additional file 1: Additional information regarding study organization, measurement of biomarkers and sample size calculations.

Competing interests
The authors have no competing interests as regards to this manuscript or study. For full transparency, however listed below is additional information about each: AL receives grant/research funds from Kidney Foundation of Canada (KFoC), Canadian Institute of Health Research (CIHR), Merck, Abbott, Amgen, Otsuka; NM receives grant/research funds from Amgen, CIHR; CR receives funds from Manitoba Health Servcies and Kidney Foundation of