Differences between hospitals in attainment of parathyroid hormone treatment targets in chronic kidney disease do not reflect differences in quality of care

Background Transparency in quality of care (QoC) is stimulated and hospitals are compared and judged on the basis of indicators of performance on specific treatment targets. In patients with chronic kidney disease, QoC differed significantly between hospitals. In this analysis we explored additional parameters to explain differences between centers in attainment of parathyroid hormone (PTH) treatment targets. Methods Using MASTERPLAN baseline data, we selected one of the worst (center A) and one of the best (center B) performing hospitals. Differences between the two centers were analyzed from the year prior to start of the MASTERPLAN study until the baseline evaluation. Determinants of PTH were assessed. Results 101 patients from center A (median PTH 9.9 pmol/l, in 67 patients exceeding recommended levels) and 100 patients from center B (median PTH 6.5 pmol/l, in 34 patients exceeding recommended levels), were included. Analysis of clinical practice did not reveal differences in PTH management between the centers. Notably, hyperparathyroidism resulted in a change in therapy in less than 25% of patients. In multivariate analysis kidney transplant status, MDRD-4, and treatment center were independent predictors of PTH. However, when MDRD-6 (which accounts for serum urea and albumin) was used instead of MDRD-4, the center effect was reduced. Moreover, after calibration of the serum creatinine assays treatment center no longer influenced PTH. Conclusions We show that differences in PTH control between centers are not explained by differences in treatment, but depend on incomparable patient populations and laboratory techniques. Therefore, results of hospital performance comparisons should be interpreted with great caution.


Background
Patients with chronic kidney disease (CKD) are at increased risk of developing cardiovascular disease [1][2][3]. Multiple traditional and non-traditional risk factors contribute to this increased cardiovascular risk. In recent years an important role of disturbances in bone and mineral metabolism has been established [4][5][6][7][8]. Patients with CKD are characterized by variable degrees of vitamin D deficiency, hyperparathyroidism, hypercalcemia and hyperphosphatemia. These abnormalities contribute to soft tissue and vascular calcification, and ensuing cardiovascular injury [9]. Parathyroid hormone (PTH) level is already elevated in patients with mild CKD and prevalence of hyperparathyroidism rises substantially when glomerular filtration rate (GFR) decreases [10,11]. Consequently, PTH is one of the most deviant laboratory parameters in CKD patients.
The 2003 K/DOQI clinical practice guideline addressed the treatment of CKD-bone and mineral disorder and defined treatment targets for calcium, phosphate, and PTH [12]. Newer guidelines with different targets were recently published [13]. Defining treatment goals is only one aspect of quality of care (QoC). Health care authorities, health insurance companies, as well as organizations of health care professionals have introduced benchmarking as a way to compare and improve QoC [14][15][16]. Treatment centers are asked to provide figures on treatment targets, complication rates, and survival rates. Hospital performance is determined on the basis of these figures.
Nevertheless, implementation of guidelines is rather difficult and treatment goals are often not met [17][18][19]. We recently showed that in CKD patients QoC, defined as attainment of treatment targets, differed significantly between treatment centers [20,21]. Differences were not explained by available patient characteristics. We hypothesized that detailed comparison of all aspects of treatment in these centers may help to define parameters that are associated with QoC, and eventually improve performance. In the present study, we compared two centers that differed in the attainment of the PTH treatment goal in CKD patients and explored possible explanations.

Methods
The MASTERPLAN (Multifactorial Approach and Superior Treatment Efficacy in Renal Patients with the Aid of Nurse practitioners) study [Trial registration ISRCTN registry: 73187232] is a randomized controlled trial conducted in nine centers with a nephrology department in The Netherlands, evaluating the added value of nurse practitioner care in reducing cardiovascular events and attenuating kidney function decline in patients with prevalent CKD. Rationale and design have been published elsewhere [22,23]. Ethics committee approval was obtained as well as written informed consent of all participants. Between April 2004 and December 2005, patients were enrolled.
We recently analyzed baseline data and evaluated QoC, defined as achievement of treatment goals. We noted significant differences between treatment centers for various treatment goals such as blood pressure, cholesterol, and PTH [20]. In the present study we focused on management of PTH and selected one of the worst (center A) and one of the best (center B) performing hospitals. Study design can be described as a nested case control study (on hospital level).
PTH is one of the quality indicators of the Dutch Federation of Nephrology. It is, therefore, a usual quality of care target.
We used the baseline clinical and laboratory data of the patients as available according to the MASTER-PLAN study protocol [23]. We retrieved additional data from the medical records, using a form, specifically addressing known determinants of PTH [24,25] and characteristics of treatment in the year before the baseline MASTERPLAN visit (number of patient visits, number of laboratory tests performed, adjustment of treatment affecting PTH levels). The additional data collection for the present study, was possible on the basis of the before mentioned ethics committee approval and informed consent.
Initially eGFR was calculated using the abbreviated MDRD formula [26] (MDRD-4). In addition the sixvariable MDRD formula [27] (MDRD-6) and the CKD-EPI equation [28] were used in order to achieve a more accurate estimation of the GFR. We also compared the serum creatinine assays between the hospitals. The hospitals used different Jaffé methods to measure serum creatinine. Both methods were compared to one enzymatic method (Roche Diagnostics). The following equations were developed. For center A: enzymatic creatinine = 1.266 x Jaffé creatinine -29. For center B: enzymatic creatinine = (Jaffé creatinine -28) / 0.93.
In center A PTH was measured by an immunoluminometric assay (Nichols Institute, San Clemente, USA). In center B an immunoradiometric assay from the same manufacturer was used. Since both methods use antibodies directed against the same regions of the PTH molecule and reference values are similar we consider the methods comparable. This is confirmed by Souberbielle et al., as they found no significant difference in measured PTH concentration between both methods [29].
Proteinuria and smoking data were missing in two patients, income data were missing in 34 patients. Two analyses were performed: one without cases with missing values and another in which missing data were imputed by single imputation. In proteinuria and income median values were imputed, in smoking the mode was used. The presented data are after imputation.
In the analysis of characteristics of treatment in the year before MASTERPLAN baseline (management of hyperparathyroidism), we excluded patients who were under specialist physician care for less than six months prior to study enrollment. We used the six month criterion since we aimed to evaluate differences in QoC between hospitals and not between general practitioners.
Characteristics are given for the study population by treatment center and expressed as means (standard deviation) or proportions. Medians [interquartile range] are presented for variables with a skewed distribution. Differences between the two treatment centers were studied using an independent-samples T test for continuous variables and a chi-square test for categorical variables. The natural logarithm (Ln) of PTH had a normal distribution and was used for correlation analyses. Correlation with several potential determinants was studied using a Pearson correlation coefficient for continuous variables and a Spearman's rho correlation coefficient for categorical variables. If necessary, the Ln of continuous variables was used to obtain a normal distribution. Multivariate analyses using stepwise backward linear regression models were performed to find determinants of PTH. Variables with univariate associations and potential confounders were included. Criteria for exclusion and re-inclusion were p ≥ 0.10 and p < 0.05 respectively. Variables were excluded from the final multivariate model if they did not contribute considerably to the explained variance (change in R 2 < 5%). Multivariate Poisson regression models were used to construct prediction models for meeting the PTH treatment target. Variables with univariate associations were included. All p-values were two-sided, and for all tests p-values less than 0.05 were considered to indicate statistical significance. The univariate analyses were performed with SPSS 16.0 (SPSS Inc., Chicago, USA). The multivariate analyses were performed with Stata 10.1 (StataCorp LP, College Station, USA).

Results
Medical records of 101 patients from center A and 100 patients from center B were studied. Center A is a university clinic with 953 beds and center B is a nonuniversity clinic with 653 beds. Both centers are teaching hospitals that offer a full range of nephrology treatment including kidney replacement therapy and are involved in the care of kidney transplant recipients. The two centers are located in the same region of The Netherlands.
Characteristics of the study population are given in Table 1. The majority of patients were male (71%) and Caucasian (96%). Twenty one percent of the patients were kidney transplant recipients. Median plasma PTH level was 8.7 pmol/l. By definition the median PTH level and the number of patients with PTH exceeding recommended levels were significantly different between the two treatment centers ( Table 1). Table 2 shows the treatment characteristics of patients in the year before entry in the MASTERPLAN study. Three patients from center A were excluded from this analysis and 18 patients from center B, because they were under specialist physician care for less than six months at MASTERPLAN baseline. In center A patients visited their physicians more often, and more laboratory tests were performed, although the number of PTH tests was not significantly different. In center A more different nephrologists and general internists were involved in the patient's treatment. PTH levels were not measured in 29-39% of patients in the six-twelve months before the start of MASTERPLAN. Of the patients with known PTH levels 29 patients in center A and 27 patients in center B had a PTH level exceeding recommended levels. There was no significant difference in the way that center A and center B handled the patients with PTH exceeding recommended levels ( Table 2). Specifically, hyperparathyroidism resulted in a change in therapy in less than 25% of patients.
The factors that were univariately associated with plasma PTH were all included in multivariate analyses. Potential confounders that were also included were age, sex, race, history of DM, BMI, smoking status, proteinuria, thiazide use, and season of blood draw. Although alphacalcidol and/or calciumcarbonate use and number of drugs were positively correlated with PTH, they were not included in multivariate models because of reverse causation (patients who have higher PTH levels more often use alphacalcidol, however the drug does not lead to higher PTH levels). Table 4 shows the results of multivariate linear regression analysis by which determinants of PTH were identified. Kidney transplant status, MDRD-4, and treatment center were independent predictors of plasma PTH level.

Role of GFR
Glomerular filtration rate is a well known determinant of PTH. Various methods can be used to estimate GFR. Since there were significant differences between center A and B in serum urea and albumin, we considered that MDRD-6 (which includes these variables) might give a better estimation of GFR than MDRD-4 and subsequently predict plasma PTH level more accurately. Estimated GFR by MDRD-6 was 37.7 (SD 12.0) for center A and 37.0 ml/min/1.73 m 2 (SD 12.2) for center B, p = 0.69. Table 4 shows that, while the explained variance increased, the center effect was reduced when MDRD-6 instead of MDRD-4 was used in the model. We also used the CKD-EPI equation, but this did not improve the model, nor reduced the center effect (data not shown).
Serum creatinine is an important parameter in the MDRD formulas. After conversion of serum creatinine to enzymatic values, mean eGFR by MDRD-6 was 35.7 ml/min/1.73 m 2 in center A and 42.1 ml/min/ 1.73 m 2 in center B (p = 0.002).
We subsequently used MDRD-6 after creatinine conversion in multivariate analysis, and found that treatment center was no longer a significant predictor of PTH (Table 4).
Since attainment of the PTH treatment target not only depends on PTH, but also on eGFR, we extended the multivariate analyses to determine predictors of meeting the PTH treatment target. The results were similar as reported in Table 4 (data not shown).

Discussion
In our study we compared two hospitals that differed in the attainment of the PTH treatment goal and explored possible explanations. We showed that the apparent differences were explained by incomparable patient populations and laboratory techniques, and the use of the abbreviated MDRD formula to estimate GFR. After correction for kidney transplant status and with the use of the MDRD-6 formula with calibrated creatinine to estimate GFR, treatment center no longer influenced the attainment of the PTH treatment goal. We evaluated potential differences in treatment which might explain any difference in attainment of the PTH target. We focused on characteristics of the treatment given to patients in the year before the start of the MAS-TERPLAN study. Plantinga et al. have shown that more frequent patient-physician contacts in patients with endstage renal disease are positively associated with the achievement of clinical performance targets, including targets of bone and mineral disorder [30]. One would therefore expect that patients in center A, the 'worst performing' hospital, visited their physician less often and had less laboratory tests done. Table 2 shows that the opposite was true. Admittedly, the number of visits and laboratory tests may be dictated by patient morbidity and disease history and not necessarily reflect QoC, e.g. kidney transplant recipients in general need more frequent control. In addition, there were no differences in the way both centers handled the patients with PTH values exceeding recommended levels. Table 2 also shows that relatively few PTH tests were performed, while PTH is one of the most deviant laboratory values in CKD patients. In both centers in only a small number of patients with PTH exceeding recommended levels, treatment was adjusted or started. Thus, although guidelines give attention to treatment of CKDbone and mineral disease, and although treatment targets are well defined, physicians are insufficiently aware of the importance of adequate treatment of hyperparathyroidism: our data point to therapeutic inertia towards the PTH treatment target.
In univariate analysis several factors were potential determinants of PTH. Multivariate analyses showed that many of these factors did not independently predict PTH levels. Renal function on the basis of calibrated creatinine values and kidney transplant status are the most important determinants of attainment of the PTH treatment target.
As mentioned before, GFR is a well known determinant of PTH and increases in PTH occur early in the Only patients who were under the care of a specialist physician (nephrologist or internist) for at least 6 months before the MASTERPLAN study were analyzed. Values given are mean (SD) or n (%). PTH: parathyroid hormone.
course of renal insufficiency. We observed an inverse relationship between PTH and eGFR (Table 3). Although eGFR proved to be an important, independent predictor of PTH, in the initial analysis the differences in PTH levels between centers could not be explained by differences in eGFR. However, all formulas for estimating GFR have limitations. We showed that MDRD-4 is invalid in patients with proteinuria, where MDRD-6 proved better [31]. MDRD-6 superiority was also shown in kidney transplant recipients [32]. Since MDRD formulas critically depend on the measurement of serum creatinine, differences between serum creatinine assays affect their performance. Therefore, recent guidelines suggest to use calcibrated serum creatinine values [33].
Our data clearly show that the use of the MDRD-6 formula reduced the center effect. Moreover, when using the MDRD-6 formula and calibrated serum creatinine values, there were no longer significant differences in attainment of PTH treatment targets between the centers.
From these findings we conclude that it is not always valid to use the abbreviated MDRD formula instead of the six-variable MDRD formula, especially in analyses in which GFR plays a central role. Moreover, comparable creatinine assays must always be used. Physicians should be aware of the limitations of formulas for estimating GFR, especially of the MDRD-4 formula, since this formula is extensively used (in The Netherlands and also in many other countries).
Admittedly, although the results may be explained by the better performance of the MDRD-6 formula as measure of real GFR, we cannot exclude that other factors are involved. The MDRD-6 formula incorporates serum albumin concentration. It is known that there is an inverse relationship between plasma calcidiol levels and magnitude of proteinuria, and thus hypoalbuminemia, because of loss of vitamin D metabolites and vitamin D binding protein in the urine in patients with proteinuria [34,35]. Low plasma calcidiol levels are related to higher PTH concentrations [36][37][38][39].
It is well known that hyperparathyroidism often persists for many years after kidney transplantation [40,41]. Treatment with vitamin D compounds is complicated, since hyperparathyroidism in post-transplant patients is usually associated with hypercalcemia [40]. Consequently, mean PTH level was higher in kidney transplant recipients and only a smaller number of kidney transplant patients achieved the PTH treatment goal. In center A more kidney transplant recipients were treated than in center B. Insight and transparency in QoC is becoming more and more important. Various indicators are used to assess hospital performance, for example attainment of treatment targets, as in our study. Benchmarking has become a way to compare and judge treatment centers. Ranking hospitals on the basis of performance indicators is supposed to give health care professionals, insurance companies as well as (associations of) patients insight in QoC. The results of our study question the reliability of these rankings and other hospital performance comparisons.
There are several difficulties associated with hospital performance comparisons: definitions are not always the same [16,42], laboratory assays vary [18], data quality is variable between hospitals and even within one hospital [16,42,43]. Another problem is patient case mix [16,18,42,43]. Patient age, race, severity of illness, and comorbidity all influence the outcome of care. Retrospective risk adjustment can only partly adjust for all these factors. There will always be additional residual confounding [42,43]. Moreover, random variation has to be taken into account [16,[42][43][44][45]. Therefore, when describing results of hospital performance comparisons, confidence intervals should be provided to give insight into the influence of random variation [16,44]. The practice of summarizing several performance indicators in one composite score adds to the unreliability of performance measures since small differences in methods of constructing the composite score can have substantial impact on the results [44]. Finally, whether performance indicators provide a true reflection of QoC is questionable. Performance indicators represent mainly technical aspects, while the humane side of health care and traditional components of caring, essential when talking about QoC, are ignored [46,47].
A major limitation of our study is that we compared only two hospitals. The extent to which the findings can be generalized beyond the centers and cases studied, is unknown. Another limitation is the lack of plasma calcidiol levels. Since plasma calcidiol and PTH are inversely related [36][37][38][39], differences in calcidiol levels can have important consequences for PTH concentrations. Other unknown, and possibly influencing, factors are patient compliance, FGF-23 levels, and dietary intake of calcium, phosphate, protein and vitamin D, including over-thecounter vitamin pills.

Conclusions
In conclusion, the observed differences in hospital performance on PTH management between center A and center B in patients with CKD are explained by incomparable patient populations and laboratory techniques, The difference in R 2 between the model with and without treatment center illustrates the contribution of the treatment center to the explained variance. The linear regression coefficient reflects the association between the independent predictor and (Ln)PTH. eGFR is a very important determinant, since for every extra ml filtration per minute, PTH decreases by three percent (based on the final model using MDRD-6 after creatinine conversion). Ln: natural logarithm; PTH: parathyroid hormone (pmol/l); eGFR: estimated glomerular filtration rate (ml/min/1.73 m 2 ); MDRD: modification of diet in renal disease; CI: confidence interval. and use of the abbreviated MDRD formula to estimate GFR. We propose the use of the MDRD-6 formula to estimate GFR in analyses when variables are critically dependent on glomerular filtration rate. Our study shows that great caution is required in interpreting the results of hospital performance comparisons. Uncritical use of these measures can result in treatment centers being wrongly classified in terms of performance.