WPP Thesis
Name: Mithilesh Dronavalli
WPP Thesis Semester 1 2008
Supervisors: Project 1) Prof. H. Klum, Dr B.Billah
Project 2) Dr S. Shakeb, Prof A. Esterman, Dr K. McCaul
WPP Project 2 units Mithilesh Dronavalli Page 1 of 65
Preface
My name is Mithilesh Dronavalli, the following is my Work Placement Project
Report for the Masters of Biostatistics and I am enrolled at Monash University. I
have a medical background. I did 3 yrs of medicine and a honours in applied
biostatistics for analysis of cardiovascular clinical trials and along the way the
relevant subjects I managed to do were first year maths and an applied
biostatistics course that covered an introduction to all the major analyses. I then
enroled in Masters of Biostatistics (2007 Jan) and after a semester I also co-
enrolled in an MPhil which was more applied biostatistics in a lip cancer cohort
study using survival analysis and a meta-analysis of lip cancer studies. The
Mphil is still in continuation. I have very little mathematics and theoretical
statistical background and I mainly have medical insight only with experiences
learned from the MBios.
My role in this work involved statistical analysis, coming up with the relevant
methodology for the required work with some guidance from the statistical
supervisors, interpreting the results to clinicians and other collaborators.
Communication played a very important role in this project as it was mostly
done via correspondence (phone and email). I implemented methodologies
acquired from my Mphil not previously described in the MBios including risk
models.
For my WPP I did 2 projects all loosely under the same umbrellas of cardiology
and clinical pharmacology. My first project was given to me by Prof. Henry Krum
WPP Project 2 units Mithilesh Dronavalli Page 2 of 65
of the DEPM Monash Uni. He is a clinical pharmacologist. The problem was to
see if drugs used for heart failure had better or worse efficacy in preventing
mortality in patients with compromised renal status. There were 5 drug types
involved and numerous trials for each one of whose articles I had to extract
from Medline. I pooled the data if the data was in the correct form and obtained
relevant odds ratio (OR) using multiple logistic regression in Stata. I did a poster
presentation of this at the Asian Pacific Heart Failure Conference in Jan 2008.
The poster text is copied and pasted into the project report.
The second project was given to me by Dr Sepehr Shakib who is head of
clinical pharmacology at the Royal Adelaide Hospital (RAH). This project was
75% of the work whereas the previous is 25% of the work. The second project
is a 10yr cohort study of all heart failure admissions at the RAH which has
survival times for death and readmission data. The database has
socioeconomic, comorbidity, pharmacological, echocardiographic and
biochemical variables. I did univariate and multivariate modelling for groups of
variables and did a full model as well predicting time to death and time to
readmission using survival analysis and recurrent event survival analysis. I had
to clean the data and format it so it would be suitable for analysis in both
survival analysis and recurrent event survival analysis I did a recurrent event
survival analysis to make a risk model for readmissions that used the same
themes (socioeconomic and comorbidity groups of covariates) as the final risk
model for survival. This was done to compare the different outcomes of survival
and readmission and see if the predicting covariates were similar or contrasting
as well as comparing the effect of these covariates.
WPP Project 2 units Mithilesh Dronavalli Page 3 of 65
I used backward and forward stepwise linear regression using the swaic
command to obtain final models. Dr Mccaul my second statistical supervisor
told me that my modelling may lead to overfitting and he suggested
bootstrapping the data (100 runs with variables in atleast 75 runs (in the model)
to be accepted ).. It must be noted that boot strapping was not used for
recurrent event model as the automated selection procedure swaic did not
function properly in this setting.
I found generally as I was modelling readmissions data that those high risk
patients who would die earlier had less readmissions compared to mild and
moderate disease with regular readmission over time. So the high risk cohorts
in readmission and survival differed quite contrastingly.
I did the bootstrapping with handwritten programs that I am including in the
appendix. My supervisor helped me with the postfile command but I did the rest
myself. I chose the type of analyses to perform in both projects and carried
them out myself with reassurances from my statistical supervisors.
Ethical Considerations
The first project was a pooled meta-analysis and data was obtained from the
randomised control trial articles. I obtained the SOLVD (Study of Left Ventricular
Dysfunction) dataset from one of study authors, provided I didn t freely disclose
the data or try to contact the patients in anyway. Another trials results were only
WPP Project 2 units Mithilesh Dronavalli Page 4 of 65
available to Prof. Krum as he was an investigator in the study so confidentiality
had to be kept with those documents.
For the second project I was dealing with confidential hospital data and had to
accept to the Royal Adelaide Hospital that I would not disclose the data in
anyway or contact the patients. They obtained all their required ethical
clearances before I was able to access the data.
Work Patterns
I finished the first project in the summer. I initially met Prof. Krum and Dr. Billah
to discuss the project and met them once again halfway through. Meanwhile we
had phone conversations sometime very often (every 2 days) till the work was
completed. A lot of discussion was spent on discussing the methodology and
my suggestions for the analysis were approved. I did the final analyses and sent
the dataset and results to Dr. Billah as he wanted to verify the work because it
was being sent to a conference. He accepted the work without any changes.
Note that I had to construct the data from summary measures from different
trials.
For the second project I was introduced to project at a conference and the rest
was done by weekly phone calls to the clinical and statistical supervisor
(sometime more often then that depending on where the work was upto). This
project was done in from March till the end of July 2008.
WPP Project 2 units Mithilesh Dronavalli Page 5 of 65
Statistical principles, methods and computing
I used Stata for all analyses and I wrote all the code for all the analyses except
for a postfile command in the bootstrapping program which my supervisor did.
I had to read the literature and search for all relevant trials and use the datasets
at my disposal to make a dataset for Angiotensin Converting Enzyme Inhibitors
(ACEi) and Beta Blockers (BB). I initially tried to do the analyses from the
summary odds ratios and developed some original formulas but it became too
complicated so I just made the datasets and did logistic regression. For this
project Categorical Data Analysis Unit (CDA) was very useful.
In the second project I received the data but I had to do a lot of cleaning to
make it fit for analysis and this required some serious manipulation as well. I
used a combination of Access, Excel and Stata for the data manipulation. This
took a lot of time especially for regrouping of categorical variables and
reformatiing the readmission data.
After doing the models using survival analysis and recurrent event survival
analysis I had a change of supervisor (recommended by the original supervisor
due to his expertise in the area). He suggested to do bootstrapping and repeat
the analysis. I wrote the bootstrapping programs for all work with help from the
second supervisor Dr. McCaul on the postfile command. Future directions in the
work include Poisson models and handling missing data using imputations.
WPP Project 2 units Mithilesh Dronavalli Page 6 of 65
I wrote the second project in an article format to compare risk models
developed using time to death as a pose time to readmission. This allows us to
understand the predictors of survival and compare them to the predictors of a
high burden on the hospital.
Survival Analysis unit (SVA) helped me greatly in doing the second project. I
had done it previously in an honours project but did not have real
understanding. I used the risk model approach that I learned in my concurrent
MPhil which is a masters by research degree in biostatistics and this was not
covered in the Masters of Biostatistics (MBios). Recurrent event survival
analysis was covered lightly in SVA but I already had good experience in my
Honours work and applied it well in this project. I did not dwell on recurrent
event models as the depth required for this project was already met according
to my supervisors.
In conclusion I thoroughly enjoyed this project and this has allowed applying my
biostatistical skills, strengthening my communication skills and improving my
research writing skills. I may get some publications from this project and have
already got one poster presentation from the first project. I had a variety of
supervisors who devoted considerable time, effort and resources to my work
allowing me really do well in doing research that is clinically applicable and
statistically acceptable. My professional networking skills in communication and
statistical consultancy have improved as well. I now do research with
WPP Project 2 units Mithilesh Dronavalli Page 7 of 65
collaborators around the world and am currently working on project with
Professor of Neurosurgery in Spain and I feel confident.
WPP Project 2 units Mithilesh Dronavalli Page 8 of 65
Meta-analysis of ACE inhibitor and beta-blocker CHF trials on mortality
with regards to creatinine levels.
Location and Dates
This project was carried out from home in Sydney with correspondence with
Prof. Krum and Dr. Billah who work at the Department of Epidemiology and
Preventative Medicine at Monash University, The work was done intermittently
from July 2007 to February 2008 where the result were presented at the Asian
Pacific Heart Failure Conference held in Melbourne February 2008.
Context
The efficacy of various heart failure treatments is unclear in renally
compromised patients. There is only available data in the correct format for
ACE inhibitors and Beta blockers, each with 2 trials. The idea to do this search
and address this problem was from Prof. Krum who is a clinical pharmacologist
with special interest in Cardiology.
Contribution of Student
Secondary data collection
Data management and manipulation
Choosing methodology
Statistical analysis
Presentation and interpretation of results and communication of these to
the research team.
Preparing the poster and attending the conference to answer questions
WPP Project 2 units Mithilesh Dronavalli Page 9 of 65
Statistical Issues Involved
Collecting data from different trial articles and their respective datasets where
available. Generating dataset from data summaries. Logistic Regression and
assumption testing. Calculating interaction odds ratios and developing standard
error formulas for these. Calculating data summaries for the trial for
presentation.
Signed Declaration by Student
I declare the work placement report presented here describes my own work
unless otherwise specified and contains no material previously published or
written by another person, except where due reference is made in the text.
WPP Project 2 units Mithilesh Dronavalli Page 10 of 65
Project Report 1
Background:
The effect of heart failure drugs on patients with renal insufficiency and renal
failure in the treatment of heart failure is poorly understood. The differential
therapeutic effect of heart failure drugs on patients classified by their serum
creatinine or creatinine clearance (measures for renal insufficiency) is
investigated via pooled meta-analysis of relevant randomised control trial trials.
Methods:
All major randomised control trials on the efficacy of individual heart failure
drugs were investigated. The heart failure drugs assessed were ACE inhibitors,
Angiotensin Receptor Blockers (ARBs), beta blockers, Aldosterone Antagonists
and Digoxin.
Data was collected where mortality was given in counts classified by treatment
or placebo and whether the patient had high or low serum creatinine (or by
creatinine clearance) where the cutoff was the median. Sources of this data
included the trial dataset, trial co-ordinators results books and any trial
publications.
The results were used to construct a dichotomous dataset for each available
drug (ACEi and beta- blockers) in a pooled meta-analysis. Multiple logistic
regression with mortality as an outcome and the predictors being renal status,
treatment, interaction term of (renal status X treatment) and adjusted for the trial
variable which indicates the trial the data came from. This is adjusted for the
WPP Project 2 units Mithilesh Dronavalli Page 11 of 65
trials when they are heterogeneous. Assumptions for the logistic regression
model were also tested.
Results:
The mortality distribution and trial particulars for the four available heart failure
trials are given in table 1. Tables 2 and 3 are the results of the logistic
regression models for each of the pooled meta-analyses, beta-blockers and
ACE inhibitors respectively.
For the ACEi trials the followup was made to be exactly the same as the
SOLVD dataset was available. Also the beta-blockers trials had similar follow
ups.
In the beta-blocker analysis, beta-blockers were protective against mortality as
compared to placebo. Poor renal status was damaging in regards to mortality
regardless of beta-blocker therapy. Also Beta-blockers did not selectively
benefit either poor renal status patients or good renal status patients. The trials
were not different with insignificant P values.
Similarly for ACE inhibitors, they were protective against mortality as compared
to placebo. Poor renal status was also a predictor of mortality regardless of
ACEi therapy. Also similarly ACEi therapy did not selectively benefit poor renal
status patients or good renal status patients.
WPP Project 2 units Mithilesh Dronavalli Page 12 of 65
There was no overdispersion in either dataset and both analyses had
insignificant values for goodness of fit indicating a good fit for the model.
Mortality Placebo Treatment
Renal response Poor Good Poor Good cutoff n followup
ACE inhibitor
trials both Enalapril median
6
SOLVD 9% 5% 7% 4% 120umol/L 6,655 months
6
CONSENSUS 50% 38% 24% 29% 120umol/L 258 months
Beta-Blocker CIBIS II :
Trials Bisoporolol COPERNICUS : Carvedilol
1 -3
CIBIS II 23% 14% 15% 10% 6ml/min 2,647 years
10.4
COPERNICUS 20% 13% 14% 8% 125umol/L 2,286 months
Table 1: Mortality distribution in heart failure trials
[95%
Mortality OR Std. Err. P>z Conf. Interval]
Bblocker 0.627 0.073
Poor renal 1.728 0.189 z Conf. Interval]
ACEi 0.762 0.099 0.036 0.591 0.982
Poor renal 1.959 0.275 =6.0ml/min
Died 97 131 228
Not Died 318-***-****
0.234 0.145
OR 1.802
se(logOR) 0.150
EF 1.341
L95%CI 1.344 U95%CI 2.416
logOR 0.589
Table 1
Bisoprolol
=6.0ml/min
Died 67 89 156
Not Died 367-***-****
OR 1.649
se(logOR) 0.174
EF 1.405
L95%CI 1.174 U95%CI 2.318
logOR 0.500
Table 2
sigma
ORMH =Q/R Q= d1i*h01/ni
signma
Q 97.471 R= d0i*h1i/ni
R 56.173
O R M -H 1.735
V 71.551
seMH 0.114
EF 1.251
L95%CI(MH) 1.387 U95%CI(MH) 2.170968
Difflogor -0.089 diff se 0.229
diffOR 0.915 diff EF 1.567
L95%CI 0.584 U95%CI 1.434
Table 3
WPP Project 2 units Mithilesh Dronavalli Page 18 of 65
Formulas:
OR=ad*bc se(logOR) = (1/a+1/b+1/c+1/d)^0.5 EF=exp(se(logOR)*1.96)
L95%CI=OR/EF U95%CI=OR*EF logOR=ln(OR)
Q = (a*d/(a+b+c+d){in placebo} + a*d/(a+b+c+d){in treatment})
R= (b*c/(a+b+c+d){in placebo} + b*c/(a+b+c+d){in treatment})
OR M-H = Q/R
V= (Died0 +Died1)*(Not Died0*Not Died1)*(Poor Renal0*Poor Renal1)*(Good
Renal0*Good Renal1)/ (Trial sample size^2*(Trial sample size-1))
Poor Renal z [95% Conf. Interval]
Bblocker 0.654 0.096 -2.9 0.004 0.491 0.871
high_creat 1.802 0.270 3.94 Pz Conf. Interval] P comb.
Marital divorced 0.901 0.122 0.77 0.441 0.691 1.174 0.0235
unknown 0.662 0.141 1.94 0.052 0.436 1.003
nvr married 1.133 0.119 1.19 0.235 0.922 1.392
separated 2.195 0.802 2.15 0.031 1.073 4.490
widowed 0.919 0.120 0.65 0.518 0.711 1.188
cob_fixed Asia 0.777 0.239 0.82 0.413 0.425 1.421 0.0131
easteu 0.995 0.117 0.04 0.965 0.790 1.253
mideast 1.177 0.400 0.48 0.631 0.605 2.293
other 1.524 0.209 3.07 0.002 1.165 1.994
Uk 0.822 0.084 1.93 0.054 0.673 1.003
westeu 0.914 0.081 1.01 0.313 0.768 1.088
nok_fixed child 1.061 0.087 0.72 0.469 0.903 1.246 0.0519
friend 1.085 0.149 0.6 0.551 0.829 1.421
other 1.123 0.152 0.86 0.391 0.861 1.464
other rel 1.273 0.172 1.78 0.074 0.976 1.660
parent 1.412 0.430 1.13 0.257 0.777 2.565
sibling 1.629 0.252 3.15 0.002 1.203 2.206
fun_fixed DVA 1.171 0.144 1.28 0.199 0.920 1.491 0.0538
Prvt hlth care 0.821 0.085 1.9 0.058 0.669 1.007
DCNH 1.387 0.148 3.06 0.002 1.125 1.711
Ihd 1.162 0.075 2.32 0.021 1.023 1.319
Hypertension 0.875 0.058 2.04 0.042 0.769 0.995
Dementia 1.286 0.181 1.79 0.073 0.977 1.694
acute RF 1.552 0.233 2.93 0.003 1.156 2.083
chronic RF 1.327 0.113 3.32 0.001 1.123 1.568
Cervascdisease 1.426 0.179 2.83 0.005 1.115 1.823
Otherpvd 1.395 0.139 3.33 0.001 1.147 1.696
Anaemia 1.330 0.130 2.93 0.003 1.099 1.610
Coad 1.279 0.099 3.18 0.001 1.099 1.488
Female 0.786 0.054 3.53 Pz Conf. Interval] P value
fail = 1127 Coef. Combined
amiodarone 0.155 0.086 1.79 0.073 0.324 0.014
female 0.253 0.070 3.61 0 0.390 0.116
nicorandil 1.287 0.324 3.97 0 0.651 1.922
thiazide 0.580 0.132 4.41 0 0.322 0.838
acuterenalfailure 0.623 0.144 4.32 0 0.341 0.906
cervascdisease 0.439 0.126 3.49 0 0.192 0.685
longnitrate 0.304 0.071 4.31 0 0.166 0.442
perhexiline 0.667 0.142 4.71 0 0.389 0.944
dihidropyridine 0.245 0.101 2.42 0.016 0.444 0.046
coad 0.184 0.085 2.17 0.03 0.018 0.350
statin 0.247 0.083 3 0.003 0.409 0.086
atenolol 0.503 0.130 3.87 0 0.757 0.248
dcnh 0.382 0.095 4.02 0 0.196 0.568
otherpvd 0.265 0.100 2.66 0.008 0.070 0.461
nok_fixed child 0.114 0.107 1.07 0.285 0.095 0.324 0.0018
friend 0.005 0.172 0.03 0.978 0.341 0.332
other 0.164 0.170 0.96 0.335 0.498 0.170
other rel 0.371 0.155 2.4 0.017 0.067 0.674
parent 0.332 0.316 1.05 0.293 0.287 0.951
sibling 0.546 0.185 2.95 0.003 0.184 0.909
anticholinergic 0.395 0.124 3.19 0.001 0.152 0.638
sulphonylurea 0.096 0.085 1.13 0.259 0.071 0.264
warfarin 0.903 0.375 2.41 0.016 0.168 1.638
martial divorced 0.104 0.164 0.64 0.524 0.216 0.425 0.0045
unknown 0.703 0.210 3.35 0.001 0.292 1.114
nvr married 0.134 0.159 0.85 0.397 0.176 0.445
separated 0.093 0.257 0.36 0.717 0.411 0.598
widowed 0.108 0.102 1.06 0.291 0.308 0.092
fun_fixed DVA 0.118 0.121 0.97 0.331 0.119 0.355 0.0226
Prvt hlth care 0.264 0.107 2.46 0.014 0.474 0.053
dementia 0.328 0.141 2.32 0.02 0.051 0.604
digoxin 0.196 0.064 3.08 0.002 0.071 0.320
nsaid 0.688 0.330 2.09 0.037 0.042 1.334
age_50_less 0.150 0.185 0.81 0.418 0.212 0.512 P