Multilevel modelling in the analysis of observational datasets in the health care setting

Schwenkglenks, Matthias Michael. Multilevel modelling in the analysis of observational datasets in the health care setting. 2007, Doctoral Thesis, University of Basel, Faculty of Science.


Official URL: http://edoc.unibas.ch/diss/DissB_7939

Downloads: Statistics Overview


In health care-related research, many studies circle around the problem of identifying risk factors for clinical events of interest, with a potential for economic consequences, or risk factors for increased health care costs. Multivariate regression methods are typically used to analyse such studies and have become central for an efficient control of confounding and assessment of effect modification. However, most of the data used for this type of research are characterised by hierarchical (multilevel) data structures (e.g., patients are frequently nested within treating physicians or study centres). Standard multivariate regression methods tend to ignore this aspect and it has been shown that this may lead to a loss of statistical efficiency and, in some cases, to wrong conclusions. Multilevel regression modelling is an emerging statistical technique which claims to correctly address this type of data, and to make use of their full potential. The author conducted and/or analysed three observational studies of factors associated with clinical events or cost endpoints of interest. In all cases, conventional regression methods were primarily used. In a second step, multilevel re-analyses were performed and the results were compared. The first study addressed the effect of exacerbation status, disease severity and other covariates on the disease-specific health care costs of adult Swiss asthma patients. Among other factors, the occurrence of asthma exacerbations was confirmed to be independently associated with higher costs, and to interact with disease severity. The second study addressed the impact of gatekeeping, a technique widely used to manage the use of health care resources, on the health care costs accrued by a general Swiss population. In a situation characterised by ambiguous research findings, the author's study indicated substantial cost savings through gatekeeping as opposed to fee-for-service based health insurance.
Finally, a combined dataset of six retrospective audits of breast cancer treatment from several Western European countries was used to estimate, for common chemotherapy regimen types, the frequency of chemotherapy-induced neutropenic events and to identify or confirm potential neutropenic event risk factors. Neutropenic events were shown to occur frequently in routine clinical practice. Several factors, including age, chemotherapy regimen type, planned chemotherapy dose intensity, and planned number of chemotherapy cycles, were shown to be potentially important elements of neutropenia risk models.
Multilevel re-analysis showed higher level variation (i.e., variation at the level of the treating physicians or study centres) to be present in the asthma dataset and in the neutropenia dataset, but not in the gatekeeping dataset. In the first-mentioned cases, multilevel modelling allowed to quantify the amount of higher level variation; to identify its sources; to identify spurious findings by analysing influential higher level units; to achieve a gain in statistical precision; and to achieve a modest gain in predictive ability for out-of-sample observations whose corresponding higher level units contributed to model estimation. The main conclusions of the conventional analyses were confirmed.
Based on these findings and in conjunction with published sources, it is concluded that multilevel modelling should be used systematically where hierarchical data structures are present, except if the higher level units must be regarded as distinct, unrelated entities or if their number is very small. Erroneous inferences will thus become more unlikely. Moreover, multilevel modelling is the only technique to date which allows to efficiently test hypotheses at different hierarchical levels, and hypotheses involving several levels, simultaneously. In the authors opinion, multilevel analysis is of particular interest where characteristics of health care providers, and clinical practice patterns in particular, may impact on health outcomes or health economic outcomes. It is only another facet of the same argument that multilevel modelling should also be used in multi-centre studies (including randomised clinical trials) to take into account study centre-specific characteristics and behaviours.
In many instances, the use of the technique will be tentative and rule out the presence of substantial higher level variation. If so, simpler methods can again be used.
Besides some technical issues, the main disadvantage of multilevel modelling is the complexity involved with the modelling process and with correctly interpreting the results. A careful approach is therefore needed. Multilevel modelling can be applied to datasets post hoc, as the author has done, but superior results can be expected from studies which are planned with the requirements of multilevel analysis (e.g., appropriate sample size, collection of relevant covariates at all hierarchical levels) in mind.
Advisors:Tanner, Marcel
Committee Members:Szucs, Thomas D. and Schumacher, Martin
Faculties and Departments:09 Associated Institutions > Swiss Tropical and Public Health Institute (Swiss TPH) > Former Units within Swiss TPH > Molecular Parasitology and Epidemiology (Beck)
UniBasel Contributors:Tanner, Marcel
Item Type:Thesis
Thesis Subtype:Doctoral Thesis
Thesis no:7939
Thesis status:Complete
Number of Pages:198
Identification Number:
edoc DOI:
Last Modified:22 Jan 2018 15:50
Deposited On:13 Feb 2009 16:06

Repository Staff Only: item control page