edoc

Visualizing count data regressions using rootograms

Kleiber, Christian and Zeileis, Achim. (2016) Visualizing count data regressions using rootograms. The American Statistician, 70 (3). pp. 296-303.

[img] PDF - Submitted Version
382Kb

Official URL: http://edoc.unibas.ch/43816/

Downloads: Statistics Overview

Abstract

The rootogram is a graphical tool associated with the work of J. W. Tukey that was originally used for assessing goodness of fit of univariate distributions. Here we extend the rootogram to regression models and show that this is particularly useful for diagnosing and treating issues such as overdispersion and/or excess zeros in count data models. We also introduce a weighted version of the rootogram that can be applied out of sample or to (weighted) subsets of the data, e.g., in finite mixture models. An empirical illustration revisiting a well-known data set from ethology is included, for which a negative binomial hurdle model is employed. Supplementary materials providing two further illustrations are available online: the first, using data from public health, employs a two-component finite mixture of negative binomial models, the second, using data from finance, involves underdispersion. An proglang{R} implementation of our tools is available in the proglang{R}~package pkg{countreg}. It also contains the data and replication code. The rootogram is a graphical tool associated with the work of J. W. Tukey that was originally used for assessing goodness of fit of univariate distributions. Here we extend the rootogram to regression models and show that this is particularly useful for diagnosing and treating issues such as overdispersion and/or excess zeros in count data models.  We also introduce a weighted version of the rootogram that can be applied out of sample  or to (weighted) subsets of the data, e.g., in finite mixture models. An empirical illustration revisiting a well-known data set from ethology is included, for which a negative binomial hurdle model is employed. Supplementary materials providing two further illustrations are available online: the first, using data from public health, employs a two-component finite mixture of negative binomial models, the second, using data from finance, involves underdispersion. An R implementation of our tools is available in the R package countreg . It also contains the data and replication code.
Faculties and Departments:06 Faculty of Business and Economics > Departement Wirtschaftswissenschaften > Professuren Wirtschaftswissenschaften > Ökonometrie und Statistik (Kleiber)
UniBasel Contributors:Kleiber, Christian
Item Type:Article, refereed
Article Subtype:Research Article
Publisher:Taylor & Francis
ISSN:0003-1305
Note:Publication type according to Uni Basel Research Database: Journal article -- This is an Original Manuscript of an article published by Taylor & Francis in [The American Statistician] on 2016, available online: http://www.tandfonline.com/10.1080/00031305.2016.1173590.
Language:English
Related URLs:
Identification Number:
edoc DOI:
Last Modified:14 Feb 2018 10:02
Deposited On:16 Dec 2016 08:25

Repository Staff Only: item control page