Hurdle model

A hurdle model is a class of statistical models where a random variable is modelled using two parts, the first which is the probability of attaining value 0, and the second part models the probability of the non-zero values. The use of hurdle models are often motivated by an excess of zero's in the data, that is not sufficiently accounted for in more standard statistical models.

In a hurdle model, a random variable x is modelled as

where is a truncated probability distribution function, truncated at 0.

Hurdle models were introduced by John G. Gragg in 1971.,[1] where the non-zero values of x were modelled using a normal model, and a probit model was used to model the zeros. The probit part of the model was said to model the presence of "hurdles" that must be overcome for the values of x to attain non-zero values, hence the designation hurdle model. Hurdle models were later developed for count data, with Poisson, geometric,[2] and Negative Binomial[3] models for the non-zero counts .

Relationship with zero-inflated models

Hurdle models differ from zero-inflated models in that zero-inflated models model the zeros using a two-component mixture model. With a mixture model, the probability of the variable being zero is determined by both the main distribution and the mixture weight. Specifically, a zero-inflated model for a random variable x is

where is the mixture weight that determines the amount of zero-inflation. A zero-inflated model can only increase the probability of , but this is not a restriction in hurdle models [4]

See also

References

  1. John G. Gragg (1971) Some Statistical Models for Limited Dependent Variables with Application to the Demand for Durable Goods Econometrica Vol. 39, No. 5 (Sep., 1971), pp. 829-844
  2. John Mullahy (1986) Specification and testing of some modified count data models Journal of Econometrics Vol 33, No. 3 (Dec 1986), pp. 341-365
  3. A.H. Welsh, R.B. Cunningham, C.F. Donnelly, D.B. Lindenmayer (1996) Modelling the abundance of rare species: statistical models for counts with extra zeros Ecological Modelling Vol. 88, No 1–3, July 1996, pp. 297-308
  4. Yongyi Min & Alan Agresti (2005) Random effect models for repeated measures of zero-inflated count data Statistical Modelling Vol 5, Issue 1, 2005
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.