Count Models

John Poe

July 6, 2016

Count Models

When Count Models Go Bad

R Models

If you want to know more

How to Model A Fishing Trip

Poisson Distributions

Counts 3 Ways

When Count Models Go Bad: Overdispersion

When Count Models Go Bad: Zero Inflation

# When Count Models Go Bad: Zero Inflation

When Count Models Go Bad: Zero Truncation

When Count Models Go Bad: Multilevel Models

Model Example

Load the data set policy and check it with head(policy)

##        state id year citi_ideo gov_ideo_nom govparty_c policy
## 1    alabama  1 2004   39.1434      42.4537          0      0
## 2     alaska  2 2004   43.5494      31.2409          0      0
## 3    arizona  3 2004   47.7508      51.3161          1      0
## 4   arkansas  4 2004   52.6281      43.8843          0      0
## 5 california  5 2004   59.1004      49.1700          0      0
## 6   colorado  6 2004   54.6555      16.2838          0      1

Histogram of the Count

## Loading required package: Matrix
## 
## Attaching package: 'lme4'
## The following object is masked from 'package:nlme':
## 
##     lmList
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : Model failed to converge with max|grad| = 0.00189097 (tol =
## 0.001, component 1)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : Model failed to converge with max|grad| = 0.00189097 (tol =
## 0.001, component 1)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?

Poisson Results

## Generalized linear mixed model fit by maximum likelihood (Adaptive
##   Gauss-Hermite Quadrature, nAGQ = 25) [glmerMod]
##  Family: poisson  ( log )
## Formula: policy ~ citi_ideo + gov_ideo_nom + govparty_c + (1 | id)
##    Data: policy
## 
##      AIC      BIC   logLik deviance df.resid 
##    446.2    465.5   -218.1    436.2      345 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.4597 -0.7523 -0.4899  0.6148  4.2825 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  id     (Intercept) 0.1003   0.3167  
## Number of obs: 350, groups:  id, 50
## 
## Fixed effects:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -2.329030   0.337739  -6.896 5.35e-12 ***
## citi_ideo     0.038062   0.007127   5.341 9.26e-08 ***
## gov_ideo_nom -0.010572   0.006321  -1.673  0.09441 .  
## govparty_c    0.767968   0.240927   3.188  0.00143 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) citi_d gv_d_n
## citi_ideo   -0.778              
## gov_ideo_nm  0.173 -0.700       
## govparty_c  -0.305  0.561 -0.794
## convergence code: 0
## Model failed to converge with max|grad| = 0.00189097 (tol = 0.001, component 1)
## Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?

Negative Binomial Results

## Generalized linear mixed model fit by maximum likelihood (Adaptive
##   Gauss-Hermite Quadrature, nAGQ = 25) [glmerMod]
##  Family: Negative Binomial(0.26)  ( log )
## Formula: policy ~ citi_ideo + gov_ideo_nom + govparty_c + (1 | id)
##    Data: policy
## 
##      AIC      BIC   logLik deviance df.resid 
##    164.6    187.8    -76.3    152.6      344 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -0.4916 -0.4243 -0.3556  0.2650  3.8743 
## 
## Random effects:
##  Groups Name        Variance  Std.Dev. 
##  id     (Intercept) 6.764e-13 8.225e-07
## Number of obs: 350, groups:  id, 50
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -2.37868    0.53408  -4.454 8.44e-06 ***
## citi_ideo     0.04180    0.01294   3.231  0.00123 ** 
## gov_ideo_nom -0.01326    0.01109  -1.195  0.23193    
## govparty_c    0.80045    0.42298   1.892  0.05844 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) citi_d gv_d_n
## citi_ideo   -0.803              
## gov_ideo_nm  0.273 -0.746       
## govparty_c  -0.338  0.581 -0.809

Comparisons of Fixed Effects

Model Comparisons
Poisson NB
State Ideology 0.04*** 0.04***
(0.01) (0.01)
Government Ideology -0.01* -0.01
(0.01) (0.01)
Partisan Control 0.77*** 0.80*
(0.24) (0.42)
Constant -2.33*** -2.38***
(0.34) (0.53)
Observations 350 350
Log Likelihood -218.08 -76.31
Akaike Inf. Crit. 446.17 164.61
Bayesian Inf. Crit. 465.46 187.76
Note: p<0.1; p<0.05; p<0.01

Looking at the Poisson Random Effects Distribution

FALSE $id

Looking at the NB Random Effects Distribution

FALSE $id

Model Comparisons with Different Integration Techniques

Negative Binomial with PQL

## Generalized linear mixed model fit by maximum likelihood (Adaptive
##   Gauss-Hermite Quadrature, nAGQ = 0) [glmerMod]
##  Family: Negative Binomial(2.6051)  ( log )
## Formula: policy ~ citi_ideo + gov_ideo_nom + govparty_c + (1 | id)
##    Data: policy
## 
##      AIC      BIC   logLik deviance df.resid 
##    827.7    850.8   -407.8    815.7      344 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.1554 -0.7028 -0.4988  0.5022  4.6484 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  id     (Intercept) 0.02513  0.1585  
## Number of obs: 350, groups:  id, 50
## 
## Fixed effects:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -2.206576   0.311755  -7.078 1.46e-12 ***
## citi_ideo     0.037023   0.007195   5.146 2.66e-07 ***
## gov_ideo_nom -0.010845   0.006587  -1.646  0.09968 .  
## govparty_c    0.747239   0.245741   3.041  0.00236 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) citi_d gv_d_n
## citi_ideo   -0.763              
## gov_ideo_nm  0.220 -0.751       
## govparty_c  -0.337  0.596 -0.799

Negative Binomial with Laplace

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: Negative Binomial(2.9593)  ( log )
## Formula: policy ~ citi_ideo + gov_ideo_nom + govparty_c + (1 | id)
##    Data: policy
## 
##      AIC      BIC   logLik deviance df.resid 
##    827.6    850.7   -407.8    815.6      344 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.1794 -0.7060 -0.4931  0.5361  4.6058 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  id     (Intercept) 0.03648  0.191   
## Number of obs: 350, groups:  id, 50
## 
## Fixed effects:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -2.236439   0.344309  -6.495 8.28e-11 ***
## citi_ideo     0.037261   0.007584   4.913 8.96e-07 ***
## gov_ideo_nom -0.010857   0.006610  -1.643   0.1005    
## govparty_c    0.752291   0.249779   3.012   0.0026 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) citi_d gv_d_n
## citi_ideo   -0.789              
## gov_ideo_nm  0.246 -0.742       
## govparty_c  -0.368  0.606 -0.802
## convergence code: 0
## Model is nearly unidentifiable: very large eigenvalue
##  - Rescale variables?

Negative Binomial with Adaptive Gauss Hermite Quadrature

## Generalized linear mixed model fit by maximum likelihood (Adaptive
##   Gauss-Hermite Quadrature, nAGQ = 25) [glmerMod]
##  Family: Negative Binomial(0.26)  ( log )
## Formula: policy ~ citi_ideo + gov_ideo_nom + govparty_c + (1 | id)
##    Data: policy
## 
##      AIC      BIC   logLik deviance df.resid 
##    164.6    187.8    -76.3    152.6      344 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -0.4916 -0.4243 -0.3556  0.2650  3.8743 
## 
## Random effects:
##  Groups Name        Variance  Std.Dev. 
##  id     (Intercept) 6.764e-13 8.225e-07
## Number of obs: 350, groups:  id, 50
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -2.37868    0.53408  -4.454 8.44e-06 ***
## citi_ideo     0.04180    0.01294   3.231  0.00123 ** 
## gov_ideo_nom -0.01326    0.01109  -1.195  0.23193    
## govparty_c    0.80045    0.42298   1.892  0.05844 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) citi_d gv_d_n
## citi_ideo   -0.803              
## gov_ideo_nm  0.273 -0.746       
## govparty_c  -0.338  0.581 -0.809

Comparisons of Fixed Effects

Negative Binomial Model Comparisons
PQL Laplace GHQ
State Ideology 0.04*** 0.04*** 0.04***
(0.01) (0.01) (0.01)
Government Ideology -0.01* -0.01 -0.01
(0.01) (0.01) (0.01)
Partisan Control 0.75*** 0.75*** 0.80*
(0.25) (0.25) (0.42)
Constant -2.21*** -2.24*** -2.38***
(0.31) (0.34) (0.53)
Observations 350 350 350
Log Likelihood -407.83 -407.78 -76.31
Akaike Inf. Crit. 827.66 827.55 164.61
Bayesian Inf. Crit. 850.81 850.70 187.76
Note: p<0.1; p<0.05; p<0.01

Takeaways