Multimodel inference and model averaging
25 Jan 2016Model selection
When we have multiple competing models (corresponding to various hypotheses), we need some techniques to compare the performance of the models, then select the best model(s) to support specific hypothesis.
Multimodel inference and model averaging
The datasets we got are always complicated and noisy. We seldom have enough evidences to select just one single best model. Within the framework of multimodel inference, the models can be ranked by a measure of its relative support (weight, the relative likelihood of the model given the data). When there exists some uncertainty in model selection, like the potential models have similar levels of support, then model averaging can be used for parameter estimates or predictions to minimize the uncertainty[1-4].
Example in R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
# Four virtual variables, one dependent variable (y),
# three independent variables (x1, x2, x3)
set.seed(1234)
x1 <- rnorm(50,mean=5,sd=3)
x2 <- rnorm(50,mean=10,sd=2)
x3 <- rnorm(50,mean=15,sd=3)
y <- rnorm(50,mean=20,sd=5)
# I use linear regression without interactions
# and polynomials here as an example
lm1 <- lm(y~x1+x2+x3)
summary(lm1)
## Call:
## lm(formula = y ~ x1 + x2 + x3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.7587 -3.8724 -0.2943 4.1334 12.1460
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 30.42305 6.08497 5.000 8.8e-06 ***
## x1 0.05949 0.32110 0.185 0.854
## x2 -0.20667 0.41578 -0.497 0.622
## x3 -0.54467 0.33476 -1.627 0.111
## ---
## Signif. codes:
## 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
##
## Residual standard error: 5.916 on 46 degrees of freedom
## Multiple R-squared: 0.06738, Adjusted R-squared: 0.006561
## F-statistic: 1.108 on 3 and 46 DF, p-value: 0.3556
# Load package "MuMIn" (Version 1.15.6) for multimodel
# inference and model averaging
library(MuMIn)
options(na.action="na.fail") # the default argument of "na.action"
# is "na.omit" which is not proper here
# Generate a set of models with combinations of terms in the global model.
# "m.lim", "subset", "fixed" (the term to be included in all models,
# the default value is the intercept) can constrain the resulting set
# of models, but think carefully before using them
dd <- dredge(
lm1, # the global model, supports many types
beta="none", # standardize the estimates or not
evaluate=TRUE, # evaluate and rank the models
rank="AICc", # rank the models with AICc for small sample size
m.lim=c(0,3), # limits for the number of terms in a single model,
# the old arguments are "m.min" and "m.max"
subset= x1 | x2 | x3, # logical expression describing models
# to keep in the resulting set
extra="R^2" # function for additional statistics
# to include in the result
)
dd
## Global model call: lm(formula = y ~ x1 + x2 + x3)
## ---
## Model selection table
## (Intrc) x1 x2 x3 R^2 df logLik AICc delta weight
## 5 28.88 -0.5686 6.133e-02 3 -157.907 322.3 0.00 0.424
## 7 30.60 -0.2131 -0.5375 6.669e-02 4 -157.764 324.4 2.08 0.150
## 6 28.73 0.072750 -0.5763 6.237e-02 4 -157.879 324.6 2.31 0.134
## 3 23.76 -0.3351 1.371e-02 3 -159.144 324.8 2.47 0.123
## 2 20.26 0.015250 4.654e-05 3 -159.488 325.5 3.16 0.087
## 8 30.42 0.059490 -0.2067 -0.5447 6.738e-02 5 -157.745 326.9 4.52 0.044
## 4 23.76 -0.001132 -0.3351 1.371e-02 4 -159.144 327.2 4.84 0.038
## Models ranked by AICc(x)
# Model averaging
avg <- model.avg(
dd, # I use the "model.selection" object
subset=cumsum(weight)<=0.95 # model averaging on the subset of the
# resulting models, I use a 95% confidence
# set of the models (the cumulative
# weight is nearly equal to 0.95),
# 5 models will be used here
)
summary(avg)
## Call:
## model.avg(object = dd, subset = cumsum(weight) <= 0.95)
##
## Component model call:
## lm(formula = y ~ <5 unique rhs>)
##
## Component models:
## df logLik AICc delta weight
## 3 3 -157.91 322.34 0.00 0.46
## 23 4 -157.76 324.42 2.08 0.16
## 13 4 -157.88 324.65 2.31 0.15
## 2 3 -159.14 324.81 2.47 0.13
## 1 3 -159.49 325.50 3.16 0.10
##
## Term codes:
## x1 x2 x3
## 1 2 3
##
## Model-averaged coefficients:
## (full average)
## Estimate Std. Error Adjusted SE z value Pr(>|z|)
## (Intercept) 27.62906 5.71991 5.82612 4.742 2.1e-06 ***
## x3 -0.43432 0.37016 0.37588 1.155 0.248
## x2 -0.07972 0.25715 0.26225 0.304 0.761
## x1 0.01203 0.15875 0.16281 0.074 0.941
##
## (conditional average)
## Estimate Std. Error Adjusted SE z value Pr(>|z|)
## (Intercept) 27.62906 5.71991 5.82612 4.742 2.1e-06 ***
## x3 -0.56346 0.32402 0.33246 1.695 0.0901 .
## x2 -0.26809 0.41457 0.42519 0.631 0.5284
## x1 0.05002 0.32073 0.32907 0.152 0.8792
## ---
## Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
##
## Relative variable importance:
## x3 x2 x1
## Importance: 0.77 0.30 0.24
## N containing models: 3 2 2
# Confidence intervals for model parameters
confint(avg)
## 2.5 % 97.5 %
## (Intercept) 16.2100663 39.04804920
## x3 -1.2150750 0.08814727
## x2 -1.1014369 0.56526046
## x1 -0.5949329 0.69497959
“Let the computer find out” is a poor strategy and usually reflects the fact that the researcher did not bother to think clearly about the problem of interest and its scientific setting[3].
So, you must select the potential models, judge the parameter estimates and the predictions with knowledge of your domain through the entire analysis process.
References:
[2]: Grueber, C. E., S. Nakagawa, R. J. Laws, and I. G. Jamieson. 2011. Multimodel inference in ecology and evolution: challenges and solutions. Journal of Evolutionary Biology 24:699-711.
[3]: Burnham KP and Anderson DR. 2002. Model selection and multi-model inference: a practical information-theoretic approach. Springer.
[4]: Burnham, K., D. Anderson, and K. Huyvaert. 2011. AIC model selection and multimodel inference in behavioral ecology: some background, observations, and comparisons. Behavioral Ecology and Sociobiology 65:23-35.