eco {eco} | R Documentation |
eco
is used to fit the parametric Bayesian model (based on a
Normal/Inverse-Wishart prior) for ecological inference in 2
times 2 tables via Markov chain Monte Carlo. It gives the in-sample
predictions as well as the estimates of the model parameters. The
model and algorithm are described in Imai and Lu (2004). The
contextual effect can also be modeled by following the strategy
described in Imai and Lu (2005).
eco(formula, data = parent.frame(), N = NULL, supplement = NULL, context = FALSE, mu0 = 0, tau0 = 2, nu0 = 4, S0 = 10, mu.start = 0, Sigma.start = 10, parameter = TRUE, grid = FALSE, n.draws = 5000, burnin = 0, thin = 0, verbose = FALSE)
formula |
A symbolic description of the model to be fit,
specifying the column and row margins of 2 times
2 ecological tables. Y ~ X specifies Y as the
column margin and X as the row margin. Details and specific
examples are given below.
|
data |
An optional data frame in which to interpret the variables
in formula . The default is the environment in which
eco is called.
|
N |
An optional variable representing the size of the unit; e.g., the total number of voters. |
supplement |
An optional matrix of supplemental data. The matrix
has two columns, which contain additional individual-level data such
as survey data for W_1 and W_2, respectively. If
NULL , no additional individual-level data are included in the
model. The default is NULL .
|
context |
Logical. If TRUE , the contextual effect is also
modeled. See Imai and Lu (2005) for details. The default is
FALSE .
|
mu0 |
A scalar or a numeric vector that specifies the prior mean
for the mean parameter μ. If it is a scalar, then its value
will be repeated to yield a vector of the length of μ, otherwise,
it needs to be a vector of same length as μ.
When context=TRUE , the length of μ is 3,
otherwise it is 2. The default is 0 .
|
tau0 |
A positive integer representing the prior scale
for the mean parameter μ. The default is 2 .
|
nu0 |
A positive integer representing the prior degrees of
freedom of the variance matrix Σ. the default is 4 .
|
S0 |
A postive scalar or a positive definite matrix that specifies
the prior scale matrix for the variance matrix Σ. If it is
a scalar, then the prior scale matrix will be a digonal matrix with
the same dimensions as Σ and the diagonal elements all take value
of S0 , otherwise S0 needs to have same dimensions as
Σ. When context=TRUE , Σ is a
3 times 3 matrix, otherwise, it is 2 times 2.
The default is 10 .
|
mu.start |
A scalar or a numeric vector that specifies the
starting values of the mean parameter μ.
If it is a scalar, then its value will be repeated to
yield a vector of the length of μ, otherwise,
it needs to be a vector of same length as μ.
When context=FALSE , the length of μ is 2,
otherwise it is 3. The default is 0 .
|
Sigma.start |
A scalar or a positive definite matrix
that specified the starting value of the variance matrix
Σ. If it is a scalar, then the prior scale
matrix will be a digonal matrix with the same dimensions
as Σ and the diagonal elements all take value
of S0 , otherwise S0 needs to have same dimensions as
Σ. When context=TRUE , Σ is a
3 times 3 matrix, otherwise, it is 2 times 2.
The default is 10 .
|
parameter |
Logical. If TRUE , the Gibbs draws of the population
parameters, μ and Σ, are returned in addition to
the in-sample predictions of the missing internal cells,
W. The default is TRUE .
|
grid |
Logical. If TRUE , the grid method is used to sample
W in the Gibbs sampler. If FALSE , the Metropolis
algorithm is used where candidate draws are sampled from the uniform
distribution on the tomography line for each unit. Note that the
grid method is significantly slower than the Metropolis algorithm.
|
n.draws |
A positive integer. The number of MCMC draws.
The default is 5000 .
|
burnin |
A positive integer. The burnin interval for the Markov
chain; i.e. the number of initial draws that should not be stored. The
default is 0 .
|
thin |
A positive integer. The thinning interval for the
Markov chain; i.e. the number of Gibbs draws between the recorded
values that are skipped. The default is 0 .
|
verbose |
Logical. If TRUE , the progress of the Gibbs
sampler is printed to the screen. The default is FALSE .
|
An example of 2 times 2 ecological table for racial voting is given below:
black voters | white voters | ||
Voted | W_{1i} | W_{2i} | Y_i |
Not voted | 1-W_{1i} | 1-W_{2i} | 1-Y_i |
X_i | 1-X_i |
where Y_i and X_i represent the observed margins, and W_1 and W_2 are unknown variables. All variables are proportions and hence bounded between 0 and 1. For each i, the following deterministic relationship holds, Y_i=X_i W_{1i}+(1-X_i)W_{2i}.
An object of class eco
containing the following elements:
call |
The matched call. |
X |
The row margin, X. |
Y |
The column margin, Y. |
N |
The size of each table, N. |
burnin |
The number of initial burnin draws. |
thin |
The thinning interval. |
nu0 |
The prior degrees of freedom. |
tau0 |
The prior scale parameter. |
mu0 |
The prior mean. |
S0 |
The prior scale matrix. |
W |
A three dimensional array storing the posterior in-sample predictions of W. The first dimension indexes the Monte Carlo draws, the second dimension indexes the columns of the table, and the third dimension represents the observations. |
Wmin |
A numeric matrix storing the lower bounds of W. |
Wmax |
A numeric matrix storing the upper bounds of W. |
mu |
The posterior draws of the population mean parameter, μ. |
Sigma |
The posterior draws of the population variance matrix, Σ. |
Kosuke Imai, Department of Politics, Princeton University kimai@Princeton.Edu, http://imai.princeton.edu; Ying Lu, Institute for Quantitative Social Sciences, Harvard University ylu@Latte.Harvard.Edu
Imai, Kosuke and Ying Lu. (2004) “ Parametric and Nonparametric Bayesian Models for Ecological Inference in 2 times 2 Tables.” Working Paper, Princeton University, available at http://imai.princeton.edu/research/einonpar.html
Imai, Kosuke and Ying Lu. (2005) “An Incomplete Data Approach to Ecological Inference.” Working Paper, Princeton University, available at http://imai.princeton.edu/research/coarse.html
ecoNP
, predict.eco
, summary.eco
## load the registration data data(reg) ## NOTE: convergence has not been properly assessed for the following ## examples. See Imai and Lu (2004, 2005) for more complete analyses. ## fit the parametric model with the default prior specification res <- eco(Y ~ X, data = reg, verbose = TRUE) ## summarize the results summary(res) ## obtain out-of-sample prediction out <- predict(res, verbose = TRUE) ## summarize the results summary(out) ## load the Robinson's census data data(census) ## fit the parametric model with contextual effects and N ## using the default prior specification res1 <- eco(Y ~ X, N = N, context = TRUE, data = census, verbose = TRUE) ## summarize the results summary(res1) ## obtain out-of-sample prediction out1 <- predict(res1, verbose = TRUE) ## summarize the results summary(out1)