Second Order Accurate One-Sided Bounds

We can use the Cornish-Fisher expansion to construct second order lower and upper bounds in a variety of model.

Example 7.25 Suppose \(X_i \sim_{iid} f_X\) where \(f_X\) is a density wrt Lebesgue measure. Let \[\begin{equation*} W_n = \sqrt{n}\frac{\bar{X} - \mu}{\widehat{\sigma}} \end{equation*}\] Define the \(\alpha\) quantile of \(W_n\) as \(w_{n\alpha}\) which satisfies \[\begin{equation*} P(W_n \leq w_{n\alpha}) = \alpha \end{equation*}\] Since \(W_n\) is converging to \(N(0,1)\), we can think of \(z_\alpha\) (standard normal \(\alpha\) quantile) as an approximation of \(w_{n\alpha}\) in that \[\begin{equation*} P(W_n \leq z_\alpha) \rightarrow \alpha \end{equation*}\] The Cornish-Fisher expansion refines this approximation to second order by stating \[\begin{equation*} w_{n\alpha} = z_\alpha + \frac{q_1(z_\alpha)}{\sqrt{n}} + O(n^{-1}) \end{equation*}\] where \(q_1(x) = -p_1(x)\). Example 1.34 found that for the \(W_n\) under consideration \[\begin{equation*} p_1(x) = \frac{1}{6}\kappa_3(2x^2 + 1). \end{equation*}\] A second order approximation to \(w_{n\alpha}\) is \[\begin{equation*} c = z_\alpha - \frac{1}{6\sqrt{n}}\widehat{\kappa}_3(2z_\alpha^2 + 1) \end{equation*}\] where \(\widehat{\kappa}_3 = \frac{\sum (X_i - \bar{X})^3}{\widehat{\sigma}^3}\). This threshold has second order accuracy, i.e. \(P(W_n \leq c) = 1-\alpha + O(n^{-1})\). Inverting the set we obtain \[\begin{equation*} C(x) = \{\mu : W_n \leq c\} = \{\mu : \mu > \bar{X} - z_{1-\alpha}\frac{\widehat{\sigma}}{\sqrt{n}} + \frac{\widehat{\sigma}}{6n}\widehat{\kappa}_3(2 z_{1-\alpha}^2 + 1)\} \end{equation*}\] Similar bounds can be derived for constructing confidence sets for paramaters such as \(\sigma^2\). For example letting \(\widehat{\tau}^2 = n^{-1}\sum (X_i - \bar{X})^4 - \widehat{\sigma}^4\), the asymptotic pivot \[\begin{equation*} W_n = \sqrt{n}\frac{\widehat{\sigma}^2 - \sigma^2}{\widehat{\tau}} \rightarrow N(0,1) \end{equation*}\] admits an Edgeworth / Cornish-Fisher expansion (see last Equation of Example 1.34).

Numerical Comparison

The Wald lower bound for the mean \(\mu\) is

\[\begin{align} \bar{X} - \frac{\widehat{\sigma}z_{1-\alpha}}{\sqrt{n}} \end{align}\]

The 2-term edgeworth expansion (Example 7.25 in textbook, last inline formula, on p 505) is

\[\begin{align} \bar{X} - \frac{\widehat{\sigma}}{\sqrt{n}}(z_{1-\alpha} - (1/6)n^{-1/2}\widehat{\kappa}_3(2z_{1-\alpha}^2 + 1)) \end{align}\]

Note that the relative locations of the bounds are determined by \(\widehat{\kappa}_3\). When \(\widehat{\kappa}_3 > 0\), the lower bound will be larger for the Edgeworth than Wald, hence less conservative.

We compare Wald and two term Edgeworth lower bounds for Example 7.25 in the textbook under the simulation setting:

## simulation parameters
N <- 160000 ## number of samples
n <- 20 ## sample size
## chi-squared random variable with 2 degrees of freedom
df <- 2
mu <- df
alpha <- 0.1
x <- rchisq(n,df=2)
hist(x,breaks=10)

From the histogram, we can see a positive skew, suggesting \(\widehat{\kappa}_3\) will tend to be positive and the second order lower bound larger for Edgeworth than Wald.

x <- matrix(rchisq(n*N,df=2),ncol=n)
muhat <- rowMeans(x)
sds <- sqrt(rowSums((x - muhat)^2)/n)

## make wald bound
wald <- muhat - qnorm(1-alpha)*sds/sqrt(n)
## make edgeworth bound
sig2 <- rowSums((x - muhat)^2)/n
kap3 <- rowSums((x - muhat)^3) / (sig2^{3/2}*n)
edge <- muhat - sqrt(sig2/n)*(qnorm(1-alpha) - (1/6)*sqrt(1/n)*kap3*(2*qnorm(1-alpha)^2 + 1))

plot(wald,edge,col="#00000010",
     xlab="Wald Lower Bound",ylab="Edgeworth Lower Bound")
abline(a=0,b=1)

From the plot above, we verify that the Edgeworth bound is nearly always larger than the Wald bound. The simulation based empirical coverage probabilities are:

The Edgeworth expansion interval obtains accuracy somewhat closer to the nominal level of 0.9.

Varying Sample Size

We repeat the above simulation varying the sample size.

ns <- seq(from=10,to=40,by=1)
wald_cov <- rep(NA_real_,length(ns))
edge_cov <- rep(NA_real_,length(ns))
for(ii in 1:length(ns)){
  x <- matrix(rchisq(ns[ii]*N,df=2),ncol=ns[ii])
  muhat <- rowMeans(x)
  sds <- sqrt(rowSums((x - muhat)^2)/ns[ii])
  ## make wald bound
  wald <- muhat - qnorm(1-alpha)*sds/sqrt(ns[ii])
  ## make edgeworth bound
  sig2 <- rowSums((x - muhat)^2)/ns[ii]
  kap3 <- rowSums((x - muhat)^3) / (sig2^{3/2}*ns[ii])
  edge <- muhat - sqrt(sig2/ns[ii])*(qnorm(1-alpha) - (1/6)*sqrt(1/ns[ii])*kap3*(2*qnorm(1-alpha)^2 + 1))
  wald_cov[ii] <- 1-mean(wald>mu)
  edge_cov[ii] <- 1-mean(edge>mu)
}

We rescale the coverage errors by \(\sqrt{n}\) for Wald and \(n\) for the 2-term Edgeworth. Letting \(C_W(n)\) be the empirical coverage of the Wald interval at sample size of \(n\) and \(C_E(n)\) be the empirical converage of the 2-term Edgeworth at sample size of \(n\) we compute \[\begin{align*} &ES_W = \sqrt{n}(C_W(n) - (1-\alpha))\\ &ES_E = n(C_E(n) - (1-\alpha))\\ \end{align*}\] where \(ES\) refers to error scaled and \(W\) and \(E\) refer to Wald and Edgeworth respectively. The scaled errors should be roughly constant in \(n\), up to simulation error and lower order terms.

wald_scale <- sqrt(ns)*(wald_cov - (1-alpha))
edge_scale <- ns*(edge_cov - (1-alpha))
plot(0,0,col=0,xlim=range(ns),ylim=range(c(wald_scale,edge_scale)),
     xlab="n",ylab="ES")
points(ns,wald_scale,type='l',lwd=2)
points(ns,edge_scale,type='l',col='red',lwd=2)
legend("bottomleft",c("ESW","ESE"),col=1:2,lty=1,lwd=2)