%%All the programs are in d:/missingdata-robustSEM/
\documentclass[12pt,epsfig,titlepage]{article}
\usepackage{listings}
\textheight=9.20in
\textwidth=6.5in
\topmargin=-.68in 
%\topmargin=-.88in 
\oddsidemargin=-.0in
%\oddsidemargin=-.09in
\setlength{\footskip}{.6truein}
\newcommand{\vech}{{\rm vech}}
\newcommand{\ve}{{\rm vec}}
\newcommand{\E}{{ E}}
\newcommand{\Var}{{\rm Var}}
\newcommand{\Cov}{{\rm Cov}}
\newcommand{\Acov}{{\rm Acov}}
\newcommand{\diag}{{\rm diag}}
\newcommand{\bone}{{\bf 1}}
\newcommand{\RMSEA}{{\rm RMSEA}}

\newcommand{\SE}{{\rm SE}}

\newcommand{\tr}{{\rm tr}}
\newcommand{\s}{\rm s}
%\newcommand{\min}{\rm min}
\newcommand{\limn}{{\lim_{n\rightarrow\infty}}}

\newcommand{\tho}{\theta_0}
\newcommand{\htn}{\hat\theta}
\newcommand{\fot}{^{\frac{1}{2}}}
\newcommand{\si}{{\sum_{i=1}^n}}
\newcommand{\sii}{\frac{1}{n}\si}
\newcommand{\cas}{\stackrel {a.s.}{\longrightarrow}}
\newcommand{\cip}{\stackrel {P}{\longrightarrow}}
\newcommand{\cid}{\stackrel {\cal L}{\to}}

\newcommand{\bDelta}{{\bf \Delta}}
\newcommand{\bGamma}{{\bf \Gamma}}
\newcommand{\bLambda}{{\bf \Lambda}}
\newcommand{\bOmega}{{\bf \Omega}}
\newcommand{\bPhi}{{\bf \Phi}}
\newcommand{\bPi}{{\bf \Pi}}
\newcommand{\bPsi}{{\bf \Psi}}
\newcommand{\bSigma}{{\bf \Sigma}}
\newcommand{\bUpsilon}{{\bf \Upsilon}}

\newcommand{\bXi}{{\bf \Xi}}
\newcommand{\balpha}{\mbox{\boldmath $\alpha$}}

\newcommand{\bbeta}{\mbox{\boldmath $\beta$}}
\newcommand{\bhbn}{\hat{\mbox{\boldmath $\beta$}}}
\newcommand{\bbo}{\mbox{\boldmath $\beta$}_0}
\newcommand{\bdelta}{\mbox{\boldmath $\delta$}}
\newcommand{\bepsilon}{\mbox{\boldmath $\epsilon$}}
\newcommand{\betab}{\mbox{\boldmath $\eta$}}

\newcommand{\bgamma}{\mbox{\boldmath $\gamma$}}
\newcommand{\biota}{\mbox{\boldmath $\iota$}}

\newcommand{\blambda}{\mbox{\boldmath $\lambda$}}
\newcommand{\bmu}{\mbox{\boldmath $\mu$}}
\newcommand{\bnu}{\mbox{\boldmath $\nu$}}
\newcommand{\bomega}{\mbox{\boldmath  $\omega$}}
\newcommand{\bphi}{\mbox{\boldmath $\phi$}}
\newcommand{\bpsi}{\mbox{\boldmath $\psi$}}
\newcommand{\bsigma}{\mbox{\boldmath $\sigma$}}
\newcommand{\btau}{\mbox{\boldmath $\tau$}}
\newcommand{\bTheta}{\mbox{\boldmath $\Theta$}}
\newcommand{\btheta}{\mbox{\boldmath $\theta$}}
\newcommand{\bhan}{\hat{\mbox{\boldmath $\alpha$}}}

\newcommand{\bhgn}{\hat{\mbox{\boldmath $\gamma$}}}

\newcommand{\btan}{\tilde{\mbox{\boldmath $\alpha$}}}

\newcommand{\btgn}{\tilde{\mbox{\boldmath $\gamma$}}}

\newcommand{\btho}{{\mbox{\boldmath $\btheta$}}_0}
%\newcommand{\bhan}{\hat{\mbox{\boldmath $\alpha$}}}

\newcommand{\bhtn}{\hat{\mbox{\boldmath $\theta$}}}
\newcommand{\bttn}{\tilde{\mbox{\boldmath $\theta$}}}
\newcommand{\bupsilon}{\mbox{\boldmath $\upsilon$}}

\newcommand{\bvarepsilon}{\mbox{\boldmath $\varepsilon$}}
\newcommand{\bvarphi}{\mbox{\boldmath $\varphi$}}
\newcommand{\bvartheta}{\mbox{\boldmath $\vartheta$}}

\newcommand{\bxi}{\mbox{\boldmath  $\xi$}}
\newcommand{\bzeta}{\mbox{\boldmath  $\zeta$}}

\newcommand{\bA}{{\bf A}}
\newcommand{\ba}{{\bf a}}
\newcommand{\bB}{{\bf B}}
\newcommand{\bb}{{\bf b}}
\newcommand{\bC}{{\bf C}}
\newcommand{\bc}{{\bf c}}

\newcommand{\bD}{{\bf D}}
\newcommand{\bE}{{\bf E}}
\newcommand{\be}{{\bf e}}
\newcommand{\bF}{{\bf F}}
\newcommand{\bff}{{\bf f}}
\newcommand{\bG}{{\bf G}}
\newcommand{\bg}{{\bf g}}
\newcommand{\bH}{{\bf H}}
\newcommand{\bh}{{\bf h}}
\newcommand{\bI}{{\bf I}}
\newcommand{\cI}{{\cal I}}

\newcommand{\bK}{{\bf K}}
\newcommand{\bL}{{\bf L}}
\newcommand{\bl}{{\bf l}}
\newcommand{\bM}{{\bf M}}
\newcommand{\bm}{{\bf m}}

\newcommand{\bP}{{\bf P}}
\newcommand{\bQ}{{\bf Q}}
\newcommand{\bR}{{\bf R}}
\newcommand{\br}{{\bf r}}
\newcommand{\bS}{{\bf S}}
\newcommand{\bs}{{\bf s}}
\newcommand{\bsn}{{\bf s}}
\newcommand{\bt}{{\bf t}}
\newcommand{\bu}{{\bf u}}
\newcommand{\bU}{{\bf U}}
\newcommand{\bV}{{\bf V}}
\newcommand{\bv}{{\bf v}}
\newcommand{\ev}{{\textbf{\emph{v}}}}

\newcommand{\bW}{{\bf W}}
\newcommand{\bX}{{\bf X}}
\newcommand{\bx}{{\bf x}}
\newcommand{\cX}{{\cal X}}
\newcommand{\cY}{{{\cal X}^*}}
\newcommand{\by}{{\bf y}}
\newcommand{\bz}{{\bf z}}
\newcommand{\bzero}{{\bf 0}}
\newcommand{\stheta}{{\scriptsize\btheta}}
\newcommand{\slambda}{{\scriptsize\blambda}}
\newcommand{\spsi}{{\scriptsize\bpsi}}


\begin{document}
\oddsidemargin=-.00in
\setlength{\footskip}{.6truein}

\baselineskip=18pt 

\begin{titlepage}
%\vspace*{3.5cm}
\vspace*{1.5cm}
\begin{center}
Robust Structural Equation Modeling with Missing Data and Auxiliary Variables\\
\vspace{12mm}
Ke-Hai Yuan and Zhiyong Zhang\\
University of Notre Dame\\
\vspace*{7mm}
December 24, 2010\\
\end{center}
\vspace*{12.0cm}
\flushbottom
$^*$The research was supported Grants DA00017 and DA01070 from the National Institute on Drug Abuse. Correspondence concerning this article: Ke-Hai Yuan, Department of Psychology, University of Notre Dame, Notre Dame, IN 46556, USA (kyuan@nd.edu).

\end{titlepage}
%\flushleft
\baselineskip=22.00pt 
\pagestyle{empty}
\begin{center}
Abstract
\end{center}

This paper develops robust M-procedures for structural equation modeling with missing data and auxiliary variables. M-estimates of saturated mean vector and covariance matrix of all variables are obtained first. Those corresponding to the interesting variables are next fitted to the structural model by the normal-distribution-based discrepancy function. Many weight functions can be used at the first stage for robustness. Details are given to Huber-type weight functions. In particular, an expectation robust (ER) algorithm is introduced for estimating the saturated model. A consistent covariance matrix for the robust estimates is obtained using estimating equations. These are used to obtain standard errors for structural parameter estimates as well as rescaled, adjusted or corrected statistics in the second stage. For applications, SAS IML and R programs are given to perform the first-stage analysis and EQS codes are provided to perform the second-stage analysis. An example with open/closed book examination data illustrates the proper use of the provided programs. 

\vskip1cm

\noindent

\vskip5mm
\noindent
Keywords: Auxiliary variables, estimating equation, missing at random, sandwich-type covariance matrix.

\newpage
\baselineskip=22.55pt 
\newpage 
\pagestyle{plain} 
\setcounter{page}{1}

\begin{center}
1. Introduction
\end{center}

Being capable of modeling latent variables and measurement errors simultaneously, structural equation modeling (SEM) has become one of the most popular statistical methods in social science research, where missing data are common, especially those that are collected longitudinally. Many procedures have been developed for modeling missing data in SEM, most of those are  normal-distribution-based maximum likelihood (NML). Because data in social sciences are typically nonnormally distributed (Micceri, 1989), there are also a few developments accounting for the effect of nonnormality on test statistics and standard errors associated with NML (Arminger \& Sobel, 1990; Yuan \& Bentler, 2000). But NML estimates (NMLE) can be very inefficient or even biased when data have heavy tails or are contaminated. Yuan and Bentler (2001) and Yuan, Marshall and Bentler (2002) provided the outline of modeling missing data in SEM and factor analysis using maximum likelihood (ML) based on the multivariate $t$-distribution. But they did not provide the details of implementing the procedures. Lee and Xia (2006) developed a missing data procedure for nonlinear SEM in which latent variables as well as measurement and prediction errors are symmetrically distributed. Model estimation and inference are through Monte Carlo and BIC.  Because model structure and estimation method both affect the value of BIC, Lee and Xia (2006, pp.\ 581--582) noted that the method should be used with caution. In this paper we develop a two-stage procedure to allow for auxiliary variables, where the saturated mean vector and covariance matrix are obtained by robust M-estimators. The robust estimates of means and covariances are then fitted by the structural model in the second stage. Model evaluation is done by using well-established statistics or fit indices for complete data. Actually, any model estimation and evaluation methods available for SEM with complete data can be applied to the second stage analysis. 

Missing data can occur for many reasons. The process by which data become incomplete was called missing data mechanism by Rubin (1976).  Missing completely at random (MCAR) is a process in which missingness of data is independent of both the observed and the missing values; missing at random (MAR) is a process in which missingness is independent of the missing values given the observed data. When missingness depends on the missing values themselves given the observed data, the process is missing not at random (MNAR). When all missing values are MCAR, most ad hoc procedures still generate consistent parameter estimates. When ignoring the process for data that are  MAR or MCAR, MLEs or the normal-distribution-based pseudo MLEs are still consistent. When missing values are MNAR, one has to correctly model the missing data mechanism in order to get consistent parameter estimates in general. However, MAR or MNAR mechanism depends on whether variables accounting for missingness are observed and included in the estimation. Auxiliary variables are those that are not directly involved in the structural model but have the potential to account for missingness in the interesting variables. The developed procedure aims for SEM with missing data that are MAR after including potential auxiliary variables. Actually, auxiliary variables can be easily included in the first stage estimation. Parallel to the procedures in Yuan and Lu (2008) and Savalei and Bentler (2009), only estimates of means and covariances corresponding to the interesting variables are selected and fitted by the structural model in the second-stage analysis. 

Ignoring MNAR mechanism will results in biased parameter estimates in general. Outliers or data contamination can also bring biases to parameter estimates. Actually, data contamination bring biases to parameter estimates even without missing values. Comparing the two kinds of biases, those caused by data contamination can be a lot greater than those by MNAR data. Notice that the consistency of MLEs with MAR data is a large sample property. When the population distribution has heavy tails, at smaller sample sizes the NMLEs can contain substantial biases when the proportion of MAR data is not trivial (Yuan, Wallentin \& Bentler, 2010).  In addition to biases, efficiency is also a key factor in any statistical estimation. The efficiency of NMLEs goes to zero as the kurtosis of the population increases. Compared to NML, a robust procedure typically has better control in (1) biases due to outliers or data contamination; (2) biases caused by the interaction of heavy-tailed distribution and missing data; (3) efficiency due to heavy-tailed distribution. 
Actually, there exist tuning parameters in all the weight functions for M-estimation. If one choose the tuning parameter properly the resulting estimates will be more close to the true MLEs than the pseudo NMLEs. 

Since practical data sets often exhibit heterogeneous marginal skewnesses and kurtoses, our purpose is to develop procedures that are robust to distribution violations. As we shall see, the robustness of the developed procedure is through assigning smaller weights to cases that lie far from the center of the majority of the data. Many weight functions can be used for such a purpose. We choose Huber-type weight function because it tends to yield more efficient parameter estimates than other weight functions with real data (Yuan, Bentler \& Chan, 2004). The tuning parameter in the Huber-type weight function is explicitly related to the percentage of data contamination or outliers one would like to control. Another consideration behind choosing Huber-type weight function is that it does not assign zero weights to cases. When many cases get zero weights, it is very likely that the resulting asymptotic covariance matrix of the robust estimates is close to singular. Then rescaled, adjusted or corrected residual-based statistics cannot be trusted or even obtained in the second stage SEM analysis (Bentler \& Yuan, 1999). This is important because the asymptotic covariance matrix may have a huge size and it is already a challenge for it to be positive definite even without any missing data.

Because no SEM software contains robust estimation with missing data, we will provide SAS IML and R programs to facilitate the applications of the developed procedure. In particular, for any missing data with or without auxiliary variables, by modifying a few lines of the codes in the SAS or R programs, they will generate the necessary elements for mean and covariance structure analysis in the second stage. With these elements, SEM with missing data is essentially the same as SEM with complete data for any SEM programs that allow users to input the first four moments for analysis.

In section 2 of the paper, we describe an expectation robust algorithm for  Huber-type M-estimators. Asymptotic covariance matrix of the robust estimators are also provided in section 2. Section 3 describes the second-stage analysis by fitting the normal-distribution-based discrepancy function and the associated adjusted and rescaled statistic. Section 4 introduces SAS IML and R programs for implementing the first stage estimation and illustrates their use with EQS code by the test score data of Mardia, Kent and Bibby (1979). Discussion and conclusion are provided in section 5.

\begin{center}
2. M-estimates of the Saturated Mean Vector and Covariance Matrix
\end{center}
 
Algorithms for robust estimates of means and covariances with missing values are developed in Little (1988) and Liu (2004) using a multivariate $t$-distribution. Cheng and Victoria-Feser (2002) provide an algorithm for obtaining minimum covariance determinant (MCD) estimates.
Yuan (2010) extended M-estimators of means and covariances to samples containing missing values using estimating equations, which can be solved by an expectation robust (ER) algorithm. This section will describe the ER algorithm. We will also provide the asymptotic covariance matrix for the M-estimates  with Huber-type weights. In particular, we assume some auxiliary variables are available.  

Let $\by$ represent a population of $p$ random variables with $E(\by)=\bmu$ and $\Cov(\by)=\bSigma$. We are interested in modeling $\bmu$ and $\bSigma$ by a structural model. A sample $\by_i$, $i=1$, 2, $\ldots$, $n$, from $\by$ with missing values is available. There also exist a vector $\bu$ of $q-p$ auxiliary variables with the associated sample realization $\bu_i$, $i=1$, 2, $\ldots$, $n$. Let $\bx=(\by',\bu')'$ with $E(\bx)=\bnu$ and $\Cov(\bx)=\bV$. Due to missing values, the vector $\bx_i=(\by_i',\bu_i')'$ only contains $q_i$ marginal observations of $\bx$. Also, we do not know the distribution of $\bx$ and the observations in $\bx_i$ may contain outliers. In such a case, robust estimates of $\bnu$ and $\bV$ are preferred than the NMLEs. Let $\bnu_i$ and $\bV_i$ be the mean vector and covariance matrix corresponding to the observations in $\bx_i$,
$$
d_i^2=d^2(\bx_i,\bnu_i,\bV_i)=(\bx_i-\bnu_i)'\bV_i^{-1}(\bx_i-\bnu_i),
$$
and $w_{i1}(d)$, $w_{i2}(d)$ and $w_{i3}(d)$ be nonincreasing scalar functions.
The estimating equations defining robust M-estimators are give by
$$
\sum_{i=1}^nw_{i1}(d_i)\frac{\partial\bnu_i'}{\partial\bnu}\bV_i^{-1}(\bx_i-\bnu_i)=\bzero
\eqno(1)
$$
and
$$
\sum_{i=1}^n\frac{\partial\ve'(\bV_i)}{\partial\ev}\bW_i\ve[w_{i2}(d_i)(\bx_i-\bnu_i)(\bx_i-\bnu_i)'-w_{i3}(d_i)\bV_i]=\bzero,
\eqno(2)
$$
where $\bW_i=0.5(\bV_i^{-1}\otimes\bV_i^{-1})$. Notice that the subscript $i$ in $w_{i1}$, $w_{i2}$ and $w_{i3}$ is to adjust for varying number of observations in $\bx_i$. When $w_{i1}(d_i)=w_{i2}(d_i)=w_{i3}(d_i)=1$, equations (1) and (2) define the NMLEs of $\bnu$ and $\bV$ with missing data. When  $w_{i1}(d_i)=w_{i2}(d_i)=(m+q_i)/(m+d_i^2)$ and $w_{i3}(d)=1$, equations (1) and (2) define the MLEs of $\bnu$ and $\bV$ based on the multivariate $t$-distribution with $m$ degrees of freedom. When the three weight functions are chosen according to Lopuha\"a (1989) or Rocke (1996), equations (1) and (2) define S-estimators of $\bnu$ and $\bV$ for samples with missing data. Let $0<\varphi<1$ and $\rho_i$ be the $(1-\varphi)$th quantile corresponding to $\chi_{q_i}$, the chi-distribution with $q_i$ degrees of freedom. Huber-type weight functions with missing data are given by
$$
w_{i1}(d_i)=\left\{ 
\begin{array}{ll}
1,   &{\rm if}\;d_i\leq \rho_i\\ 
\rho_i/d_i, &{\rm if}\;d_i>\rho_i,
\end{array}
\right. 
\eqno(3)
$$
$w_{i2}(d_i)=[w_{i1}(d_i)]^2/\kappa_i$ and $w_{i3}(d_i)=1$, where $\kappa_i$ is a constant defined by $E[\chi_{p_i}^2w_{i1}^2(\chi_{q_i}^2)/\kappa_i]=q_i$ that aims to yield consistent estimate of $\bV$ when $\bx\sim N(\bnu,\bV)$. 


Equations (1) and (2) can be easily solved by an ER algorithm. Let $\bx_{ic}=(\bx_i',\bx_{im}')'$ denote the  complete data, where $\bx_{im}$ is the vector containing the $q-q_i$ missing values. Of course, the positions of missing values are not always at the end in practice. We can perform  permutations on each missing pattern so that all the algebraic operations in this paper still hold. Let $\bnu^{(j)}$ and $\bV^{(j)}$ be the current values of $\bnu$ and $\bV$, $\bnu_i^{(j)}$ and $\bV_i^{(j)}$ be the means and covariances corresponding to the observed $\bx_i$. When $p_i\rho_i
\end{array}
\right. 
\eqno(A5)
$$
and
$$
dw_{i2}(d_i)=\left\{ 
\begin{array}{ll}
0,   &{\rm if}\;d_i\leq \rho_i\\[5mm]
\displaystyle
\frac{\rho_i^2}{\kappa_i d_i^4}[2(d\bnu_i)'\bV_i^{-1}(\bx_i-\bnu_i)+(\bx_i-\bnu_i)'\bV_i^{-1}(d\bV_i)\bV_i^{-1}(\bx_i-\bnu_i)], &{\rm if}\;d_i>\rho_i
\end{array}
\right. 
\eqno(A6)
$$
Thus, when $d_i>\rho_i$, 
we have
$$
\begin{array}{ll}
dg_{i1}(\balpha)&=
-w_{i1}(d\bnu_i)'\bV_i^{-1}(d\bnu_i)
-w_{i1}(d\bnu_i)'\bV_i^{-1}(d\bV_i)\bV_i^{-1}(\bx_i-\bnu_i)
\\[5mm]&\displaystyle
+\frac{\rho}{d^3}
(d\bnu_i)'\bV_i^{-1}(\bx_i-\bnu_i)(\bx_i-\bnu_i)'\bV_i^{-1}(d\bnu_i)
\\[5mm]&\displaystyle
+\frac{\rho}{2d^3}
(\bx_i-\bnu_i)\bV_i^{-1}(d\bV_i)\bV_i^{-1}(\bx_i-\bnu_i)(\bx_i-\bnu_i)'\bV_i^{-1}(d\bnu_i)
\end{array}
\eqno(A7)
$$
and
$$
\begin{array}{ll}
dg_{i2}(\balpha)&=-\tr\left\{\bV_i^{-1}(d\bV_i)\bV_i^{-1}(d\bV_i)\bV_i^{-1}[w_{i2}(\bx_i-\bnu_i)(\bx_i-\bnu_i)']\right\}
\\[5mm]
&\displaystyle
+\frac{1}{2}
\tr\left\{\bV_i^{-1}(d\bV_i)\bV_i^{-1}(d\bV_i)\right\}
\\[5mm]
&\displaystyle
-\frac{1}{2}
\tr\left\{\bV_i^{-1}(d\bV_i)\bV_i^{-1}w_{i2}[(d\bnu_i)(\bx_i-\bnu_i)'+(\bx_i-\bnu_i)(d\bnu_i)']\right\}
\\[5mm]
&\displaystyle
+\frac{\rho^2}{\kappa_i d_i^4}
\tr\left\{\bV_i^{-1}(\bx_i-\bnu_i)(\bx_i-\bnu_i)'\bV_i^{-1}(d\bV_i)\bV_i^{-1}(\bx_i-\bnu_i)(d\bnu_i)'\right\}
\\[5mm]
&\displaystyle
+\frac{\rho^2}{2\kappa_i d_i^4}
\tr\left\{\bV_i^{-1}(\bx_i-\bnu_i)(\bx_i-\bnu_i)'\bV_i^{-1}(d\bV_i)\bV_i^{-1}
(\bx_i-\bnu_i)(\bx_i-\bnu_i)'\bV_i^{-1}(d\bV_i)\right\}.
\end{array}
\eqno(A8)
$$
Notice that, for matrices $\bA$, $\bB$, $\bC$ and $\bD$ of proper orders, there exists
$$
\tr(\bA\bB\bC\bD)=\ve'(\bD)(\bA\otimes \bC')\ve(\bB')=\ve'(\bD')(\bC'\otimes \bA)\ve(\bB). \eqno(A9)
$$ 
Let 
$$
\bb_i=\bV_i^{-1}(\bx_i-\bnu_i),
\;\;\;
\;\;\;
\bH_i=\bV_i^{-1}(\bx_i-\bnu_i)(\bx_i-\bnu_i)'\bV_i^{-1},
$$
$$
\bE_i=\frac{\partial\bnu_i}{\partial\bnu'}
\;\;\;
{\rm and}
\;\;\;
\bF_i=\frac{\partial\ve(\bV_i)}{\partial\ev'}.
$$
Using (A9), it follows from (A3) to (A8) that, when $d_i\leq\rho_i$, 
$$
\frac{\partial\bg_{i1}(\balpha)}{\partial\bnu'}=-\bE_i'\bV_i^{-1}\bE_i,
\;\;\;\;
\frac{\partial\bg_{i1}(\balpha)}{\partial\ev'}=-\bE_i'(\bV_i^{-1}\otimes\bb_i')\bF_i,
$$
$$
\frac{\partial\bg_{i2}(\balpha)}{\partial\bnu'}=-\frac{1}{\kappa_i}\bF_i'(\bb_i\otimes\bV_i^{-1})\bE_i,
\;\;\;\;
\frac{\partial\bg_{i2}(\balpha)}{\partial\ev'}=-\bF_i'[\frac{1}{\kappa_i}(\bH_i\otimes\bV_i^{-1})-\bW_i]\bF_i;
$$
and when $d_i>\rho_i$, 
$$
\frac{\partial\bg_{i1}(\balpha)}{\partial\bnu'}
=-\bE_i'(w_{i1}\bV_i^{-1}-\frac{\rho_i}{d_i^3}\bH_i)\bE_i
$$
$$
\frac{\partial\bg_{i1}(\balpha)}{\partial\ev'}=
-\bE_i'[w_{i1}(\bV_i^{-1}\otimes\bb_i')-\frac{\rho_i}{2d_i^3}
(\bH_i\otimes\bb_i')]\bF_i,
$$
$$
\frac{\partial\bg_{i2}(\balpha)}{\partial\bnu'}=
-\bF_i'[w_{i2}(\bb_i\otimes\bV_i^{-1})-\frac{\rho_i^2}{\kappa_i d_i^4}
(\bH_i\otimes\bb_i)]\bE_i,
$$
$$
\frac{\partial\bg_{i2}(\balpha)}{\partial\ev'}=
-\bF_i'[w_{i2}(\bH_i\otimes\bV_i^{-1})
-\bW_i-\frac{\rho_i^2}{2\kappa_i d_i^4}
(\bH_i\otimes\bH_i)]\bF_i.
$$

\newpage
\baselineskip=15.00pt 
\begin{center}
 Appendix B
\end{center}
\lstset{numbers=right}
\begin{lstlisting}[basicstyle={\ttfamily}]
library(rsem)
setwd("c:/rsem")
dset<-read.table("mardiaMV25.dat", header=T, na.string="-99")
dset<-data.matrix(dset)
ex1<-rsem_main(dset, c(1,2,4,5), "mcov.eqs") 

## Sample output from the above analysis.

Sample size n = 88 
Total number of variables p = 5 

The following variables are selected for SEM models 
Mechanics Vectors Analysis Statistics 

There are  2  missing data patterns. They are 
     [,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,]   57    5    1    2    3    4    5
[2,]   31    3    1    3    4    2    5

Estimated means: 
[1] 39.18057 50.90668 47.20384 41.05387

Estimated covariance matrix: 
     Mechanics  Vectors Analysis Statistics
[1,]  289.3151 124.7135 105.4223   102.6819
[2,]  124.7135 182.4506  95.7065   108.5030
[3,]  105.4223  95.7065 202.1062   180.7338
[4,]  102.6819 108.5030 180.7338   373.3670

Fit statistics:
            T       p
RML   1.37630 0.71109
AML   1.21980 0.74826
CRADF 1.42660 0.69930
RF    0.47245 0.70228

Parameter estimates:
            Parameter         RSE          z
(E1,E1)   180.6958200 30.85086800  5.8570741
(E2,E2)    41.5747190 27.15030900  1.5312798
(E3,E3)    10.5451330 55.17224100  0.1911311
(E4,E4)   203.8083700 41.22027600  4.9443718
(D1,D1)    87.6602940 34.17143500  2.5653091
(D1,D2)    78.8122260 23.54614800  3.3471388
(D2,D2)   192.6949000 53.88817100  3.5758293
(F1,V999)  39.4471310  1.69824790 23.2281347
(F2,V999)  47.1918250  1.52386050 30.9685992
(V2,F1)     1.2892982  0.04580923 28.1449413
(V4,F2)     0.8755543  0.02442287 35.8497667
\end{lstlisting}

\newpage
\begin{center}
Appendix C
\end{center}

\begin{lstlisting}[basicstyle={\ttfamily}]
/TITLE
EQS 6.1: Mean and covariance structure analysis. 
The file name is mcov.eqs.
/SPECIFICATION 
weight='weight.txt'; 
cases=88; variables=4; matrix=covariance; 
analysis=moment; methods=ML, robust; data='data.txt';
/LABELS
V1=Mechanics; V2=Vectors; V3=Analysis; V4=Statistics;
/EQUATIONS
V1=  F1+E1;
V2= *F1+E2;
V3=  F2+E3;
V4= *F2+E4;
F1= *V999+D1;
F2= *V999+D2;
/VARIANCES
E1-E4= *;
D1=*; 
D2=*;
/COVARIANCES
D1,D2= *;
/TECHNICAL
conv=0.0001;
itera=100;
/Means
/OUTPUT
CODEBOOK;
DATA='mcov.ETS';
PARAMETER ESTIMATES;
STANDARD ERRORS;
LISTING;
/END
\end{lstlisting}

\newpage

\begin{center}
Appendix D
\end{center}

\begin{verbatim}
/TITLE
EQS 6.1: Covariance structure analysis. The file name is cov.eqs
/SPECIFICATION 
weight='weight.txt'; 
cases=88; variables=4; matrix=covariance; 
analysis=covariance; methods=ML, robust; data='data.txt';
/LABELS
V1=Mechanics; V2=Vectors; V3=Analysis; V4=Statistics;
/EQUATIONS
V1=  F1+E1;
V2= *F1+E2;
V3=  F2+E3;
V4= *F2+E4;
/VARIANCES
E1-E4= *;
F1=*; 
F2=*;
/COVARIANCES
F1,F2= *;
/TECHNICAL
conv=0.0001;
itera=100;
/OUTPUT
CODEBOOK;
DATA='cov.ETS';
PARAMETER ESTIMATES;
STANDARD ERRORS;
LISTING;
/END
\end{verbatim}



\newpage

\baselineskip=18pt
\def\bibb{\hangindent=.5 true cm\hangafter=1\noindent}
\def\bib{\vskip3pt\hangindent=.5 true cm\hangafter=1\noindent}
\begin{center}
References
\end{center}

\bib 
Arminger, G., \& Sobel, M. E. (1990). Pseudo-maximum likelihood
estimation of mean and covariance structures with missing data.
{\em Journal of the American Statistical Association, 85}, 
195--203.

\bib
Bentler, P. M. (2008). {\em EQS 6 structural equations program
manual}. Encino, CA: Multivariate Software.

\bib
Bentler, P. M., \& Yuan, K.-H. (1999).
Structural equation modeling with small samples: Test statistics.  {\em Multivariate Behavioral Research}, {\em 34}, 181--197.

\bib
Browne, M. W. (1984). Asymptotic distribution-free methods for
the analysis of covariance structures.
{\em British Journal of Mathematical and Statistical Psychology, 37}, 62--83.

\bib
Cheng, T.-C., \& Victoria-Feser, M.-P. (2002). High-breakdown estimation of multivariate mean and covariance with missing observations. {\em British Journal of Mathematical and Statistical Psychology, 55}, 317-ñ335.

\bib
Lee, S.Y., \& Xia, Y.M. (2006). Maximum likelihood methods in treating outliers and symmetrically heavy-tailed distributions for nonlinear structural equation models with missing data. {\em Psychometrika, 71}, 565--585.

\bib 
Little, R. J. A. (1988). Robust estimation of the mean and covariance
matrix from data with missing values.
{\em Applied Statistics}, {\em 37}, 23--38.

\bib
Liu, C. (1997). ML estimation of the multivariate $t$ distribution and the EM algorithm. {\em Journal of Multivariate Analysis, 63}, 296--312.

\bib
Lopuha\"{a}, H. P. (1989). On the relation between S-estimators and
M-estimators of multivariate location and covariances. {\em Annals of
Statistics, 17}, 1662--83.

\bib
Mair, P., Wu, E., \& Bentler, P. M. (2010). EQS Goes R: Simulations for SEM Using the Package REQS.
 {\em Structural Equation Modeling: A Multidisciplinary Journal, 17}, 333--349.

\bib
Mardia, K. V. (1970). Measures of multivariate skewness and kurtosis
with applications. {\em Biometrika, 57}, 519--530.

\bib 
Mardia, K. V., Kent, J. T., \& Bibby, J. M. (1979).
Multivariate analysis. New York: Academic Press.

\bib
Micceri, T. (1989).
The unicorn, the normal curve, and other improbable creatures. {\em 
Psychological Bulletin, 105}, 156--166.

\bib
Rocke, D. M. (1996). Robustness properties of S-estimators of multivariate location and shape in
high dimension. {\em Annals of Statistics, 24}, 1327--1345.

\bib
Rubin, D. B. (1976). Inference and missing data (with discussions). {\em  Biometrika, 63}, 581--592.

\bib
Savalei, V., \& Bentler, P. M. (2009). A two-stage ML approach to missing data: theory and application to auxiliary variables. {\em Structural Equation Modeling, 16}, 477--497. 

\bib 
Yuan, K.-H. (2010).
Expectation-robust algorithm and estimating equation for means and covariances with missing data. Under review.

\bib
Yuan, K.-H., \& Bentler, P. M. (2000).
 Three likelihood-based methods for mean and covariance
structure analysis with nonnormal missing data.
{\em Sociological Methodology, 30}, 167--202.

\bib
Yuan, K.-H., \& Bentler, P. M. (2001). 
A unified approach to multigroup structural equation modeling
with nonstandard samples. In
G. A. Marcoulides \& R. E. Schumacker (Eds.),
{\em Advanced structural equation modeling: New developments and techniques} (pp.\ 35--56). Mahwah, NJ: Lawrence Erlbaum Associates.

\bib
Yuan, K.-H., \& Bentler, P. M. (2010).
Consistency of normal distribution based pseudo maximum likelihood estimates when data are missing at random. {\em American Statistician, 64}, 263--267.

\bib
Yuan, K.-H.,  Bentler, P. M., \& Chan, W. (2004).
Structural equation modeling with heavy tailed distributions. {\em Psychometrika, 69}, 421--436.

\bib
Yuan, K.-H., Chan, W., \& Bentler, P. M. (2000). 
Robust  transformation with applications to structural equation 
modeling. {\em British Journal of Mathematical and Statistical Psychology}, {\em 53}, 31--50.

\bib
Yuan, K.-H., \& Lu, L. (2008).
SEM with missing data and unknown population using two-stage ML:
Theory and its application. {\em Multivariate Behavioral Research, 62}, 621--652. 

\bib
Yuan, K.-H., Marshall, L. L., \& Bentler, P. M. (2002).
A unified approach to exploratory factor analysis with 
missing data, nonnormal data,
and in the presence of outliers. {\em Psychometrika, 67}, 95--122.

\bib
Yuan, K.-H., Wallentin, F., \& Bentler, P. M. (2009).
ML versus MI for missing data with violation of distribution conditions.

\bib
Zhong, X. (2008).
{\em Effects of outlying observations on standard errors in factor analysis}.
Master degree thesis. University of Notre Dame.

\bib
Zu, J., \& Yuan, K.-H. (2010).
Local influence and robust procedures for mediation analysis. {\em Multivariate Behavioral Research, 45}, 1--44.

\end{document}