NUMSAMPLES-by-NUMSERIES matrix
with NUMSAMPLES samples of a NUMSERIES-dimensional
random vector. Missing values are indicated by NaNs.

InitMethod

(Optional) String that identifies one of three defined
initialization methods to compute initial estimates for the mean and
covariance of the data. If InitMethod = [] or '',
the default method nanskip is used. The initialization
methods are

nanskip — (Default) Skip
all records with NaNs.

twostage — Estimate mean.
Fill NaNs with the mean. Then estimate the covariance.

diagonal — Form a diagonal
covariance.

Description

[Mean, Covariance] = ecmninit(Data, InitMethod) creates
initial mean and covariance estimates for the function ecmnmle. Mean is
a NUMSERIES-by-1 column vector
estimate for the mean of Data. Covariance is
a NUMSERIES-by-NUMSERIES matrix
estimate for the covariance of Data.

Each observation of Z is assumed to be
iid (independent, identically distributed)
multivariate normal, and missing values are assumed to be missing
at random (MAR).

Initialization Methods

This routine has three initialization methods that cover most
cases, each with its advantages and disadvantages.

nanskip

The nanskip method works well with small
problems (fewer than 10 series or with monotone
missing data patterns). It skips over any records with NaNs
and estimates initial values from complete-data records
only. This initialization method tends to yield fastest convergence
of the ECM algorithm. This routine switches to
the twostage method if it determines that significant
numbers of records contain NaN.

twostage

The twostage method is the best choice for
large problems (more than 10 series). It estimates the
mean for each series using all available data for each series. It
then estimates the covariance matrix with missing values treated
as equal to the mean rather than as NaNs. This
initialization method is quite robust but tends to result in slower
convergence of the ECM algorithm.

diagonal

The diagonal method is a worst-case approach
that deals with problematic data, such as disjoint series and excessive
missing data (more than 33% missing data). Of the three initialization
methods, this method causes the slowest convergence of the ECM algorithm.