Accelerating the pace of engineering and science

# Statistics Toolbox

## Regression, Classification, and ANOVA

### Regression

With regression, you can model a continuous response variable as a function of one or more predictors. Statistics Toolbox offers a wide variety of regression algorithms, including:

Fitting with MATLAB: Statistics, Optimization, and Curve Fitting (Webinar)
Apply regression algorithms with MATLAB.

You can evaluate goodness of fit using a variety of metrics, including:

• Cross-validated mean squared error
• Akaike information criterion (AIC) and Bayesian information criterion (BIC)

With the toolbox, you can calculate confidence intervals for both regression coefficients and predicted values.

Statistics Toolbox supports more advanced techniques to improve predictive accuracy when the dataset includes large numbers of correlated variables. The toolbox supports:

• Subset selection techniques, including sequential features selection and stepwise regression
• Regularization methods, including ridge regression, lasso, and elastic net

Computational Statistics: Feature Selection, Regularization, and Shrinkage with MATLAB (Webinar)
Learn how to generate accurate fits in the presence of correlated data.

Statistics Toolbox also supports nonparametric regression techniques for generating an accurate fit without specifying a model that describes the relationship between the predictor and the response. Nonparametric regression techniques include decision trees as well as boosted and bagged regression trees.

Nonparametric Fitting 4:07
Develop a predictive model without specifying a function that describes the relationship between variables.

Additionally, Statistics Toolbox supports nonlinear mixed-effect (NLME) models in which some parameters of a nonlinear function vary across individuals or groups.

Nonlinear mixed-effects model of drug absorption and elimination showing intrasubject concentration-versus-time profiles. The `nlmefit` function in Statistics Toolbox generates a population model using fixed and random effects.

### Classification

Classification algorithms enable you to model a categorical response variable as a function of one or more predictors. Statistics Toolbox offers a wide variety of parametric and nonparametric classification algorithms, such as:

• Boosted and bagged classification trees, including AdaBoost, LogitBoost, GentleBoost, and RobustBoost
• Naïve Bayes classification
• k-Nearest Neighbor (kNN) classification
• Linear discriminant analysis

An Introduction to Classification 9:00
Develop predictive models for classifying data.

You can evaluate goodness of fit for the resulting classification models using techniques such as:

• Cross-validated loss
• Confusion matrices
• Performance curves/receiver operating characteristic (ROC) curves

### ANOVA

Analysis of variance (ANOVA) enables you to assign sample variance to different sources and determine whether the variation arises within or among different population groups. Statistics Toolbox includes these ANOVA algorithms and related techniques:

• One-way ANOVA
• Two-way ANOVA for balanced data
• Multiway ANOVA for balanced and unbalanced data
• Multivariate ANOVA (MANOVA)
• Nonparametric one-way and two-way ANOVA (Kruskal-Wallis and Friedman)
• Analysis of covariance (ANOCOVA)
• Multiple comparison of group means, slopes, and intercepts