PCA

The main idea of PCA is to seek the most accurate data representation in a lower dimensional space. For a formal definition, according to Wikipedia, Principal component analysis (PCA) is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components.

Demo

A nice application of PCA is plane fitting (3D->2D dimension reduction):

Another example is to project data from 2D to 1D subspace,

pca_illustration

Goal

The goal of PCA is to minimize the projection error:

pca_error

The optimal value for each coefficient is just the dot products between x_i and the e, the total error can be simplified to:

min_pca_%e5%89%af%e6%9c%ac

enginvalue

The larger the eigenvalue of S, the larger is the variance in the direction of corresponding eigenvector

FLA / LDA

PCA finds the most accurate data representation in a lower dimensional space. However, the directions of maximum variance may be useless for classification. Fisher Linear Discriminant (FLA) project to a line which preserves direction useful for classification. Linear discriminant analysis (LDA) is a generalization of Fisher’s linear discriminant, a method used in statistics, pattern recognition and machine learning to find a linear combination of features that characterizes or separates two or more classes of objects or events. The resulting combination may be used as a linear classifier, or, more commonly, for dimensionality reduction before later classification.

Main Idea

The main idea of FLA is to find projection to a line s.t. samples from different classes are well separated, thus separating the means of each cluster and minimize the variance

Objective Function

lda_objective

LDA only projects data into 1D space that keeps the data easy to classify, and PCA projects data from N-D to (N-1)-D.