Learning Algorithms

Supervised Learnings

Concepts

Supervised learning is the machine learning task of learning a function that mapping from the input variable and output variable. The target results are the approximate mapping function that generates the prediction based on new input data. It is called "supervised" because of the process of an algorithm learning from the training dataset can be thought of as the object supervising the learning process. Generally, it is known as a learning that data come with labels.

Supervised learning are mainly used in regression and classification problems.

Approaches

Linear Regression
Logistic Regression
Support Vector machine
Naive Bayes
Decision Trees
Random Forest
Neural Networks

Convolutional Neural Network

Unsupervised Learnings

Concepts

Unsupervised learning is the machine learning task of inferring a function to describe hidden pattern from input variable without corresponding labels. The target results are the underlying structure or distribution in the data in order to learn more about the data. It is called "unsupervised" since there is no object supervising the learning process.

Unsupervised learning are mainly used in clustering and association problems.

Approaches

K means clustering
Principal Component Analysis
Singular Value Decomposition
Independent Component Analysis
Nonnegative Matrix Factorization

Non-Negative Matrix Factorization

Concepts

Matrix factorization is to find two or more matrix factors whose product is a good approximation to the original matrix. In practice, the dimension of the decomposed matrix factors is usually much smaller than that of the original matrix, which compact representation of the data points and facilitate other learning tasks (e.g. clustering and classification)

Non-negative Matrix Factorization (NMF) enforces the constraint of having the factor matrices be nonnegative (n.n.d) (i.e. all elements must be equal to or greater than zero). The nonnegativity constraint leads to a parts-based representation of the object in the sense that it only allows additive combination of the original data. NMF is an ideal dimension reduction algorithm for image processing, face recognition, and document clustering where it is a natural to consider the object as a combination of parts to form a whole.

Characteristic

Unsupervised learning algorithm (unlabeled data) inapplicable to many real-world problems where limited knowledge from domain experts is available
Since unlabeled data with small labels improves in accuracy where small set of labeled data is relatively inexpensive, then extending NMF to semi-supervised learning results in great practical value.

Theories

Since factorization of matrices is non-unique and plenty of different methods to perform, so incorporating different constraints have been developed.
- PCA/SVD: Decomposes matrix as linear combination of principle components (also eigenvalue decomposition)
- NMF: Enforces the constraints that the elements of the factor matrices must be nonnegative.
Suppose n data points, each data point is m-dimensional and represented by a vector. The vectors are placed as the columns and the dataset is represented by a matrix X. Then NMF aims to find two nonnegative matrix factors U and V where the product of the two factors is an approximation of the original matrix. These represent as:

$\begin{aligned} \text{n data points: }& \{\vec{x_i}\}^n_{i=1} \text{ where } \vec{x_i} \in \mathbb{R}^m \\ \text{whole dataset: }& \mathbf{X}_{m\times n} = [\vec{x}_1, \vec{x}_2, ..., \vec{x}_n] \in \mathbb{R}^{m\times n} \\ & \mathbf{X}_{m\times n} = \begin{bmatrix} x_{11} & x_{12} & \hdots & x_{1n} \\ x_{21} & x_{22} & \hdots & x_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ x_{m1} & x_{m2} & \hdots & x_{mn} \end{bmatrix} \in \mathbb{R}^{m\times n}\\ \text{NMF: } & \mathbf{X}_{m\times n} \approx \mathbf{U}_{m \times k}\mathbf{V}_{k\times n} \end{aligned}$

The approximation is quantified by a cost function which can be constructed by some distance measures.
- Frobenius norm - square of euclidean distance between two matrices Then, the goal of NMF is to obtain following results:
  $\begin{aligned} & \underset{U, V}{\text{min}}||\mathbf{X-UV^T}||^2 \\ & \text{where } \mathbf{U} \in \mathbb{R}^{m \times k}, \mathbf{V} \in \mathbb{R}^{k \times n}, k<<<n<m \end{aligned}$
- Divergence of X from Y - Y is product of UV. This measurement is not symmetric, and the distance from X to Y is not necessarily the same as the distance from Y to X.

Semi-supervised Learnings

Concepts

Semi-supervised learning is the machine learning task of a large amount of input data with only small amount of input labels. In real-world data, semi-supervised learning is common since unlabeled data is cheap and easy to collect and store.

Approaches

Constrained Nonnegative Matrix Factorization

Constrained Nonnegative Matrix Factorization

Takes the label information as additional hard constraints
The data points from the same class should be merged together in the new representation space, then the obtained representation has the consistent label with the original data, and can have more discriminating power.
Parameter free, avoids the cost of tuning parameter in order to get the best result

Comparison: NMF vs CNMF

NMF	CNMF
unsupervised learning not incoporate the label information	semi-supervised learning takes the label information as constraints
no way to directly obtain solution	parameter free no cost of tuning parameters

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Learning Algorithms

Supervised Learnings

Concepts

Approaches

Convolutional Neural Network

Unsupervised Learnings

Concepts

Approaches

Non-Negative Matrix Factorization

Concepts

Characteristic

Theories

Semi-supervised Learnings

Concepts

Approaches

Constrained Nonnegative Matrix Factorization

Comparison: NMF vs CNMF

Contents

Clone this wiki locally