The class implements the Expectation Maximization algorithm. More...

#include <ml.hpp>

Public Types
enum	Types { COV_MAT_SPHERICAL = 0, COV_MAT_DIAGONAL = 1, COV_MAT_GENERIC = 2 }
	Type of covariation matrices. More...
enum
	Default parameters. More...
enum
	The initial step. More...
enum	Flags { , RAW_OUTPUT = 1 }
	Predict options. More...
Public Member Functions
virtual CV_WRAP int	getClustersNumber () const =0
	The number of mixture components in the Gaussian mixture model.
virtual CV_WRAP void	setClustersNumber (int val)=0
	The number of mixture components in the Gaussian mixture model.
virtual CV_WRAP int	getCovarianceMatrixType () const =0
	Constraint on covariance matrices which defines type of matrices.
virtual CV_WRAP void	setCovarianceMatrixType (int val)=0
	Constraint on covariance matrices which defines type of matrices.
virtual CV_WRAP TermCriteria	getTermCriteria () const =0
	The termination criteria of the EM algorithm.
virtual CV_WRAP void	setTermCriteria (const TermCriteria &val)=0
	The termination criteria of the EM algorithm.
virtual CV_WRAP Mat	getWeights () const =0
	Returns weights of the mixtures.
virtual CV_WRAP Mat	getMeans () const =0
	Returns the cluster centers (means of the Gaussian mixture)
virtual CV_WRAP void	getCovs (CV_OUT std::vector< Mat > &covs) const =0
	Returns covariation matrices.
virtual CV_WRAP Vec2d	predict2 (InputArray sample, OutputArray probs) const =0
	Returns a likelihood logarithm value and an index of the most probable mixture component for the given sample.
virtual CV_WRAP bool	trainEM (InputArray samples, OutputArray logLikelihoods=noArray(), OutputArray labels=noArray(), OutputArray probs=noArray())=0
	Estimate the Gaussian mixture parameters from a samples set.
virtual CV_WRAP bool	trainE (InputArray samples, InputArray means0, InputArray covs0=noArray(), InputArray weights0=noArray(), OutputArray logLikelihoods=noArray(), OutputArray labels=noArray(), OutputArray probs=noArray())=0
	Estimate the Gaussian mixture parameters from a samples set.
virtual CV_WRAP bool	trainM (InputArray samples, InputArray probs0, OutputArray logLikelihoods=noArray(), OutputArray labels=noArray(), OutputArray probs=noArray())=0
	Estimate the Gaussian mixture parameters from a samples set.
virtual CV_WRAP int	getVarCount () const =0
	Returns the number of variables in training samples.
virtual CV_WRAP bool	empty () const
	Returns true if the Algorithm is empty (e.g.
virtual CV_WRAP bool	isTrained () const =0
	Returns true if the model is trained.
virtual CV_WRAP bool	isClassifier () const =0
	Returns true if the model is classifier.
virtual CV_WRAP bool	train (const Ptr< TrainData > &trainData, int flags=0)
	Trains the statistical model.
virtual CV_WRAP bool	train (InputArray samples, int layout, InputArray responses)
	Trains the statistical model.
virtual CV_WRAP float	calcError (const Ptr< TrainData > &data, bool test, OutputArray resp) const
	Computes error on the training or test dataset.
virtual CV_WRAP float	predict (InputArray samples, OutputArray results=noArray(), int flags=0) const =0
	Predicts response(s) for the provided sample(s)
virtual CV_WRAP void	clear ()
	Clears the algorithm state.
virtual void	write (FileStorage &fs) const
	Stores algorithm parameters in a file storage.
virtual void	read (const FileNode &fn)
	Reads algorithm parameters from a file storage.
virtual CV_WRAP void	save (const String &filename) const
	Saves the algorithm to a file.
virtual CV_WRAP String	getDefaultName () const
	Returns the algorithm string identifier.
Static Public Member Functions
static CV_WRAP Ptr< EM >	create ()
	Creates empty EM model.
template<typename _Tp >
static Ptr< _Tp >	train (const Ptr< TrainData > &data, int flags=0)
	Create and train model with default parameters.
template<typename _Tp >
static Ptr< _Tp >	read (const FileNode &fn)
	Reads algorithm from the file node.
template<typename _Tp >
static Ptr< _Tp >	load (const String &filename, const String &objname=String())
	Loads algorithm from the file.
template<typename _Tp >
static Ptr< _Tp >	loadFromString (const String &strModel, const String &objname=String())
	Loads algorithm from a String.

Detailed Description

The class implements the Expectation Maximization algorithm.

See also:: ml_intro_em

Definition at line 732 of file ml.hpp.

Member Enumeration Documentation

anonymous enum

Default parameters.

Definition at line 757 of file ml.hpp.

anonymous enum

The initial step.

Definition at line 760 of file ml.hpp.

enum Flags [inherited]

Predict options.

Enumerator:

RAW_OUTPUT

makes the method return the raw results (the sum), not the class label

Reimplemented in DTrees.

Definition at line 296 of file ml.hpp.

enum Types

Type of covariation matrices.

Enumerator:

COV_MAT_SPHERICAL

A scaled identity matrix $\mu_k * I$ .

There is the only parameter $\mu_k$ to be estimated for each matrix. The option may be used in special cases, when the constraint is relevant, or as a first step in the optimization (for example in case when the data is preprocessed with PCA). The results of such preliminary estimation may be passed again to the optimization procedure, this time with covMatType=EMCOV_MAT_DIAGONAL.

COV_MAT_DIAGONAL

A diagonal matrix with positive diagonal elements.

The number of free parameters is d for each matrix. This is most commonly used option yielding good estimation results.

COV_MAT_GENERIC

A symmetric positively defined matrix.

The number of free parameters in each matrix is about $d^2/2$ . It is not recommended to use this option, unless there is pretty accurate initial estimation of the parameters and/or a huge number of training samples.

Definition at line 736 of file ml.hpp.

Member Function Documentation

virtual CV_WRAP float calcError	(	const Ptr< TrainData > &	data,
		bool	test,
		OutputArray	resp
	)		const `[virtual, inherited]`

Computes error on the training or test dataset.

Parameters:

data	the training data
test	if true, the error is computed over the test subset of the data, otherwise it's computed over the training subset of the data. Please note that if you loaded a completely different dataset to evaluate already trained classifier, you will probably want not to set the test subset at all with TrainData::setTrainTestSplitRatio and specify test=false, so that the error is computed for the whole new set. Yes, this sounds a bit confusing.
resp	the optional output responses.

The method uses StatModel::predict to compute the error. For regression models the error is computed as RMS, for classifiers - as a percent of missclassified samples (0-100%).

virtual CV_WRAP void clear ( ) [virtual, inherited]

Clears the algorithm state.

Reimplemented in DescriptorMatcher, and FlannBasedMatcher.

Definition at line 2984 of file core.hpp.

static CV_WRAP Ptr<EM> create ( ) [static]

Creates empty EM model.

The model should be trained then using StatModel::train(traindata, flags) method. Alternatively, you can use one of the EM::train\* methods or load it from file using Algorithm::load<EM>(filename).

virtual CV_WRAP bool empty ( ) const [virtual, inherited]

Returns true if the Algorithm is empty (e.g.

in the very beginning or after unsuccessful read

Reimplemented from Algorithm.

virtual CV_WRAP int getClustersNumber ( ) const [pure virtual]

The number of mixture components in the Gaussian mixture model.

Default value of the parameter is EM::DEFAULT_NCLUSTERS=5. Some of EM implementation could determine the optimal number of mixtures within a specified value range, but that is not the case in ML yet.

See also:: setClustersNumber

virtual CV_WRAP int getCovarianceMatrixType ( ) const [pure virtual]

Constraint on covariance matrices which defines type of matrices.

See EM::Types.

See also:: setCovarianceMatrixType

virtual CV_WRAP void getCovs ( CV_OUT std::vector< Mat > & covs ) const [pure virtual]

Returns covariation matrices.

Returns vector of covariation matrices. Number of matrices is the number of gaussian mixtures, each matrix is a square floating-point matrix NxN, where N is the space dimensionality.

virtual CV_WRAP String getDefaultName ( ) const [virtual, inherited]

Returns the algorithm string identifier.

This string is used as top level xml/yml node tag when the object is saved to a file or string.

virtual CV_WRAP Mat getMeans ( ) const [pure virtual]

Returns the cluster centers (means of the Gaussian mixture)

Returns matrix with the number of rows equal to the number of mixtures and number of columns equal to the space dimensionality.

virtual CV_WRAP TermCriteria getTermCriteria ( ) const [pure virtual]

The termination criteria of the EM algorithm.

The EM algorithm can be terminated by the number of iterations termCrit.maxCount (number of M-steps) or when relative change of likelihood logarithm is less than termCrit.epsilon. Default maximum number of iterations is EM::DEFAULT_MAX_ITERS=100.

See also:: setTermCriteria

virtual CV_WRAP int getVarCount ( ) const [pure virtual, inherited]

Returns the number of variables in training samples.

virtual CV_WRAP Mat getWeights ( ) const [pure virtual]

Returns weights of the mixtures.

Returns vector with the number of elements equal to the number of mixtures.

virtual CV_WRAP bool isClassifier ( ) const [pure virtual, inherited]

Returns true if the model is classifier.

virtual CV_WRAP bool isTrained ( ) const [pure virtual, inherited]

Returns true if the model is trained.

static Ptr<_Tp> load	(	const String &	filename,
		const String &	objname = `String()`
	)		`[static, inherited]`

Loads algorithm from the file.

Parameters:

filename	Name of the file to read.
objname	The optional name of the node to read (if empty, the first top-level node will be used)

This is static template method of Algorithm. It's usage is following (in the case of SVM):

     Ptr<SVM> svm = Algorithm::load<SVM>("my_svm_model.xml");

In order to make this method work, the derived class must overwrite Algorithm::read(const FileNode& fn).

Definition at line 3027 of file core.hpp.

static Ptr<_Tp> loadFromString	(	const String &	strModel,
		const String &	objname = `String()`
	)		`[static, inherited]`

Loads algorithm from a String.

Parameters:

strModel	The string variable containing the model you want to load.
objname	The optional name of the node to read (if empty, the first top-level node will be used)

This is static template method of Algorithm. It's usage is following (in the case of SVM):

     Ptr<SVM> svm = Algorithm::loadFromString<SVM>(myStringModel);

Definition at line 3046 of file core.hpp.

virtual CV_WRAP float predict	(	InputArray	samples,
		OutputArray	results = `noArray()`,
		int	flags = `0`
	)		const `[pure virtual, inherited]`

Predicts response(s) for the provided sample(s)

Parameters:

samples	The input samples, floating-point matrix
results	The optional output matrix of results.
flags	The optional flags, model-dependent. See cv::ml::StatModel::Flags.

Implemented in LogisticRegression.

virtual CV_WRAP Vec2d predict2	(	InputArray	sample,
		OutputArray	probs
	)		const `[pure virtual]`

Returns a likelihood logarithm value and an index of the most probable mixture component for the given sample.

Parameters:

sample	A sample for classification. It should be a one-channel matrix of $1 \times dims$ or $dims \times 1$ size.
probs	Optional output matrix that contains posterior probabilities of each component given the sample. It has $1 \times nclusters$ size and CV_64FC1 type.

The method returns a two-element double vector. Zero element is a likelihood logarithm value for the sample. First element is an index of the most probable mixture component for the given sample.

virtual void read ( const FileNode & fn ) [virtual, inherited]

Reads algorithm parameters from a file storage.

Reimplemented in DescriptorMatcher, and FlannBasedMatcher.

Definition at line 2992 of file core.hpp.

static Ptr<_Tp> read ( const FileNode & fn ) [static, inherited]

Reads algorithm from the file node.

This is static template method of Algorithm. It's usage is following (in the case of SVM):

     Ptr<SVM> svm = Algorithm::read<SVM>(fn);

In order to make this method work, the derived class must overwrite Algorithm::read(const FileNode& fn) and also have static create() method without parameters (or with all the optional parameters)

Reimplemented in DescriptorMatcher, and FlannBasedMatcher.

Definition at line 3008 of file core.hpp.

virtual CV_WRAP void save ( const String & filename ) const [virtual, inherited]

Saves the algorithm to a file.

In order to make this method work, the derived class must implement Algorithm::write(FileStorage& fs).

virtual CV_WRAP void setClustersNumber ( int val ) [pure virtual]

The number of mixture components in the Gaussian mixture model.

See also:: getClustersNumber

virtual CV_WRAP void setCovarianceMatrixType ( int val ) [pure virtual]

Constraint on covariance matrices which defines type of matrices.

See also:: getCovarianceMatrixType

virtual CV_WRAP void setTermCriteria ( const TermCriteria & val ) [pure virtual]

The termination criteria of the EM algorithm.

See also:: getTermCriteria

virtual CV_WRAP bool train	(	const Ptr< TrainData > &	trainData,
		int	flags = `0`
	)		`[virtual, inherited]`

Trains the statistical model.

Parameters:

trainData	training data that can be loaded from file using TrainData::loadFromCSV or created with TrainData::create.
flags	optional flags, depending on the model. Some of the models can be updated with the new training samples, not completely overwritten (such as NormalBayesClassifier or ANN_MLP).

virtual CV_WRAP bool train	(	InputArray	samples,
		int	layout,
		InputArray	responses
	)		`[virtual, inherited]`

Trains the statistical model.

Parameters:

samples	training samples
layout	See ml::SampleTypes.
responses	vector of responses associated with the training samples.

static Ptr<_Tp> train	(	const Ptr< TrainData > &	data,
		int	flags = `0`
	)		`[static, inherited]`

Create and train model with default parameters.

The class must implement static `create()` method with no parameters or with all default parameter values

Definition at line 357 of file ml.hpp.

virtual CV_WRAP bool trainE	(	InputArray	samples,
		InputArray	means0,
		InputArray	covs0 = `noArray()`,
		InputArray	weights0 = `noArray()`,
		OutputArray	logLikelihoods = `noArray()`,
		OutputArray	labels = `noArray()`,
		OutputArray	probs = `noArray()`
	)		`[pure virtual]`

Estimate the Gaussian mixture parameters from a samples set.

This variation starts with Expectation step. You need to provide initial means $a_k$ of mixture components. Optionally you can pass initial weights $\pi_k$ and covariance matrices $S_k$ of mixture components.

Parameters:

samples	Samples from which the Gaussian mixture model will be estimated. It should be a one-channel matrix, each row of which is a sample. If the matrix does not have CV_64F type it will be converted to the inner matrix of such type for the further computing.
means0	Initial means of mixture components. It is a one-channel matrix of $nclusters \times dims$ size. If the matrix does not have CV_64F type it will be converted to the inner matrix of such type for the further computing.
covs0	The vector of initial covariance matrices of mixture components. Each of covariance matrices is a one-channel matrix of $dims \times dims$ size. If the matrices do not have CV_64F type they will be converted to the inner matrices of such type for the further computing.
weights0	Initial weights $\pi_k$ of mixture components. It should be a one-channel floating-point matrix with $1 \times nclusters$ or $nclusters \times 1$ size.
logLikelihoods	The optional output matrix that contains a likelihood logarithm value for each sample. It has $nsamples \times 1$ size and CV_64FC1 type.
labels	The optional output "class label" for each sample: $\texttt{labels}_i=\texttt{arg max}_k(p_{i,k}), i=1..N$ (indices of the most probable mixture component for each sample). It has $nsamples \times 1$ size and CV_32SC1 type.
probs	The optional output matrix that contains posterior probabilities of each Gaussian mixture component given the each sample. It has $nsamples \times nclusters$ size and CV_64FC1 type.

virtual CV_WRAP bool trainEM	(	InputArray	samples,
		OutputArray	logLikelihoods = `noArray()`,
		OutputArray	labels = `noArray()`,
		OutputArray	probs = `noArray()`
	)		`[pure virtual]`

Estimate the Gaussian mixture parameters from a samples set.

This variation starts with Expectation step. Initial values of the model parameters will be estimated by the k-means algorithm.

Unlike many of the ML models, EM is an unsupervised learning algorithm and it does not take responses (class labels or function values) as input. Instead, it computes the *Maximum Likelihood Estimate* of the Gaussian mixture parameters from an input sample set, stores all the parameters inside the structure: $p_{i,k}$ in probs, $a_k$ in means , $S_k$ in covs[k], $\pi_k$ in weights , and optionally computes the output "class label" for each sample: $\texttt{labels}_i=\texttt{arg max}_k(p_{i,k}), i=1..N$ (indices of the most probable mixture component for each sample).

The trained model can be used further for prediction, just like any other classifier. The trained model is similar to the NormalBayesClassifier.

Parameters:

samples	Samples from which the Gaussian mixture model will be estimated. It should be a one-channel matrix, each row of which is a sample. If the matrix does not have CV_64F type it will be converted to the inner matrix of such type for the further computing.
logLikelihoods	The optional output matrix that contains a likelihood logarithm value for each sample. It has $nsamples \times 1$ size and CV_64FC1 type.
labels	The optional output "class label" for each sample: $\texttt{labels}_i=\texttt{arg max}_k(p_{i,k}), i=1..N$ (indices of the most probable mixture component for each sample). It has $nsamples \times 1$ size and CV_32SC1 type.
probs	The optional output matrix that contains posterior probabilities of each Gaussian mixture component given the each sample. It has $nsamples \times nclusters$ size and CV_64FC1 type.

virtual CV_WRAP bool trainM	(	InputArray	samples,
		InputArray	probs0,
		OutputArray	logLikelihoods = `noArray()`,
		OutputArray	labels = `noArray()`,
		OutputArray	probs = `noArray()`
	)		`[pure virtual]`

Estimate the Gaussian mixture parameters from a samples set.

This variation starts with Maximization step. You need to provide initial probabilities $p_{i,k}$ to use this option.

Parameters:

samples	Samples from which the Gaussian mixture model will be estimated. It should be a one-channel matrix, each row of which is a sample. If the matrix does not have CV_64F type it will be converted to the inner matrix of such type for the further computing.
probs0
logLikelihoods	The optional output matrix that contains a likelihood logarithm value for each sample. It has $nsamples \times 1$ size and CV_64FC1 type.
labels	The optional output "class label" for each sample: $\texttt{labels}_i=\texttt{arg max}_k(p_{i,k}), i=1..N$ (indices of the most probable mixture component for each sample). It has $nsamples \times 1$ size and CV_32SC1 type.
probs	The optional output matrix that contains posterior probabilities of each Gaussian mixture component given the each sample. It has $nsamples \times nclusters$ size and CV_64FC1 type.

virtual void write ( FileStorage & fs ) const [virtual, inherited]

Stores algorithm parameters in a file storage.

Reimplemented in DescriptorMatcher, and FlannBasedMatcher.

Definition at line 2988 of file core.hpp.

Type:	Program
Mbed OS support:	Mbed 2 deprecated
Created:	31 Mar 2016
Imports:	152
Forks:	0
Commits:	1
Dependents:	0
Dependencies:	1
Followers:	2

EM Class Reference
[Machine Learning]

Public Types

Public Member Functions

Static Public Member Functions

Detailed Description

Member Enumeration Documentation

Member Function Documentation

Repository toolbox

Repository details

Important Information for this Arm website

EM Class Reference [Machine Learning]

Public Types

Public Member Functions

Static Public Member Functions

Detailed Description

Member Enumeration Documentation

Member Function Documentation

Repository toolbox

Repository details

Important Information for this Arm website

Access Warning

EM Class Reference
[Machine Learning]