9.5.3. sklearn.mixture.VBGMM¶
- class sklearn.mixture.VBGMM(n_components=1, cvtype='diag', alpha=1.0, random_state=None, thresh=0.01, verbose=False, min_covar=None)¶
Variational Inference for the Gaussian Mixture Model
Variational inference for a Gaussian mixture model probability distribution. This class allows for easy and efficient inference of an approximate posterior distribution over the parameters of a gaussian mixture model with a fixed number of components.
Initialization is with normally-distributed means and identity covariance, for proper convergence.
Parameters : n_components: int, optional :
Number of mixture components. Defaults to 1.
cvtype: string (read-only), optional :
String describing the type of covariance parameters to use. Must be one of ‘spherical’, ‘tied’, ‘diag’, ‘full’. Defaults to ‘diag’.
alpha: float, optional :
Real number representing the concentration parameter of the dirichlet distribution. Intuitively, the higher the value of alpha the more likely the variational mixture of gaussians model will use all components it can. Defaults to 1.
See also
- GMM
- Finite gaussian mixture model fit with EM
- DPGMM
- Ininite gaussian mixture model, using the dirichlet
process, fit
Attributes
cvtype Covariance type of the model. weights Mixing weights for each mixture component. means Mean parameters for each mixture component. precisions Return precisions as a full matrix. n_features int Dimensionality of the Gaussians. n_components int (read-only) Number of mixture components. converged_ bool True when convergence was reached in fit(), False otherwise. Methods
decode(X) Find most likely mixture components for each point in X. eval(X) Compute a lower-bound of the log likelihood of X under the model and an approximate posterior distribution over mixture components. fit(X) Estimate the posterior of themodel parameters from X using the variational mean-field algorithm. predict(X) Like decode, find most likely mixtures components for each observation in X. rvs(n=1) Generate n samples from the posterior for the model. score(X) Compute the log likelihood of X under the model. - __init__(n_components=1, cvtype='diag', alpha=1.0, random_state=None, thresh=0.01, verbose=False, min_covar=None)¶
- cvtype¶
Covariance type of the model.
Must be one of ‘spherical’, ‘tied’, ‘diag’, ‘full’.
- decode(obs)¶
Find most likely mixture components for each point in obs.
Parameters : obs : array_like, shape (n, n_features)
List of n_features-dimensional data points. Each row corresponds to a single data point.
Returns : logprobs : array_like, shape (n_samples,)
Log probability of each point in obs under the model.
components : array_like, shape (n_samples,)
Index of the most likelihod mixture components for each observation
- eval(obs=None)¶
Evaluate the model on data
Compute the bound on log probability of obs under the model and return the posterior distribution (responsibilities) of each mixture component for each element of obs.
This is done by computing the parameters for the mean-field of z for each observation.
Parameters : obs : array_like, shape (n_samples, n_features)
List of n_features-dimensional data points. Each row corresponds to a single data point.
Returns : logprob : array_like, shape (n_samples,)
Log probabilities of each data point in obs
posteriors: array_like, shape (n_samples, n_components) :
Posterior probabilities of each mixture component for each observation
- fit(X, n_iter=10, params='wmc', init_params='wmc')¶
Estimate model parameters with the variational algorithm.
For a full derivation and description of the algorithm see doc/dp-derivation/dp-derivation.tex
A initialization step is performed before entering the em algorithm. If you want to avoid this step, set the keyword argument init_params to the empty string ‘’. Likewise, if you would like just to do an initialization, call this method with n_iter=0.
Parameters : X : array_like, shape (n, n_features)
List of n_features-dimensional data points. Each row corresponds to a single data point.
- n_iter : int, optional
Maximum number of iterations to perform before convergence.
params : string, optional
Controls which parameters are updated in the training process. Can contain any combination of ‘w’ for weights, ‘m’ for means, and ‘c’ for covars. Defaults to ‘wmc’.
init_params : string, optional
Controls which parameters are updated in the initialization process. Can contain any combination of ‘w’ for weights, ‘m’ for means, and ‘c’ for covars. Defaults to ‘wmc’.
- means¶
Mean parameters for each mixture component.
- precisions¶
Return precisions as a full matrix.
- predict(X)¶
Predict label for data.
Parameters : X : array-like, shape = [n_samples, n_features] Returns : C : array, shape = (n_samples,)
- predict_proba(X)¶
Predict posterior probability of data under each Gaussian in the model.
Parameters : X : array-like, shape = [n_samples, n_features]
Returns : T : array-like, shape = (n_samples, n_components)
Returns the probability of the sample for each Gaussian (state) in the model.
- rvs(n_samples=1, random_state=None)¶
Generate random samples from the model.
Parameters : n_samples : int, optional
Number of samples to generate. Defaults to 1.
Returns : obs : array_like, shape (n_samples, n_features)
List of samples
- score(obs)¶
Compute the log probability under the model.
Parameters : obs : array_like, shape (n_samples, n_features)
List of n_features-dimensional data points. Each row corresponds to a single data point.
Returns : logprob : array_like, shape (n_samples,)
Log probabilities of each data point in obs
- set_params(**params)¶
Set the parameters of the estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
Returns : self :
- weights¶
Mixing weights for each mixture component.