This documentation is for scikit-learn version 0.11-gitOther versions

Citing

If you use the software, please consider citing scikit-learn.

This page

8.5.15. sklearn.decomposition.dict_learning_online

sklearn.decomposition.dict_learning_online(X, n_atoms, alpha, n_iter=100, return_code=True, dict_init=None, callback=None, chunk_size=3, verbose=False, shuffle=True, n_jobs=1, method='lars', iter_offset=0, random_state=None)

Solves a dictionary learning matrix factorization problem online.

Finds the best dictionary and the corresponding sparse code for approximating the data matrix X by solving:

(U^*, V^*) = argmin 0.5 || X - U V ||_2^2 + alpha * || U ||_1
             (U,V)
             with || V_k ||_2 = 1 for all  0 <= k < n_atoms

where V is the dictionary and U is the sparse code. This is accomplished by repeatedly iterating over mini-batches by slicing the input data.

Parameters :

X: array of shape (n_samples, n_features) :

data matrix

n_atoms: int, :

number of dictionary atoms to extract

alpha: int, :

sparsity controlling parameter

n_iter: int, :

number of iterations to perform

return_code: boolean, :

whether to also return the code U or just the dictionary V

dict_init: array of shape (n_atoms, n_features), :

initial value for the dictionary for warm restart scenarios

callback: :

callable that gets invoked every five iterations

chunk_size: int, :

the number of samples to take in each batch

verbose: :

degree of output the procedure will print

shuffle: boolean, :

whether to shuffle the data before splitting it in batches

n_jobs: int, :

number of parallel jobs to run, or -1 to autodetect.

method: {‘lars’, ‘cd’} :

lars: uses the least angle regression method to solve the lasso problem (linear_model.lars_path) cd: uses the coordinate descent method to compute the Lasso solution (linear_model.Lasso). Lars will be faster if the estimated components are sparse.

iter_offset: int, default 0 :

number of previous iterations completed on the dictionary used for initialization

random_state: int or RandomState :

Pseudo number generator state used for random sampling.

Returns :

code: array of shape (n_samples, n_atoms), :

the sparse code (only returned if return_code=True)

dictionary: array of shape (n_atoms, n_features), :

the solutions to the dictionary learning problem