9.2.4. sklearn.linear_model.Lasso¶
- class sklearn.linear_model.Lasso(alpha=1.0, fit_intercept=True, normalize=False, precompute='auto', overwrite_X=False, max_iter=1000, tol=0.0001)¶
Linear Model trained with L1 prior as regularizer (aka the Lasso)
Technically the Lasso model is optimizing the same objective function as the Elastic Net with rho=1.0 (no L2 penalty).
Parameters : alpha : float, optional
Constant that multiplies the L1 term. Defaults to 1.0
fit_intercept : boolean
whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).
normalize : boolean, optional
If True, the regressors X are normalized
overwrite_X : boolean, optional
If True, X will not be copied Default is False
precompute : True | False | ‘auto’ | array-like
Whether to use a precomputed Gram matrix to speed up calculations. If set to ‘auto’ let us decide. The Gram matrix can also be passed as argument.
max_iter: int, optional :
The maximum number of iterations
tol: float, optional :
The tolerance for the optimization: if the updates are smaller than ‘tol’, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.
See also
LassoLars, decomposition.sparse_encode, decomposition.sparse_encode_parallel
Notes
The algorithm used to fit the model is coordinate descent.
To avoid unnecessary memory duplication the X argument of the fit method should be directly passed as a fortran contiguous numpy array.
Examples
>>> from sklearn import linear_model >>> clf = linear_model.Lasso(alpha=0.1) >>> clf.fit([[0,0], [1, 1], [2, 2]], [0, 1, 2]) Lasso(alpha=0.1, fit_intercept=True, max_iter=1000, normalize=False, overwrite_X=False, precompute='auto', tol=0.0001) >>> print clf.coef_ [ 0.85 0. ] >>> print clf.intercept_ 0.15
Attributes
coef_ array, shape = [n_features] parameter vector (w in the fomulation formula) intercept_ float independent term in decision function. Methods
fit(X, y[, Xy, coef_init]) Fit Elastic Net model with coordinate descent predict(X) Predict using the linear model score(X, y) Returns the coefficient of determination of the prediction set_params(**params) Set the parameters of the estimator. - __init__(alpha=1.0, fit_intercept=True, normalize=False, precompute='auto', overwrite_X=False, max_iter=1000, tol=0.0001)¶
- fit(X, y, Xy=None, coef_init=None)¶
Fit Elastic Net model with coordinate descent
Parameters : X: ndarray, (n_samples, n_features) :
Data
y: ndarray, (n_samples) :
Target
Xy : array-like, optional
Xy = np.dot(X.T, y) that can be precomputed. It is useful only when the Gram matrix is precomuted.
coef_init: ndarray of shape n_features :
The initial coeffients to warm-start the optimization
Notes
Coordinate descent is an algorithm that considers each column of data at a time hence it will automatically convert the X input as a fortran contiguous numpy array if necessary.
To avoid memory re-allocation it is advised to allocate the initial data in memory directly using that format.
- predict(X)¶
Predict using the linear model
Parameters : X : numpy array of shape [n_samples, n_features]
Returns : C : array, shape = [n_samples]
Returns predicted values.
- score(X, y)¶
Returns the coefficient of determination of the prediction
Parameters : X : array-like, shape = [n_samples, n_features]
Training set.
y : array-like, shape = [n_samples]
Returns : z : float
- set_params(**params)¶
Set the parameters of the estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
Returns : self :