8.15.1.7. sklearn.linear_model.ElasticNet¶

class sklearn.linear_model.ElasticNet(alpha=1.0, rho=0.5, fit_intercept=True, normalize=False, precompute='auto', max_iter=1000, copy_X=True, tol=0.0001, warm_start=False)¶

Linear Model trained with L1 and L2 prior as regularizer

Minimizes the objective function:

1 / (2 * n_samples) * ||y - Xw||^2_2 +
+ alpha * rho * ||w||_1 + 0.5 * alpha * (1 - rho) * ||w||^2_2

If you are interested in controlling the L1 and L2 penalty separately, keep in mind that this is equivalent to:

a * L1 + b * L2

where:

alpha = a + b and rho = a / (a + b)

The parameter rho corresponds to alpha in the glmnet R package while alpha corresponds to the lambda parameter in glmnet. Specifically, rho = 1 is the lasso penalty. Currently, rho <= 0.01 is not reliable, unless you supply your own sequence of alpha.

Parameters :

alpha : float

Constant that multiplies the penalty terms. Defaults to 1.0 See the notes for the exact mathematical meaning of this parameter

rho : float

The ElasticNet mixing parameter, with 0 < rho <= 1. For rho = 0 the penalty is an L1 penalty. For rho = 1 it is an L2 penalty. For 0 < rho < 1, the penalty is a combination of L1 and L2

fit_intercept: bool :

Whether the intercept should be estimated or not. If False, the data is assumed to be already centered.

normalize : boolean, optional

If True, the regressors X are normalized

precompute : True | False | ‘auto’ | array-like

Whether to use a precomputed Gram matrix to speed up calculations. If set to ‘auto’ let us decide. The Gram matrix can also be passed as argument.

max_iter: int, optional :

The maximum number of iterations

copy_X : boolean, optional, default False

If True, X will be copied; else, it may be overwritten.

tol: float, optional :

The tolerance for the optimization: if the updates are smaller than ‘tol’, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.

warm_start : bool, optional

When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.

Notes

To avoid unnecessary memory duplication the X argument of the fit method should be directly passed as a fortran contiguous numpy array.

Methods

`decision_function`(X)	Decision function of the linear model
`fit`(X, y[, Xy, coef_init])	Fit Elastic Net model with coordinate descent
`get_params`([deep])	Get parameters for the estimator
`predict`(X)	Predict using the linear model
`score`(X, y)	Returns the coefficient of determination R^2 of the prediction.
`set_params`(**params)	Set the parameters of the estimator.

__init__(alpha=1.0, rho=0.5, fit_intercept=True, normalize=False, precompute='auto', max_iter=1000, copy_X=True, tol=0.0001, warm_start=False)¶

decision_function(X)¶

Decision function of the linear model

Parameters :

X : numpy array of shape [n_samples, n_features]

Returns :

C : array, shape = [n_samples]

Returns predicted values.

fit(X, y, Xy=None, coef_init=None)¶

Fit Elastic Net model with coordinate descent

Parameters :

X: ndarray, (n_samples, n_features) :

Data

y: ndarray, (n_samples) :

Target

Xy : array-like, optional

Xy = np.dot(X.T, y) that can be precomputed. It is useful only when the Gram matrix is precomputed.

coef_init: ndarray of shape n_features :

The initial coeffients to warm-start the optimization

Notes

Coordinate descent is an algorithm that considers each column of data at a time hence it will automatically convert the X input as a fortran contiguous numpy array if necessary.

To avoid memory re-allocation it is advised to allocate the initial data in memory directly using that format.

get_params(deep=True)¶

Get parameters for the estimator

Parameters :

deep: boolean, optional :

If True, will return the parameters for this estimator and contained subobjects that are estimators.

predict(X)¶

Predict using the linear model

Parameters :

X : numpy array of shape [n_samples, n_features]

Returns :

C : array, shape = [n_samples]

Returns predicted values.

score(X, y)¶

Returns the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the regression sum of squares ((y - y_pred) ** 2).sum() and v is the residual sum of squares ((y_true - y_true.mean()) ** 2).sum(). Best possible score is 1.0, lower values are worse.

Parameters :

X : array-like, shape = [n_samples, n_features]

Training set.

y : array-like, shape = [n_samples]

Returns :

z : float

set_params(**params)¶

Set the parameters of the estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns :	self :

Citing

This page

8.15.1.7. sklearn.linear_model.ElasticNet¶