This documentation is for scikit-learn version 0.11-gitOther versions

Citing

If you use the software, please consider citing scikit-learn.

This page

8.18.4. sklearn.multiclass.OutputCodeClassifier

class sklearn.multiclass.OutputCodeClassifier(estimator, code_size=1.5, random_state=None)

(Error-Correcting) Output-Code multiclass strategy

Output-code based strategies consist in representing each class with a binary code (an array of 0s and 1s). At fitting time, one binary classifier per bit in the code book is fitted. At prediction time, the classifiers are used to project new points in the class space and the class closest to the points is chosen. The main advantage of these strategies is that the number of classifiers used can be controlled by the user, either for compressing the model (0 < code_size < 1) or for making the model more robust to errors (code_size > 1). See the documentation for more details.

Parameters :

estimator : estimator object

An estimator object implementing fit and one of decision_function or predict_proba.

code_size : float

Percentage of the number of classes to be used to create the code book. A number between 0 and 1 will require fewer classifiers than one-vs-the-rest. A number greater than 1 will require more classifiers than one-vs-the-rest.

random_state : numpy.RandomState, optional

The generator used to initialize the codebook. Defaults to numpy.random.

Notes

References:

  • [1] “Solving multiclass learning problems via error-correcting ouput

    codes”, Dietterich T., Bakiri G., Journal of Artificial Intelligence Research 2, 1995.

  • [2] “The error coding method and PICTs”,

    James G., Hastie T., Journal of Computational and Graphical statistics 7, 1998.

  • [3] “The Elements of Statistical Learning”,

    Hastie T., Tibshirani R., Friedman J., page 606 (second-edition) 2008.

Attributes

estimators_ list of int(n_classes * code_size) estimators Estimators used for predictions.
classes_ numpy array of shape [n_classes] Array containing labels.
code_book_ numpy array of shape [n_classes, code_size] Binary array containing the code of each class.

Methods

fit(X, y) Fit underlying estimators.
predict(X) Predict multi-class targets using underlying estimators.
score(X, y) Returns the mean accuracy on the given test data and labels.
set_params(**params) Set the parameters of the estimator.
__init__(estimator, code_size=1.5, random_state=None)
fit(X, y)

Fit underlying estimators.

Parameters :

X: {array-like, sparse matrix}, shape = [n_samples, n_features] :

Data.

y : numpy array of shape [n_samples]

Multi-class targets.

Returns :

self :

predict(X)

Predict multi-class targets using underlying estimators.

Parameters :

X: {array-like, sparse matrix}, shape = [n_samples, n_features] :

Data.

Returns :

y : numpy array of shape [n_samples]

Predicted multi-class targets.

score(X, y)

Returns the mean accuracy on the given test data and labels.

Parameters :

X : array-like, shape = [n_samples, n_features]

Training set.

y : array-like, shape = [n_samples]

Labels for X.

Returns :

z : float

set_params(**params)

Set the parameters of the estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns :self :