This page

Citing

Please consider citing the scikit-learn.

9.3.3. sklearn.naive_bayes.MultinomialNB

class sklearn.naive_bayes.MultinomialNB(alpha=1.0, fit_prior=True)

Naive Bayes classifier for multinomial models

The multinomial Naive Bayes classifier is suitable for classification with discrete features (e.g., word counts for text classification). The multinomial distribution normally requires integer feature counts. However, in practice, fractional counts such as tf-idf may also work.

Parameters :

alpha: float, optional (default=1.0) :

Additive (Laplace/Lidstone) smoothing parameter (0 for no smoothing).

fit_prior: boolean :

Whether to learn class prior probabilities or not. If false, a uniform prior will be used.

References

For the rationale behind the names coef_ and intercept_, i.e. naive Bayes as a linear classifier, see J. Rennie et al. (2003), Tackling the poor assumptions of naive Bayes text classifiers, ICML.

Examples

>>> import numpy as np
>>> X = np.random.randint(5, size=(6, 100))
>>> Y = np.array([1, 2, 3, 4, 5, 6])
>>> from sklearn.naive_bayes import MultinomialNB
>>> clf = MultinomialNB()
>>> clf.fit(X, Y)
MultinomialNB(alpha=1.0, fit_prior=True)
>>> print clf.predict(X[2])
[3]

Attributes

Methods

fit(X, y) self Fit the model
predict(X) array Predict using the model.
predict_proba(X) array Predict the probability of each class using the model.
predict_log_proba(X) array Predict the log probability of each class using the model.
__init__(alpha=1.0, fit_prior=True)
fit(X, y, class_prior=None)

Fit Naive Bayes classifier according to X, y

Parameters :

X : {array-like, sparse matrix}, shape = [n_samples, n_features]

Training vectors, where n_samples is the number of samples and n_features is the number of features.

y : array-like, shape = [n_samples]

Target values.

class_prior : array, shape [n_classes]

Custom prior probability per class. Overrides the fit_prior parameter.

Returns :

self : object

Returns self.

predict(X)

Perform classification on an array of test vectors X.

Parameters :X : {array-like, sparse matrix}, shape = [n_samples, n_features]
Returns :C : array, shape = [n_samples]
predict_log_proba(X)

Return log-probability estimates for the test vector X.

Parameters :

X : {array-like, sparse matrix}, shape = [n_samples, n_features]

Returns :

C : array-like, shape = [n_samples, n_classes]

Returns the log-probability of the sample for each class in the model, where classes are ordered by arithmetical order.

predict_proba(X)

Return probability estimates for the test vector X.

Parameters :

X : {array-like, sparse matrix}, shape = [n_samples, n_features]

Returns :

C : array-like, shape = [n_samples, n_classes]

Returns the probability of the sample for each class in the model, where classes are ordered by arithmetical order.

score(X, y)

Returns the mean error rate on the given test data and labels.

Parameters :

X : array-like, shape = [n_samples, n_features]

Training set.

y : array-like, shape = [n_samples]

Labels for X.

Returns :

z : float

set_params(**params)

Set the parameters of the estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns :self :