This documentation is for scikit-learn version 0.10Other versions

Citing

If you use the software, please consider citing scikit-learn.

This page

8.19.2. sklearn.naive_bayes.MultinomialNB

class sklearn.naive_bayes.MultinomialNB(alpha=1.0, fit_prior=True)

Naive Bayes classifier for multinomial models

The multinomial Naive Bayes classifier is suitable for classification with discrete features (e.g., word counts for text classification). The multinomial distribution normally requires integer feature counts. However, in practice, fractional counts such as tf-idf may also work.

Parameters :

alpha: float, optional (default=1.0) :

Additive (Laplace/Lidstone) smoothing parameter (0 for no smoothing).

fit_prior: boolean :

Whether to learn class prior probabilities or not. If false, a uniform prior will be used.

Notes

For the rationale behind the names coef_ and intercept_, i.e. naive Bayes as a linear classifier, see J. Rennie et al. (2003), Tackling the poor assumptions of naive Bayes text classifiers, ICML.

Examples

>>> import numpy as np
>>> X = np.random.randint(5, size=(6, 100))
>>> Y = np.array([1, 2, 3, 4, 5, 6])
>>> from sklearn.naive_bayes import MultinomialNB
>>> clf = MultinomialNB()
>>> clf.fit(X, Y)
MultinomialNB(alpha=1.0, fit_prior=True)
>>> print clf.predict(X[2])
[3]

Attributes

intercept_, class_log_prior_ array, shape = [n_classes] Smoothed empirical log probability for each class.
feature_log_prob_, coef_ array, shape = [n_classes, n_features]

Empirical log probability of features given a class, P(x_i|y).

(intercept_ and coef_ are properties referring to class_log_prior_ and feature_log_prob_, respectively.)

Methods

fit(X, y[, sample_weight, class_prior]) Fit Naive Bayes classifier according to X, y
predict(X) Perform classification on an array of test vectors X.
predict_log_proba(X) Return log-probability estimates for the test vector X.
predict_proba(X) Return probability estimates for the test vector X.
score(X, y) Returns the mean accuracy on the given test data and labels.
set_params(**params) Set the parameters of the estimator.
__init__(alpha=1.0, fit_prior=True)
fit(X, y, sample_weight=None, class_prior=None)

Fit Naive Bayes classifier according to X, y

Parameters :

X : {array-like, sparse matrix}, shape = [n_samples, n_features]

Training vectors, where n_samples is the number of samples and n_features is the number of features.

y : array-like, shape = [n_samples]

Target values.

sample_weight : array-like, shape = [n_samples], optional

Weights applied to individual samples (1. for unweighted).

class_prior : array, shape [n_classes]

Custom prior probability per class. Overrides the fit_prior parameter.

Returns :

self : object

Returns self.

predict(X)

Perform classification on an array of test vectors X.

Parameters :

X : array-like, shape = [n_samples, n_features]

Returns :

C : array, shape = [n_samples]

Predicted target values for X

predict_log_proba(X)

Return log-probability estimates for the test vector X.

Parameters :

X : array-like, shape = [n_samples, n_features]

Returns :

C : array-like, shape = [n_samples, n_classes]

Returns the log-probability of the sample for each class in the model, where classes are ordered arithmetically.

predict_proba(X)

Return probability estimates for the test vector X.

Parameters :

X : array-like, shape = [n_samples, n_features]

Returns :

C : array-like, shape = [n_samples, n_classes]

Returns the probability of the sample for each class in the model, where classes are ordered arithmetically.

score(X, y)

Returns the mean accuracy on the given test data and labels.

Parameters :

X : array-like, shape = [n_samples, n_features]

Training set.

y : array-like, shape = [n_samples]

Labels for X.

Returns :

z : float

set_params(**params)

Set the parameters of the estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns :self :