This documentation is for scikit-learn version 0.11-gitOther versions

Citing

If you use the software, please consider citing scikit-learn.

This page

8.16.3.3. sklearn.metrics.homogeneity_completeness_v_measure

sklearn.metrics.homogeneity_completeness_v_measure(labels_true, labels_pred)

Compute the homogeneity and completeness and V-measure scores at once

Those metrics are based on normalized conditional entropy measures of the clustering labeling to evaluate given the knowledge of a Ground Truth class labels of the same samples.

A clustering result satisfies homogeneity if all of its clusters contain only data points which are members of a single class.

A clustering result satisfies completeness if all the data points that are members of a given class are elements of the same cluster.

Both scores have positive values between 0.0 and 1.0, larger values being desirable.

Those 3 metrics are independent of the absolute values of the labels: a permutation of the class or cluster label values won’t change the score values in any way.

V-Measure is furthermore symmetric: swapping labels_true and label_pred will give the same score. This does not hold for homogeneity and completeness.

Parameters :

labels_true : int array, shape = [n_samples]

ground truth class labels to be used as a reference

labels_pred : array, shape = [n_samples]

cluster labels to evaluate

Returns :

homogeneity: float :

score between 0.0 and 1.0. 1.0 stands for perfectly homogeneous labeling

completeness: float :

score between 0.0 and 1.0. 1.0 stands for perfectly complete labeling

v_measure: float :

harmonic mean of the first two

See also

-, -, -