completeness_score#

sklearn.metrics.completeness_score(labels_true, labels_pred)[source]#

Compute completeness metric of a cluster labeling given a ground truth.

A clustering result satisfies completeness if all the data points that are members of a given class are elements of the same cluster.

This metric is independent of the absolute values of the labels: a permutation of the class or cluster label values won’t change the score value in any way.

This metric is not symmetric: switching label_true with label_pred will return the homogeneity_score which will be different in general.

See also

homogeneity_score: Homogeneity metric of cluster labeling.
v_measure_score: V-Measure (NMI with arithmetic mean option).

References

[1]

Andrew Rosenberg and Julia Hirschberg, 2007. V-Measure: A conditional entropy-based external cluster evaluation measure

Examples

Perfect labelings are complete:

>>> from sklearn.metrics.cluster import completeness_score
>>> completeness_score([0, 0, 1, 1], [1, 1, 0, 0])
np.float64(1.0)

Non-perfect labelings that assign all classes members to the same clusters are still complete:

>>> print(completeness_score([0, 0, 1, 1], [0, 0, 0, 0]))
1.0
>>> print(completeness_score([0, 1, 2, 3], [0, 0, 1, 1]))
0.999...

If classes members are split across different clusters, the assignment cannot be complete:

>>> print(completeness_score([0, 0, 1, 1], [0, 1, 0, 1]))
0.0
>>> print(completeness_score([0, 0, 0, 0], [0, 1, 2, 3]))
0.0

Gallery examples#

Release Highlights for scikit-learn 0.23

A demo of K-Means clustering on the handwritten digits data

demo of DBSCAN clustering algorithm

demo of affinity propagation clustering algorithm

Clustering text documents using k-means