cosine_similarity#

sklearn.metrics.pairwise.cosine_similarity(X, Y=None, dense_output=True)[source]#

Compute cosine similarity between samples in X and Y.

Cosine similarity, or the cosine kernel, computes similarity as the normalized dot product of X and Y:

K(X, Y) = <X, Y> / (||X||*||Y||)

On L2-normalized data, this function is equivalent to linear_kernel.

Read more in the User Guide.

Parameters:
X{array-like, sparse matrix} of shape (n_samples_X, n_features)

Input data.

Y{array-like, sparse matrix} of shape (n_samples_Y, n_features), default=None

Input data. If None, the output will be the pairwise similarities between all samples in X.

dense_outputbool, default=True

Whether to return dense output even when the input is sparse. If False, the output is sparse if both input arrays are sparse.

Added in version 0.17: parameter dense_output for dense output.

Returns:
similaritiesndarray or sparse matrix of shape (n_samples_X, n_samples_Y)

Returns the cosine similarity between samples in X and Y.

Examples

>>> from sklearn.metrics.pairwise import cosine_similarity
>>> X = [[0, 0, 0], [1, 1, 1]]
>>> Y = [[1, 0, 0], [1, 1, 0]]
>>> cosine_similarity(X, Y)
array([[0.     , 0.     ],
       [0.57..., 0.81...]])