pairwise_kernels#

sklearn.metrics.pairwise.pairwise_kernels(X, Y=None, metric='linear', *, filter_params=False, n_jobs=None, **kwds)[source]#

Compute the kernel between arrays X and optional array Y.

This method takes either a vector array or a kernel matrix, and returns a kernel matrix. If the input is a vector array, the kernels are computed. If the input is a kernel matrix, it is returned instead.

This method provides a safe way to take a kernel matrix as input, while preserving compatibility with many other algorithms that take a vector array.

If Y is given (default is None), then the returned matrix is the pairwise kernel between the arrays from both X and Y.

Valid values for metric are:: [‘additive_chi2’, ‘chi2’, ‘linear’, ‘poly’, ‘polynomial’, ‘rbf’, ‘laplacian’, ‘sigmoid’, ‘cosine’]

Read more in the User Guide.

Parameters:

X{array-like, sparse matrix} of shape (n_samples_X, n_samples_X) or (n_samples_X, n_features)

Array of pairwise kernels between samples, or a feature array. The shape of the array should be (n_samples_X, n_samples_X) if metric == “precomputed” and (n_samples_X, n_features) otherwise.

Y{array-like, sparse matrix} of shape (n_samples_Y, n_features), default=None

A second feature array only if X has shape (n_samples_X, n_features).

metricstr or callable, default=”linear”

The metric to use when calculating kernel between instances in a feature array. If metric is a string, it must be one of the metrics in pairwise.PAIRWISE_KERNEL_FUNCTIONS. If metric is “precomputed”, X is assumed to be a kernel matrix. Alternatively, if metric is a callable function, it is called on each pair of instances (rows) and the resulting value recorded. The callable should take two rows from X as input and return the corresponding kernel value as a single number. This means that callables from sklearn.metrics.pairwise are not allowed, as they operate on matrices, not single samples. Use the string identifying the kernel instead.

filter_paramsbool, default=False

Whether to filter invalid parameters or not.

n_jobsint, default=None

The number of jobs to use for the computation. This works by breaking down the pairwise matrix into n_jobs even slices and computing them using multithreading.

None means 1 unless in a joblib.parallel_backend context. -1 means using all processors. See Glossary for more details.

**kwdsoptional keyword parameters

Any further parameters are passed directly to the kernel function.

Returns:

Kndarray of shape (n_samples_X, n_samples_X) or (n_samples_X, n_samples_Y): A kernel matrix K such that K_{i, j} is the kernel between the ith and jth vectors of the given matrix X, if Y is None. If Y is not None, then K_{i, j} is the kernel between the ith array from X and the jth array from Y.

Notes

If metric is ‘precomputed’, Y is ignored and X is returned.

Examples

>>> from sklearn.metrics.pairwise import pairwise_kernels
>>> X = [[0, 0, 0], [1, 1, 1]]
>>> Y = [[1, 0, 0], [1, 1, 0]]
>>> pairwise_kernels(X, Y, metric='linear')
array([[0., 0.],
       [1., 2.]])