KNeighborsTransformer#

class sklearn.neighbors.KNeighborsTransformer(*, mode='distance', n_neighbors=5, algorithm='auto', leaf_size=30, metric='minkowski', p=2, metric_params=None, n_jobs=None)[source]#

Transform X into a (weighted) graph of k nearest neighbors.

The transformed data is a sparse graph as returned by kneighbors_graph.

See also

kneighbors_graph: Compute the weighted graph of k-neighbors for points in X.
RadiusNeighborsTransformer: Transform X into a weighted graph of neighbors nearer than a radius.

Notes

For an example of using KNeighborsTransformer in combination with TSNE see Approximate nearest neighbors in TSNE.

Examples

&gt;&gt;&gt; from sklearn.datasets import load_wine
&gt;&gt;&gt; from sklearn.neighbors import KNeighborsTransformer
&gt;&gt;&gt; X, _ = load_wine(return_X_y=True)
&gt;&gt;&gt; X.shape
(178, 13)
&gt;&gt;&gt; transformer = KNeighborsTransformer(n_neighbors=5, mode='distance')
&gt;&gt;&gt; X_dist_graph = transformer.fit_transform(X)
&gt;&gt;&gt; X_dist_graph.shape
(178, 178)

fit(X, y=None)[source]#

Fit the k-nearest neighbors transformer from the training dataset.

Parameters:

X{array-like, sparse matrix} of shape (n_samples, n_features) or (n_samples, n_samples) if metric=’precomputed’: Training data.
yIgnored: Not used, present for API consistency by convention.

Returns:

selfKNeighborsTransformer: The fitted k-nearest neighbors transformer.

fit_transform(X, y=None)[source]#

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:

Xarray-like of shape (n_samples, n_features): Training set.
yIgnored: Not used, present for API consistency by convention.

Returns:

Xtsparse matrix of shape (n_samples, n_samples): Xt[i, j] is assigned the weight of edge that connects i to j. Only the neighbors have an explicit value. The diagonal is always explicit. The matrix is of CSR format.

get_feature_names_out(input_features=None)[source]#

get output feature names for transformation.

The feature names out will prefixed by the lowercased class name. For example, if the transformer outputs 3 features, then the feature names out are: ["class_name0", "class_name1", "class_name2"].

Parameters:

input_featuresarray-like of str or None, default=None: Only used to validate feature names with the names seen in fit.

Returns:

feature_names_outndarray of str objects: Transformed feature names.

get_metadata_routing()[source]#

get metadata routing of this object.

Please check User guide on how the routing mechanism works.

Returns:

routingMetadataRequest: A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

get parameters for this estimator.

Parameters:

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

paramsdict: Parameter names mapped to their values.

kneighbors(X=None, n_neighbors=None, return_distance=True)[source]#

Find the K-neighbors of a point.

Returns indices of and distances to the neighbors of each point.

Parameters:

X{array-like, sparse matrix}, shape (n_queries, n_features), or (n_queries, n_indexed) if metric == ‘precomputed’, default=None: The query point or points. If not provided, neighbors of each indexed point are returned. In this case, the query point is not considered its own neighbor.
n_neighborsint, default=None: Number of neighbors required for each sample. The default is the value passed to the constructor.
return_distancebool, default=True: Whether or not to return the distances.

Returns:

neigh_distndarray of shape (n_queries, n_neighbors): Array representing the lengths to points, only present if return_distance=True.
neigh_indndarray of shape (n_queries, n_neighbors): Indices of the nearest points in the population matrix.

Examples

In the following example, we construct a NearestNeighbors class from an array representing our data set and ask who’s the closest point to [1,1,1]

&gt;&gt;&gt; samples = [[0., 0., 0.], [0., .5, 0.], [1., 1., .5]]
&gt;&gt;&gt; from sklearn.neighbors import NearestNeighbors
&gt;&gt;&gt; neigh = NearestNeighbors(n_neighbors=1)
&gt;&gt;&gt; neigh.fit(samples)
NearestNeighbors(n_neighbors=1)
&gt;&gt;&gt; print(neigh.kneighbors([[1., 1., 1.]]))
(array([[0.5]]), array([[2]]))

As you can see, it returns [[0.5]], and [[2]], which means that the element is at distance 0.5 and is the third element of samples (indexes start at 0). You can also query for multiple points:

&gt;&gt;&gt; X = [[0., 1., 0.], [1., 0., 1.]]
&gt;&gt;&gt; neigh.kneighbors(X, return_distance=False)
array([[1],
       [2]]...)

kneighbors_graph(X=None, n_neighbors=None, mode='connectivity')[source]#

Compute the (weighted) graph of k-Neighbors for points in X.

Parameters:

X{array-like, sparse matrix} of shape (n_queries, n_features), or (n_queries, n_indexed) if metric == ‘precomputed’, default=None: The query point or points. If not provided, neighbors of each indexed point are returned. In this case, the query point is not considered its own neighbor. For metric='precomputed' the shape should be (n_queries, n_indexed). Otherwise the shape should be (n_queries, n_features).
n_neighborsint, default=None: Number of neighbors for each sample. The default is the value passed to the constructor.
mode{‘connectivity’, ‘distance’}, default=’connectivity’: Type of returned matrix: ‘connectivity’ will return the connectivity matrix with ones and zeros, in ‘distance’ the edges are distances between points, type of distance depends on the selected metric parameter in NearestNeighbors class.

Returns:

Asparse-matrix of shape (n_queries, n_samples_fit): n_samples_fit is the number of samples in the fitted data. A[i, j] gives the weight of the edge connecting i to j. The matrix is of CSR format.

gallery examples#

Release Highlights for scikit-learn 0.22

Approximate nearest neighbors in TSNE

Caching nearest neighbors