sklearn.utils#

Various utilities to help with development.

Developer guide. See the Utilities for Developers section for further details.

Bunch

Container object exposing keys as attributes.

_safe_indexing

Return rows, items or columns of X using indices.

as_float_array

Convert an array-like to an array of floats.

assert_all_finite

Throw a ValueError if X contains NaN or infinity.

deprecated

Decorator to mark a function or class as deprecated.

estimator_html_repr

Build a HTML representation of an estimator.

gen_batches

Generator to create slices containing batch_size elements from 0 to n.

gen_even_slices

Generator to create n_packs evenly spaced slices going up to n.

indexable

Make arrays indexable for cross-validation.

murmurhash3_32

Compute the 32bit murmurhash3 of key at seed.

resample

Resample arrays or sparse matrices in a consistent way.

safe_mask

Return a mask which is safe to use on X.

safe_sqr

Element wise squaring of array-likes and sparse matrices.

shuffle

Shuffle arrays or sparse matrices in a consistent way.

Tags

Tags for the estimator.

InputTags

Tags for the input data.

TargetTags

Tags for the target data.

ClassifierTags

Tags for the classifier.

RegressorTags

Tags for the regressor.

TransformerTags

Tags for the transformer.

default_tags

Get the default tags for an estimator.

get_tags

Get estimator tags.

Input and parameter validation#

Functions to validate input and parameters within scikit-learn estimators.

check_X_y

Input validation for standard estimators.

check_array

Input validation on an array, list, sparse matrix or similar.

check_consistent_length

Check that all arrays have consistent first dimensions.

check_random_state

Turn seed into a np.random.RandomState instance.

check_scalar

Validate scalar parameters type and value.

validation.check_is_fitted

Perform is_fitted validation for estimator.

validation.check_memory

Check that memory is joblib.Memory-like.

validation.check_symmetric

Make sure that array is 2D, square and symmetric.

validation.column_or_1d

Ravel column or 1d numpy array, else raises an error.

validation.has_fit_parameter

Check whether the estimator's fit method supports the given parameter.

validation.validate_data

Validate input data and set or check feature names and counts of the input.

Meta-estimators#

Utilities for meta-estimators.

metaestimators.available_if

An attribute that is available only if check returns a truthy value.

Weight handling based on class labels#

Utilities for handling weights based on class labels.

class_weight.compute_class_weight

Estimate class weights for unbalanced datasets.

class_weight.compute_sample_weight

Estimate sample weights by class for unbalanced datasets.

Dealing with multiclass target in classifiers#

Utilities to handle multiclass/multioutput target in classifiers.

multiclass.is_multilabel

Check if y is in a multilabel format.

multiclass.type_of_target

Determine the type of data indicated by the target.

multiclass.unique_labels

Extract an ordered array of unique labels.

Optimal mathematical operations#

Utilities to perform optimal mathematical operations in scikit-learn.

extmath.density

Compute density of a sparse vector.

extmath.fast_logdet

Compute logarithm of determinant of a square matrix.

extmath.randomized_range_finder

Compute an orthonormal matrix whose range approximates the range of A.

extmath.randomized_svd

Compute a truncated randomized SVD.

extmath.safe_sparse_dot

Dot product that handle the sparse matrix case correctly.

extmath.weighted_mode

Return an array of the weighted modal (most common) value in the passed array.

Working with sparse matrices and arrays#

A collection of utilities to work with sparse matrices and arrays.

sparsefuncs.incr_mean_variance_axis

Compute incremental mean and variance along an axis on a CSR or CSC matrix.

sparsefuncs.inplace_column_scale

Inplace column scaling of a CSC/CSR matrix.

sparsefuncs.inplace_csr_column_scale

Inplace column scaling of a CSR matrix.

sparsefuncs.inplace_row_scale

Inplace row scaling of a CSR or CSC matrix.

sparsefuncs.inplace_swap_column

Swap two columns of a CSC/CSR matrix in-place.

sparsefuncs.inplace_swap_row

Swap two rows of a CSC/CSR matrix in-place.

sparsefuncs.mean_variance_axis

Compute mean and variance along an axis on a CSR or CSC matrix.

Utilities to work with sparse matrices and arrays written in Cython.

sparsefuncs_fast.inplace_csr_row_normalize_l1

Normalize inplace the rows of a CSR matrix or array by their L1 norm.

sparsefuncs_fast.inplace_csr_row_normalize_l2

Normalize inplace the rows of a CSR matrix or array by their L2 norm.

Working with graphs#

Graph utilities and algorithms.

graph.single_source_shortest_path_length

Return the length of the shortest path from source to all reachable nodes.

Random sampling#

Utilities for random sampling.

random.sample_without_replacement

Sample integers without replacement.

Auxiliary functions that operate on arrays#

A small collection of auxiliary functions that operate on arrays.

arrayfuncs.min_pos

Find the minimum value of an array over positive values.

Metadata routing#

Utilities to route metadata within scikit-learn estimators.

User guide. See the Metadata Routing section for further details.

metadata_routing.MetadataRequest

Contains the metadata request info of a consumer.

metadata_routing.MetadataRouter

Stores and handles metadata routing for a router object.

metadata_routing.MethodMapping

Stores the mapping between caller and callee methods for a router.

metadata_routing.get_routing_for_object

Get a Metadata{Router, Request} instance from the given object.

metadata_routing.process_routing

Validate and route input parameters.

Discovering scikit-learn objects#

Utilities to discover scikit-learn objects.

discovery.all_displays

Get a list of all displays from sklearn.

discovery.all_estimators

Get a list of all estimators from sklearn.

discovery.all_functions

Get a list of all functions from sklearn.

API compatibility checkers#

Various utilities to check the compatibility of estimators with scikit-learn API.

estimator_checks.check_estimator

Check if estimator adheres to scikit-learn conventions.

estimator_checks.parametrize_with_checks

Pytest specific decorator for parametrizing estimator checks.

Parallel computing#

Customizations of joblib and threadpoolctl tools for scikit-learn usage.

parallel.Parallel

Tweak of joblib.Parallel that propagates the scikit-learn configuration.

parallel.delayed

Decorator used to capture the arguments of a function.