FixedThresholdClassifier#

class sklearn.model_selection.FixedThresholdClassifier(estimator, *, threshold='auto', pos_label=None, response_method='auto')[source]#

Binary classifier that manually sets the decision threshold.

This classifier allows to change the default decision threshold used for converting posterior probability estimates (i.e. output of predict_proba) or decision scores (i.e. output of decision_function) into a class label.

Here, the threshold is not optimized and is set to a constant value.

Read more in the User Guide.

Added in version 1.5.

Parameters:
estimatorestimator instance

The binary classifier, fitted or not, for which we want to optimize the decision threshold used during predict.

threshold{“auto”} or float, default=”auto”

The decision threshold to use when converting posterior probability estimates (i.e. output of predict_proba) or decision scores (i.e. output of decision_function) into a class label. When "auto", the threshold is set to 0.5 if predict_proba is used as response_method, otherwise it is set to 0 (i.e. the default threshold for decision_function).

pos_labelint, float, bool or str, default=None

The label of the positive class. Used to process the output of the response_method method. When pos_label=None, if y_true is in {-1, 1} or {0, 1}, pos_label is set to 1, otherwise an error will be raised.

response_method{“auto”, “decision_function”, “predict_proba”}, default=”auto”

Methods by the classifier estimator corresponding to the decision function for which we want to find a threshold. It can be:

  • if "auto", it will try to invoke "predict_proba" or "decision_function" in that order.

  • otherwise, one of "predict_proba" or "decision_function". If the method is not implemented by the classifier, it will raise an error.

Attributes:
estimator_estimator instance

The fitted classifier used when predicting.

classes_ndarray of shape (n_classes,)

Classes labels.

n_features_in_int

Number of features seen during fit. Only defined if the underlying estimator exposes such an attribute when fit.

feature_names_in_ndarray of shape (n_features_in_,)

Names of features seen during fit. Only defined if the underlying estimator exposes such an attribute when fit.

See also

sklearn.model_selection.TunedThresholdClassifierCV

Classifier that post-tunes the decision threshold based on some metrics and using cross-validation.

sklearn.calibration.CalibratedClassifierCV

Estimator that calibrates probabilities.

Examples

>>> from sklearn.datasets import make_classification
>>> from sklearn.linear_model import LogisticRegression
>>> from sklearn.metrics import confusion_matrix
>>> from sklearn.model_selection import FixedThresholdClassifier, train_test_split
>>> X, y = make_classification(
...     n_samples=1_000, weights=[0.9, 0.1], class_sep=0.8, random_state=42
... )
>>> X_train, X_test, y_train, y_test = train_test_split(
...     X, y, stratify=y, random_state=42
... )
>>> classifier = LogisticRegression(random_state=0).fit(X_train, y_train)
>>> print(confusion_matrix(y_test, classifier.predict(X_test)))
[[217   7]
 [ 19   7]]
>>> classifier_other_threshold = FixedThresholdClassifier(
...     classifier, threshold=0.1, response_method="predict_proba"
... ).fit(X_train, y_train)
>>> print(confusion_matrix(y_test, classifier_other_threshold.predict(X_test)))
[[184  40]
 [  6  20]]
decision_function(X)[source]#

Decision function for samples in X using the fitted estimator.

Parameters:
X{array-like, sparse matrix} of shape (n_samples, n_features)

Training vectors, where n_samples is the number of samples and n_features is the number of features.

Returns:
decisionsndarray of shape (n_samples,)

The decision function computed the fitted estimator.

fit(X, y, **params)[source]#

Fit the classifier.

Parameters:
X{array-like, sparse matrix} of shape (n_samples, n_features)

Training data.

yarray-like of shape (n_samples,)

Target values.

**paramsdict

Parameters to pass to the fit method of the underlying classifier.

Returns:
selfobject

Returns an instance of self.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRouter

A MetadataRouter encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

predict(X)[source]#

Predict the target of new samples.

Parameters:
X{array-like, sparse matrix} of shape (n_samples, n_features)

The samples, as accepted by estimator.predict.

Returns:
class_labelsndarray of shape (n_samples,)

The predicted class.

predict_log_proba(X)[source]#

Predict logarithm class probabilities for X using the fitted estimator.

Parameters:
X{array-like, sparse matrix} of shape (n_samples, n_features)

Training vectors, where n_samples is the number of samples and n_features is the number of features.

Returns:
log_probabilitiesndarray of shape (n_samples, n_classes)

The logarithm class probabilities of the input samples.

predict_proba(X)[source]#

Predict class probabilities for X using the fitted estimator.

Parameters:
X{array-like, sparse matrix} of shape (n_samples, n_features)

Training vectors, where n_samples is the number of samples and n_features is the number of features.

Returns:
probabilitiesndarray of shape (n_samples, n_classes)

The class probabilities of the input samples.

score(X, y, sample_weight=None)[source]#

Return accuracy on provided data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:
Xarray-like of shape (n_samples, n_features)

Test samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs)

True labels for X.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.

Returns:
scorefloat

Mean accuracy of self.predict(X) w.r.t. y.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') FixedThresholdClassifier[source]#

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in score.

Returns:
selfobject

The updated object.