FixedThresholdClassifier#
- class sklearn.model_selection.FixedThresholdClassifier(estimator, *, threshold='auto', pos_label=None, response_method='auto')[source]#
 Binary classifier that manually sets the decision threshold.
This classifier allows to change the default decision threshold used for converting posterior probability estimates (i.e. output of
predict_proba) or decision scores (i.e. output ofdecision_function) into a class label.Here, the threshold is not optimized and is set to a constant value.
Read more in the User Guide.
Added in version 1.5.
- Parameters:
 - estimatorestimator instance
 The binary classifier, fitted or not, for which we want to optimize the decision threshold used during
predict.- threshold{“auto”} or float, default=”auto”
 The decision threshold to use when converting posterior probability estimates (i.e. output of
predict_proba) or decision scores (i.e. output ofdecision_function) into a class label. When"auto", the threshold is set to 0.5 ifpredict_probais used asresponse_method, otherwise it is set to 0 (i.e. the default threshold fordecision_function).- pos_labelint, float, bool or str, default=None
 The label of the positive class. Used to process the output of the
response_methodmethod. Whenpos_label=None, ify_trueis in{-1, 1}or{0, 1},pos_labelis set to 1, otherwise an error will be raised.- response_method{“auto”, “decision_function”, “predict_proba”}, default=”auto”
 Methods by the classifier
estimatorcorresponding to the decision function for which we want to find a threshold. It can be:if
"auto", it will try to invoke"predict_proba"or"decision_function"in that order.otherwise, one of
"predict_proba"or"decision_function". If the method is not implemented by the classifier, it will raise an error.
- Attributes:
 - estimator_estimator instance
 The fitted classifier used when predicting.
classes_ndarray of shape (n_classes,)Classes labels.
- n_features_in_int
 Number of features seen during fit. Only defined if the underlying estimator exposes such an attribute when fit.
- feature_names_in_ndarray of shape (
n_features_in_,) Names of features seen during fit. Only defined if the underlying estimator exposes such an attribute when fit.
See also
sklearn.model_selection.TunedThresholdClassifierCVClassifier that post-tunes the decision threshold based on some metrics and using cross-validation.
sklearn.calibration.CalibratedClassifierCVEstimator that calibrates probabilities.
Examples
>>> from sklearn.datasets import make_classification >>> from sklearn.linear_model import LogisticRegression >>> from sklearn.metrics import confusion_matrix >>> from sklearn.model_selection import FixedThresholdClassifier, train_test_split >>> X, y = make_classification( ... n_samples=1_000, weights=[0.9, 0.1], class_sep=0.8, random_state=42 ... ) >>> X_train, X_test, y_train, y_test = train_test_split( ... X, y, stratify=y, random_state=42 ... ) >>> classifier = LogisticRegression(random_state=0).fit(X_train, y_train) >>> print(confusion_matrix(y_test, classifier.predict(X_test))) [[217 7] [ 19 7]] >>> classifier_other_threshold = FixedThresholdClassifier( ... classifier, threshold=0.1, response_method="predict_proba" ... ).fit(X_train, y_train) >>> print(confusion_matrix(y_test, classifier_other_threshold.predict(X_test))) [[184 40] [ 6 20]]
- decision_function(X)[source]#
 Decision function for samples in
Xusing the fitted estimator.- Parameters:
 - X{array-like, sparse matrix} of shape (n_samples, n_features)
 Training vectors, where
n_samplesis the number of samples andn_featuresis the number of features.
- Returns:
 - decisionsndarray of shape (n_samples,)
 The decision function computed the fitted estimator.
- fit(X, y, **params)[source]#
 Fit the classifier.
- Parameters:
 - X{array-like, sparse matrix} of shape (n_samples, n_features)
 Training data.
- yarray-like of shape (n_samples,)
 Target values.
- **paramsdict
 Parameters to pass to the
fitmethod of the underlying classifier.
- Returns:
 - selfobject
 Returns an instance of self.
- get_metadata_routing()[source]#
 Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
 - routingMetadataRouter
 A
MetadataRouterencapsulating routing information.
- get_params(deep=True)[source]#
 Get parameters for this estimator.
- Parameters:
 - deepbool, default=True
 If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
 - paramsdict
 Parameter names mapped to their values.
- predict(X)[source]#
 Predict the target of new samples.
- Parameters:
 - X{array-like, sparse matrix} of shape (n_samples, n_features)
 The samples, as accepted by
estimator.predict.
- Returns:
 - class_labelsndarray of shape (n_samples,)
 The predicted class.
- predict_log_proba(X)[source]#
 Predict logarithm class probabilities for
Xusing the fitted estimator.- Parameters:
 - X{array-like, sparse matrix} of shape (n_samples, n_features)
 Training vectors, where
n_samplesis the number of samples andn_featuresis the number of features.
- Returns:
 - log_probabilitiesndarray of shape (n_samples, n_classes)
 The logarithm class probabilities of the input samples.
- predict_proba(X)[source]#
 Predict class probabilities for
Xusing the fitted estimator.- Parameters:
 - X{array-like, sparse matrix} of shape (n_samples, n_features)
 Training vectors, where
n_samplesis the number of samples andn_featuresis the number of features.
- Returns:
 - probabilitiesndarray of shape (n_samples, n_classes)
 The class probabilities of the input samples.
- score(X, y, sample_weight=None)[source]#
 Return accuracy on provided data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
- Parameters:
 - Xarray-like of shape (n_samples, n_features)
 Test samples.
- yarray-like of shape (n_samples,) or (n_samples, n_outputs)
 True labels for
X.- sample_weightarray-like of shape (n_samples,), default=None
 Sample weights.
- Returns:
 - scorefloat
 Mean accuracy of
self.predict(X)w.r.t.y.
- set_params(**params)[source]#
 Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
 - **paramsdict
 Estimator parameters.
- Returns:
 - selfestimator instance
 Estimator instance.
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') FixedThresholdClassifier[source]#
 Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
 - sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
 Metadata routing for
sample_weightparameter inscore.
- Returns:
 - selfobject
 The updated object.
Gallery examples#
Visualizing the probabilistic predictions of a VotingClassifier
Post-tuning the decision threshold for cost-sensitive learning