log_loss#
- sklearn.metrics.log_loss(y_true, y_pred, *, normalize=True, sample_weight=None, labels=None)[source]#
Log loss, aka logistic loss or cross-entropy loss.
This is the loss function used in (multinomial) logistic regression and extensions of it such as neural networks, defined as the negative log-likelihood of a logistic model that returns
y_pred
probabilities for its training datay_true
. The log loss is only defined for two or more labels. For a single sample with true label \(y \in \{0,1\}\) and a probability estimate \(p = \operatorname{Pr}(y = 1)\), the log loss is:\[L_{\log}(y, p) = -(y \log (p) + (1 - y) \log (1 - p))\]Read more in the User Guide.
- Parameters:
- y_truearray-like or label indicator matrix
Ground truth (correct) labels for n_samples samples.
- y_predarray-like of float, shape = (n_samples, n_classes) or (n_samples,)
Predicted probabilities, as returned by a classifier’s predict_proba method. If
y_pred.shape = (n_samples,)
the probabilities provided are assumed to be that of the positive class. The labels iny_pred
are assumed to be ordered alphabetically, as done byLabelBinarizer
.y_pred
values are clipped to[eps, 1-eps]
whereeps
is the machine precision fory_pred
’s dtype.- normalizebool, default=True
If true, return the mean loss per sample. Otherwise, return the sum of the per-sample losses.
- sample_weightarray-like of shape (n_samples,), default=None
sample weights.
- labelsarray-like, default=None
If not provided, labels will be inferred from y_true. If
labels
isNone
andy_pred
has shape (n_samples,) the labels are assumed to be binary and are inferred fromy_true
.Added in version 0.18.
- Returns:
- lossfloat
Log loss, aka logistic loss or cross-entropy loss.
Notes
The logarithm used is the natural logarithm (base-e).
References
C.M. Bishop (2006). Pattern Recognition and Machine Learning. springer, p. 209.
Examples
>>> from sklearn.metrics import log_loss >>> log_loss(["spam", "ham", "ham", "spam"], ... [[.1, .9], [.9, .1], [.8, .2], [.35, .65]]) 0.21616...
Gallery examples#
Probability Calibration curves
Probability Calibration for 3-class classification
Gradient Boosting Out-of-Bag estimates
Gradient Boosting regularization
Probabilistic predictions with Gaussian process classification (GPC)