make_gaussian_quantiles#
- sklearn.datasets.make_gaussian_quantiles(*, mean=None, cov=1.0, n_samples=100, n_features=2, n_classes=3, shuffle=True, random_state=None)[source]#
Generate isotropic Gaussian and label samples by quantile.
This classification dataset is constructed by taking a multi-dimensional standard normal distribution and defining classes separated by nested concentric multi-dimensional spheres such that roughly equal numbers of samples are in each class (quantiles of the \(\chi^2\) distribution).
For an example of usage, see Plot randomly generated classification dataset.
Read more in the User Guide.
- Parameters:
- meanarray-like of shape (n_features,), default=None
The mean of the multi-dimensional normal distribution. If None then use the origin (0, 0, …).
- covfloat, default=1.0
The covariance matrix will be this value times the unit matrix. This dataset only produces symmetric normal distributions.
- n_samplesint, default=100
The total number of points equally divided among classes.
- n_featuresint, default=2
The number of features for each sample.
- n_classesint, default=3
The number of classes.
- shufflebool, default=True
Shuffle the samples.
- random_stateint, RandomState instance or None, default=None
Determines random number generation for dataset creation. Pass an int for reproducible output across multiple function calls. See Glossary.
- Returns:
- Xndarray of shape (n_samples, n_features)
The generated samples.
- yndarray of shape (n_samples,)
The integer labels for quantile membership of each sample.
Notes
The dataset is from Zhu et al [1].
References
[1]Zhu, H. Zou, S. Rosset, T. Hastie, “Multi-class AdaBoost”, 2009.
Examples
>>> from sklearn.datasets import make_gaussian_quantiles >>> X, y = make_gaussian_quantiles(random_state=42) >>> X.shape (100, 2) >>> y.shape (100,) >>> list(y[:5]) [np.int64(2), np.int64(0), np.int64(1), np.int64(0), np.int64(2)]
Gallery examples#
Plot randomly generated classification dataset
Multi-class AdaBoosted Decision Trees