fastica#

sklearn.decomposition.fastica(X, n_components=None, *, algorithm='parallel', whiten='unit-variance', fun='logcosh', fun_args=None, max_iter=200, tol=0.0001, w_init=None, whiten_solver='svd', random_state=None, return_X_mean=False, compute_sources=True, return_n_iter=False)[source]#

Perform Fast Independent Component Analysis.

The implementation is based on [1].

Read more in the User guide.

Parameters:

Xarray-like of shape (n_samples, n_features)

Training vector, where n_samples is the number of samples and n_features is the number of features.

n_componentsint, default=None

Number of components to use. If None is passed, all are used.

algorithm{‘parallel’, ‘deflation’}, default=’parallel’

Specify which algorithm to use for FastICA.

whitenstr or bool, default=’unit-variance’

Specify the whitening strategy to use.

If ‘arbitrary-variance’, a whitening with variance arbitrary is used.
If ‘unit-variance’, the whitening matrix is rescaled to ensure that each recovered source has unit variance.
If False, the data is already considered to be whitened, and no whitening is performed.

Changed in version 1.3: The default value of whiten changed to ‘unit-variance’ in 1.3.

fun{‘logcosh’, ‘exp’, ‘cube’} or callable, default=’logcosh’

The functional form of the g function used in the approximation to neg-entropy. Could be either ‘logcosh’, ‘exp’, or ‘cube’. You can also provide your own function. It should return a tuple containing the value of the function, and of its derivative, in the point. The derivative should be averaged along its last dimension. Example:

def my_g(x):
    return x ** 3, (3 * x ** 2).mean(axis=-1)

fun_argsdict, default=None

Arguments to send to the functional form. If empty or None and if fun=’logcosh’, fun_args will take value {‘alpha’ : 1.0}.

max_iterint, default=200

Maximum number of iterations to perform.

tolfloat, default=1e-4

A positive scalar giving the tolerance at which the un-mixing matrix is considered to have converged.

w_initndarray of shape (n_components, n_components), default=None

Initial un-mixing array. If w_init=None, then an array of values drawn from a normal distribution is used.

whiten_solver{“eigh”, “svd”}, default=”svd”

The solver to use for whitening.

“svd” is more stable numerically if the problem is degenerate, and often faster when n_samples <= n_features.
“eigh” is generally more memory efficient when n_samples >= n_features, and can be faster when n_samples >= 50 * n_features.

Added in version 1.2.

random_stateint, RandomState instance or None, default=None

Used to initialize w_init when not specified, with a normal distribution. Pass an int, for reproducible results across multiple function calls. See glossary.

return_X_meanbool, default=False

If True, X_mean is returned too.

compute_sourcesbool, default=True

If False, sources are not computed, but only the rotation matrix. This can save memory when working with big data. Defaults to True.

return_n_iterbool, default=False

Whether or not to return the number of iterations.

Returns:

Kndarray of shape (n_components, n_features) or None: If whiten is ‘True’, K is the pre-whitening matrix that projects data onto the first n_components principal components. If whiten is ‘False’, K is ‘None’.
Wndarray of shape (n_components, n_components): The square matrix that unmixes the data after whitening. The mixing matrix is the pseudo-inverse of matrix W K if K is not None, else it is the inverse of W.
Sndarray of shape (n_samples, n_components) or None: Estimated source matrix.
X_meanndarray of shape (n_features,): The mean over features. Returned only if return_X_mean is True.
n_iterint: If the algorithm is “deflation”, n_iter is the maximum number of iterations run across all components. Else they are just the number of iterations taken to converge. This is returned only when return_n_iter is set to True.

Notes

The data matrix X is considered to be a linear combination of non-gaussian (independent) components i.e. X = AS where columns of S contain the independent components and A is a linear mixing matrix. In short ICA attempts to un-mix' the data by estimating an un-mixing matrix W where ``S = W K X.` While FastICA was proposed to estimate as many sources as features, it is possible to estimate less by setting n_components < n_features. It this case K is not a square matrix and the estimated A is the pseudo-inverse of W K.

This implementation was originally made for data of shape [n_features, n_samples]. Now the input is transposed before the algorithm is applied. This makes it slightly faster for Fortran-ordered input.

References

[1]

A. Hyvarinen and E. Oja, “Fast Independent Component Analysis”, Algorithms and Applications, Neural Networks, 13(4-5), 2000, pp. 411-430.

Examples

&gt;&gt;&gt; from sklearn.datasets import load_digits
&gt;&gt;&gt; from sklearn.decomposition import fastica
&gt;&gt;&gt; X, _ = load_digits(return_X_y=True)
&gt;&gt;&gt; K, W, S = fastica(X, n_components=7, random_state=0, whiten='unit-variance')
&gt;&gt;&gt; K.shape
(7, 64)
&gt;&gt;&gt; W.shape
(7, 7)
&gt;&gt;&gt; S.shape
(1797, 7)