FunctionTransformer#

class sklearn.preprocessing.FunctionTransformer(func=None, inverse_func=None, *, validate=False, accept_sparse=False, check_inverse=True, feature_names_out=None, kw_args=None, inv_kw_args=None)[source]#

Constructs a transformer from an arbitrary callable.

A FunctionTransformer forwards its X (and optionally y) arguments to a user-defined function or function object and returns the result of this function. This is useful for stateless transformations such as taking the log of frequencies, doing custom scaling, etc.

Note: If a lambda is used as the function, then the resulting transformer will not be pickleable.

Added in version 0.17.

See also

MaxAbsScaler: Scale each feature by its maximum absolute value.
StandardScaler: Standardize features by removing the mean and scaling to unit variance.
LabelBinarizer: Binarize labels in a one-vs-all fashion.
MultiLabelBinarizer: Transform between iterable of iterables and a multilabel format.

Notes

If func returns an output with a columns attribute, then the columns is enforced to be consistent with the output of get_feature_names_out.

Examples

>>> import numpy as np
>>> from sklearn.preprocessing import FunctionTransformer
>>> transformer = FunctionTransformer(np.log1p)
>>> X = np.array([[0, 1], [2, 3]])
>>> transformer.transform(X)
array([[0.       , 0.6931...],
       [1.0986..., 1.3862...]])

fit(X, y=None)[source]#

Fit transformer by checking X.

If validate is True, X will be checked.

Parameters:

X{array-like, sparse-matrix} of shape (n_samples, n_features) if validate=True else any object that func can handle: Input array.
yIgnored: Not used, present here for API consistency by convention.

Returns:

selfobject: FunctionTransformer class instance.

fit_transform(X, y=None, **fit_params)[source]#

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:

Xarray-like of shape (n_samples, n_features): Input samples.
yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None: Target values (None for unsupervised transformations).
**fit_paramsdict: Additional fit parameters.

Returns:

X_newndarray array of shape (n_samples, n_features_new): Transformed array.

get_feature_names_out(input_features=None)[source]#

Get output feature names for transformation.

This method is only defined if feature_names_out is not None.

Parameters:

input_featuresarray-like of str or None, default=None

Input feature names.

If input_features is None, then feature_names_in_ is used as the input feature names. If feature_names_in_ is not defined, then names are generated: [x0, x1, ..., x(n_features_in_ - 1)].
If input_features is array-like, then input_features must match feature_names_in_ if feature_names_in_ is defined.

Returns:

feature_names_outndarray of str objects

Transformed feature names.

If feature_names_out is ‘one-to-one’, the input feature names are returned (see input_features above). This requires feature_names_in_ and/or n_features_in_ to be defined, which is done automatically if validate=True. Alternatively, you can set them in func.
If feature_names_out is a callable, then it is called with two arguments, self and input_features, and its return value is returned by this method.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routingMetadataRequest: A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

paramsdict: Parameter names mapped to their values.

inverse_transform(X)[source]#

Transform X using the inverse function.

Parameters:

X{array-like, sparse-matrix} of shape (n_samples, n_features) if validate=True else any object that inverse_func can handle: Input array.

Returns:

X_outarray-like, shape (n_samples, n_features): Transformed input.

set_output(*, transform=None)[source]#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:

transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

"default": Default output format of a transformer
"pandas": DataFrame output
"polars": Polars output
None: Transform configuration is unchanged

Added in version 1.4: "polars" option was added.

Returns:

selfestimator instance: Estimator instance.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**paramsdict: Estimator parameters.

Returns:

selfestimator instance: Estimator instance.

transform(X)[source]#

Transform X using the forward function.

Parameters:

X{array-like, sparse-matrix} of shape (n_samples, n_features) if validate=True else any object that func can handle: Input array.

Returns:

X_outarray-like, shape (n_samples, n_features): Transformed input.

Gallery examples#

Feature transformations with ensembles of trees

Time-related feature engineering

Poisson regression and non-normal loss

Tweedie regression on insurance claims

Column Transformer with Heterogeneous Data Sources

Semi-supervised Classification on a Text Dataset