neuraxle.union

Union of Features

This module contains steps to perform various feature unions and model stacking, using parallelism is possible.

Classes

AddFeatures(steps_as_tuple, BaseStep], …)

Parallelize the union of many pipeline steps AND concatenate the new features to the received inputs using Identity.

FeatureUnion(steps_as_tuple, BaseStep], …)

Parallelize the union of many pipeline steps.

Identity(hyperparams, hyperparams_space)

A pipeline step that has no effect at all but to return the same data without changes.

ModelStacking(steps_as_tuple, BaseStep], …)

Performs a FeatureUnion of steps, and then send the joined result to the above judge step.

class neuraxle.union.AddFeatures(steps_as_tuple: List[Union[Tuple[str, BaseStep], BaseStep]], **kwargs)[source]

Parallelize the union of many pipeline steps AND concatenate the new features to the received inputs using Identity.

append(item: Tuple[str, BaseStep])[source]
fit(data_inputs, expected_outputs=None) → neuraxle.union.FeatureUnion[source]

Fit the parallel steps on the data. It will make use of some parallel processing.

Parameters
  • data_inputs – The input data to fit onto

  • expected_outputs – The output that should be obtained when fitting.

Returns

self

fit_one(data_input, expected_output=None) → neuraxle.base.BaseStep[source]
fit_transform(data_inputs, expected_outputs=None) -> ('BaseStep', typing.Any)[source]
fit_transform_one(data_input, expected_output=None) -> ('BaseStep', typing.Any)[source]
get_hyperparams(flat=False) → neuraxle.hyperparams.space.HyperparameterSamples[source]
get_hyperparams_space(flat=False)[source]
inverse_transform(processed_outputs)[source]
inverse_transform_one(data_output)[source]
items()[source]
keys()[source]
meta_fit(X_train, y_train, metastep: neuraxle.base.MetaStepMixin)[source]

Uses a meta optimization technique (AutoML) to find the best hyperparameters in the given hyperparameter space.

Usage: p = p.meta_fit(X_train, y_train, metastep=RandomSearch(n_iter=10, scoring_function=r2_score, higher_score_is_better=True))

Call .mutate(new_method="inverse_transform", method_to_assign_to="transform"), and the current estimator will become

Parameters
  • X_train – data_inputs.

  • y_train – expected_outputs.

  • metastep – a metastep, that is, a step that can sift through the hyperparameter space of another estimator.

Returns

your best self.

mutate(new_method='inverse_transform', method_to_assign_to='transform', warn=True) → neuraxle.base.BaseStep[source]

Call mutate on every steps the the present truncable step contains.

Parameters
  • new_method – the method to replace transform with.

  • method_to_assign_to – the method to which the new method will be assigned to.

  • warn – (verbose) wheter or not to warn about the inexistence of the method.

Returns

self, a copy of self, or even perhaps a new or different BaseStep object.

patch_missing_names(steps_as_tuple: List) → List[Union[Tuple[str, neuraxle.base.BaseStep], neuraxle.base.BaseStep]][source]
pop() → neuraxle.base.BaseStep[source]
popfront() → neuraxle.base.BaseStep[source]
popfrontitem() → Tuple[str, neuraxle.base.BaseStep][source]
popitem(key=None) → Tuple[str, neuraxle.base.BaseStep][source]
predict(data_input)[source]
reverse() → neuraxle.base.BaseStep[source]

The object will mutate itself such that the .transform method (and of all its underlying objects if applicable) be replaced by the .inverse_transform method.

Note: the reverse may fail if there is a pending mutate that was set earlier with .will_mutate_to.

Returns

a copy of self, reversed. Each contained object will also have been reversed if self is a pipeline.

set_hyperparams(hyperparams: dict) → neuraxle.base.BaseStep[source]
set_hyperparams_space(hyperparams_space: dict) → neuraxle.base.BaseStep[source]
tosklearn() → NeuraxleToSKLearnPipelineWrapper[source]
transform(data_inputs)[source]

Transform the data with the unions. It will make use of some parallel processing.

Parameters

data_inputs – The input data to fit onto

Returns

the transformed data_inputs.

transform_one(data_input)[source]
values()[source]
will_mutate_to(new_base_step: Optional[neuraxle.base.BaseStep] = None, new_method: str = None, method_to_assign_to: str = None) → neuraxle.base.BaseStep[source]

This will change the behavior of self.mutate(<...>) such that when mutating, it will return the presently provided new_base_step BaseStep (can be left to None for self), and the .mutate method will also apply the new_method and the method_to_affect, if they are not None, and after changing the object to new_base_step.

This can be useful if your pipeline requires unsupervised pretraining. For example:

X_pretrain = ...
X_train = ...

p = Pipeline(
    SomePreprocessing(),
    SomePretrainingStep().will_mutate_to(new_base_step=SomeStepThatWillUseThePretrainingStep),
    Identity().will_mutate_to(new_base_step=ClassifierThatWillBeUsedOnlyAfterThePretraining)
)
# Pre-train the pipeline
p = p.fit(X_pretrain, y=None)

# This will leave `SomePreprocessing()` untouched and will affect the two other steps.
p = p.mutate(new_method="transform", method_to_affect="transform")

# Pre-train the pipeline
p = p.fit(X_train, y_train)  # Then fit the classifier and other new things
Parameters
  • new_base_step – if it is not None, upon calling mutate, the object it will mutate to will be this provided new_base_step.

  • method_to_assign_to – if it is not None, upon calling mutate, the method_to_affect will be the one that is used on the provided new_base_step.

  • new_method – if it is not None, upon calling mutate, the new_method will be the one that is used on the provided new_base_step.

Returns

self

class neuraxle.union.FeatureUnion(steps_as_tuple: List[Union[Tuple[str, BaseStep], BaseStep]], joiner: neuraxle.base.NonFittableMixin = <neuraxle.steps.numpy.NumpyConcatenateInnerFeatures object>, n_jobs: int = None, backend: str = 'threading')[source]

Parallelize the union of many pipeline steps.

append(item: Tuple[str, BaseStep])[source]
fit(data_inputs, expected_outputs=None) → neuraxle.union.FeatureUnion[source]

Fit the parallel steps on the data. It will make use of some parallel processing.

Parameters
  • data_inputs – The input data to fit onto

  • expected_outputs – The output that should be obtained when fitting.

Returns

self

fit_one(data_input, expected_output=None) → neuraxle.base.BaseStep[source]
fit_transform(data_inputs, expected_outputs=None) -> ('BaseStep', typing.Any)[source]
fit_transform_one(data_input, expected_output=None) -> ('BaseStep', typing.Any)[source]
get_hyperparams(flat=False) → neuraxle.hyperparams.space.HyperparameterSamples[source]
get_hyperparams_space(flat=False)[source]
inverse_transform(processed_outputs)[source]
inverse_transform_one(data_output)[source]
items()[source]
keys()[source]
meta_fit(X_train, y_train, metastep: neuraxle.base.MetaStepMixin)[source]

Uses a meta optimization technique (AutoML) to find the best hyperparameters in the given hyperparameter space.

Usage: p = p.meta_fit(X_train, y_train, metastep=RandomSearch(n_iter=10, scoring_function=r2_score, higher_score_is_better=True))

Call .mutate(new_method="inverse_transform", method_to_assign_to="transform"), and the current estimator will become

Parameters
  • X_train – data_inputs.

  • y_train – expected_outputs.

  • metastep – a metastep, that is, a step that can sift through the hyperparameter space of another estimator.

Returns

your best self.

mutate(new_method='inverse_transform', method_to_assign_to='transform', warn=True) → neuraxle.base.BaseStep[source]

Call mutate on every steps the the present truncable step contains.

Parameters
  • new_method – the method to replace transform with.

  • method_to_assign_to – the method to which the new method will be assigned to.

  • warn – (verbose) wheter or not to warn about the inexistence of the method.

Returns

self, a copy of self, or even perhaps a new or different BaseStep object.

patch_missing_names(steps_as_tuple: List) → List[Union[Tuple[str, neuraxle.base.BaseStep], neuraxle.base.BaseStep]][source]
pop() → neuraxle.base.BaseStep[source]
popfront() → neuraxle.base.BaseStep[source]
popfrontitem() → Tuple[str, neuraxle.base.BaseStep][source]
popitem(key=None) → Tuple[str, neuraxle.base.BaseStep][source]
predict(data_input)[source]
reverse() → neuraxle.base.BaseStep[source]

The object will mutate itself such that the .transform method (and of all its underlying objects if applicable) be replaced by the .inverse_transform method.

Note: the reverse may fail if there is a pending mutate that was set earlier with .will_mutate_to.

Returns

a copy of self, reversed. Each contained object will also have been reversed if self is a pipeline.

set_hyperparams(hyperparams: dict) → neuraxle.base.BaseStep[source]
set_hyperparams_space(hyperparams_space: dict) → neuraxle.base.BaseStep[source]
tosklearn() → NeuraxleToSKLearnPipelineWrapper[source]
transform(data_inputs)[source]

Transform the data with the unions. It will make use of some parallel processing.

Parameters

data_inputs – The input data to fit onto

Returns

the transformed data_inputs.

transform_one(data_input)[source]
values()[source]
will_mutate_to(new_base_step: Optional[neuraxle.base.BaseStep] = None, new_method: str = None, method_to_assign_to: str = None) → neuraxle.base.BaseStep[source]

This will change the behavior of self.mutate(<...>) such that when mutating, it will return the presently provided new_base_step BaseStep (can be left to None for self), and the .mutate method will also apply the new_method and the method_to_affect, if they are not None, and after changing the object to new_base_step.

This can be useful if your pipeline requires unsupervised pretraining. For example:

X_pretrain = ...
X_train = ...

p = Pipeline(
    SomePreprocessing(),
    SomePretrainingStep().will_mutate_to(new_base_step=SomeStepThatWillUseThePretrainingStep),
    Identity().will_mutate_to(new_base_step=ClassifierThatWillBeUsedOnlyAfterThePretraining)
)
# Pre-train the pipeline
p = p.fit(X_pretrain, y=None)

# This will leave `SomePreprocessing()` untouched and will affect the two other steps.
p = p.mutate(new_method="transform", method_to_affect="transform")

# Pre-train the pipeline
p = p.fit(X_train, y_train)  # Then fit the classifier and other new things
Parameters
  • new_base_step – if it is not None, upon calling mutate, the object it will mutate to will be this provided new_base_step.

  • method_to_assign_to – if it is not None, upon calling mutate, the method_to_affect will be the one that is used on the provided new_base_step.

  • new_method – if it is not None, upon calling mutate, the new_method will be the one that is used on the provided new_base_step.

Returns

self

class neuraxle.union.Identity(hyperparams: neuraxle.hyperparams.space.HyperparameterSamples = None, hyperparams_space: neuraxle.hyperparams.space.HyperparameterSpace = None)[source]

A pipeline step that has no effect at all but to return the same data without changes.

This can be useful to concatenate new features to existing features, such as what AddFeatures do.

Identity inherits from NonTransformableMixin and from NonFittableMixin which makes it a class that has no effect in the pipeline: it doesn’t require fitting, and at transform-time, it returns the same data it received.

fit(data_inputs, expected_outputs=None) → neuraxle.base.NonFittableMixin[source]

Don’t fit.

Parameters
  • data_inputs – the data that would normally be fitted on.

  • expected_outputs – the data that would normally be fitted on.

Returns

self

fit_one(data_input, expected_output=None) → neuraxle.base.NonFittableMixin[source]

Don’t fit.

Parameters
  • data_input – the data that would normally be fitted on.

  • expected_output – the data that would normally be fitted on.

Returns

self

fit_transform(data_inputs, expected_outputs=None) -> ('BaseStep', typing.Any)[source]
fit_transform_one(data_input, expected_output=None) -> ('BaseStep', typing.Any)[source]
get_hyperparams() → neuraxle.hyperparams.space.HyperparameterSamples[source]
get_hyperparams_space(flat=False) → neuraxle.hyperparams.space.HyperparameterSpace[source]
inverse_transform(processed_outputs)[source]

Do nothing - return the same data.

Parameters

processed_outputs – the data to process

Returns

the processed_outputs, unchanged.

inverse_transform_one(processed_output)[source]

Do nothing - return the same data.

Parameters

processed_output – the data to process

Returns

the data_output, unchanged.

meta_fit(X_train, y_train, metastep: neuraxle.base.MetaStepMixin)[source]

Uses a meta optimization technique (AutoML) to find the best hyperparameters in the given hyperparameter space.

Usage: p = p.meta_fit(X_train, y_train, metastep=RandomSearch(n_iter=10, scoring_function=r2_score, higher_score_is_better=True))

Call .mutate(new_method="inverse_transform", method_to_assign_to="transform"), and the current estimator will become

Parameters
  • X_train – data_inputs.

  • y_train – expected_outputs.

  • metastep – a metastep, that is, a step that can sift through the hyperparameter space of another estimator.

Returns

your best self.

mutate(new_method='inverse_transform', method_to_assign_to='transform', warn=True) → neuraxle.base.BaseStep[source]

Replace the “method_to_assign_to” method by the “new_method” method, IF the present object has no pending calls to .will_mutate_to() waiting to be applied. If there is a pending call, the pending call will override the methods specified in the present call. If the change fails (such as if the new_method doesn’t exist), then a warning is printed (optional). By default, there is no pending will_mutate_to call.

This could for example be useful within a pipeline to apply inverse_transform to every pipeline steps, or to assign predict_probas to predict, or to assign “inverse_transform” to “transform” to a reversed pipeline.

Parameters
  • new_method – the method to replace transform with, if there is no pending will_mutate_to call.

  • method_to_assign_to – the method to which the new method will be assigned to, if there is no pending will_mutate_to call.

  • warn – (verbose) wheter or not to warn about the inexistence of the method.

Returns

self, a copy of self, or even perhaps a new or different BaseStep object.

predict(data_input)[source]
reverse() → neuraxle.base.BaseStep[source]

The object will mutate itself such that the .transform method (and of all its underlying objects if applicable) be replaced by the .inverse_transform method.

Note: the reverse may fail if there is a pending mutate that was set earlier with .will_mutate_to.

Returns

a copy of self, reversed. Each contained object will also have been reversed if self is a pipeline.

set_hyperparams(hyperparams: neuraxle.hyperparams.space.HyperparameterSamples) → neuraxle.base.BaseStep[source]
set_hyperparams_space(hyperparams_space: neuraxle.hyperparams.space.HyperparameterSpace) → neuraxle.base.BaseStep[source]
tosklearn() → NeuraxleToSKLearnPipelineWrapper[source]
transform(data_inputs)[source]

Do nothing - return the same data.

Parameters

data_inputs – the data to process

Returns

the data_inputs, unchanged.

transform_one(data_input)[source]

Do nothing - return the same data.

Parameters

data_input – the data to process

Returns

the data_input, unchanged.

will_mutate_to(new_base_step: Optional[neuraxle.base.BaseStep] = None, new_method: str = None, method_to_assign_to: str = None) → neuraxle.base.BaseStep[source]

This will change the behavior of self.mutate(<...>) such that when mutating, it will return the presently provided new_base_step BaseStep (can be left to None for self), and the .mutate method will also apply the new_method and the method_to_affect, if they are not None, and after changing the object to new_base_step.

This can be useful if your pipeline requires unsupervised pretraining. For example:

X_pretrain = ...
X_train = ...

p = Pipeline(
    SomePreprocessing(),
    SomePretrainingStep().will_mutate_to(new_base_step=SomeStepThatWillUseThePretrainingStep),
    Identity().will_mutate_to(new_base_step=ClassifierThatWillBeUsedOnlyAfterThePretraining)
)
# Pre-train the pipeline
p = p.fit(X_pretrain, y=None)

# This will leave `SomePreprocessing()` untouched and will affect the two other steps.
p = p.mutate(new_method="transform", method_to_affect="transform")

# Pre-train the pipeline
p = p.fit(X_train, y_train)  # Then fit the classifier and other new things
Parameters
  • new_base_step – if it is not None, upon calling mutate, the object it will mutate to will be this provided new_base_step.

  • method_to_assign_to – if it is not None, upon calling mutate, the method_to_affect will be the one that is used on the provided new_base_step.

  • new_method – if it is not None, upon calling mutate, the new_method will be the one that is used on the provided new_base_step.

Returns

self

class neuraxle.union.ModelStacking(steps_as_tuple: List[Union[Tuple[str, BaseStep], BaseStep]], judge: neuraxle.base.BaseStep, **kwargs)[source]

Performs a FeatureUnion of steps, and then send the joined result to the above judge step.

append(item: Tuple[str, BaseStep])[source]
fit(data_inputs, expected_outputs=None) → neuraxle.union.ModelStacking[source]

Fit the parallel steps on the data. It will make use of some parallel processing. Also, fit the judge on the result of the parallel steps.

Parameters
  • data_inputs – The input data to fit onto

  • expected_outputs – The output that should be obtained when fitting.

Returns

self

fit_one(data_input, expected_output=None) → neuraxle.base.BaseStep[source]
fit_transform(data_inputs, expected_outputs=None) -> ('BaseStep', typing.Any)[source]
fit_transform_one(data_input, expected_output=None) -> ('BaseStep', typing.Any)[source]
get_hyperparams(flat=False) → neuraxle.hyperparams.space.HyperparameterSamples[source]
get_hyperparams_space(flat=False)[source]
inverse_transform(processed_outputs)[source]
inverse_transform_one(data_output)[source]
items()[source]
keys()[source]
meta_fit(X_train, y_train, metastep: neuraxle.base.MetaStepMixin)[source]

Uses a meta optimization technique (AutoML) to find the best hyperparameters in the given hyperparameter space.

Usage: p = p.meta_fit(X_train, y_train, metastep=RandomSearch(n_iter=10, scoring_function=r2_score, higher_score_is_better=True))

Call .mutate(new_method="inverse_transform", method_to_assign_to="transform"), and the current estimator will become

Parameters
  • X_train – data_inputs.

  • y_train – expected_outputs.

  • metastep – a metastep, that is, a step that can sift through the hyperparameter space of another estimator.

Returns

your best self.

mutate(new_method='inverse_transform', method_to_assign_to='transform', warn=True) → neuraxle.base.BaseStep[source]

Call mutate on every steps the the present truncable step contains.

Parameters
  • new_method – the method to replace transform with.

  • method_to_assign_to – the method to which the new method will be assigned to.

  • warn – (verbose) wheter or not to warn about the inexistence of the method.

Returns

self, a copy of self, or even perhaps a new or different BaseStep object.

patch_missing_names(steps_as_tuple: List) → List[Union[Tuple[str, neuraxle.base.BaseStep], neuraxle.base.BaseStep]][source]
pop() → neuraxle.base.BaseStep[source]
popfront() → neuraxle.base.BaseStep[source]
popfrontitem() → Tuple[str, neuraxle.base.BaseStep][source]
popitem(key=None) → Tuple[str, neuraxle.base.BaseStep][source]
predict(data_input)[source]
reverse() → neuraxle.base.BaseStep[source]

The object will mutate itself such that the .transform method (and of all its underlying objects if applicable) be replaced by the .inverse_transform method.

Note: the reverse may fail if there is a pending mutate that was set earlier with .will_mutate_to.

Returns

a copy of self, reversed. Each contained object will also have been reversed if self is a pipeline.

set_hyperparams(hyperparams: dict) → neuraxle.base.BaseStep[source]
set_hyperparams_space(hyperparams_space: dict) → neuraxle.base.BaseStep[source]
tosklearn() → NeuraxleToSKLearnPipelineWrapper[source]
transform(data_inputs)[source]

Transform the data with the unions. It will make use of some parallel processing. Then, use the judge to refine the transformations.

Parameters

data_inputs – The input data to fit onto

Returns

the transformed data_inputs.

transform_one(data_input)[source]
values()[source]
will_mutate_to(new_base_step: Optional[neuraxle.base.BaseStep] = None, new_method: str = None, method_to_assign_to: str = None) → neuraxle.base.BaseStep[source]

This will change the behavior of self.mutate(<...>) such that when mutating, it will return the presently provided new_base_step BaseStep (can be left to None for self), and the .mutate method will also apply the new_method and the method_to_affect, if they are not None, and after changing the object to new_base_step.

This can be useful if your pipeline requires unsupervised pretraining. For example:

X_pretrain = ...
X_train = ...

p = Pipeline(
    SomePreprocessing(),
    SomePretrainingStep().will_mutate_to(new_base_step=SomeStepThatWillUseThePretrainingStep),
    Identity().will_mutate_to(new_base_step=ClassifierThatWillBeUsedOnlyAfterThePretraining)
)
# Pre-train the pipeline
p = p.fit(X_pretrain, y=None)

# This will leave `SomePreprocessing()` untouched and will affect the two other steps.
p = p.mutate(new_method="transform", method_to_affect="transform")

# Pre-train the pipeline
p = p.fit(X_train, y_train)  # Then fit the classifier and other new things
Parameters
  • new_base_step – if it is not None, upon calling mutate, the object it will mutate to will be this provided new_base_step.

  • method_to_assign_to – if it is not None, upon calling mutate, the method_to_affect will be the one that is used on the provided new_base_step.

  • new_method – if it is not None, upon calling mutate, the new_method will be the one that is used on the provided new_base_step.

Returns

self