neuraxle.steps.util

Util Pipeline Steps

You can find here misc. pipeline steps, for example, callbacks useful for debugging, and a step cloner.

Classes

BaseCallbackStep(callback_function, …)

Base class for callback steps.

DataShuffler

FitCallbackStep(callback_function, …)

Call a callback method on fit.

StepClonerForEachDataInput(wrapped[, copy_op])

TapeCallbackFunction()

This class’s purpose is to be sent to the callback to accumulate information.

TransformCallbackStep(callback_function, …)

Call a callback method on transform and inverse transform.

class neuraxle.steps.util.BaseCallbackStep(callback_function, more_arguments: List = ())[source]

Base class for callback steps.

fit(data_inputs, expected_outputs=None) → neuraxle.base.BaseStep[source]
fit_one(data_input, expected_output=None) → neuraxle.base.BaseStep[source]
fit_transform(data_inputs, expected_outputs=None) -> ('BaseStep', typing.Any)[source]
fit_transform_one(data_input, expected_output=None) -> ('BaseStep', typing.Any)[source]
get_hyperparams() → neuraxle.hyperparams.space.HyperparameterSamples[source]
get_hyperparams_space(flat=False) → neuraxle.hyperparams.space.HyperparameterSpace[source]
inverse_transform(processed_outputs)[source]
inverse_transform_one(data_output)[source]
meta_fit(X_train, y_train, metastep: neuraxle.base.MetaStepMixin)[source]

Uses a meta optimization technique (AutoML) to find the best hyperparameters in the given hyperparameter space.

Usage: p = p.meta_fit(X_train, y_train, metastep=RandomSearch(n_iter=10, scoring_function=r2_score, higher_score_is_better=True))

Call .mutate(new_method="inverse_transform", method_to_assign_to="transform"), and the current estimator will become

Parameters
  • X_train – data_inputs.

  • y_train – expected_outputs.

  • metastep – a metastep, that is, a step that can sift through the hyperparameter space of another estimator.

Returns

your best self.

mutate(new_method='inverse_transform', method_to_assign_to='transform', warn=True) → neuraxle.base.BaseStep[source]

Replace the “method_to_assign_to” method by the “new_method” method, IF the present object has no pending calls to .will_mutate_to() waiting to be applied. If there is a pending call, the pending call will override the methods specified in the present call. If the change fails (such as if the new_method doesn’t exist), then a warning is printed (optional). By default, there is no pending will_mutate_to call.

This could for example be useful within a pipeline to apply inverse_transform to every pipeline steps, or to assign predict_probas to predict, or to assign “inverse_transform” to “transform” to a reversed pipeline.

Parameters
  • new_method – the method to replace transform with, if there is no pending will_mutate_to call.

  • method_to_assign_to – the method to which the new method will be assigned to, if there is no pending will_mutate_to call.

  • warn – (verbose) wheter or not to warn about the inexistence of the method.

Returns

self, a copy of self, or even perhaps a new or different BaseStep object.

predict(data_input)[source]
reverse() → neuraxle.base.BaseStep[source]

The object will mutate itself such that the .transform method (and of all its underlying objects if applicable) be replaced by the .inverse_transform method.

Note: the reverse may fail if there is a pending mutate that was set earlier with .will_mutate_to.

Returns

a copy of self, reversed. Each contained object will also have been reversed if self is a pipeline.

set_hyperparams(hyperparams: neuraxle.hyperparams.space.HyperparameterSamples) → neuraxle.base.BaseStep[source]
set_hyperparams_space(hyperparams_space: neuraxle.hyperparams.space.HyperparameterSpace) → neuraxle.base.BaseStep[source]
tosklearn() → NeuraxleToSKLearnPipelineWrapper[source]
transform(data_inputs)[source]
transform_one(data_input)[source]
will_mutate_to(new_base_step: Optional[neuraxle.base.BaseStep] = None, new_method: str = None, method_to_assign_to: str = None) → neuraxle.base.BaseStep[source]

This will change the behavior of self.mutate(<...>) such that when mutating, it will return the presently provided new_base_step BaseStep (can be left to None for self), and the .mutate method will also apply the new_method and the method_to_affect, if they are not None, and after changing the object to new_base_step.

This can be useful if your pipeline requires unsupervised pretraining. For example:

X_pretrain = ...
X_train = ...

p = Pipeline(
    SomePreprocessing(),
    SomePretrainingStep().will_mutate_to(new_base_step=SomeStepThatWillUseThePretrainingStep),
    Identity().will_mutate_to(new_base_step=ClassifierThatWillBeUsedOnlyAfterThePretraining)
)
# Pre-train the pipeline
p = p.fit(X_pretrain, y=None)

# This will leave `SomePreprocessing()` untouched and will affect the two other steps.
p = p.mutate(new_method="transform", method_to_affect="transform")

# Pre-train the pipeline
p = p.fit(X_train, y_train)  # Then fit the classifier and other new things
Parameters
  • new_base_step – if it is not None, upon calling mutate, the object it will mutate to will be this provided new_base_step.

  • method_to_assign_to – if it is not None, upon calling mutate, the method_to_affect will be the one that is used on the provided new_base_step.

  • new_method – if it is not None, upon calling mutate, the new_method will be the one that is used on the provided new_base_step.

Returns

self

class neuraxle.steps.util.DataShuffler[source]
class neuraxle.steps.util.FitCallbackStep(callback_function, more_arguments: List = ())[source]

Call a callback method on fit.

fit(data_inputs, expected_outputs=None) → neuraxle.steps.util.FitCallbackStep[source]

Will call the self._callback() with the data being processed and the extra arguments specified. Note that here, the data to process is packed into a tuple of (data_inputs, expected_outputs). It has no other effect.

Parameters
  • data_inputs – the data to process

  • expected_outputs – the data to process

Returns

self

fit_one(data_input, expected_output=None) → neuraxle.steps.util.FitCallbackStep[source]

Will call the self._callback() with the data being processed and the extra arguments specified. Note that here, the data to process is packed into a tuple of (data_input, expected_output). It has no other effect.

Parameters
  • data_input – the data to process

  • expected_output – the data to process

Returns

self

fit_transform(data_inputs, expected_outputs=None) -> ('BaseStep', typing.Any)[source]
fit_transform_one(data_input, expected_output=None) -> ('BaseStep', typing.Any)[source]
get_hyperparams() → neuraxle.hyperparams.space.HyperparameterSamples[source]
get_hyperparams_space(flat=False) → neuraxle.hyperparams.space.HyperparameterSpace[source]
inverse_transform(processed_outputs)[source]

Do nothing - return the same data.

Parameters

processed_outputs – the data to process

Returns

the processed_outputs, unchanged.

inverse_transform_one(processed_output)[source]

Do nothing - return the same data.

Parameters

processed_output – the data to process

Returns

the data_output, unchanged.

meta_fit(X_train, y_train, metastep: neuraxle.base.MetaStepMixin)[source]

Uses a meta optimization technique (AutoML) to find the best hyperparameters in the given hyperparameter space.

Usage: p = p.meta_fit(X_train, y_train, metastep=RandomSearch(n_iter=10, scoring_function=r2_score, higher_score_is_better=True))

Call .mutate(new_method="inverse_transform", method_to_assign_to="transform"), and the current estimator will become

Parameters
  • X_train – data_inputs.

  • y_train – expected_outputs.

  • metastep – a metastep, that is, a step that can sift through the hyperparameter space of another estimator.

Returns

your best self.

mutate(new_method='inverse_transform', method_to_assign_to='transform', warn=True) → neuraxle.base.BaseStep[source]

Replace the “method_to_assign_to” method by the “new_method” method, IF the present object has no pending calls to .will_mutate_to() waiting to be applied. If there is a pending call, the pending call will override the methods specified in the present call. If the change fails (such as if the new_method doesn’t exist), then a warning is printed (optional). By default, there is no pending will_mutate_to call.

This could for example be useful within a pipeline to apply inverse_transform to every pipeline steps, or to assign predict_probas to predict, or to assign “inverse_transform” to “transform” to a reversed pipeline.

Parameters
  • new_method – the method to replace transform with, if there is no pending will_mutate_to call.

  • method_to_assign_to – the method to which the new method will be assigned to, if there is no pending will_mutate_to call.

  • warn – (verbose) wheter or not to warn about the inexistence of the method.

Returns

self, a copy of self, or even perhaps a new or different BaseStep object.

predict(data_input)[source]
reverse() → neuraxle.base.BaseStep[source]

The object will mutate itself such that the .transform method (and of all its underlying objects if applicable) be replaced by the .inverse_transform method.

Note: the reverse may fail if there is a pending mutate that was set earlier with .will_mutate_to.

Returns

a copy of self, reversed. Each contained object will also have been reversed if self is a pipeline.

set_hyperparams(hyperparams: neuraxle.hyperparams.space.HyperparameterSamples) → neuraxle.base.BaseStep[source]
set_hyperparams_space(hyperparams_space: neuraxle.hyperparams.space.HyperparameterSpace) → neuraxle.base.BaseStep[source]
tosklearn() → NeuraxleToSKLearnPipelineWrapper[source]
transform(data_inputs)[source]

Do nothing - return the same data.

Parameters

data_inputs – the data to process

Returns

the data_inputs, unchanged.

transform_one(data_input)[source]

Do nothing - return the same data.

Parameters

data_input – the data to process

Returns

the data_input, unchanged.

will_mutate_to(new_base_step: Optional[neuraxle.base.BaseStep] = None, new_method: str = None, method_to_assign_to: str = None) → neuraxle.base.BaseStep[source]

This will change the behavior of self.mutate(<...>) such that when mutating, it will return the presently provided new_base_step BaseStep (can be left to None for self), and the .mutate method will also apply the new_method and the method_to_affect, if they are not None, and after changing the object to new_base_step.

This can be useful if your pipeline requires unsupervised pretraining. For example:

X_pretrain = ...
X_train = ...

p = Pipeline(
    SomePreprocessing(),
    SomePretrainingStep().will_mutate_to(new_base_step=SomeStepThatWillUseThePretrainingStep),
    Identity().will_mutate_to(new_base_step=ClassifierThatWillBeUsedOnlyAfterThePretraining)
)
# Pre-train the pipeline
p = p.fit(X_pretrain, y=None)

# This will leave `SomePreprocessing()` untouched and will affect the two other steps.
p = p.mutate(new_method="transform", method_to_affect="transform")

# Pre-train the pipeline
p = p.fit(X_train, y_train)  # Then fit the classifier and other new things
Parameters
  • new_base_step – if it is not None, upon calling mutate, the object it will mutate to will be this provided new_base_step.

  • method_to_assign_to – if it is not None, upon calling mutate, the method_to_affect will be the one that is used on the provided new_base_step.

  • new_method – if it is not None, upon calling mutate, the new_method will be the one that is used on the provided new_base_step.

Returns

self

class neuraxle.steps.util.StepClonerForEachDataInput(wrapped: neuraxle.base.BaseStep, copy_op=<function deepcopy>)[source]
fit(data_inputs: List, expected_outputs: List = None) → neuraxle.steps.util.StepClonerForEachDataInput[source]
fit_one(data_input, expected_output=None) → neuraxle.base.BaseStep[source]
fit_transform(data_inputs, expected_outputs=None) -> ('BaseStep', typing.Any)[source]
fit_transform_one(data_input, expected_output=None) -> ('BaseStep', typing.Any)[source]
get_best_model() → neuraxle.base.BaseStep[source]
get_hyperparams() → neuraxle.hyperparams.space.HyperparameterSamples[source]
get_hyperparams_space(flat=False) → neuraxle.hyperparams.space.HyperparameterSpace[source]
inverse_transform(data_output)[source]
inverse_transform_one(data_output)[source]
meta_fit(X_train, y_train, metastep: neuraxle.base.MetaStepMixin)[source]

Uses a meta optimization technique (AutoML) to find the best hyperparameters in the given hyperparameter space.

Usage: p = p.meta_fit(X_train, y_train, metastep=RandomSearch(n_iter=10, scoring_function=r2_score, higher_score_is_better=True))

Call .mutate(new_method="inverse_transform", method_to_assign_to="transform"), and the current estimator will become

Parameters
  • X_train – data_inputs.

  • y_train – expected_outputs.

  • metastep – a metastep, that is, a step that can sift through the hyperparameter space of another estimator.

Returns

your best self.

mutate(new_method='inverse_transform', method_to_assign_to='transform', warn=True) → neuraxle.base.BaseStep[source]

Replace the “method_to_assign_to” method by the “new_method” method, IF the present object has no pending calls to .will_mutate_to() waiting to be applied. If there is a pending call, the pending call will override the methods specified in the present call. If the change fails (such as if the new_method doesn’t exist), then a warning is printed (optional). By default, there is no pending will_mutate_to call.

This could for example be useful within a pipeline to apply inverse_transform to every pipeline steps, or to assign predict_probas to predict, or to assign “inverse_transform” to “transform” to a reversed pipeline.

Parameters
  • new_method – the method to replace transform with, if there is no pending will_mutate_to call.

  • method_to_assign_to – the method to which the new method will be assigned to, if there is no pending will_mutate_to call.

  • warn – (verbose) wheter or not to warn about the inexistence of the method.

Returns

self, a copy of self, or even perhaps a new or different BaseStep object.

predict(data_input)[source]
reverse() → neuraxle.base.BaseStep[source]

The object will mutate itself such that the .transform method (and of all its underlying objects if applicable) be replaced by the .inverse_transform method.

Note: the reverse may fail if there is a pending mutate that was set earlier with .will_mutate_to.

Returns

a copy of self, reversed. Each contained object will also have been reversed if self is a pipeline.

set_hyperparams(hyperparams: neuraxle.hyperparams.space.HyperparameterSamples) → neuraxle.base.BaseStep[source]
set_hyperparams_space(hyperparams_space: neuraxle.hyperparams.space.HyperparameterSpace) → neuraxle.base.BaseStep[source]
set_step(step: neuraxle.base.BaseStep) → neuraxle.base.BaseStep[source]
tosklearn() → NeuraxleToSKLearnPipelineWrapper[source]
transform(data_inputs: List) → List[source]
transform_one(data_input)[source]
will_mutate_to(new_base_step: Optional[neuraxle.base.BaseStep] = None, new_method: str = None, method_to_assign_to: str = None) → neuraxle.base.BaseStep[source]

This will change the behavior of self.mutate(<...>) such that when mutating, it will return the presently provided new_base_step BaseStep (can be left to None for self), and the .mutate method will also apply the new_method and the method_to_affect, if they are not None, and after changing the object to new_base_step.

This can be useful if your pipeline requires unsupervised pretraining. For example:

X_pretrain = ...
X_train = ...

p = Pipeline(
    SomePreprocessing(),
    SomePretrainingStep().will_mutate_to(new_base_step=SomeStepThatWillUseThePretrainingStep),
    Identity().will_mutate_to(new_base_step=ClassifierThatWillBeUsedOnlyAfterThePretraining)
)
# Pre-train the pipeline
p = p.fit(X_pretrain, y=None)

# This will leave `SomePreprocessing()` untouched and will affect the two other steps.
p = p.mutate(new_method="transform", method_to_affect="transform")

# Pre-train the pipeline
p = p.fit(X_train, y_train)  # Then fit the classifier and other new things
Parameters
  • new_base_step – if it is not None, upon calling mutate, the object it will mutate to will be this provided new_base_step.

  • method_to_assign_to – if it is not None, upon calling mutate, the method_to_affect will be the one that is used on the provided new_base_step.

  • new_method – if it is not None, upon calling mutate, the new_method will be the one that is used on the provided new_base_step.

Returns

self

class neuraxle.steps.util.TapeCallbackFunction[source]

This class’s purpose is to be sent to the callback to accumulate information.

Example usage:

expected_tape = ["1", "2", "3", "a", "b", "4"]
tape = TapeCallbackFunction()

p = Pipeline([
    Identity(),
    TransformCallbackStep(tape.callback, ["1"]),
    TransformCallbackStep(tape.callback, ["2"]),
    TransformCallbackStep(tape.callback, ["3"]),
    AddFeatures([
        TransformCallbackStep(tape.callback, ["a"]),
        TransformCallbackStep(tape.callback, ["b"]),
    ]),
    TransformCallbackStep(tape.callback, ["4"]),
    Identity()
])
p.fit_transform(np.ones((1, 1)))

assert expected_tape == tape.get_name_tape()
callback(data, name: str = '')[source]

Will stick the data and name to the tape.

Parameters
  • data – data to save

  • name – name to save (string)

Returns

None

get_data() → List[source]

Get the data tape

Returns

The list of data.

get_name_tape() → List[str][source]

Get the data tape

Returns

The list of names.

class neuraxle.steps.util.TransformCallbackStep(callback_function, more_arguments: List = ())[source]

Call a callback method on transform and inverse transform.

fit(data_inputs, expected_outputs=None) → neuraxle.base.NonFittableMixin[source]

Don’t fit.

Parameters
  • data_inputs – the data that would normally be fitted on.

  • expected_outputs – the data that would normally be fitted on.

Returns

self

fit_one(data_input, expected_output=None) → neuraxle.base.NonFittableMixin[source]

Don’t fit.

Parameters
  • data_input – the data that would normally be fitted on.

  • expected_output – the data that would normally be fitted on.

Returns

self

fit_transform(data_inputs, expected_outputs=None) -> ('BaseStep', typing.Any)[source]
fit_transform_one(data_input, expected_output=None) -> ('BaseStep', typing.Any)[source]
get_hyperparams() → neuraxle.hyperparams.space.HyperparameterSamples[source]
get_hyperparams_space(flat=False) → neuraxle.hyperparams.space.HyperparameterSpace[source]
inverse_transform(processed_outputs)[source]

Will call the self._callback() with the data being processed and the extra arguments specified. It has no other effect.

Parameters

processed_outputs – the data to process

Returns

the same data as input, unchanged (like the Identity class).

inverse_transform_one(processed_output)[source]

Will call the self._callback() with the data being processed and the extra arguments specified. It has no other effect.

Parameters

processed_output – the data to process

Returns

the same data as input, unchanged (like the Identity class).

meta_fit(X_train, y_train, metastep: neuraxle.base.MetaStepMixin)[source]

Uses a meta optimization technique (AutoML) to find the best hyperparameters in the given hyperparameter space.

Usage: p = p.meta_fit(X_train, y_train, metastep=RandomSearch(n_iter=10, scoring_function=r2_score, higher_score_is_better=True))

Call .mutate(new_method="inverse_transform", method_to_assign_to="transform"), and the current estimator will become

Parameters
  • X_train – data_inputs.

  • y_train – expected_outputs.

  • metastep – a metastep, that is, a step that can sift through the hyperparameter space of another estimator.

Returns

your best self.

mutate(new_method='inverse_transform', method_to_assign_to='transform', warn=True) → neuraxle.base.BaseStep[source]

Replace the “method_to_assign_to” method by the “new_method” method, IF the present object has no pending calls to .will_mutate_to() waiting to be applied. If there is a pending call, the pending call will override the methods specified in the present call. If the change fails (such as if the new_method doesn’t exist), then a warning is printed (optional). By default, there is no pending will_mutate_to call.

This could for example be useful within a pipeline to apply inverse_transform to every pipeline steps, or to assign predict_probas to predict, or to assign “inverse_transform” to “transform” to a reversed pipeline.

Parameters
  • new_method – the method to replace transform with, if there is no pending will_mutate_to call.

  • method_to_assign_to – the method to which the new method will be assigned to, if there is no pending will_mutate_to call.

  • warn – (verbose) wheter or not to warn about the inexistence of the method.

Returns

self, a copy of self, or even perhaps a new or different BaseStep object.

predict(data_input)[source]
reverse() → neuraxle.base.BaseStep[source]

The object will mutate itself such that the .transform method (and of all its underlying objects if applicable) be replaced by the .inverse_transform method.

Note: the reverse may fail if there is a pending mutate that was set earlier with .will_mutate_to.

Returns

a copy of self, reversed. Each contained object will also have been reversed if self is a pipeline.

set_hyperparams(hyperparams: neuraxle.hyperparams.space.HyperparameterSamples) → neuraxle.base.BaseStep[source]
set_hyperparams_space(hyperparams_space: neuraxle.hyperparams.space.HyperparameterSpace) → neuraxle.base.BaseStep[source]
tosklearn() → NeuraxleToSKLearnPipelineWrapper[source]
transform(data_inputs)[source]

Will call the self._callback() with the data being processed and the extra arguments specified. It has no other effect.

Parameters

data_inputs – the data to process

Returns

the same data as input, unchanged (like the Identity class).

transform_one(data_input)[source]

Will call the self._callback() with the data being processed and the extra arguments specified. It has no other effect.

Parameters

data_input – the data to process

Returns

the same data as input, unchanged (like the Identity class).

will_mutate_to(new_base_step: Optional[neuraxle.base.BaseStep] = None, new_method: str = None, method_to_assign_to: str = None) → neuraxle.base.BaseStep[source]

This will change the behavior of self.mutate(<...>) such that when mutating, it will return the presently provided new_base_step BaseStep (can be left to None for self), and the .mutate method will also apply the new_method and the method_to_affect, if they are not None, and after changing the object to new_base_step.

This can be useful if your pipeline requires unsupervised pretraining. For example:

X_pretrain = ...
X_train = ...

p = Pipeline(
    SomePreprocessing(),
    SomePretrainingStep().will_mutate_to(new_base_step=SomeStepThatWillUseThePretrainingStep),
    Identity().will_mutate_to(new_base_step=ClassifierThatWillBeUsedOnlyAfterThePretraining)
)
# Pre-train the pipeline
p = p.fit(X_pretrain, y=None)

# This will leave `SomePreprocessing()` untouched and will affect the two other steps.
p = p.mutate(new_method="transform", method_to_affect="transform")

# Pre-train the pipeline
p = p.fit(X_train, y_train)  # Then fit the classifier and other new things
Parameters
  • new_base_step – if it is not None, upon calling mutate, the object it will mutate to will be this provided new_base_step.

  • method_to_assign_to – if it is not None, upon calling mutate, the method_to_affect will be the one that is used on the provided new_base_step.

  • new_method – if it is not None, upon calling mutate, the new_method will be the one that is used on the provided new_base_step.

Returns

self