ScalerAndEncoder
- class falcon.tabular.processors.ScalerAndEncoder(mask: List[ColumnTypes])
Applies OneHotEncoder/OrdinalEncoder on low/high cardinality categorical features and StandardScaler on numerical features.
- __init__(mask: List[ColumnTypes]) None
- Parameters
mask (List[ColumnTypes]) – provides a type for each column at a given index
- fit(X: ndarray[Any, dtype[ScalarType]], y: Optional[Any] = None, *args: Any, **kwargs: Any) None
Fits the encoder.
- Parameters
X (npt.NDArray) – data to encode
_ (Any, optional) – dummy argument to keep compatibility with pipeline training
- fit_pipe(X: Any, y: Any, *args: Any, **kwargs: Any) Any
Equivalent of fit method that is used for elements chaining inisde pipeline during training.
- Parameters
X (Any) – features
y (Any) – targets
- Returns
usually None
- Return type
Any
- forward(X: ndarray[Any, dtype[object_]], *args: Any, **kwargs: Any) ndarray[Any, dtype[ScalarType]]
Equivalent of .predict() or .transform().
- Parameters
X (npt.NDArray[object]) – data to process
- Returns
processed data
- Return type
npt.NDArray
- get_input_type() Type
- Returns
object
- Return type
Type
- get_output_type() Type
- Returns
Float32Array
- Return type
Type
- predict(X: ndarray[Any, dtype[ScalarType]], *args: Any, **kwargs: Any) ndarray[Any, dtype[ScalarType]]
Applies the encoder.
- Parameters
X (npt.NDArray) – input data
- Returns
encoded data
- Return type
npt.NDArray
- to_onnx() SerializedModelRepr
Serializes the encoder to onnx. Each feature in the original dataset is mapped to its own input node (float32 for numerical or string for categorical).
- Return type
SerializedModelRepr
- transform(X: ndarray[Any, dtype[ScalarType]], *args: Any, **kwargs: Any) ndarray[Any, dtype[ScalarType]]
Equivalent of self.predict(X)
- Parameters
X (npt.NDArray) – features
- Returns
transformed features
- Return type
npt.NDArray