SimpleTabularPipeline
- class falcon.tabular.pipelines.SimpleTabularPipeline(task: str, mask: ~typing.List[~falcon.types.ColumnTypes], learner: ~typing.Type[~falcon.abstract.learner.Learner] = <class 'falcon.tabular.learners.super_learner.SuperLearner'>, learner_kwargs: ~typing.Optional[~typing.Dict] = None, preprocessor: str = 'MultiModalEncoder')
Default tabular pipeline.
- __init__(task: str, mask: ~typing.List[~falcon.types.ColumnTypes], learner: ~typing.Type[~falcon.abstract.learner.Learner] = <class 'falcon.tabular.learners.super_learner.SuperLearner'>, learner_kwargs: ~typing.Optional[~typing.Dict] = None, preprocessor: str = 'MultiModalEncoder')
Default tabular pipeline. On a high level it simply chains a preprocessor and model learner (by default SuperLearner). For classification tasks, the labels are also encoded as integers (while predictions are decoded back to strings). Internally, all numerical features are scaled to 0 mean and 1 std. All categorical features are one-hot encoded (this approach might not be suitable for features with very high cardinality).
- Parameters
task (str) – tabular_classification or tabular_regression
mask (List[int]) – list of ints where 1/2 indicates a low/high cardinality categorical feature and 0 indicates a numerical feature
learner (Learner, optional) – learner class to be used, by default SuperLearner
learner_kwargs (Optional[Dict], optional) – arguments to be passed to the learner, by default None
preprocessor (str) – defines which preprocessor to use, can be one of {‘MultiModalEncoder’,’ScalerAndEncoder’}, by default ‘MultiModalEncoder’
- add_element(element: PipelineElement) None
Adds element to pipeline. The input type of added element should match the output type of the last element in the pipeline.
- Parameters
element (PipelineElement) – element to be added to the end of the pipeline
- fit(X: ndarray[Any, dtype[ScalarType]], y: ndarray[Any, dtype[ScalarType]], *args: Any, **kwargs: Any) None
Fits the pipeline by consecutively calling .fit_pipe() method of each element in pipeline. For tabular classification, LabelDecoder is applied to targets before actual training occurs.
- Parameters
X (npt.NDArray) – train featrues
y (npt.NDArray) – train targets
- predict(X: ndarray[Any, dtype[ScalarType]], *args: Any, **kwargs: Any) ndarray[Any, dtype[ScalarType]]
Predicts the label of passed data points.
- Parameters
X (npt.NDArray) – features
- Returns
predicted label
- Return type
npt.NDArray
- save() ModelProto
Exports the pipeline to ONNX ModelProto
- Returns
Pipeline as ONNX ModelProto
- Return type
ModelProto