Available Configurations

The tables below list both main and additional configurations that can be used. Additional configurations should be used with caution as they may not be suitable for certain datasets. It is reccomended to always choose one of the main configurations.

Configurations for tabular_regression/tabular_classification tasks

Name

Extension

Description

SuperLearner

Uses SuperLearner to build a stacking ensemble of base estimators.
SuperLearner combines multiple individual estimators to make predictions with greater accuracy than any of the individual estimators alone.
Additionaly, it learns to weigh the predictions of each individual model, optimizing the combination to maximize performance on the given task.
SuperLearner is more suitable for smaller datasets, but the produced models tend to be relatively large.

OptunaLearner

It builds a model and optimizes its hyperparameters using Optuna framework; HistGradientBoostingClassifier/HistGradientBoostingRegressor is used as a default model.
Since OptunaLearner focuses on finetuning a single model, the produced model is not very large in size, but the optimization procedure can be very long.

PlainLearner

It builds a model using default hyperparameters; HistGradientBoostingClassifier/HistGradientBoostingRegressor is used as a default model.
PlainLearner is very fast, thus it is a good choice for building initial baselines or automizing preprocessing steps.
Additional configurations

Name

Extension

Description

SuperLearner.mini

Uses SuperLearner with a config for small datasets.
The dataset is considered small when the number of cells after preprocessing [n_rows*n_columns] is < 80k.

SuperLearner.mid

Uses SuperLearner with a config for mid datasets.
The dataset is considered small when the number of cells after preprocessing [n_rows*n_columns] is < 4kk.

SuperLearner.large

Uses SuperLearner with a config for large datasets.
The dataset is considered small when the number of cells after preprocessing [n_rows*n_columns] is < 16kk.

SuperLearner.xlarge

Uses SuperLearner with a config for x-large datasets.
The dataset is considered small when the number of cells after preprocessing [n_rows*n_columns] is >= 16kk.

OptunaLearner.hgbt

It builds a HistGradientBoostingClassifier/HistGradientBoostingRegressor model with hyperparameters optimized by Optuna framework.

PlainLearner.hgbt

It builds a HistGradientBoostingClassifier/HistGradientBoostingRegressor model with default hyperparameters.