Available Configurations

The tables below list both main and additional configurations that can be used. Additional configurations should be used with caution as they may not be suitable for certain datasets. It is reccomended to always choose one of the main configurations.

Configurations for tabular_regression/tabular_classification tasks

Name	Extension	Description
SuperLearner	–	Uses SuperLearner to build a stacking ensemble of base estimators. SuperLearner combines multiple individual estimators to make predictions with greater accuracy than any of the individual estimators alone. Additionaly, it learns to weigh the predictions of each individual model, optimizing the combination to maximize performance on the given task. SuperLearner is more suitable for smaller datasets, but the produced models tend to be relatively large.
OptunaLearner	–	Uses OptunaLearner. It builds a model and optimizes its hyperparameters using Optuna framework; HistGradientBoostingClassifier/HistGradientBoostingRegressor is used as a default model. Since OptunaLearner focuses on finetuning a single model, the produced model is not very large in size, but the optimization procedure can be very long.
PlainLearner	–	Uses PlainLearner. It builds a model using default hyperparameters; HistGradientBoostingClassifier/HistGradientBoostingRegressor is used as a default model. PlainLearner is very fast, thus it is a good choice for building initial baselines or automizing preprocessing steps.

Additional configurations

Name	Extension	Description
SuperLearner.mini	–	Uses SuperLearner with a config for small datasets. The dataset is considered small when the number of cells after preprocessing [n_rows*n_columns] is < 80k.
SuperLearner.mid	–	Uses SuperLearner with a config for mid datasets. The dataset is considered small when the number of cells after preprocessing [n_rows*n_columns] is < 4kk.
SuperLearner.large	–	Uses SuperLearner with a config for large datasets. The dataset is considered small when the number of cells after preprocessing [n_rows*n_columns] is < 16kk.
SuperLearner.xlarge	–	Uses SuperLearner with a config for x-large datasets. The dataset is considered small when the number of cells after preprocessing [n_rows*n_columns] is >= 16kk.
OptunaLearner.hgbt	–	Uses OptunaLearner. It builds a HistGradientBoostingClassifier/HistGradientBoostingRegressor model with hyperparameters optimized by Optuna framework.
PlainLearner.hgbt	–	Uses PlainLearner. It builds a HistGradientBoostingClassifier/HistGradientBoostingRegressor model with default hyperparameters.