Add support for HistGradientBoostingRegressor #105

samuelefiorini · 2023-04-06T11:09:01Z

I use Greykite to forecast hourly time-series with years of historical data and fit_algorithm=gradient_boosting is very slow.

According to sklearn.ensemble.HistGradientBoostingRegressor

This estimator is much faster than GradientBoostingRegressor for big datasets (n_samples >= 10 000).

have you considered adding support for this estimator? It looks straightforward from here, but I may be wrong.

The text was updated successfully, but these errors were encountered:

amyfei2015 · 2023-04-28T21:09:37Z

Thanks for the suggestion! We haven't planed for this yet, but we now take a note. Will update with you if we have this feature implemented. In the meanwhile please feel free to submit a pull request for this feature change if you need to use that. Thanks!

samuelefiorini · 2023-05-02T13:22:54Z

Thanks, I did some experiments (here) and I've been able to make it run (it's far from being a PR though). In my case (hourly forecast with 2+ years of historical data) HistGradientBoostingRegressor is way faster than GradientBoostingRegressor (around 4x) while it has roughly the same performace in backtest.

However, there are also some points of discussion. For instance: due to its implementation, HistGradientBoostingRegressor does not offer a native feature importance measure. While both GradientBoostingRegressor and RandomForestsRegressor do.

A possible approach would be to rely on something like sklearn.inspection.permutation_importance, but this of course comes with higher computational cost, and it's probably not ideal. Otherwise a dummy empty array may be used, maybe raising some warning to inform the user.

samuelefiorini · 2024-10-10T06:57:07Z

It’s been a while, but the issue regarding the addition of feature_importance in HistGradientBoosting* estimator is still open on scikit-learn: 15132. I’m adding this here for future reference.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for HistGradientBoostingRegressor #105

Add support for HistGradientBoostingRegressor #105

samuelefiorini commented Apr 6, 2023

amyfei2015 commented Apr 28, 2023

samuelefiorini commented May 2, 2023

samuelefiorini commented Oct 10, 2024

Add support for HistGradientBoostingRegressor #105

Add support for HistGradientBoostingRegressor #105

Comments

samuelefiorini commented Apr 6, 2023

amyfei2015 commented Apr 28, 2023

samuelefiorini commented May 2, 2023

samuelefiorini commented Oct 10, 2024