Window and stride arguments are making it harder to use the package. feature_collection.reduce example #76

arturdaraujo · 2022-09-28T20:01:44Z

First of all, this package is awesome. The community that deals with time series data needed to improve the game and tsflex have everything to be the main library.

However, here are a few specific suggestions:

Remove "windows" and "strides" arguments altogether for feature extraction:
It does seem a bit excessive but hear me out. They are good arguments but not fundamental for feature extraction. They could be used in data preparation, Alteryx has a library called "compose" (https://github.com/alteryx/compose) just for the purpose of creating multiple time frame windows. Once the "window" is ready, just select the functions. I propose tsflex main function (feature_collection.calculate) just use time series data and a list of functions for feature extraction, no window or strides.

Explaining further:
The way I view the implementation of the essentials would be only this: feature_collection.calculate(time_series_df, functions).
If any of the columns of the time series had any data type other than int, float, it could simply raise an error or ignore the column.

Window and stride also make feature_collection.reduce function hard to use:
After feature selection and having selected a few columns of the many created using tsflex I use the reduce that gives me the functions for transformation/extraction. The problem is that the naming convention includes window and strides (e.g: Open__mean__w=233500_s=233500) which means I have to have a time series with the same characteristics/size, which often doesn't happen. I use the arguments windows and strides like the following:

simple_feats = MultipleFeatureDescriptors(
functions=tsfresh_settings_wrapper(settings),
series_names="Open",
windows=len(stock_data) - 1,
strides=len(stock_data) - 1,
)
feature_collection = FeatureCollection(simple_feats)
features_df = feature_collection.calculate(
stock_full, return_df=True, show_progress=True, approve_sparsity=(True)
)

I use this because I need to process the whole dataset.

Anyway, I hope this is helpful.

jvdd · 2022-09-28T20:16:48Z

Hi, thanks for creating this issue @arturdaraujo! We are always happy to hear feedback from the community 😄

I'll discuss your remarks with @jonasvdd & @emield12 (the 2 other maintainers) and will keep you posted.

Cheers, Jeroen

P.S. In PR #71 I already decoupled the window & stride from the feature descriptors 😉

jvdd · 2022-10-11T21:44:31Z

tsflex v0.3 is just released! 🎉

arturdaraujo changed the title ~~Window and stride arguments are making it harder to use the package. feature_collection.reduce doesn't make sense~~ Window and stride arguments are making it harder to use the package. feature_collection.reduce example Sep 28, 2022

jvdd added enhancement New feature or request question Further information is requested labels Sep 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Window and stride arguments are making it harder to use the package. feature_collection.reduce example #76

Window and stride arguments are making it harder to use the package. feature_collection.reduce example #76

arturdaraujo commented Sep 28, 2022 •

edited

Loading

jvdd commented Sep 28, 2022

jvdd commented Oct 11, 2022

Window and stride arguments are making it harder to use the package. feature_collection.reduce example #76

Window and stride arguments are making it harder to use the package. feature_collection.reduce example #76

Comments

arturdaraujo commented Sep 28, 2022 • edited Loading

jvdd commented Sep 28, 2022

jvdd commented Oct 11, 2022

arturdaraujo commented Sep 28, 2022 •

edited

Loading