Forecasting with missing value during inference but available when training #1133
Unanswered
runyournode
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello there,
First, let me thank you for the available open-source code and the extensive documentation 🥇.
I am only discovering time-series forecasting and I think I am not starting with the easiest task 😄.
So let me expose and how I think I will handle it but also the early concerns I may have.
Any insight is welcome before I deep-dive into implementing it.
I am not asking you to do my job, but if you think my strategy is very poor with high chances of failure, I'd better revise it 😄 . Also if this is not the place for such discussion, please excuse me and delete my post.
Thank you for your guidance !
Context
System$(ds, feat_0, feat_1, ... feat_n, y)$ . Features may be static or dynamic, $y$ is just another feature, but is the one I am eventually interested in forecasting.
A
records every second some IT network metrics:System
A
df would look like:System
A
data:A
irregularly streams these data to systemB
.System
B
df would for instance look like:System
B
data:I ignore the policy that dictates$y$ .
A
to stream or not toB
but I suspect that the streaming state is correlated to the value ofDuring the training phase, I may ask for both df, so the concatenated data would be :
Available data for training
I would have no other time-serie, but will have severals samples of this time-serie recorded in different situations and time.
Aim
My aim is to forecast (in real time) the next values of$y$ from the data available at
B
.Real-time forecasting from:
My Strategy
By definition, I cannot use future exogenous features but have access to some past exogenous features and some past$y$ values. I expect no trend (and maybe no seasonal) patterns in my data. I will first try neuralforecast univariate models that can handle historical exogenous features, but will also try multivariate models.
What do you think would be the smartest way to leverage the available feature /$y$ data seen during training (but unavailable when forecasting) ?
I can think of:
When training:
A
, create a masked copy if it by replacing the un-streamed features values with interpolation, or last streamed values for extrapolation. I have now 2 time-series (A
andmasked A
) I can train my model on.When forecasting:
B
with the same inter/extra-polation strategy. MissingWhat concerns me :
A
) and a degraded case (masked A
) that would be closer to data seen during forecasting.masked-A
are 100% accurate butB
would result from interpolation or forecasting. Do you think I can easily code and train my model by filling inmasked A
theBeta Was this translation helpful? Give feedback.
All reactions