Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] How to get the appropriate SHAP values for a particular time of interest #2566

Open
DataScientistET opened this issue Oct 21, 2024 · 0 comments
Labels
question Further information is requested triage Issue waiting for triaging

Comments

@DataScientistET
Copy link

After fitting my model, I am interested in looking at the SHAP values of a particular point of interest. I am having trouble understanding how the output from force_plot_from_ts relates to that point of interest in terms of the horizon parameter.

Assuming that I fitted my model under these conditions:

model_estimator = LightGBMModel(
    lags=None,
    lags_past_covariates=list(range(-8, 0)),
    lags_future_covariates=list(range(-22, 21)),
    output_chunk_length=3,
)

I then pass the appropriate timeseries to train the model and create the shap_explain_obj:
shap_explain = ShapExplainer(model_estimator)

Now I am interested in looking at the SHAP values for a particular point in time, if I pass the exact length of past and future covariates required to the predict function, the model will perform as expected and predict the next 3 points.

timestamp_of_interest = pd.Timestamp('2024-01-01 00:45:00', tz="utc").tz_convert(None)

target_end_date = timestamp_of_interest - relativedelta(minutes = 15)
past_cov_start_date = (timestamp_of_interest - relativedelta(minutes=120))
past_cov_end_date = timestamp_of_interest
future_cov_start_date = (timestamp_of_interest - relativedelta(minutes=330))
future_cov_end_date = (timestamp_of_interest + relativedelta(minutes=300))

test1 = local_lgbm_hf_output.model_estimator.predict(
    n = 3,
    series = local_lgbm_hf_output.hf_data_dict['target_hf'][: target_end_date],
    past_covariates = local_lgbm_hf_output.hf_data_dict['past_cov_hf'][past_cov_start_date: past_cov_end_date],
    future_covariates = local_lgbm_hf_output.hf_data_dict['future_cov_hf'][future_cov_start_date: future_cov_end_date]
)
test1.pd_dataframe()

image

Now I pass the exact same dataset to shap_explain.explain and shap_explain.force_plot_from_ts:

shap_results_sample = shap_explain.explain(
    foreground_series = local_lgbm_hf_output.hf_data_dict['target_hf'][: target_end_date],
    foreground_past_covariates = local_lgbm_hf_output.hf_data_dict['past_cov_hf'][past_cov_start_date: past_cov_end_date],
    foreground_future_covariates = local_lgbm_hf_output.hf_data_dict['future_cov_hf'][future_cov_start_date: future_cov_end_date],
    horizons = [1,2,3]
)
shap_results_sample.get_explanation(horizon = 1).pd_dataframe()

shap_explain.force_plot_from_ts(    
    foreground_series = local_lgbm_hf_output.hf_data_dict['target_hf'][: target_end_date],
    foreground_past_covariates = local_lgbm_hf_output.hf_data_dict['past_cov_hf'][past_cov_start_date: past_cov_end_date],
    foreground_future_covariates = local_lgbm_hf_output.hf_data_dict['future_cov_hf'][future_cov_start_date: future_cov_end_date],
    horizon = [1]
)

For the get_explanation function, no matter what horizon parameter (1/2/3) I pass to it, I always only get back the timestamp for the first point of prediction, i.e. only 2024-01-01 00:45:00
e.g. horizon = 1
image

horizon = 2
image

Also for the force_plot_from_ts, how does the horizon parameter relate to the plot being shown? If horizon is 1, does it mean that is the SHAP values contributing to the first point of prediction 2024-01-01 00:45:00 and setting horizon 2 will be for the second point of prediction 2024-01-01 01:00:00?

Shouldn't both functions return the same results just visualized differently?

@DataScientistET DataScientistET added question Further information is requested triage Issue waiting for triaging labels Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested triage Issue waiting for triaging
Projects
None yet
Development

No branches or pull requests

1 participant