You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks a lot for the effort to release the code-base. I am trying to reproduce the results from the paper, however I am finding lower performance that what was reported in the paper on most of the datasets and I am wondering whether this is a variance problem to do with seed selection ? Were the reported results ran over a single seed?
In particular, I am having issues reproducing SeqCombMV where the performance is significantly lower (even than the baselines IG and Dynamask). I get the following results when running the model on this:
Results for ours explainer on seqcomb_mv with split=1
auprc = 0.2960 +- 0.0023
aup = 0.7468 +- 0.0020
aur = 0.3036 +- 0.0021
iou = 0.1143 +- 0.0013
Results for ours explainer on seqcomb_mv with split=2
auprc = 0.1231 +- 0.0039
aup = 0.0888 +- 0.0022
aur = 0.5560 +- 0.0042
iou = 0.0584 +- 0.0028
Results for ours explainer on seqcomb_mv with split=3
auprc = 0.7016 +- 0.0038
aup = 0.7407 +- 0.0015
aur = 0.4463 +- 0.0020
iou = 0.3340 +- 0.0028
Results for ours explainer on seqcomb_mv with split=4
auprc = 0.2680 +- 0.0031
aup = 0.7546 +- 0.0034
aur = 0.1154 +- 0.0023
iou = 0.1375 +- 0.0020
Results for ours explainer on seqcomb_mv with split=5
auprc = 0.0812 +- 0.0021
aup = 0.0551 +- 0.0015
aur = 0.4215 +- 0.0067
iou = 0.0384 +- 0.0022
Results for ours explainer on all splits
auprc = 0.2940 +- 0.0039
aup = 0.4772 +- 0.0055
aur = 0.3685 +- 0.0030
iou = 0.1365 +- 0.0020
Thanks a lot for the effort to release the code-base. I am trying to reproduce the results from the paper, however I am finding lower performance that what was reported in the paper on most of the datasets and I am wondering whether this is a variance problem to do with seed selection ? Were the reported results ran over a single seed?
In particular, I am having issues reproducing
SeqCombMV
where the performance is significantly lower (even than the baselines IG and Dynamask). I get the following results when running the model on this:And this is what was reported in the paper:
I double checked the hyperparameters as well. But is it possible that there is a problem with the generated data, or some error in the hyperparameter?
Thanks a lot for your help in advance!
The text was updated successfully, but these errors were encountered: