Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new CaloCluster event class #1178

Merged
merged 1 commit into from
Aug 17, 2023

Conversation

therwig
Copy link
Contributor

@therwig therwig commented Jul 18, 2023

I'm adding a new CaloCluster class that EcalCluster and HcalCluster can inherit from, in order to reduce code duplication and increase usability of existing tools aiming at one class or the other.

What are the issues that this addresses?

This is the first step needed to resolve #1177

Check List

  • I successfully compiled ldmx-sw with my developments
  • I ran my developments and the following shows that they are successful.

I re-ran clustering locally with an EcalCluster prototype based on this base class. Results were unchanged wrt the previous implementation.

I leave updates to the submodules for future work/reviews, and will follow that progress in the issue reference above.

@therwig therwig linked an issue Jul 18, 2023 that may be closed by this pull request
3 tasks
@therwig
Copy link
Contributor Author

therwig commented Jul 18, 2023

FWIW even though many tests "failed" this appears to be due to the KS metric being too tight. Here's one random failing histogram from the PN sample, for example. (I guess since the events are generated with a different random seed they aren't identical?) My changes shouldn't affect any of these plots, so I think the failures are unrelated to the PR.

EcalShowerFeatures_EcalShowerFeatures_deepest_layer_hit.pdf

@EinarElen
Copy link
Contributor

EinarElen commented Jul 18, 2023

I can't comment directly on the failing ecal plot, but the seed should be identical for the two tests. But yeah it seems strange that something like the hcal sim energy would change otherwise

@therwig
Copy link
Contributor Author

therwig commented Jul 18, 2023

@EinarElen ok interesting. Does this pipeline test usually work? The PR literally just adds a new (unrelated) event class so I really don't think the failing plots can be related to the changes here.

@EinarElen
Copy link
Contributor

Yeah usually it works well. But yeah the failure here can't be from the changes here (unless there's something really obscure going on), so I think there's something happening on the CI side. Maybe the most recent gold histograms were mixed up with an earlier version? A very quick check for that would be to just run a small test manually once with current trunk and once with the branch.

If nobody has figured it out by tomorrow, I'll give that a try :)

@therwig
Copy link
Contributor Author

therwig commented Jul 18, 2023

Hm yeah I don't see the same pipelines run for the past 2 PRs (but maybe I'm not looking in the right place?) and the last commit from @tomeichlersmith updated the reference histograms so my bet is some kind of CI issue.

@tomeichlersmith
Copy link
Member

(I guess since the events are generated with a different random seed they aren't identical?)

The validation samples should be running with the same random seed and therefore should be generating identical events.1
The reason the KS check is so tight is because these events should be identical, but we can look into loosening that cut if some other bug isn't discovered.

Footnotes

  1. there do exist some breaks in this. The TS subsystem manages its own random seed in some processors. Also the ECal MIP tracking SVD algorithm depends on some randomness.

@tomeichlersmith
Copy link
Member

I was able to confirm that v3.2.6-2-5e05b82 (trunk) is different from the gold histograms in the inclusive validation sample. This is telling me that something fishy happened with the golden histogram files as expected. Will check against v3.2.6 to make sure.

Set Up

The git describe --tags is to make sure that I'm only 2 commits ahead of v3.2.6.

. scripts/ldmx-env.sh
ldmx pull dev latest
git describe --tags
ldmx clean src
ldmx cmake -B build -S .
ldmx cmake --build build --target install -- -j4
# set container env variables to match CI
ldmx setenv LDMX_RUN_NUMBER=1
ldmx setenv LDMX_NUM_EVENTS=10000

Running

Going to do inclusive since its the shortest.

cd .github/validation_samples/inclusive
ldmx fire config.py
ldmx python3 ${LDMX_BASE}/ldmx-sw/.github/actions/validate/compare.py \
  gold.root v3.2.6 \
  hist.root $(git describe --tags)

@tomeichlersmith
Copy link
Member

It does appear to be an issue with how the golden histograms were generated. They do not match the manually generated histograms for the same tag.
plots.tar.gz

@tomeichlersmith
Copy link
Member

The run which produced the golden histograms appears to have cloned the correct commit of ldmx-sw (deduced from looking at the logs).

https://github.com/LDMX-Software/ldmx-sw/actions/runs/5349505801

I fear that this is revealing an actual code bug but I have no idea what it could be...

@tomeichlersmith
Copy link
Member

Alright, I found a difference between the environment that produced the histograms and the environment producing them now.

The run producing the histograms is using ldmx/dev:v3.5.0. I was able to look this up by copying out the digest from the run logs

Digest: sha256:1bf3f3aa946eb91f3b1bf118ee7f5e74a61e64b59b32b11eb1878cd026ec68ab

and then looking at docker image ls --digests ldmx/dev on my computer to see which tag had this digest.

I am going to run the inclusive sample on trunk but in ldmx/dev:v3.5.0 and if they pass the cuts then I will make a new patch release to recreate the histograms with ldmx/dev:v4.0.0.

I forget where @EinarElen pointed this out, but I vaguely remember you mentioning that changing the compiler version may affect how the RNG behaves which seems to be the case.

@EinarElen
Copy link
Contributor

Yep, that can happen :) Great for reproducibility

@tomeichlersmith tomeichlersmith marked this pull request as draft July 19, 2023 16:06
@tomeichlersmith
Copy link
Member

Rerunning trunk with ldmx/dev:v3.5.0 did match the golden histograms, so I made a new patch release and we can mark this as "Ready for Review" after those new histograms are generated and merged so that these tests can pass.

@tomeichlersmith tomeichlersmith marked this pull request as ready for review July 19, 2023 18:38
@tomeichlersmith tomeichlersmith merged commit 8ff7c95 into trunk Aug 17, 2023
7 of 11 checks passed
@tomeichlersmith tomeichlersmith deleted the 1177-common-calorimeter-cluster-class branch August 17, 2023 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Common calorimeter cluster class
3 participants