Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TICLv5a : Improve skeleton linking and parameters tuning #46539

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

waredjeb
Copy link
Contributor

@waredjeb waredjeb commented Oct 29, 2024

This PR improves the current Skeleton Linking

  • New Parameters: tuned parameters for hadronic reconstruction.

  • Code refactoring: simply code by refactoring pieces

  • Modified TICLGraph: modified logic for traversing the graph and build components

  • Small changes to TICLDumper: allow to run TICLDumper with TICLv4. It will consume SuperClusters (new with TICLv5) only in TICLv5

New parameters have been included also at HLT

The performance against TICLv4 can be found in the DPS approval talk https://indico.cern.ch/event/1462666/contributions/6158255/attachments/2940573/5170734/TICLv5_DPS_version3.pdf

Should be tested with a .203 workflow

E.g.

test parameters:

    - workflow_opts= -w upgrade
    - workflow = 29896.203, 29691.203, 29691.204
    - enable = profiling, hlt_p2_timing

Edit 06/11/2024:

  • hltTiclTracksterLinks_cfi and hltTiclTracksterLinksUnseeded_cfi, Both of them were Unseeded with no differences between them, for consistency I removed hltTiclTracksterLinksUnseeded_cfi fixing the Egamma sequence accordingly
  • I 've added the L1Seeded module and sequence for the Recovery Tracksters, it is not used but I added it for consistency with hltTiclTracksterLinksL1Seeded_cfi

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 29, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @waredjeb for master.

It involves the following packages:

  • HLTrigger/Configuration (hlt)
  • RecoHGCal/TICL (upgrade, reconstruction)

@Martin-Grunewald, @Moanwar, @cmsbuild, @jfernan2, @mandrenguyen, @mmusich, @srimanob, @subirsarkar can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @SohamBhattacharya, @VourMa, @apsallid, @felicepantaleo, @forthommel, @hatakeyamak, @lecriste, @missirol, @mmusich, @rovere, @sameasy, @silviodonato, @sobhatta, @youyingli this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@felicepantaleo
Copy link
Contributor

test parameters:

  • workflow_opts= -w upgrade
  • workflow = 29896.203
  • enable = profiling, hlt_p2_timing

@felicepantaleo
Copy link
Contributor

@cmsbuild please test

@waredjeb
Copy link
Contributor Author

Hi,
I think the tests got stuck

@felicepantaleo
Copy link
Contributor

@smuzaffar do you have any idea why profiling tests are stuck?

@smuzaffar
Copy link
Contributor

@felicepantaleo , the node which runs the profiling tests is busy running IB profiling jobs. Let me check if I can improve the situation

@smuzaffar
Copy link
Contributor

please test

I can not find any trace of previse profiling jobs , so lets restart the tests

@cmsbuild
Copy link
Contributor

-1

Failed Tests: RelVals-INPUT
Size: This PR adds an extra 12KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-d4760a/42504/summary.html
COMMIT: bdcc39c
CMSSW: CMSSW_14_2_X_2024-10-30-2300/el8_amd64_gcc12
Additional Tests: PROFILING,HLT_P2_TIMING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/46539/42504/install.sh to create a dev area with all the needed externals and cmssw changes.

  • DAS Queries: The DAS query tests failed, see the summary page for details.

RelVals-INPUT

  • 2024.0000012024.000001_RunJetMET02024D_10k/step1_dasquery.log
  • 2024.0010012024.001001_RunZeroBias2024D_10k/step1_dasquery.log
  • 2024.1000012024.100001_RunJetMET02024C_10k/step1_dasquery.log
Expand to see more relval errors ...
  • 2024.101001
  • 2024.000001
  • 2024.001001
  • 2024.100001
  • 2024.101001

Comparison Summary

Summary:

  • You potentially removed 3 lines from the logs
  • Reco comparison results: 1056 differences found in the comparisons
  • DQMHistoTests: Total files compared: 47
  • DQMHistoTests: Total histograms compared: 3668988
  • DQMHistoTests: Total failures: 789
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 3668178
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -0.004 KiB( 46 files compared)
  • DQMHistoSizes: changed ( 29896.203 ): -0.004 KiB MessageLogger/Warnings
  • Checked 205 log files, 175 edm output root files, 47 DQM output files
  • TriggerResults: no differences found

@jfernan2
Copy link
Contributor

@waredjeb
Copy link
Contributor Author

waredjeb commented Nov 1, 2024

ticlTrackstersMerge is doubling the time, do we understand why? Thanks

https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-d4760a/42504/profiling/29834.21/diff-step3_cpu.resources.json.html

Hi @jfernan2 . I'll be honest, I am quite confused, I am not touching the TracksterMergeProducer with this PR (as you can also see from the changes). ticlTrackstersMerge is part of TICLv4, in this PR I am only touching TICLv5 (workflows .203), so I am not expecting any changes on all the other workflows.
Let me investigate

@felicepantaleo
Copy link
Contributor

I tried running wf 29834.21 (TICLv4) with CMSSW_14_2_X_2024-10-31-2300 and CMSSW_14_2_X_2024-10-31-2300+PR46539, then dumped the step3 in the two configuration and the only difference I see is in process.ticlTracksterLinks (TICLv5, not executed in this wf):

79481c79481,79482
<         cylinder_radius_sqr = cms.vdouble(9, 9),
---
>         cylinder_radius_sqr = cms.vdouble(9, 15),
>         deltaRxy = cms.double(4.0),
79483,79484c79484,79486
<         max_distance_projective_sqr = cms.vdouble(60, 60),
<         max_distance_projective_sqr_closest_points = cms.vdouble(60, 60),
---
>         lower_boundary = cms.vdouble(20, 10),
>         lower_distance_projective_sqr = cms.vdouble(30, 60),
>         lower_distance_projective_sqr_closest_points = cms.vdouble(10, 50),
79486,79488c79488,79490
<         min_distance_z = cms.vdouble(30, 30),
<         min_num_lcs = cms.uint32(7),
<         min_trackster_energy = cms.double(10),
---
>         min_distance_z = cms.vdouble(35, 35),
>         min_num_lcs = cms.uint32(15),
>         min_trackster_energy = cms.double(20),
79492c79494,79497
<         wind = cms.double(0.036)
---
>         upper_boundary = cms.vdouble(150, 100),
>         upper_distance_projective_sqr = cms.vdouble(30, 60),
>         upper_distance_projective_sqr_closest_points = cms.vdouble(5, 30),
>         wind = cms.double(0.072)
79519,79520c79524,79525
<         onnxEnergyModelPath = cms.FileInPath('RecoHGCal/TICL/data/ticlv5/onnx_models/linking/energy_v0.onnx'),
<         onnxPIDModelPath = cms.FileInPath('RecoHGCal/TICL/data/ticlv5/onnx_models/linking/id_v0.onnx'),
---
>         onnxEnergyModelPath = cms.FileInPath('RecoHGCal/TICL/data/ticlv5/onnx_models/patternrecognition/energy_v0.onnx'),
>         onnxPIDModelPath = cms.FileInPath('RecoHGCal/TICL/data/ticlv5/onnx_models/patternrecognition/id_v0.onnx'),

@mmusich
Copy link
Contributor

mmusich commented Nov 3, 2024

@cmsbuild, please test

@mmusich
Copy link
Contributor

mmusich commented Nov 12, 2024

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

-1

Failed Tests: UnitTests
Size: This PR adds an extra 12KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-d4760a/42775/summary.html
COMMIT: 75b3afc
CMSSW: CMSSW_14_2_X_2024-11-12-1100/el8_amd64_gcc12
Additional Tests: HLT_P2_INTEGRATION,HLT_P2_TIMING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/46539/42775/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found 1 errors in the following unit tests:

---> test TestDQMGUIUpload had ERRORS

Comparison Summary

There are some workflows for which there are errors in the baseline:
2024.101001 step 1
2024.202001 step 1
2024.303001 step 1
The results for the comparisons for these workflows could be incomplete
This means most likely that the IB is having errors in the relvals.The error does NOT come from this pull request

Summary:

  • You potentially removed 4 lines from the logs
  • Reco comparison results: 1776 differences found in the comparisons
  • DQMHistoTests: Total files compared: 46
  • DQMHistoTests: Total histograms compared: 3369424
  • DQMHistoTests: Total failures: 2761
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 3366642
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -0.004 KiB( 45 files compared)
  • DQMHistoSizes: changed ( 29896.203 ): -0.004 KiB MessageLogger/Warnings
  • Checked 204 log files, 172 edm output root files, 46 DQM output files
  • TriggerResults: no differences found

@mmusich
Copy link
Contributor

mmusich commented Nov 13, 2024

I found 1 errors in the following unit tests:

errors are unrelated, see #46682

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Pull request #46539 was updated. @Martin-Grunewald, @Moanwar, @cmsbuild, @jfernan2, @mandrenguyen, @mmusich, @srimanob, @subirsarkar can you please check and sign again.

@waredjeb waredjeb changed the title TICLv5.1 : Improve skeleton linking and parameters tuning TICLv5a : Improve skeleton linking and parameters tuning Nov 13, 2024
@mmusich
Copy link
Contributor

mmusich commented Nov 13, 2024

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

-1

Failed Tests: UnitTests
Size: This PR adds an extra 40KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-d4760a/42800/summary.html
COMMIT: d112936
CMSSW: CMSSW_14_2_X_2024-11-12-2300/el8_amd64_gcc12
Additional Tests: HLT_P2_INTEGRATION,HLT_P2_TIMING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/46539/42800/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found 1 errors in the following unit tests:

---> test TestDQMGUIUpload had ERRORS

Comparison Summary

Summary:

@mmusich
Copy link
Contributor

mmusich commented Nov 15, 2024

@cmsbuild, please test

  • let's try again

@cmsbuild
Copy link
Contributor

+1

Size: This PR adds an extra 12KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-d4760a/42875/summary.html
COMMIT: d112936
CMSSW: CMSSW_14_2_X_2024-11-14-2300/el8_amd64_gcc12
Additional Tests: HLT_P2_INTEGRATION,HLT_P2_TIMING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/46539/42875/install.sh to create a dev area with all the needed externals and cmssw changes.

  • DAS Queries: The DAS query tests failed, see the summary page for details.

  • HLT P2 Timing: chart

Comparison Summary

Summary:

  • You potentially added 4 lines to the logs
  • Reco comparison results: 1779 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3642837
  • DQMHistoTests: Total failures: 2729
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 3640087
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.004 KiB( 48 files compared)
  • DQMHistoSizes: changed ( 29896.203 ): 0.004 KiB MessageLogger/Warnings
  • Checked 216 log files, 184 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants