Skip to content

Commit

Permalink
Update processor to run on NanoAOD with globalParT stage2 (#214)
Browse files Browse the repository at this point in the history
* review xsecs with xsdb

* add v12v2 with glopart

* use xsdb number for last vbfhh4b sample

* fix typo in vbf sample for xsec

* add new nano version

* delete unused trigger weights

* reorder added fatjet variables and rename parT

* enable bdt inference, reduce jecs, add part variables, uniformize presel and signal cuts, lower mreg cut;

* style: pre-commit fixes

* formatting

* style: pre-commit fixes

* fix format and remove duplicate xsec

* style: pre-commit fixes

* fix format

* add submit config and get rid of unused option

* gen match ak8 jets ordered by pt

* 2nd try submission

* style: pre-commit fixes

* fix str

* style: pre-commit fixes

* add part roc curve validation

* add pnet xbb option

* style: pre-commit fixes

* fix

* style: pre-commit fixes

* format

* style: pre-commit fixes

* format

* push latest files

* style: pre-commit fixes

* fix submit

* add matching for vv and vjets

* style: pre-commit fixes

* change pnet txbb option

* add assertion in genselection and switch pnet txbb str

* style: pre-commit fixes

* few more fixes

* add options for loading with different txbb versions

* move to run2 folder

* add bdt training config

* style: pre-commit fixes

* remove legacy options

* style: pre-commit fixes

* add bdt training command

* style: pre-commit fixes

* add all years

* style: pre-commit fixes

* fix logging

* temp fix to submit w/o trigger selection

* add logging

* style: pre-commit fixes

* sync ROC curve

* style: pre-commit fixes

* BDT trained and minor modifications to plotting (#216)

* Added notebook to retrieve eventlist

* style: pre-commit fixes

* better merge

* style: pre-commit fixes

* update

* style: pre-commit fixes

* saving loaded events in root, wip

* NB loads file, selects desired columns from hh4b, saves to root

* root files saved into a separate folder

* started making eventlist.py, made note of potential bug in EventList.ipynb

* minor touch ups

* make eventlist.py

* added README.md

* added README.md

* tested eventlist script, ready for merge

* training BDT with v12 events

* PNetPlots wip

* wip

* plotting ROC for PNET wip

* wip

* added matching masks

* added different way of handling signal exclusive columns

* wip

* fixed masks, wip

* added Legacy, values off, needs fix

* wip

* wip

* added 24Sep26 BDT Training config

* modified TrainBDT.py and postprocessing.py to fit retraining

* wip

* test BDT trained on aprtial data

* trained 24Sep27_v5_GloParTv2

* removed duplicate line

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Javier Duarte <[email protected]>

* revert trigger selection in place

* comparison script

* style: pre-commit fixes

* working version of ValidateBDT.py, next: evaluate with scale&smear (#217)

* Added notebook to retrieve eventlist

* style: pre-commit fixes

* better merge

* style: pre-commit fixes

* update

* style: pre-commit fixes

* saving loaded events in root, wip

* NB loads file, selects desired columns from hh4b, saves to root

* root files saved into a separate folder

* started making eventlist.py, made note of potential bug in EventList.ipynb

* minor touch ups

* make eventlist.py

* added README.md

* added README.md

* tested eventlist script, ready for merge

* training BDT with v12 events

* PNetPlots wip

* wip

* plotting ROC for PNET wip

* wip

* added matching masks

* added different way of handling signal exclusive columns

* wip

* fixed masks, wip

* added Legacy, values off, needs fix

* wip

* wip

* added 24Sep26 BDT Training config

* modified TrainBDT.py and postprocessing.py to fit retraining

* wip

* test BDT trained on aprtial data

* trained 24Sep27_v5_GloParTv2

* removed duplicate line

* fixed bugs and working on plotting funciton

* working version of ValidateBDT.py, next: evaluate with scale&smear

* fixed plotting of vertical lines for thresholds

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Javier Duarte <[email protected]>

* edit requirements

* edit legend

* style: pre-commit fixes

* performed validation on latest three BDTs, cleaned up code  (#219)

* Added notebook to retrieve eventlist

* style: pre-commit fixes

* better merge

* style: pre-commit fixes

* update

* style: pre-commit fixes

* saving loaded events in root, wip

* NB loads file, selects desired columns from hh4b, saves to root

* root files saved into a separate folder

* started making eventlist.py, made note of potential bug in EventList.ipynb

* minor touch ups

* make eventlist.py

* added README.md

* added README.md

* tested eventlist script, ready for merge

* training BDT with v12 events

* PNetPlots wip

* wip

* plotting ROC for PNET wip

* wip

* added matching masks

* added different way of handling signal exclusive columns

* wip

* fixed masks, wip

* added Legacy, values off, needs fix

* wip

* wip

* added 24Sep26 BDT Training config

* modified TrainBDT.py and postprocessing.py to fit retraining

* wip

* test BDT trained on aprtial data

* trained 24Sep27_v5_GloParTv2

* removed duplicate line

* fixed bugs and working on plotting funciton

* working version of ValidateBDT.py, next: evaluate with scale&smear

* fixed plotting of vertical lines for thresholds

* wip

* performed validation on thre most recent BDTs, code cleaned up

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Javier Duarte <[email protected]>
Co-authored-by: Cristina Mantilla Suarez <[email protected]>

* modify jms jmr to default to 10%

* style: pre-commit fixes

* remove year

* restructure yaml files

* revert jmsr for semilep-tt and implement in ttskimmer instead

* fix a couple of typos in ttskimmer

* rhalphalib update

* pre-commit

* run3 ttskimmer

* remove unused option

* style: pre-commit fixes

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: dprim7 <[email protected]>
Co-authored-by: Javier Duarte <[email protected]>
  • Loading branch information
4 people authored Oct 15, 2024
1 parent 8ac20c1 commit 5491a74
Show file tree
Hide file tree
Showing 141 changed files with 96,843 additions and 3,237 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -178,7 +178,7 @@ src/HH4b/postprocessing/**/*.pdf
src/HH4b/boosted/**/*.png
src/HH4b/boosted/**/*.pdf
src/HH4b/boosted/**/*roc_dict.pkl

src/HH4b/boosted/**/*.pkl

running_jobs.txt

Expand Down
40 changes: 5 additions & 35 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,8 @@ pip install -e .
# for committing to the repository
pip install pre-commit
pre-commit install
# install requirements
pip3 install -r requirements.txt
```

### Troubleshooting
Expand All @@ -101,18 +103,10 @@ python3 -m pip install -e .
For submitting to condor, all you need is python >= 3.7.

For running locally, follow the same virtual environment setup instructions
above and install `coffea`
above

```bash
micromamba activate hh4b
pip install coffea
```

Clone the repository:

```
git clone https://github.com/LPC-HH/HH4b/
pip install -e .
```

### Running locally
Expand Down Expand Up @@ -192,31 +186,6 @@ python -u -W ignore src/run.py --year 2022EE --yaml src/condor/submit_configs/sk

## Postprocessing

### Setup

Make sure to install the package (#installing-package) and install all the
requirements in your conda environment:

```bash
pip3 install -r requirements.txt
```

### BDT Training

Multi-class BDT training:

```bash
python -W ignore TrainBDT.py --data-path /ceph/cms/store/user/rkansal/bbbb/skimmer/24Apr19LegacyFixes_v12_private_signal/ --model-name 24Apr21_legacy_vbf_vars --legacy --sig-keys hh4b vbfhh4b-k2v0 --no-pnet-plots
```

### Creating templates / FOM Scan / BDT ROC curve

From inside the src/HH4b/postprocessing directory:

```bash
python PostProcess.py --templates-tag 24Apr17pT300Cut --tag 24Mar31_v12_signal --legacy --mass H2PNetMass --bdt-model 24Apr21_legacy_vbf_vars --bdt-config 24Apr21_legacy_vbf_vars --txbb-wps 0.99 0.94 --bdt-wps 0.94 0.68 0.03 (--no-fom-scan) (--no-fom-scan-bin1) (--no-fom-scan-bin2) (--no-fom-scan-vbf) (--no-templates) (--bdt-roc)
```

## Condor Scripts

### Check jobs
Expand All @@ -241,6 +210,7 @@ Combine all output pickles into one:
for year in 2016APV 2016 2017 2018; do python src/condor/combine_pickles.py --tag $TAG --processor trigger --r --year $year; done
```


## Combine

### CMSSW + Combine Quickstart
Expand Down Expand Up @@ -269,7 +239,7 @@ and this repo:

```bash
# rhalphalib
git clone https://github.com/rkansal47/rhalphalib
git clone https://github.com/nsmith-/rhalphalib
cd rhalphalib
pip3 install -e . --user # editable installation
cd ..
Expand Down
485 changes: 483 additions & 2 deletions data/make_filelists.py

Large diffs are not rendered by default.

91,458 changes: 91,458 additions & 0 deletions data/nanoindex_v12v2_private.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ pandas==2.0.3
pyarrow==12.0.1
PyYAML==6.0.1
scikit-learn
setuptools<71
tabulate
tqdm==4.65.0
uproot==4.3.7
Expand Down
Loading

0 comments on commit 5491a74

Please sign in to comment.