Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hpstr cannot process more than one input file. #107

Open
mholtrop opened this issue Jun 15, 2020 · 8 comments
Open

hpstr cannot process more than one input file. #107

mholtrop opened this issue Jun 15, 2020 · 8 comments
Assignees

Comments

@mholtrop
Copy link
Collaborator

There is an issue with the TH1D histogram that counts events, which is destroyed when the first file is processed.

  • The event count should probably be stored in an TParameter and not a histogram
  • EventFile is created for each input file, but never deleted.
  • EventProcessor creates new EventHeader, TSData and VTPData, but never cleans them up.
@mholtrop mholtrop self-assigned this Jun 15, 2020
@mholtrop
Copy link
Collaborator Author

  • The EventFile class should not also open a TFile and do output stuff. Having this there really complicates the code, while opening a TFile is just a one liner. It makes EventFile way more complicated than it needs to be if you want to open LCIO files and NOT create a root output file, or if you want to process multiple LCIO files into a single ROOT file.
  • There is really no benefit to EventFile managing the output.
  • I do not see a reason for the "IEventFile" abstract class, but I will leave this as is.

@omar-moreno
Copy link
Collaborator

omar-moreno commented Jun 15, 2020

This isn't an issue with EventFile. It's an issue with how EventFile is used within process. Before making major changes here, please propose a solution at the software meeting. This could potentially end up breaking how the framework completely works so I want to be careful.

FYI, we have multiple file IO working on LDMX using the same framework i.e. EventFile. I can just port this code over.

@mholtrop
Copy link
Collaborator Author

It IS an issue with EventFile, at least how it is currently implemented in hpstr. Yes, of course we can modify EventFile to do whatever we want, that is not the point.

This is not the LDMX framework. I am sure you had a good reason for the more complicated EventFile that you have there, but I fail to see how this would help the hpstr code, if all we really need to do is open an LCIO file and read events from it.

Can you please explain the purpose of using EventFile in this way and how that would benefit hpstr, since I really see no reason why this needs to become some complicated class?

@cbravo135
Copy link
Collaborator

The intent is to run hpstr once per file and hadd after if you want to concatenate multiple files. I would categorize this as a fancy feature that would be nice, but is not necessary. If we have something we absolutely need to do this for, I will make it work, but I don't really think this is really a necessary feature.

@mholtrop
Copy link
Collaborator Author

It is nearly trivial to fix.
As far as I know, using hadd does not work on TTree objects. Also, for histograms, it is a pretty inefficient way to go if you just want to run over a few small root files to get a histogram.

@cbravo135
Copy link
Collaborator

hadd does work on TTree objects, I have been hadding them for probably nearly 10 years, as long as they have the same structure. hadding histograms takes seconds, not such a huge inefficiency that anyone notices. The only hurdle is typing one more command...

@cbravo135
Copy link
Collaborator

In the end, running on multiple files in parallel and then hadding the histograms output is FASTER than filling the histos serially. Parallel processing is tough to beat.

@cbravo135
Copy link
Collaborator

The only "inefficiency" is in the number of command line commands that need to be issued.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants