-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix file-read performance with parallel call #713
Comments
I started looking into this and have a possibly too invasive fix for it, I'm looking into some other less complex modifications that maintain most of the speedup without the additional complexity. https://github.com/JoshCu/t-route/tree/optimise_parquet The bottleneckCurrently there are number_of_catchment * number_of_timesteps file io operations so ~51k file writes for the ngiab demo data of ~220 catchments over 240 timesteps. The required functionalityWhen building the qlat array before the routing calculations are run, the output nexus files from ngen are one file per catchment containing timestamped data. They need to be pivoted to one file per timestep containing the data for each catchment at that timestep Considerations for the fix
Room for optimization
The fix I have addresses both of those issues, but the former uses Dask and the latter adds a lot of extra code to handle grouped up parquet files. I'm going to try out a lighter version of the change to get some performance improvements quickly without getting in the parquet file weeds. Parquet filesFrom some cursory research it looks like optimal parquet files are anywhere from ~200mb-1gb. |
Running T-Route from Ngen inputs takes far too long due to the back-and-forth reading and writing of qlateral inputs.
There are reasonably sane explanations for how things got that way.
This line seems like a ripe option for improvement with a relatively focused fix:
https://github.com/NOAA-OWP/t-route/blame/07d511bb4f93aed23b9031d0b464b17bdf8f2060/src/troute-network/troute/AbstractNetwork.py#L870C9-L870C9
PR forthcoming.
The text was updated successfully, but these errors were encountered: