Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates for sharrow compatibility #2

Merged

Conversation

jpn--
Copy link
Collaborator

@jpn-- jpn-- commented Aug 5, 2024

First: when using non-sharrow, the trip_period values for the trip table are stored as strings “EA”, “AM”, “MD”, etc. When using sharrow, for efficiency these values get encoded as the integer position within the time dimension 0, 1, 2, etc.

These two lines in the preprocessor work only for the legacy code:

https://github.com/wsp-sag/client_arc_abm/blob/3b22eb234537e69d26b2fddc953e0d8f7c713390/configs/trip_mode_choice_annotate_trips_preprocessor.csv#L16-L17

But are easily modified for sharrow compatibility:

https://github.com/driftlesslabs/client_arc_abm/blob/1852f8a42fdaf9d639dea6e208c2331c4dcf4c6d/configs/trip_mode_choice_annotate_trips_preprocessor.csv#L16-L17

Second, I’ve found an inconsistency in the trip mode choice spec. The preprocessor defines “_in_period” as shown above using ['MD','PM'], and uses it in several places, but then the main trip mode spec has several lines that probably should use the same “_in_period” but instead recompute the isin with different operands (['PM','EV']):

https://github.com/wsp-sag/client_arc_abm/blob/3b22eb234537e69d26b2fddc953e0d8f7c713390/configs/trip_mode_choice.csv#L199-L206

I've change the code in this PR to emit is_in_period from the preprocessor with ['MD','PM'] and use it in the main spec. @i-am-sijia can you confirm (a) that we do want a consistent set of time periods used, and (b) whether the correct set of time periods is ['MD','PM'], ['PM','EV'], or something else?

@jpn--
Copy link
Collaborator Author

jpn-- commented Aug 5, 2024

Note I marked this as "draft" because I'm not done checking if these changes represent all the needed changes or just the ones I found so far...

@jpn-- jpn-- marked this pull request as ready for review August 5, 2024 20:24
@jpn--
Copy link
Collaborator Author

jpn-- commented Aug 5, 2024

@i-am-sijia I've confirmed this PR now has what's needed to run the ARC model with sharrow on, single process. It also requires ActivitySim/activitysim#884, which prevents a rare underflow from happening in the ARC model when run at scale. The final output results look reasonably close with sharrow on or off, just plus/minus a handful of trips on a 100K household sample. After a precompile, running that 100K household sample single-thread/single-process on my laptop:

  • sharrow: require & recode_pipeline_columns: true takes 26.8 minutes
  • sharrow: false & recode_pipeline_columns: false takes 64.5 minutes

@i-am-sijia
Copy link
Collaborator

i-am-sijia commented Aug 14, 2024

@jpn-- Thank you. I am able to run 100% sample with Sharrow on WSP's server with these changes. When running multiprocessing with 28 processors, the run time without vs with Sharrow are 200 minutes vs 113 minutes, a 44% reduction with Sharrow.

I've change the code in this PR to emit is_in_period from the preprocessor with ['MD','PM'] and use it in the main spec. @i-am-sijia can you confirm (a) that we do want a consistent set of time periods used, and (b) whether the correct set of time periods is ['MD','PM'], ['PM','EV'], or something else?

I checked ARC's CTRAMP UEC. Yes we want a consistent set of time periods used in the preprocessor and the main spec. The correct set of time periods for in_period is ["MD','PM']. So what you did was consistent with CTRAMP.

@i-am-sijia i-am-sijia merged commit bbfb781 into atlregional:asim-v1.3-integration Aug 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants