Add option to run multiple jobs in a row #71

nweires · 2024-03-14T18:00:51Z

Allows passing in multiple project files to buildstock_gcp. The projects are all validated upfront (to catch any issues quickly) then run sequentially.

Possible future improvements:

Start the second job while the first job is still running post-processing.
Allow overriding other variables in the project file. For example, allow specifying something like "Run project.yaml with upgrades A, B, C and output directories X, Y, Z."

mfathollahzadeh

Thanks, Natalie! just had a few clarifying questions

mfathollahzadeh · 2024-03-25T14:43:07Z

buildstockbatch/gcp/gcp.py

@@ -1186,12 +1186,13 @@ def main():
        GcpBatch.run_combine_results_on_cloud(gcs_bucket, gcs_prefix, results_dir, do_timeseries)
    else:
        parser = argparse.ArgumentParser()
-        parser.add_argument("project_filename")
+        parser.add_argument("project_filenames", help="Comma-separated list of project YAML files to run.")


maybe removing the help part from here as it is in the argument section as well?

I'm not sure what you mean... But I like the help string here because it shows up when you run buildstock_gcp --help:

... positional arguments: project_filenames Comma-separated list of project YAML files to run. job_identifiers Comma-separated list of job IDs to use. Optional override of gcp.job_identifier in your project file. Max 48 characters. ...

mfathollahzadeh · 2024-03-25T14:46:11Z

buildstockbatch/gcp/gcp.py

+        job_IDs = len(project_filenames) * [None]
+        if args.job_identifiers:
+            job_IDs = args.job_identifiers.split(",")
+            if len(job_IDs) != n_projects:


Is this capturing the project_id issue? like trying to see what is the likelihood of this error happening?

This is just checking that if you give a list of project files and a list of IDs, they're the same length.

mfathollahzadeh · 2024-03-25T14:48:02Z

buildstockbatch/gcp/gcp.py

@@ -1186,12 +1186,13 @@ def main():
        GcpBatch.run_combine_results_on_cloud(gcs_bucket, gcs_prefix, results_dir, do_timeseries)


Maybe we move this into develop at this point instead of gcp?

mfathollahzadeh · 2024-03-25T14:59:22Z

buildstockbatch/gcp/gcp.py

-            batch.push_image()
-            batch.run_batch()
-            batch.process_results()
+        for project_filename, job_ID in zip(project_filenames, job_IDs):


Project-Job Pair Processing continues all the way to post-processing and then the next project-job pair is picked up or once the results are avilable in gcs, this will pick up the next job?

Right now, this is just running one job completely (waiting for post-processing to finish), then starting the next one. We could potentially start the second job while the first is in post-processing, but I'm not doing that here.

github-actions · 2024-03-29T15:18:16Z

File	Coverage
All files	`87%`	✅
base.py	`91%`	✅
exc.py	`57%`	✅
hpc.py	`78%`	✅
local.py	`70%`	✅
postprocessing.py	`84%`	✅
utils.py	`92%`	✅
cloud/docker_base.py	`88%`	✅
sampler/base.py	`79%`	✅
sampler/downselect.py	`33%`	✅
sampler/precomputed.py	`93%`	✅
sampler/residential_quota.py	`61%`	✅
test/shared_testing_stuff.py	`85%`	✅
test/test_docker.py	`33%`	✅
test/test_local.py	`97%`	✅
test/test_validation.py	`97%`	✅
workflow_generator/base.py	`90%`	✅
workflow_generator/commercial.py	`53%`	✅
workflow_generator/residential_hpxml.py	`86%`	✅

Minimum allowed coverage is 33%

Generated by 🐒 cobertura-action against 07ccfae

Add option to run multiple jobs in a row

4d8fa80

mfathollahzadeh reviewed Mar 25, 2024

View reviewed changes

Merge branch 'gcp' into natalie/run_multiple

064025b

Update docs

07ccfae

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option to run multiple jobs in a row #71

Add option to run multiple jobs in a row #71

nweires commented Mar 14, 2024 •

edited

Loading

mfathollahzadeh left a comment

mfathollahzadeh Mar 25, 2024

nweires Mar 29, 2024

mfathollahzadeh Mar 25, 2024

nweires Mar 29, 2024

mfathollahzadeh Mar 25, 2024

mfathollahzadeh Mar 25, 2024

nweires Mar 29, 2024

github-actions bot commented Mar 29, 2024 •

edited

Loading

		@@ -1186,12 +1186,13 @@ def main():
		GcpBatch.run_combine_results_on_cloud(gcs_bucket, gcs_prefix, results_dir, do_timeseries)

Add option to run multiple jobs in a row #71

Are you sure you want to change the base?

Add option to run multiple jobs in a row #71

Conversation

nweires commented Mar 14, 2024 • edited Loading

mfathollahzadeh left a comment

Choose a reason for hiding this comment

mfathollahzadeh Mar 25, 2024

Choose a reason for hiding this comment

nweires Mar 29, 2024

Choose a reason for hiding this comment

mfathollahzadeh Mar 25, 2024

Choose a reason for hiding this comment

nweires Mar 29, 2024

Choose a reason for hiding this comment

mfathollahzadeh Mar 25, 2024

Choose a reason for hiding this comment

mfathollahzadeh Mar 25, 2024

Choose a reason for hiding this comment

nweires Mar 29, 2024

Choose a reason for hiding this comment

github-actions bot commented Mar 29, 2024 • edited Loading

nweires commented Mar 14, 2024 •

edited

Loading

github-actions bot commented Mar 29, 2024 •

edited

Loading