Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update how we start post-processing job #64

Merged
merged 2 commits into from
Feb 2, 2024
Merged

Update how we start post-processing job #64

merged 2 commits into from
Feb 2, 2024

Conversation

nweires
Copy link
Collaborator

@nweires nweires commented Jan 30, 2024

  • When creating the job, explicitly wait for that operation to finish (instead of retrying the next step).
  • Fix how we check whether the execution has started.

This should help with the errors we sometimes get when post-processing starts.

Copy link

github-actions bot commented Jan 30, 2024

File Coverage
All files 86%
base.py 91%
exc.py 57%
hpc.py 78%
local.py 70%
postprocessing.py 84%
utils.py 91%
cloud/docker_base.py 79%
sampler/base.py 79%
sampler/downselect.py 33%
sampler/precomputed.py 93%
sampler/residential_quota.py 61%
test/shared_testing_stuff.py 85%
test/test_docker.py 33%
test/test_local.py 97%
test/test_validation.py 97%
workflow_generator/base.py 90%
workflow_generator/commercial.py 53%
workflow_generator/residential_hpxml.py 86%

Minimum allowed coverage is 33%

Generated by 🐒 cobertura-action against 720a933

Copy link

@mfathollahzadeh mfathollahzadeh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Natalie! Just added some questions

@@ -910,21 +910,23 @@ def start_combine_results_job_on_cloud(self, results_dir, do_timeseries=True):

# Create the job
jobs_client = run_v2.JobsClient()
jobs_client.create_job(
op = jobs_client.create_job(
run_v2.CreateJobRequest(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering if we should also change this to request = run_v2.CreateJobRequest(?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you just referring to the variable name? I used op because create_job returns an Operation object.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment was more of a readability comment. when creating a job with the JobsClient in Google Cloud Run, should you pass the CreateJobRequest object using the request keyword argument?
like

op = jobs_client.create_job(
            request = run_v2.CreateJobRequest(

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. Updated!

exc_info=True,
)
return
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes it much simpler to follow but I feel like we should probably keep the retry and probably change the wait from 1 second to something more robust? Curious to hear your thoughts on removing the retry

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The retries were only there to handle the case where the job took a few seconds to be created, causing the first attempt(s) to fail. But now the call to op.result() above explicitly waits for the job creation to finish. So it's unlikely that just retrying would help anymore.

Copy link

@mfathollahzadeh mfathollahzadeh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Natalie! Looks good to me!

@@ -910,21 +910,23 @@ def start_combine_results_job_on_cloud(self, results_dir, do_timeseries=True):

# Create the job
jobs_client = run_v2.JobsClient()
jobs_client.create_job(
op = jobs_client.create_job(
run_v2.CreateJobRequest(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment was more of a readability comment. when creating a job with the JobsClient in Google Cloud Run, should you pass the CreateJobRequest object using the request keyword argument?
like

op = jobs_client.create_job(
            request = run_v2.CreateJobRequest(

@nweires nweires merged commit 75c951e into gcp Feb 2, 2024
6 checks passed
@nweires nweires deleted the natalie/pp_start branch February 2, 2024 15:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants