-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix cost (for methods cromwell), add explicit example #278
Conversation
The test errors I'm getting appear to be #277 The github actions seem to be failing when installing the repo, unless I'm misunderstanding the logs. |
README.md
Outdated
{ | ||
"cromwell_server": "https://cromwell-v77.dsde-methods.broadinstitute.org/", | ||
"requests_timeout": 5, | ||
"bq_cost_table": "Methods_billing_dump.gcp_billing_export_v1_009C7D_923007_219A6F" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer we hide these values from the open internet, just in case...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so how about the following:
"<bq_dataset_hosting_the_billing_datatable>.<the_billing_datatable>"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do that if we need to yeah
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(But if there's no risk in exposing these, I'd like the example to be 100% explicit. Because I never would have foudn those values.
So... is there a risk?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or... better... if we included directions for how to find that table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I don't know what those directions are
@@ -182,15 +182,15 @@ def create_bq_query(detailed: bool, bq_cost_table: str) -> str: | |||
AND task.key LIKE "wdl-task-name" | |||
AND wfid.key LIKE "cromwell-workflow-id" | |||
AND wfid.value like @workflow_id | |||
AND partition_time BETWEEN @start_date AND @end_date | |||
AND export_time BETWEEN @start_date AND @end_date |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at our cost tabe and do see this key in it, instead of the partition_time
key. But I wonder why it was partition_time
at the beginning, starting from the shell version. Did Google change things underneath us?
export_time
is documented here, but not partition_time
https://cloud.google.com/billing/docs/how-to/export-data-bigquery-tables/detailed-usage
And confusing the situation, even more, are the two other *time
values: usage_start_time
and usage_end_time
. And now I wonder, should we use them instead of export_time
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good questions. I do remember some email about google changing things. I think this is what happened, but I can't verify it.
Not sure about usage_end_time
versus export_time
The unit/integration tests are failing for an unrelated reason and is being fixed here. Once that PR is merged, updating your branch from main should pass your tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pulled in the changes to fix the previous git action test fails.
Now, just need to update the unit test script that checks the for the expected query. Update it the "export_time"
https://github.com/broadinstitute/cromshell/blob/f1344e208a8a0a1375651f70ce8e0261ff6747b9/tests/unit/test_cost.py#L26C21-L26C35
cromshell/tests/unit/test_cost.py
Line 37 in f1344e2
WHERE value LIKE @workflow_id AND partition_time BETWEEN @start_date AND @end_date |
I think it's ready pending @SHuang-Broad 's comment above about hiding the values from the README. Not sure what the consensus is there. I see Steve's point, but also if the example is less explicit, I would never be able to figure out how to use |
@sjfleming
where you can see the table like displayed in the following figure (actual project hidden) Click on the table itself (in this case, |
Thanks @SHuang-Broad ! Alright how about this README? (Tried to just use one picture, but I think these are super helpful.) |
Looks good to me! Thanks for getting all these together! |
I didn't have a clue how to get
cromshell cost
to work. @SHuang-Broad helped me out.I added the information I needed to the README.md so others can get it to work.
There also seemed to be a bug in
cromshell cost
: the code expected a table column calledpartition_time
but the table here (Methods cromwell server v77 location)https://console.cloud.google.com/bigquery?project=broad-dsde-methods&organizationId=548622027621&j=bq:US:bquxjob_85a71e8_16fa747f47d&page=queryresults&ws=!1m10!1m4!4m3!1sbroad-dsde-methods!2sMethods_billing_dump!3sgcp_billing_export_v1_009C7D_923007_219A6F!1m4!1m3!1sbroad-dsde-methods!2sbquxjob_85a71e8_16fa747f47d!3sUS
has a column called
export_time
instead.When I changed the code from
partition_time
toexport_time
, I was able to get the command to work and see my job costs.