Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow for HDMF Dev compatibility #77

Merged
merged 34 commits into from
Sep 12, 2023
Merged

Workflow for HDMF Dev compatibility #77

merged 34 commits into from
Sep 12, 2023

Conversation

mavaylon1
Copy link
Contributor

@mavaylon1 mavaylon1 commented Aug 25, 2023

Summary of changes

Created new workflow to check for HDMf Dev compatibility.

The workflow will clone the dev branch of HDMF and run the tests for "common".

PR checklist for schema changes

  • Update the version string in docs/source/conf.py and common/namespace.yaml to the next version with the suffix "-alpha"
  • Add a new section in the release notes for the new version with the date "Upcoming"
  • Add release notes for the PR to docs/source/hdmf_common_release_notes.rst and/or
    docs/source/hdmf_experimental_release_notes.rst

@mavaylon1
Copy link
Contributor Author

@rly @oruebel I took a different route to what was on nwb-schema and similar to how we test pynwb on hdmf. The way nwb-schema validates is comparing the yaml to a json file, which we do not have. I thought it would be good enough to install hdmf and just test the "common" folder.

I do have a question :

name: Cancel non-latest runs
        uses: styfle/[email protected]
        with:
          all_but_latest: true
          access_token: ${{ github.token }}

I am a little confused about the github.token
Further digging, I found that GITHUB_TOKEN is an environment variable that is generated during the workflow for read and write access. (https://devopsjournal.io/blog/2022/01/03/GitHub-Tokens)

I don't fully understand the point of this.

@mavaylon1
Copy link
Contributor Author

Note I did not go through the check list for the PR and this is still a draft. @oruebel @rly

@mavaylon1
Copy link
Contributor Author

The token is used to authenticate in the workflow job. Does this mean, we need to give the workflow permission to read the repository or in this case to cancel the workflow job if it is running already.

If I as a third party person (not dev) want to cancel a workflow, I cannot because I don't have permission. Is this just giving the workflow permission to cancel to workflow if it exists as if the workflow was a third party person?

@mavaylon1 mavaylon1 self-assigned this Aug 28, 2023
@mavaylon1
Copy link
Contributor Author

mavaylon1 commented Aug 28, 2023

the release notes are not updated since this is a workflow update and no release will be made.

@mavaylon1 mavaylon1 marked this pull request as ready for review August 28, 2023 17:16
@mavaylon1 mavaylon1 requested a review from rly August 28, 2023 17:16
@rly
Copy link
Contributor

rly commented Aug 28, 2023

This workflow will checkout the dev branch of hdmf and the specific commit of hdmf-common-schema that is stored in hdmf, and then run the tests in tests/unit/common/ of hdmf. That does not test the current hdmf-common-schema version.

What I was looking for is a little more complicated than I initially imagined. Looking at how we do it in nwb-schema (https://github.com/NeurodataWithoutBorders/nwb-schema/blob/dev/.github/workflows/validate_schema.yml), we want to run the script validate_hdmf_spec core -m nwb.schema.json but replacing nwb.schema.json with a json schema for HDMF common. It looks like we never made that schema. But that file should be pretty much the same as https://github.com/NeurodataWithoutBorders/nwb-schema/blob/dev/nwb.schema.json except that 1) the top level metadata needs to be adjusted for HDMF common, 2) all instances of "neurodata_type" -> "data_type" and 3) rename it to hdmf-common.schema.json. I think that's it. Then because the schema lives in a directory "common" instead of "core", I think the command should be validate_hdmf_spec common -m hdmf-common.schema.json.

@mavaylon1
Copy link
Contributor Author

@rly what if I cd into hdmf_common_schema, git checkout PR branch for the schema, then ran the tests

@rly
Copy link
Contributor

rly commented Aug 31, 2023

Then when the branch is merged, you have to update the workflow again so that you check out the current main branch of the schema instead of a PR branch.

That is fine to do, but I think independently of whether the dev branch of HDMF is compatible with the current PR or main branch of the schema, it would be good to run validate_hdmf_spec common -m hdmf-common.schema.json just like what we have in NWB schema, to catch validation issues with the schema that may not be caught by the tests in HDMF.

@mavaylon1
Copy link
Contributor Author

mavaylon1 commented Aug 31, 2023

Then when the branch is merged, you have to update the workflow again so that you check out the current main branch of the schema instead of a PR branch.

That is fine to do, but I think independently of whether the dev branch of HDMF is compatible with the current PR or main branch of the schema, it would be good to run validate_hdmf_spec common -m hdmf-common.schema.json just like what we have in NWB schema, to catch validation issues with the schema that may not be caught by the tests in HDMF.

The reason why I went with this approach was because it would return an error if the schema wasn't compatible. With going the json route, does that record every instance where there's an issue? Because on second thought my way would only catch it one at a time. @rly

@rly
Copy link
Contributor

rly commented Aug 31, 2023

The compatibility test is good.

With JSON validation, I think it will error out at the first error. But critically, just because the schema is compatible with the current API does not mean it is valid according to the schema language. For example, if we add a new data type to the schema, there will not be corresponding code that uses that data type in the dev branch of the API (so that test will pass), but the new data type definition might be invalid according to the rules of the language. We might then write API code for this new data type only to determine later that we wrote the schema in a way that is technically not valid.

Can you please also implement a workflow that does validate_hdmf_spec common -m hdmf-common.schema.json?

@mavaylon1
Copy link
Contributor Author

The compatibility test is good.

With JSON validation, I think it will error out at the first error. But critically, just because the schema is compatible with the current API does not mean it is valid according to the schema language. For example, if we add a new data type to the schema, there will not be corresponding code that uses that data type in the dev branch of the API (so that test will pass), but the new data type definition might be invalid according to the rules of the language. We might then write API code for this new data type only to determine later that we wrote the schema in a way that is technically not valid.

Can you please also implement a workflow that does validate_hdmf_spec common -m hdmf-common.schema.json?

I'll do both in this PR.

@mavaylon1
Copy link
Contributor Author

@rly could you do a review?

@rly
Copy link
Contributor

rly commented Sep 12, 2023

Schema validation workflow looks good. Thanks.

The .github/workflows/hdmf_compatibility_schema.yml workflow confuses me. Currently it just checks out HDMF and runs a subset of tests from HDMF. It doesn't use the updated schema in the master branch or a pull request branch. I think it should.

The validate schema workflow was skipped. That's odd. Do you know why that happened?

@mavaylon1
Copy link
Contributor Author

I removed "push" as a trigger because "pull_request" should trigger on each update to the PR, i.e a commit. I unerstood the trigger push, pull_request, and pull request on forked repos. However, I see now that it actually ignores the pull request trigger on the current repo, hence the need for push.

As for the compatibility workflow, I forgot to add the checkout current branch part. I let that slip that's my bad. The idea is to make sure the changes in the schema on any PR shouldn't break HDMF and if it does then it is because an update needs to be made on HDMF

@mavaylon1
Copy link
Contributor Author

@rly I believe I made the needed changes! I need to slow down so this can be a quick check process.

@mavaylon1 mavaylon1 requested a review from rly September 12, 2023 04:28
Copy link
Contributor

@rly rly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks. Please squash and merge.

@mavaylon1 mavaylon1 merged commit c538bc5 into main Sep 12, 2023
4 checks passed
@rly rly deleted the workflow_val branch September 12, 2023 06:42
@rly rly mentioned this pull request Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants