Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-2140] [Feature] Parse versions with specific version standards #6999

Open
3 tasks done
emmyoop opened this issue Feb 17, 2023 · 2 comments
Open
3 tasks done

[CT-2140] [Feature] Parse versions with specific version standards #6999

emmyoop opened this issue Feb 17, 2023 · 2 comments
Labels
dependencies Changes to the version of dbt dependencies deps dbt's package manager tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality

Comments

@emmyoop
Copy link
Member

emmyoop commented Feb 17, 2023

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Currently, in semver.py we have a single function to parse both dbt version (PEP 440 compliant) and the package versions (SemVer 2.0 compliant). It's both confusing and broken.

The only versions that are complant for both standards do not contain any pre-release or dev info:

  • 1.5.2
  • 10.5.22

We also allow strictly PEP440 compliant versions:

  • 1.5.0rc1
  • 1.5.0rc1.dev1234

As well as SemVer 2.0 compliant versions

  • 1.5.0-rc1

Some PEP440 compliant but SemVer 2.0 uncomplianty versions fail

  • 1.5.0.dev1232

And some versions that are not compliant by either standard pass

  • 1.5.0-rc1b2

The first thing that should happen is to pull apart checking bopth PEP440 compliance and SemVer 2.0 compliance in the same place. Be explicit about which is expected and audit for it specifically. This will involve looking into everywhere this logic is called to be sure the right functionality get invoked.

Instead of using our own homegrown solution here, we should use packaging.version.Version (same thing we're using in the parse-semver action) to check the version of dbt projects against PEP440. Possibly use the semvver package for checking package versions against SemVer 2.0.

Note: Adapters also do some version parsing in setup.py. It's worth exploring if this solution could work for them as well to remove as much manual regex we have to maintain as possible.

Describe alternatives you've considered

Tweaking the current regex to work for all version patterns but this is undesirable since we're already in the code.

Who will this benefit?

Maintainers for ease of readability of what's happening as well as package maintainers and adapter maintainers as it would allow the use of full versioning instead of a restricted subset.

Are you interested in contributing this feature?

No response

Anything else?

Would allowing packages fuller versioning have an effect on the package hub?

@emmyoop emmyoop added enhancement New feature or request triage labels Feb 17, 2023
@github-actions github-actions bot changed the title [Feature] Parse versions with specific version standards [CT-2140] [Feature] Parse versions with specific version standards Feb 17, 2023
@dbeatty10 dbeatty10 removed the triage label Feb 17, 2023
@jtcohen6
Copy link
Contributor

Currently, in semver.py we have a single function to parse both dbt version (PEP 440 compliant) and the package versions (SemVer 2.0 compliant). It's both confusing and broken.

Agree that this is super confusing, and that this code is hand-rolled and really ought to be replaced with an off-the-shelf library. That said, do we still have any PEP 440-compliant logic in semver.py, now that we've removed packaging from that code? (#6838)

The only other place inside our codebase that we use packaging (PEP 440-compliant) is to detect the installed version of git and see if it supports subdirectory sparse checkouts:

out, _ = run_cmd(cwd, ["git", "--version"], env={"LC_ALL": "C"})
git_version = version.parse(re.search(r"\d+\.\d+\.\d+", out.decode("utf-8")).group(0))
if not git_version >= version.parse("2.25.0"):

But git isn't a Python package, so... shouldn't we be using semver for this...?

$ git --version
git version 2.37.1 (Apple Git-137.1)

Obviously, we need to keep packaging / PEP 440 code around for all our release code, because dbt-core is, after all, a Python package :)


Possibly use the semvver package for checking package versions against SemVer 2.0.

No question that we try to do should do this, and delete a ton of our hand-rolled code in semver.py. We originally wrote this code in Feb 2018; it's possible that the libraries were less stable then, or we were trying to limit unneeded dependencies. At this point, these packages seem pretty stable; we just shouldn't pin them too tightly, or at all :)

@colin-rogers-dbt apparently likes semantic-version.

@jtcohen6 jtcohen6 added Team:Language deps dbt's package manager tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality dependencies Changes to the version of dbt dependencies and removed enhancement New feature or request labels Feb 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Changes to the version of dbt dependencies deps dbt's package manager tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality
Projects
None yet
Development

No branches or pull requests

3 participants