Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add and update commit frequency metrics #173

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

d33bs
Copy link
Member

@d33bs d33bs commented Nov 12, 2024

Description

This PR adds metrics surrounding commits and commit frequency. Along the journey towards this work I found that some functionality wasn't working as expected (or maybe wasn't labeled clearly in the context of these changes). As a result I took time to fix things and evolve the work towards consistency. This also included making modifications to a few tests and the repo_setup function, which is becoming more and more important to creating and implementing tests for these changes.

For commit frequency I used commits per day as a calculation to help avoid inconsistencies when it comes to weeks, months and years. I could also see how it might be better to perform a different type of calculation here, let me know if something seems more useful here.

Closes #157
Closes #158

What is the nature of your change?

  • Content additions or updates (adds or updates content)
  • Bug fix (fixes an issue).
  • Enhancement (adds functionality).
  • Breaking change (these changes would cause existing functionality to not work as expected).

Checklist

Please ensure that all boxes are checked before indicating that this pull request is ready for review.

  • I have read the CONTRIBUTING.md guidelines.
  • My code follows the style guidelines of this project.
  • I have performed a self-review of my own contributions.
  • I have commented my content, particularly in hard-to-understand areas.
  • I have made corresponding changes to related documentation (outside of book content).
  • My changes generate no new warnings.
  • New and existing tests pass locally with my changes.
  • I have added tests that prove my additions are effective or that my feature works.
  • I have deleted all non-relevant text in this pull request template.

@d33bs d33bs marked this pull request as ready for review November 12, 2024 22:28
src/almanack/git.py Show resolved Hide resolved
Copy link
Contributor

@falquaddoomi falquaddoomi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks really neat; kudos! I left two comments, only one of which (the count_files() one) I think needs to be considered.

@@ -213,3 +204,32 @@ def find_and_read_file(repo: pygit2.Repository, filename: str) -> Optional[str]:

# Decode and return content as a string
return blob_data.decode(detect_encoding(blob_data))


def count_files(tree: Union[pygit2.Tree, pygit2.Blob]) -> int:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First off, I think this is a great use of recursion!

Now, why the Union type for the tree argument? It looks like it'll only ever be called recursively with pygit2.Tree objects, and if someone were to pass a pygit2.Blob as the initial element it looks like the function would attempt to iterate over it first, which I'm not sure is a defined operation for that type.

I could see a version of this function where you first check if tree is a Blob and return 1 (presumably because a blob could be thought of as a tree with 1 element), or a version that only takes a Tree and deals with Blobs non-recursively as you're doing now.

Comment on lines +161 to +164
repo_path: pathlib.Path,
files: list[dict],
branch_name: str = "main",
dates: Optional[list[datetime]] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty neat that you can build up a commit history in a test git repo using just this function!

Since files is now a list of commits which contains a dict of files per commit, I wonder if it makes sense to change the name, e.g. to commits?

This one's debatable and may be more work than it's worth, but I see that dates is a parallel array that gets zipped with the files to produce dates for each commit. Perhaps it might make sense to have the commit dates be inlined in the files data structure, e.g. something like:

commits = [
  {'commit_date': Optional[datetime] = None, 'files': dict }, ...
]

where the current date is used for each if commit_date wasn't provided?

FWIW, I'm fine keeping this all as-is, since in your test cases it seems you're mostly adding files in a single commit and my suggestions above would make those test cases more verbose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants