Add and update commit frequency metrics #173

d33bs · 2024-11-12T21:56:21Z

Description

This PR adds metrics surrounding commits and commit frequency. Along the journey towards this work I found that some functionality wasn't working as expected (or maybe wasn't labeled clearly in the context of these changes). As a result I took time to fix things and evolve the work towards consistency. This also included making modifications to a few tests and the repo_setup function, which is becoming more and more important to creating and implementing tests for these changes.

For commit frequency I used commits per day as a calculation to help avoid inconsistencies when it comes to weeks, months and years. I could also see how it might be better to perform a different type of calculation here, let me know if something seems more useful here.

Closes #157
Closes #158

What is the nature of your change?

Content additions or updates (adds or updates content)
Bug fix (fixes an issue).
Enhancement (adds functionality).
Breaking change (these changes would cause existing functionality to not work as expected).

Checklist

Please ensure that all boxes are checked before indicating that this pull request is ready for review.

I have read the CONTRIBUTING.md guidelines.
My code follows the style guidelines of this project.
I have performed a self-review of my own contributions.
I have commented my content, particularly in hard-to-understand areas.
I have made corresponding changes to related documentation (outside of book content).
My changes generate no new warnings.
New and existing tests pass locally with my changes.
I have added tests that prove my additions are effective or that my feature works.
I have deleted all non-relevant text in this pull request template.

src/almanack/git.py

falquaddoomi

This all looks really neat; kudos! I left two comments, only one of which (the count_files() one) I think needs to be considered.

falquaddoomi · 2024-11-13T20:28:38Z

src/almanack/git.py

@@ -213,3 +204,32 @@ def find_and_read_file(repo: pygit2.Repository, filename: str) -> Optional[str]:

    # Decode and return content as a string
    return blob_data.decode(detect_encoding(blob_data))
+
+
+def count_files(tree: Union[pygit2.Tree, pygit2.Blob]) -> int:


First off, I think this is a great use of recursion!

Now, why the Union type for the tree argument? It looks like it'll only ever be called recursively with pygit2.Tree objects, and if someone were to pass a pygit2.Blob as the initial element it looks like the function would attempt to iterate over it first, which I'm not sure is a defined operation for that type.

I could see a version of this function where you first check if tree is a Blob and return 1 (presumably because a blob could be thought of as a tree with 1 element), or a version that only takes a Tree and deals with Blobs non-recursively as you're doing now.

falquaddoomi · 2024-11-13T21:25:28Z

tests/data/almanack/repo_setup/create_repo.py

+    repo_path: pathlib.Path,
+    files: list[dict],
+    branch_name: str = "main",
+    dates: Optional[list[datetime]] = None,


Pretty neat that you can build up a commit history in a test git repo using just this function!

Since files is now a list of commits which contains a dict of files per commit, I wonder if it makes sense to change the name, e.g. to commits?

This one's debatable and may be more work than it's worth, but I see that dates is a parallel array that gets zipped with the files to produce dates for each commit. Perhaps it might make sense to have the commit dates be inlined in the files data structure, e.g. something like:

commits = [ {'commit_date': Optional[datetime] = None, 'files': dict }, ... ]

where the current date is used for each if commit_date wasn't provided?

FWIW, I'm fine keeping this all as-is, since in your test cases it seems you're mostly adding files in a single commit and my suggestions above would make those test cases more verbose.

d33bs added 2 commits November 12, 2024 14:50

add and update commit frequency metrics

cf4b269

docs formatting

a63d4c2

d33bs requested review from falquaddoomi, gwaybio and vincerubinetti November 12, 2024 22:28

d33bs marked this pull request as ready for review November 12, 2024 22:28

correct the counting of files

44c7f3b

gwaybio approved these changes Nov 13, 2024

View reviewed changes

src/almanack/git.py Show resolved Hide resolved

d33bs added 2 commits November 12, 2024 21:44

Merge remote-tracking branch 'upstream/main' into commit-metrics

ae72ee3

fix cli test

3772905

falquaddoomi reviewed Nov 13, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add and update commit frequency metrics #173

Add and update commit frequency metrics #173

d33bs commented Nov 12, 2024 •

edited

Loading

falquaddoomi left a comment

falquaddoomi Nov 13, 2024

falquaddoomi Nov 13, 2024

Add and update commit frequency metrics #173

Are you sure you want to change the base?

Add and update commit frequency metrics #173

Conversation

d33bs commented Nov 12, 2024 • edited Loading

Description

What is the nature of your change?

Checklist

falquaddoomi left a comment

Choose a reason for hiding this comment

falquaddoomi Nov 13, 2024

Choose a reason for hiding this comment

falquaddoomi Nov 13, 2024

Choose a reason for hiding this comment

d33bs commented Nov 12, 2024 •

edited

Loading