-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent empty commits if files did not change #809
base: main
Are you sure you want to change the base?
Conversation
|
||
if (isFrontend) { | ||
yield 1; | ||
return null; // unsupported if we're here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to keep things simple, and it's not really a problem IMO, it just means that, in this case, we'll always upload non-LFS files
export async function* sha1(buffer: Blob, opts?: { abortSignal?: AbortSignal }): AsyncGenerator<number, string | null> { | ||
yield 0; | ||
|
||
if (globalThis.crypto?.subtle) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no threshold like for sha256 since non-LFS files are always < 10MB (which is the threshold for sha256 IIUC)
packages/hub/src/lib/commit.ts
Outdated
const allOpsToCommit = [...deleteOps, ...fileOpsToCommit]; | ||
if (allOpsToCommit.length === 0) { | ||
const res: Response = await (params.fetch ?? fetch)( | ||
`${params?.hubUrl || HUB_URL}/api/${repoId.type}s/${repoId.name}`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there doesn't seem to be any method already to call this endpoint, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes we should create one - in another PR - with expand
handling too (getModelInfo
, ...)
But here I think we should include the sha directly in the preupload
response. Given we give already oid
for individual files, and those are the oid
in the context a specific commit from the remote, we should add the sha in the preupload
response.
That way - no need for all this extra code here - cc @Wauplin as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should include the sha directly in the preupload response
That would simplifies things indeed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see 325d917
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a quick look at the PR but haven't tried it myself either 😄
can we maybe add a |
@coyotte508 that would work too, yes, but it would have been nice to hear from you when we discussed that internally haha; now we already have huggingface/huggingface_hub#2389 and https://github.com/huggingface-internal/moon-landing/pull/10547, but if we do that we won't need those anymore, correct? :) |
indeed 😬 I guess the backend solution would have been to add a gitaly hook conditionally skipped to prevent empty commits |
You can add a test to |
I actually prefer to enforce the A bit the same for raising errors. That would break some workflows that haven't asked for anything. Silently preventing empty commits (with a log message) is the best compromise I think. TL;DR: no one cares expect us, right? |
@coyotte508 I added a test, but it seems it's timing out, along with others, any tip regarding this? (does that mean I broke them? :)) |
well yes you probably broke them :) (they ran fine in https://github.com/huggingface/huggingface.js/actions/runs/9991710454/job/27615066628) |
it's greeeeeeeen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested nothing is broken by renaming this file - https://huggingface.co/spaces/coyotte508/hub-sha256 - and it seems fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks nice!
packages/hub/src/lib/commit.ts
Outdated
const allOpsToCommit = [...deleteOps, ...fileOpsToCommit]; | ||
if (allOpsToCommit.length === 0) { | ||
const res: Response = await (params.fetch ?? fetch)( | ||
`${params?.hubUrl || HUB_URL}/api/${repoId.type}s/${repoId.name}`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes we should create one - in another PR - with expand
handling too (getModelInfo
, ...)
But here I think we should include the sha directly in the preupload
response. Given we give already oid
for individual files, and those are the oid
in the context a specific commit from the remote, we should add the sha in the preupload
response.
That way - no need for all this extra code here - cc @Wauplin as well
Co-authored-by: Eliott C. <[email protected]>
@@ -459,6 +467,38 @@ export async function* commitIter(params: CommitParams): AsyncGenerator<CommitPr | |||
|
|||
yield { event: "phase", phase: "committing" }; | |||
|
|||
const allOpsToCommit = [...deleteOps, ...fileOpsToCommit]; | |||
if (allOpsToCommit.length === 0) { | |||
if (!currentCommitOid) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, we currently don't return early or throw an error in the case where operations
is empty, meaning it's possible to get there without calling the preupload
endpoint before, and thus currentCommitOid
can still be undefined
here.
So I kept this for now; the other alternative would be to return an object with a different different shape or throw an error if there are no operations, but it would be breaking change - I'll let you decide @coyotte508
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine to throw on empty commits IMO, we can do a major version change (and maybe do some of https://github.com/huggingface/huggingface.js/milestone/2 before publishing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually if we upgrade to 1.0 we can add the null
return type and return null
on empty commit
(And it's better than throwing an error, at least it will be caught by TS)
Cousin of huggingface/huggingface_hub#2389
Do I need to add tests somewhere? x)
(disclaimer: I obviously haven't tried it yet, but theoretically, it should work!)