-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify export_file_compute_checksum_md5 function #411
Simplify export_file_compute_checksum_md5 function #411
Conversation
usefile = Path(tmpdir.name) / "tmpfile" | ||
|
||
assert usefile is not None | ||
export_file(obj, usefile, extension, flag=flag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If export_file(...)
raises an Exception, and we are using a tmp file it wont get cleaned up. The new code will do so by usage of the context manger.
@@ -906,17 +906,14 @@ def export( | |||
|
|||
obj = self._check_obj_if_file(obj) | |||
logger.info("Export to file and compute MD5 sum, using flag: <%s>", useflag) | |||
toutfile, md5 = export_file_compute_checksum_md5( | |||
# inject md5 checksum in metadata | |||
metadata["file"]["checksum_md5"] = export_file_compute_checksum_md5( | |||
obj, | |||
outfile, | |||
outfile.suffix, | |||
flag=useflag, # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs more investigation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice. Is export_file_compute_checksum_md5
even needed as its own function now?
Atm. i think yes. But i think we should be able to get this hash from a in memory stream insted. |
A more thorough go-through of the whole checksum is perhaps needed. We are discussing need for enabling consumers (i.e. REP in this case) to use it to verify correct data in their end, which means they would have to be able to reproduce it regardless of storage (so checksum of data cannot depend on the specific file format is is stored in). Also I cannot find a single test for this (which means we never tested this?). I assume current changes will have no impact on output, but would be nice to confirm with tests (and in the process increase coverage). |
Addresses #410