-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concatenate and merge info, preserving conflicts as lists. #691
base: main
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #691 +/- ##
==========================================
+ Coverage 91.85% 91.86% +0.01%
==========================================
Files 60 60
Lines 4222 4229 +7
==========================================
+ Hits 3878 3885 +7
Misses 344 344
Continue to review full report at Codecov.
|
Hmm, I am worried that the conflict resolution would lead to inconsistent/confusing results. I wonder if we could combine it with #643 by always making a list, even when the values are identical? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very nice feature, but I share @arcondello's concern about consistency.
I suggest adding an optional info_merge_strategy
argument (or similar) which would control how info
fields are combined:
"squeeze"
could be your current implementation,"list"
would always form lists,"drop"
would skip the merge altogether,"inplace"
or"recursive"
would recursively combine all info dicts,- custom
callable
would allow merge delegation to user's custom function ofn
dict args.
Inplace/recursive merge might be an overkill, so we can drop that one for now.
In addition, you'd probably want to make copies before merging them, due to dict's mutability.
I was considering one option flag, e.g. I see now how "drop" would allow compatibility with the current implementation, but I'm not sure if adding more flexibility is necessary. |
I prefer categorical option to boolean because it allows future expansion. Accepting a callable (in addition) is trivial and provides an absolute flexibility. |
One reason I didn't originally implement "list" is because I had to decide whether lists are always I think I'd go for the latter, even though it wouldn't allow tracking. |
IMO, "list" should include fields from all samplesets. We might consider splitting "list" into "list-all" and "list-existing", although "squeeze" is conceptually already very similar to "list-existing". @arcondello, thoughts? |
|
I don't see how |
If there is interest in merging info when concatenating samplesets since right now all info is ignored.
This preserves conflicts by listing them, but squeezes unique values.