-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add NestedFrame.reduce #32
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #32 +/- ##
==========================================
+ Coverage 95.47% 95.63% +0.15%
==========================================
Files 14 14
Lines 597 618 +21
==========================================
+ Hits 570 591 +21
Misses 27 27 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I don't have any technical blockers, just a couple of un-actionable thoughts.
Change Description
To address #14, here we provide an easy way for users to apply a custom function to a
NestedFrame
, where the user-provided function is applied to each row of the base layer of theNestedFrame
along the specified columns (including the ability to specify nested columns packed into that row). The result of the function is then another nested frame which can be worked with separately or joined back to the original frame.We opted for the provisional name
reduce
since it provides to the user function "base" layer columns as scalars and hierarchical columns as collections allowing for the "reduction" of the hierarchical data to a frame matching the indices of the base layer. For other cases of providing a custom function, the user can still. the use theapply
API.Solution Description
This solution wraps around pandas.DataFrame.apply to apply the user function to each row of the nested frame. Note that for columns in the "nested" frames, these are already packed as a dataframe within their respective cells of that row. So we do not need to unpack and flatten in the case where a user requests hierarchical columns.
Code Quality
New Feature Checklist