Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NestedFrame.reduce #32

Merged
merged 19 commits into from
Apr 18, 2024
Merged

Add NestedFrame.reduce #32

merged 19 commits into from
Apr 18, 2024

Conversation

wilsonbb
Copy link
Contributor

@wilsonbb wilsonbb commented Apr 18, 2024

Change Description

To address #14, here we provide an easy way for users to apply a custom function to a NestedFrame, where the user-provided function is applied to each row of the base layer of the NestedFrame along the specified columns (including the ability to specify nested columns packed into that row). The result of the function is then another nested frame which can be worked with separately or joined back to the original frame.

We opted for the provisional name reduce since it provides to the user function "base" layer columns as scalars and hierarchical columns as collections allowing for the "reduction" of the hierarchical data to a frame matching the indices of the base layer. For other cases of providing a custom function, the user can still. the use the apply API.

  • My PR includes a link to the issue that I am addressing

Solution Description

This solution wraps around pandas.DataFrame.apply to apply the user function to each row of the nested frame. Note that for columns in the "nested" frames, these are already packed as a dataframe within their respective cells of that row. So we do not need to unpack and flatten in the case where a user requests hierarchical columns.

Code Quality

  • I have read the Contribution Guide
  • My code follows the code style of this project
  • My code builds (or compiles) cleanly without any errors or warnings
  • My code contains relevant comments and necessary documentation

New Feature Checklist

  • I have added or updated the docstrings associated with my feature using the NumPy docstring format
  • I have updated the tutorial to highlight my new feature (if appropriate)
  • I have added unit/End-to-End (E2E) test cases to cover my new feature
  • My change includes a breaking change
    • My change includes backwards compatibility and deprecation warnings (if possible)

Copy link

codecov bot commented Apr 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.63%. Comparing base (ac4032f) to head (71594b5).

❗ Current head 71594b5 differs from pull request most recent head bbd14fd. Consider uploading reports for the commit bbd14fd to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #32      +/-   ##
==========================================
+ Coverage   95.47%   95.63%   +0.15%     
==========================================
  Files          14       14              
  Lines         597      618      +21     
==========================================
+ Hits          570      591      +21     
  Misses         27       27              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

github-actions bot commented Apr 18, 2024

Before [ac4032f] After [812bc2b] Ratio Benchmark (Parameter)
360 2.28k 6.33 benchmarks.mem_list
2.95±0.9s 2.84±1s 0.96 benchmarks.time_computation

Click here to view all benchmarks.

@wilsonbb wilsonbb marked this pull request as ready for review April 18, 2024 22:46
@wilsonbb wilsonbb requested a review from dougbrn April 18, 2024 22:46
Copy link
Collaborator

@dougbrn dougbrn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I don't have any technical blockers, just a couple of un-actionable thoughts.

src/nested_pandas/nestedframe/core.py Outdated Show resolved Hide resolved
src/nested_pandas/nestedframe/core.py Show resolved Hide resolved
src/nested_pandas/nestedframe/core.py Show resolved Hide resolved
@wilsonbb wilsonbb merged commit 025ad96 into main Apr 18, 2024
9 checks passed
@wilsonbb wilsonbb deleted the custom_func branch April 18, 2024 23:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants