Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Notebook for Loading Data to NestedPandas #85

Merged
merged 4 commits into from
May 17, 2024
Merged

Conversation

wilsonbb
Copy link
Contributor

@wilsonbb wilsonbb commented May 16, 2024

Change Description

Adds a tutorial notebook to demonstrate data loading for NestedPandas (from a dictionary or from parquet file(s)). Also demonstrates than an existed NestedFrame can be saved to either a single parquet file or to multiple parquet files (one for each layer).

Addresses #68

  • My PR includes a link to the issue that I am addressing

Documentation Change Checklist

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link

codecov bot commented May 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.68%. Comparing base (3dea29f) to head (e59c6f9).
Report is 72 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main      #85   +/-   ##
=======================================
  Coverage   98.68%   98.68%           
=======================================
  Files          15       15           
  Lines         836      836           
=======================================
  Hits          825      825           
  Misses         11       11           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

github-actions bot commented May 16, 2024

Before [3dea29f] After [5f446f9] Ratio Benchmark (Parameter)
60.7±3ms 67.9±2ms ~1.12 benchmarks.ReassignHalfOfNestedSeries.time_run
9.10±0.3ms 9.45±0.1ms 1.04 benchmarks.NestedFrameAddNested.time_run
6.53±0.1ms 6.75±0.2ms 1.03 benchmarks.NestedFrameQuery.time_run
272M 278M 1.02 benchmarks.ReassignHalfOfNestedSeries.peakmem_run
255M 258M 1.01 benchmarks.AssignSingleDfToNestedSeries.peakmem_run
33.6±4ms 33.8±2ms 1.01 benchmarks.AssignSingleDfToNestedSeries.time_run
89.6M 90.9M 1.01 benchmarks.NestedFrameQuery.peakmem_run
86.2M 86.4M 1.00 benchmarks.NestedFrameAddNested.peakmem_run
89.6M 89.6M 1.00 benchmarks.NestedFrameReduce.peakmem_run
5.36±0.2ms 5.36±0.05ms 1.00 benchmarks.NestedFrameReduce.time_run

Click here to view all benchmarks.

@wilsonbb wilsonbb marked this pull request as ready for review May 16, 2024 23:54
@wilsonbb wilsonbb requested a review from dougbrn May 16, 2024 23:54
Copy link
Collaborator

@dougbrn dougbrn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, just found two minor nits

@wilsonbb wilsonbb merged commit 33a3e6f into main May 17, 2024
11 checks passed
@wilsonbb wilsonbb deleted the data_notebook branch May 17, 2024 22:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants