From 935cdce8c6ff13302b773b8332df981c5dfb5d5e Mon Sep 17 00:00:00 2001 From: Doug Branton Date: Mon, 13 May 2024 10:45:50 -0700 Subject: [PATCH 1/7] revise intro --- README.md | 5 +- docs/index.rst | 51 ++++++++----------- docs/{notebooks.rst => tutorials.rst} | 4 +- docs/{notebooks => tutorials}/README.md | 0 docs/{notebooks => tutorials}/low_level.ipynb | 0 5 files changed, 25 insertions(+), 35 deletions(-) rename docs/{notebooks.rst => tutorials.rst} (61%) rename docs/{notebooks => tutorials}/README.md (100%) rename docs/{notebooks => tutorials}/low_level.ipynb (100%) diff --git a/README.md b/README.md index 05b70fd..8fbc77e 100644 --- a/README.md +++ b/README.md @@ -1,10 +1,11 @@ # nested-pandas -Efficient pandas representation for nested associated datasets. +An extension of pandas for efficient representation of nested +associated datasets. Nested-Pandas extends the [pandas](https://pandas.pydata.org/) package with tooling and support for nested dataframes packed into values of top-level dataframe columns. [Pyarrow](https://arrow.apache.org/docs/python/index.html) -is used intrinsically to aid in scalability and performance. +is used internally to aid in scalability and performance. ![image](./nestedframe.png) diff --git a/docs/index.rst b/docs/index.rst index aefa712..6fd1647 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -2,45 +2,34 @@ You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. -Welcome to nested-pandas's documentation! -======================================================================================== +Nested-Pandas +============= -Dev Guide - Getting Started ---------------------------- +An extension of pandas for efficient representation of nested +associated datasets. -Before installing any dependencies or writing code, it's a great idea to create a -virtual environment. LINCC-Frameworks engineers primarily use `conda` to manage virtual -environments. If you have conda installed locally, you can run the following to -create and activate a new environment. +Nested-Pandas extends the [pandas](https://pandas.pydata.org/) package with +tooling and support for nested dataframes packed into values of top-level +dataframe columns. [Pyarrow](https://arrow.apache.org/docs/python/index.html) +is used internally to aid in scalability and performance. -.. code-block:: console +![image](../nestedframe.png) - >> conda create env -n python=3.10 - >> conda activate +How to Use This Guide +===================== +Begin with the :doc:`Getting Started ` +guide to learn the basics of installation and walkthrough a simple example of +using nested-pandas. -Once you have created a new environment, you can install this project for local -development using the following commands: +The :doc:`Tutorials `_ +section showcases the fundamental features of nested-pandas. -.. code-block:: console - - >> pip install -e .'[dev]' - >> pre-commit install - >> conda install pandoc - - -Notes: - -1) The single quotes around ``'[dev]'`` may not be required for your operating system. -2) ``pre-commit install`` will initialize pre-commit for this local repository, so - that a set of tests will be run prior to completing a local commit. For more - information, see the Python Project Template documentation on - `pre-commit `_. -3) Installing ``pandoc`` allows you to verify that automatic rendering of Jupyter notebooks - into documentation for ReadTheDocs works as expected. For more information, see - the Python Project Template documentation on - `Sphinx and Python Notebooks `_. +API-level information about nested-pandas is viewable in the +:doc:`API Reference ` +section. +Learn more about contributing to this repository in our :doc:`Contribution Guide `. .. toctree:: :hidden: diff --git a/docs/notebooks.rst b/docs/tutorials.rst similarity index 61% rename from docs/notebooks.rst rename to docs/tutorials.rst index 7a3f9b3..606bd01 100644 --- a/docs/notebooks.rst +++ b/docs/tutorials.rst @@ -1,6 +1,6 @@ -Notebooks +Tutorials ======================================================================================== .. toctree:: - Lower-level interfaces + Lower-level interfaces diff --git a/docs/notebooks/README.md b/docs/tutorials/README.md similarity index 100% rename from docs/notebooks/README.md rename to docs/tutorials/README.md diff --git a/docs/notebooks/low_level.ipynb b/docs/tutorials/low_level.ipynb similarity index 100% rename from docs/notebooks/low_level.ipynb rename to docs/tutorials/low_level.ipynb From 25a1550cdb43a94dfdee299b107f4fddb77346fa Mon Sep 17 00:00:00 2001 From: Doug Branton Date: Mon, 13 May 2024 10:50:18 -0700 Subject: [PATCH 2/7] fix image --- docs/index.rst | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/docs/index.rst b/docs/index.rst index 6fd1647..8e1e9fe 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -13,7 +13,10 @@ tooling and support for nested dataframes packed into values of top-level dataframe columns. [Pyarrow](https://arrow.apache.org/docs/python/index.html) is used internally to aid in scalability and performance. -![image](../nestedframe.png) +.. image:: ../nestedframe.png + :width: 800 + :alt: Example NestedFrame + How to Use This Guide ===================== @@ -22,7 +25,7 @@ Begin with the :doc:`Getting Started ` guide to learn the basics of installation and walkthrough a simple example of using nested-pandas. -The :doc:`Tutorials `_ +The :doc:`Tutorials ` section showcases the fundamental features of nested-pandas. API-level information about nested-pandas is viewable in the From 5937e8dac2b590e40a1d89160e98883c97339209 Mon Sep 17 00:00:00 2001 From: Doug Branton Date: Mon, 13 May 2024 10:53:43 -0700 Subject: [PATCH 3/7] try smaller width --- docs/index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/index.rst b/docs/index.rst index 8e1e9fe..e1390ee 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -14,7 +14,7 @@ dataframe columns. [Pyarrow](https://arrow.apache.org/docs/python/index.html) is used internally to aid in scalability and performance. .. image:: ../nestedframe.png - :width: 800 + :width: 600 :alt: Example NestedFrame From 5856de3b9c18cd357ee754a2806ce0d2ae814327 Mon Sep 17 00:00:00 2001 From: Doug Branton Date: Mon, 13 May 2024 10:56:38 -0700 Subject: [PATCH 4/7] try centering --- docs/index.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/index.rst b/docs/index.rst index e1390ee..284eaf8 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -15,6 +15,7 @@ is used internally to aid in scalability and performance. .. image:: ../nestedframe.png :width: 600 + :align: center :alt: Example NestedFrame From cdd01e3eb7798335139baf7dd39068422e7020ef Mon Sep 17 00:00:00 2001 From: Doug Branton Date: Mon, 13 May 2024 11:01:25 -0700 Subject: [PATCH 5/7] fix external links --- docs/index.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/index.rst b/docs/index.rst index 284eaf8..e26ba6a 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -8,9 +8,9 @@ Nested-Pandas An extension of pandas for efficient representation of nested associated datasets. -Nested-Pandas extends the [pandas](https://pandas.pydata.org/) package with +Nested-Pandas extends the `pandas `_ package with tooling and support for nested dataframes packed into values of top-level -dataframe columns. [Pyarrow](https://arrow.apache.org/docs/python/index.html) +dataframe columns. `Pyarrow `_ is used internally to aid in scalability and performance. .. image:: ../nestedframe.png From 35c328099d824df00452071013246faf4b007c36 Mon Sep 17 00:00:00 2001 From: Doug Branton Date: Mon, 13 May 2024 11:07:07 -0700 Subject: [PATCH 6/7] add additional readme section --- docs/index.rst | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/docs/index.rst b/docs/index.rst index e26ba6a..ec18c6f 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -18,6 +18,16 @@ is used internally to aid in scalability and performance. :align: center :alt: Example NestedFrame +Nested-Pandas is motivated by time-domain astronomy use cases, where we see +typically two levels of information, information about astronomical objects and +then an associated set of `N` measurements of those objects. Nested-Pandas offers +a performant and memory-efficient package for working with these types of datasets. + +Core advantages being: +* hierarchical column access +* efficient packing of nested information into inputs to custom user functions +* avoiding costly groupby operations + How to Use This Guide ===================== From 9871e56a5ff0656d32dd1f78d179b2777c4d189d Mon Sep 17 00:00:00 2001 From: Doug Branton Date: Mon, 13 May 2024 11:11:11 -0700 Subject: [PATCH 7/7] fix bullets --- docs/index.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/index.rst b/docs/index.rst index ec18c6f..697e1ef 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -24,6 +24,7 @@ then an associated set of `N` measurements of those objects. Nested-Pandas offer a performant and memory-efficient package for working with these types of datasets. Core advantages being: + * hierarchical column access * efficient packing of nested information into inputs to custom user functions * avoiding costly groupby operations