Skip to content

Commit

Permalink
Merge pull request #56 from kthyng/update_intake
Browse files Browse the repository at this point in the history
Update intake to v2
  • Loading branch information
kthyng authored Jul 19, 2024
2 parents 75ff058 + 7ba8a34 commit e862aa0
Show file tree
Hide file tree
Showing 29 changed files with 714 additions and 974 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ jobs:
fail-fast: false
matrix:
os: ["macos-latest", "ubuntu-latest", "windows-latest"]
python-version: ["3.8", "3.9", "3.10"]
python-version: ["3.9", "3.10", "3.11"]
steps:
- name: Checkout source
uses: actions/checkout@v2
Expand Down
28 changes: 11 additions & 17 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,18 +28,12 @@ repos:
exclude: docs/conf.py
args: [--max-line-length=105 ]

- repo: https://github.com/pre-commit/mirrors-isort
rev: v5.10.1
- repo: https://github.com/pycqa/isort
rev: 5.12.0
hooks:
- id: isort
additional_dependencies: [toml]
exclude: ^(docs|setup.py)
args: [--project=gcm_filters, --multi-line=3, --lines-after-imports=2, --lines-between-types=1, --trailing-comma, --force-grid-wrap=0, --use-parentheses, --line-width=88]

- repo: https://github.com/asottile/seed-isort-config
rev: v2.2.0
hooks:
- id: seed-isort-config
- id: isort
name: isort (python)
args: ["--profile", "black", "--filter-files", "--lines-after-imports=2", "--project=gcm_filters", "--multi-line=3", "--lines-between-types=1", "--trailing-comma", "--force-grid-wrap=0", "--use-parentheses", "--line-width=88"]

- repo: https://github.com/psf/black
rev: 22.10.0
Expand All @@ -56,9 +50,9 @@ repos:
exclude: docs/source/conf.py
args: [--ignore-missing-imports]

# - repo: https://github.com/codespell-project/codespell
# rev: v1.16.0
# hooks:
# - id: codespell
# args:
# - --quiet-level=2
- repo: https://github.com/codespell-project/codespell
rev: v2.1.0
hooks:
- id: codespell
args:
- --quiet-level=2
8 changes: 4 additions & 4 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,10 @@ build:
# uncomment to build from this exact version of package
# the downside is the version listed in the docs will be a dev version
# if uncommenting this, comment out installing pypi version of package in docs/env file
# python:
# install:
# - method: pip
# path: ./
python:
install:
- method: pip
path: ./

conda:
environment: docs/environment.yml
Expand Down
9 changes: 0 additions & 9 deletions MANIFEST.in

This file was deleted.

142 changes: 44 additions & 98 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,15 +24,13 @@ For changes prior to 2022-10-19, all contributions are Copyright James Munroe, s



Intake is a lightweight set of tools for loading and sharing data in data
science projects. Intake ERDDAP provides a set of integrations for ERDDAP.
Intake is a lightweight set of tools for loading and sharing data in data science projects. Intake ERDDAP provides a set of integrations for ERDDAP.

- Quickly identify all datasets from an ERDDAP service in a geographic region,
or containing certain variables.
- Quickly identify all datasets from an ERDDAP service in a geographic region, or containing certain variables.
- Produce a pandas DataFrame for a given dataset or query.
- Get an xarray Dataset for the Gridded datasets.

The Key features are:
The key features are:

- Pandas DataFrames for any TableDAP dataset.
- xarray Datasets for any GridDAP datasets.
Expand All @@ -59,7 +57,7 @@ project is available on PyPI, so it can be installed using `pip`
The following are prerequisites for a developer environment for this project:

- [conda](https://docs.conda.io/en/latest/miniconda.html)
- (optional but highly recommended) [mamba](https://mamba.readthedocs.io/en/latest/) Hint: `conda install -c conda-forge mamba`
- (optional but highly recommended) [mamba](https://mamba.readthedocs.io/en/latest/). Hint: `conda install -c conda-forge mamba`

Note: if `mamba` isn't installed, replace all instances of `mamba` in the following instructions with `conda`.

Expand All @@ -83,126 +81,74 @@ Note: if `mamba` isn't installed, replace all instances of `mamba` in the follow
pip install -e .
```

Note that you need to install with `pip install .` once to get the `entry_points` correct too.

## Examples

To create an intake catalog for all of the ERDDAP's TableDAP offerings use:
To create an `intake` catalog for all of the ERDDAP's TableDAP offerings use:

```python
import intake
catalog = intake.open_erddap_cat(
import intake_erddap
catalog = intake_erddap.ERDDAPCatalogReader(
server="https://erddap.sensors.ioos.us/erddap"
)
).read()
```


The catalog objects behave like a dictionary with the keys representing the
dataset's unique identifier within ERDDAP, and the values being the
`TableDAPSource` objects. To access a source object:
The catalog objects behave like a dictionary with the keys representing the dataset's unique identifier within ERDDAP, and the values being the `TableDAPReader` objects. To access a Reader object (for a single dataset, in this case for dataset_id "aoos_204"):

```python
source = catalog["datasetid"]
dataset = catalog["aoos_204"]
```

From the source object, a pandas DataFrame can be retrieved:
From the reader object, a pandas DataFrame can be retrieved:

```python
df = source.read()
df = dataset.read()
```

Find other dataset_ids available with

```python
list(catalog)
```

Consider a case where you need to find all wind data near Florida:

```python
import intake
import intake_erddap
from datetime import datetime
bbox = (-87.84, 24.05, -77.11, 31.27)
catalog = intake.open_erddap_cat(
catalog = intake_erddap.ERDDAPCatalogReader(
server="https://erddap.sensors.ioos.us/erddap",
bbox=bbox,
intersection="union",
start_time=datetime(2022, 1, 1),
end_time=datetime(2023, 1, 1),
standard_names=["wind_speed", "wind_from_direction"],
)
variables=["wind_speed", "wind_from_direction"],
).read()

df = next(catalog.values()).read()
dataset_id = list(catalog)[0]
print(dataset_id)
df = catalog[dataset_id].read()
```

Using the `standard_names` input with `intersection="union"` searches for datasets that have both "wind_speed" and "wind_from_direction". Using the `variables` input subsequently narrows the dataset to only those columns, plus "time", "latitude", "longitude", and "z".

<table class="align-default">
<thead>
<tr style="text-align: right;">
<th></th>
<th>time (UTC)</th>
<th>wind_speed (m.s-1)</th>
<th>wind_from_direction (degrees)</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>2022-12-14T19:40:00Z</td>
<td>7.0</td>
<td>140.0</td>
</tr>
<tr>
<th>1</th>
<td>2022-12-14T19:20:00Z</td>
<td>7.0</td>
<td>120.0</td>
</tr>
<tr>
<th>2</th>
<td>2022-12-14T19:10:00Z</td>
<td>NaN</td>
<td>NaN</td>
</tr>
<tr>
<th>3</th>
<td>2022-12-14T19:00:00Z</td>
<td>9.0</td>
<td>130.0</td>
</tr>
<tr>
<th>4</th>
<td>2022-12-14T18:50:00Z</td>
<td>9.0</td>
<td>130.0</td>
</tr>
<tr>
<th>...</th>
<td>...</td>
<td>...</td>
<td>...</td>
</tr>
<tr>
<th>48296</th>
<td>2022-01-01T00:40:00Z</td>
<td>4.0</td>
<td>120.0</td>
</tr>
<tr>
<th>48297</th>
<td>2022-01-01T00:30:00Z</td>
<td>3.0</td>
<td>130.0</td>
</tr>
<tr>
<th>48298</th>
<td>2022-01-01T00:20:00Z</td>
<td>4.0</td>
<td>120.0</td>
</tr>
<tr>
<th>48299</th>
<td>2022-01-01T00:10:00Z</td>
<td>4.0</td>
<td>130.0</td>
</tr>
<tr>
<th>48300</th>
<td>2022-01-01T00:00:00Z</td>
<td>4.0</td>
<td>130.0</td>
</tr>
</tbody>
</table>
```python
time (UTC) latitude (degrees_north) ... wind_speed (m.s-1) wind_from_direction (degrees)
0 2022-01-01T00:00:00Z 28.508 ... 3.6 126.0
1 2022-01-01T00:10:00Z 28.508 ... 3.8 126.0
2 2022-01-01T00:20:00Z 28.508 ... 3.6 124.0
3 2022-01-01T00:30:00Z 28.508 ... 3.4 125.0
4 2022-01-01T00:40:00Z 28.508 ... 3.5 124.0
... ... ... ... ... ...
52524 2022-12-31T23:20:00Z 28.508 ... 5.9 176.0
52525 2022-12-31T23:30:00Z 28.508 ... 6.8 177.0
52526 2022-12-31T23:40:00Z 28.508 ... 7.2 175.0
52527 2022-12-31T23:50:00Z 28.508 ... 7.4 169.0
52528 2023-01-01T00:00:00Z 28.508 ... 8.1 171.0

[52529 rows x 6 columns]
```
5 changes: 4 additions & 1 deletion ci/environment-py3.10.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,14 @@ channels:
- conda-forge
dependencies:
- python=3.10
- appdirs
- fsspec
- numpy
- dask
- pandas
- erddapy
- panel
- intake
- intake-xarray>=0.6.1
- pytest
- pytest-cov
- isort
Expand All @@ -19,6 +20,8 @@ dependencies:
- mypy
- codecov
- coverage[toml]
- xarray
- pip
- pip:
- git+https://github.com/intake/intake
- cf-pandas
9 changes: 6 additions & 3 deletions ci/environment-py3.8.yml → ci/environment-py3.11.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,15 @@ name: test-env
channels:
- conda-forge
dependencies:
- python=3.8
- python=3.11
- appdirs
- fsspec
- numpy
- dask
- pandas
- erddapy
- panel
- intake
- intake-xarray>=0.6.1
# - intake
- pytest
- pytest-cov
- isort
Expand All @@ -19,6 +20,8 @@ dependencies:
- mypy
- codecov
- coverage[toml]
- xarray
- pip
- pip:
- git+https://github.com/intake/intake
- cf-pandas
7 changes: 5 additions & 2 deletions ci/environment-py3.9.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,14 @@ channels:
- conda-forge
dependencies:
- python=3.9
- appdirs
- numpy
- dask
- pandas
- erddapy
- fsspec
- panel
- intake
- intake-xarray>=0.6.1
# - intake
- pytest
- pytest-cov
- isort
Expand All @@ -19,6 +20,8 @@ dependencies:
- mypy
- codecov
- coverage[toml]
- xarray
- pip
- pip:
- git+https://github.com/intake/intake
- cf-pandas
29 changes: 7 additions & 22 deletions docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,27 +2,12 @@
``intake-erddap`` Python API
=============================

.. toctree::
:maxdepth: 2
:caption: Documentation
.. currentmodule:: intake_erddap

.. autosummary::
:toctree: generated/
:recursive:

``intake-erddap`` catalog
-------------------------


.. autoclass:: intake_erddap.erddap_cat.ERDDAPCatalog
:members: get_client, get_search_urls

``intake-erddap`` source
------------------------


.. autoclass:: intake_erddap.erddap.ERDDAPSource
:members: get_client

.. autoclass:: intake_erddap.erddap.TableDAPSource
:members: read, read_partition, read_chunked

.. autoclass:: intake_erddap.erddap.GridDAPSource
:members: read_partition, read_chunked, to_dask, close
ERDDAPCatalogReader
TableDAPReader
GridDAPReader
Loading

0 comments on commit e862aa0

Please sign in to comment.