Skip to content

Design Document

Scott Henderson edited this page Oct 8, 2024 · 1 revision

Welcome to the coincident wiki!

Initial design plan and I/O

  • generally, this package should have a 'geopandas-centric' workflow for metadata. functions return geodataframes and either operate on whole dataframes or rows.

  • when possible there should be a 1-to-1 mapping between a dataframe row and a pystac.Item. Using stac-geoparquet as a dependency should make this pretty easy as STAC API searches can be read into a dataframe with all the STAC properties, links, assets, etc. And then it's just a matter of reading and writing with geopandas.to_parquet. This will ensure being able to take advantage of many great tools in the STAC ecosystem https://github.com/stac-utils. If a dataset doesn't have STAC already, could just save add the minimum required properties (https://github.com/radiantearth/stac-spec/blob/master/item-spec/item-spec.md), or add a proper parser https://stactools-packages.github.io!

  • it would be good to keep track of original polygons as well as the spatial subsets from gpd.overlay. This could either be a separate dataframe or save the original in a second column before overlays https://geopandas.org/en/stable/docs/user_guide/data_structures.html#geodataframe

  • initial i/o focus will be on returning data as geodataframes (from sliderule) and xarray (using rioxarray for simple rasters and odc.stac for larger datacubes). to work with multiple rasters in different projections, will likely be useful and for combined workflows xvec will be useful. If things need to be saved to disk or object storage, go with geoparquet or zarr.

Clone this wiki locally