Releases: Sydney-Informatics-Hub/geodata-harvester
v1.1.2
Release Notes: Geodata-Harvester v1.1.2
This is a minor release of Geodata-Harvester that includes only revisions to the JOSS paper. No changes have been made to the code base. This release is to be used in conjunction with the paper submitted to the Journal of Open Source Software.
Updates
Installation & Issue Reports
You can find the source code and installation instructions for Geodata-Harvester on our GitHub repository.
For any questions or issues, please refer to our issue tracker.
Thank you for using Geodata-Harvester!
v1.1.1
Release Notes: Geodata-Harvester v1.1.1
This minor release of Geodata-Harvester addresses two bugs identified in the previous version and provides corresponding updates to the included notebooks to reflect the fixes. We recommend all users to upgrade to this version.
Bug Fixes
-
Issue when importing an existing settings file using the widget notebook.
This issue occurred while loading and executing a settings file via widgets in the notebookexample_harvest_with_widgets.ipynb
. The issue has been addressed and fixed (commit 5e5d98b). -
Problem in temporal aggregation of DEA data
The temporal aggregation of DEA data layers using the median resulted in -999 values due to missing data in the majority of the downloaded image frames. Since the missing data value is not listed in the data file header of the corresponding DEA sources, we had to include additional checks for masking missing values before aggregation. This problem has been fixed intemporal.py
(commit 8edf44c).
Notebook Updates
To ensure that users can fully benefit from the bug fixes, we've made necessary modifications to the following notebooks:
example_harvest.ipynb
(commit dc6cd56)example_harvest_withGEE.ipynb
(commit ca8e820)example_harvest_with_widgets.ipynb
(commit 5e5d98b)
Installation & Issue Reports
You can find the source code and installation instructions for Geodata-Harvester v1.1.1 on our GitHub repository.
For any questions or issues, please refer to our issue tracker.
Thank you for using Geodata-Harvester!
v1.1.0
Release Notes for Geodata-Harvester v1.1.0
We are excited to announce the new release of Geodata-Harvester v1.1.0. This update brings substantial improvements in documentation and notebooks, and bug fixes, largely inspired by the constructive feedback and suggestions from our users and the Journal of Open Source Software (JOSS) reviewers.
Code Improvements
- Slope and Aspect-Ratio computation: removal of gdal dependency for terrain calculations in getdata_dem.py and added new efficient Python functions for calculating slope and aspect-ratio instead of relying on gdal wrappers. (commit 5f9901e)
- Update of test functions: updated test functions and reduced download sizes to speed-up automated test (commits: 74e8679, f2092a5, 6fc4084, 07bc8e5).
Notebook Updates
Our notebooks have been updated with more examples demonstrating the capabilities of the package. The existing notebooks have also been refined for better clarity and user-friendliness. The following key changes have been made:
- Notebook w/o Google Earth Engine authentication: removed Google Earth Engine (GEE) authorisation for basic notebook example settings_harvest.yaml (commit c2fc3e2).
- Notebook with Google Earth Engine demonstration: new notebook for demonstrating GEE including authentication check in settings_harvest_withGEE.ipynb (commit b36abaf).
- Step-by-step notebook: new notebook with step-by-step walk through the processes and all download modules (commit cacd041). This notebook also showcases GEE initialisation and processing
- Include settings display: add functionality to display the settings file within the notebook (commit 5c721a8).
- Improved auto run function documentation: added documentation for harvest.run function amd process steps in notebooks (commit c2fc3e2]).
- Streamlined settings files: updated settings yaml files to provide less download-heavy datasets, simplify settings, and add more documentation to settings.
- Download-time warnings added disclaimer for data downloading parts if longer download times are expected.
(d9feb0c).
Documentation Updates
The documentation has been reworked and expanded:
- Updated README: key features are now linked to feature code snippets; added instructions for running test scripts.
- Updated API docs updated API reference reflecting the changes in the package's methods and classes have been added (commit 537859f).
- Updated Paper expanded on the Statement of Need and added comparison to other packages (commit 6b122bd).
Updated Jupyter Widgets
The documentation and performance of the existing widgets have been improved with the following changes:
- Improved documentation added description in setting widgets and updated notebook for widget demonstration (commit 206df45).
- Bounding box calculation for empty widget settingadded automatic inference of bounding box if empty (commit 57021ce and 8fdb489).
Bug Fixes
Several bugs related to the data collection process and the Jupyter widgets have been fixed. These fixes aim at ensuring a smoother and more accurate data collection and visualization experience.
- Fixed settings.temp_intervals in widget created settings.temp_intervals (commit 57021ce).
- Fixed utils.aggregate_rasters function (commit fa346bd).
- Fixed bug if no GEE image found for time interval (commit a096c65).
Installation & Issue Reports
You can find the source code and installation instructions for Geodata-Harvester v1.1.0 on our GitHub repository.
For any questions or issues, please refer to our issue tracker.
Happy data harvesting!
v1.0.0
Release Notes: Geodata-Harvester v1.0.0
We are excited to announce the new release of Geodata-Harvester v1.0.0! This release brings several new features, performance improvements, and bug fixes that enhance the overall experience of using the Geodata-Harvester.
🌟 New Features
- Time-Series Extraction: The long-awaited extraction of time-series data is now available and integrated in the auto harvest.run function! This enhancement provides users with the ability to process image collections for multiple time intervals and to automatically extract temporal aggregated data, including for climate data (SILO), Digital Earth Australia (DEA) post-processed satellite data, and Google Earth Engine data sources.
- Multi-band raster data queries: With this feature, multi-band data can now be extract from raster images into data tables, including automatic generation and labeling of data columns for each band or channel per image.
- FAQ Chatbot: The new FAQ chatbot on the Github page provides users with a quick and easy way to access the Geodata-Harvester documentation. The FAQbot leverages OpenAI's GPT and uses vector embeddings of the Geodata-Harvester reference material and code documentation.
🚀 Performance Improvements
- Raster query optimisation: Performance gains were achieved by leveraging (rio)xarray for data extraction from raster images, resulting in faster execution times for all users.
- Temporal processing optimisation: The temporal aggregation process for stats extraction has been optimized to reduce the time required to extract temporal aggregated data from large image collections.
🐛 Main Bug Fixes
- Fix missing data values for aggregation: Resolved an issue where temporal processing aggregates over missing data values, which needed to be identified from image header via ase-insensitive search for nodata value names in header and replaced with nan values. This fix ensures that missing data values in images are not corrupting aggregated stats.
- Fix name objects: Fixed naming conventions in image labeling and data table generation to ensure that all objects are named correctly and consistently.
- Fix data table generation: Fixed an issue where data tables were not generated correctly for multi-band raster data queries.
- Fix datetime labeling: Fixed an issue where extracted image dates are not added to metadata in images and labels.
- Fix potential issue in geopackage writing: Fixed the potential issue of duplicate column names in geopackage writing, in case of identical named pre-existing data in result folder.
- Cleanup results csv file: Reordered columns in CSV and removed geometry column from csv since Lat, Lng columns already exist.
- Fix xarray2tif function due to rioxarray upgrade: Fixed an issue in xarray2tif where the rioxarray upgrade (from version '0.13.1' to '0.13.3') was not working anymore for writing multi-channel xarray data as geotiff.
📚 Documentation and Notebooks
- Added new notebooks to demonstrate the new temporal processing features
- Updated documentation to reflect the new features
- Added Geodata-Harvester summary paper
- Added contribution guidelines
📦 Download & Installation
You can find the source code and installation instructions for Geodata-Harvester v1.0.0 on our GitHub repository.
For any questions or issues, please refer to our issue tracker.
Happy coding! 🎉
v0.2.2
Release notes for v0.2.2
Description of work or change:
- add support for multiple collections and reduce options for GEE in settings
- add compatibility with latest eeharvest dependency updates
- update docs (README, settings, docstrings)
- add API code reference docs
- fix notebooks with updated settings
- Clean up repo
- add JOSS paper
v0.2.0
First Geodata-Harvester release
This release of the Geodata-Harvester package includes pip/conda ready installation packages and a range of example workflows for automatic data extraction from a wide range of geodata sources including support for Google Earth Engine data layers.
The following data sources are currently integrated:
- Soil and Landscape Grid of Australia (SLGA)
- SILO Climate Database (Australia)
- National Digital Elevation Model (DEM)
- Digital Earth Australia (DEA) Geoscience Earth Observations
- Radiometric Data (Australia)
- Google Earth Engine Data (account needed)
Core features currently supported in the Geodata-Harvester:
- automatic data retrieval from geospatial APIs for given locations and dates
- data experimentation frontends via Jupyter and R notebooks
- enables reusable workflows via YAML files to save/load settings.
- interactive Jupyter notebook widgets for selecting settings options
- automatic geospatial-temporal processing
- support for multiple temporal aggregation options
- automatic extraction of retrieved data into aligned maps and ready-made dataframes for ML
- preview of data map layers
Main contributors:
@sebhaan
@natbutter
@januarharianto