Releases: kedro-org/kedro
0.18.14
Release 0.18.14
Major features and improvements
- Allowed using of custom cookiecutter templates for creating pipelines with
--template
flag forkedro pipeline create
or viatemplate/pipeline
folder. - Allowed overriding of configuration keys with runtime parameters using the
runtime_params
resolver withOmegaConfigLoader
.
Bug fixes and other changes
- Updated dataset factories to resolve nested catalog config properly.
- Updated
OmegaConfigLoader
to handle paths containing dots outside ofconf_source
. - Made
settings.py
optional.
Documentation changes
- Added documentation to clarify execution order of hooks.
- Added a notebook example for spaceflights to illustrate how to incrementally add Kedro features.
- Moved documentation for the
standalone-datacatalog
starter into its README file. - Added new documentation about deploying a Kedro project with Amazon EMR.
- Added new documentation about how to publish a Kedro-Viz project to make it shareable.
- New TSC members added to the page and the organisation of each member is also now listed.
- Plus some minor bug fixes and changes across the documentation.
Upcoming deprecations for Kedro 0.19.0
- All dataset classes will be removed from the core Kedro repository (
kedro.extras.datasets
). Install and import them from thekedro-datasets
package instead. - All dataset classes ending with
DataSet
are deprecated and will be removed in Kedro0.19.0
andkedro-datasets
2.0.0
. Instead, use the updated class names ending withDataset
. - The starters
pandas-iris
,pyspark-iris
,pyspark
, andstandalone-datacatalog
are deprecated and will be archived in Kedro 0.19.0. PartitionedDataset
andIncrementalDataset
have been moved tokedro-datasets
and will be removed in Kedro0.19.0
. Install and import them from thekedro-datasets
package instead.
Community contributions
Many thanks to the following Kedroids for contributing PRs to this release:
0.18.13
Release 0.18.13
Major features and improvements
- Added support for Python 3.11. This includes tackling challenges like dependency pinning and test adjustments to ensure a smooth experience. Detailed migration tips are provided below for further context.
- Added new
OmegaConfigLoader
features:- Allowed registering of custom resolvers to
OmegaConfigLoader
throughCONFIG_LOADER_ARGS
. - Added support for global variables to
OmegaConfigLoader
.
- Allowed registering of custom resolvers to
- Added
kedro catalog resolve
CLI command that resolves dataset factories in the catalog with any explicit entries in the project pipeline. - Implemented a flat
conf/
structure for modular pipelines, and accordingly, updated thekedro pipeline create
andkedro catalog create
command. - Updated new Kedro project template and Kedro starters:
- Change Kedro starters and new Kedro projects to use
OmegaConfigLoader
. - Converted
setup.py
in new Kedro project template and Kedro starters topyproject.toml
and moved flake8 configuration
to dedicated file.flake8
. - Updated the spaceflights starter to use the new flat
conf/
structure.
- Change Kedro starters and new Kedro projects to use
Bug fixes and other changes
- Updated
OmegaConfigLoader
to ignore config from hidden directories like.ipynb_checkpoints
.
Documentation changes
- Revised the
data
section to restructure beginner and advanced pages about the Data Catalog and datasets. - Moved contributor documentation to the GitHub wiki.
- Updated example of using generator functions in nodes.
- Added migration guide from the
ConfigLoader
and theTemplatedConfigLoader
to theOmegaConfigLoader
. TheConfigLoader
and theTemplatedConfigLoader
are deprecated and will be removed in the0.19.0
release.
Migration Tips for Python 3.11:
- PyTables on Windows: Users on Windows with Python >=3.8 should note we've pinned
pytables
to3.8.0
due to compatibility issues. - Spark Dependency: We've set an upper version limit for
pyspark
at <3.4 due to breaking changes in 3.4. - Testing with Python 3.10: The latest
moto
version now supports parallel test execution for Python 3.10, resolving previous issues.
Breaking changes to the API
Upcoming deprecations for Kedro 0.19.0
- Renamed abstract dataset classes, in accordance with the Kedro lexicon. Dataset classes ending with "DataSet" are deprecated and will be removed in 0.19.0. Note that all of the below classes are also importable from
kedro.io
; only the module where they are defined is listed as the location.
Type | Deprecated Alias | Location |
---|---|---|
AbstractDataset |
AbstractDataSet |
kedro.io.core |
AbstractVersionedDataset |
AbstractVersionedDataSet |
kedro.io.core |
- Using the
layer
attribute at the top level is deprecated; it will be removed in Kedro version 0.19.0. Please movelayer
inside themetadata
->kedro-viz
attributes.
Community contributions
Thanks to Laíza Milena Scheid Parizotto and Jonathan Cohen.
0.18.12
Release 0.18.12
Major features and improvements
- Added dataset factories feature which uses pattern matching to reduce the number of catalog entries.
- Activated all built-in resolvers by default for
OmegaConfigLoader
except foroc.env
. - Added
kedro catalog rank
CLI command that ranks dataset factories in the catalog by matching priority.
Bug fixes and other changes
- Consolidated dependencies and optional dependencies in
pyproject.toml
. - Made validation of unique node outputs much faster.
- Updated
kedro catalog list
to show datasets generated with factories.
Documentation changes
- Recommended
ruff
as the linter and removed mentions ofpylint
,isort
,flake8
.
Community contributions
Thanks to Laíza Milena Scheid Parizotto and Chris Schopp.
Breaking changes to the API
Upcoming deprecations for Kedro 0.19.0
ConfigLoader
andTemplatedConfigLoader
will be deprecated. Please useOmegaConfigLoader
instead.
0.18.11
Release 0.18.11
Major features and improvements
- Added databricks-iris as an official starter.
Bug fixes and other changes
- Reworked micropackaging workflow to use standard Python packaging practices.
- Make kedro micropkg package accept --verbose.
Documentation changes
- Significant improvements to the documentation that covers working with Databricks and Kedro, including a new page for workspace-only development, and a guide to choosing the best workflow for your use case.
- Updated documentation for deploying with Prefect for version 2.0.
0.18.10
0.18.9
Major features and improvements
kedro run --params
now updates interpolated parameters correctly when usingOmegaConfigLoader
.- Added
metadata
attribute tokedro.io
datasets. This is ignored by Kedro, but may be consumed by users or external plugins. - Added
kedro.logging.RichHandler
. This replaces the defaultrich.logging.RichHandler
and is more flexible, user can turn off therich
traceback if needed.
Bug fixes and other changes
OmegaConfigLoader
will return adict
instead ofDictConfig
.OmegaConfigLoader
does not show aMissingConfigError
when the config files exist but are empty.
Documentation changes
- Added documentation for collaborative experiment tracking within Kedro-Viz.
- Revised section on deployment to better organise content and reflect how recently docs have been updated.
- Minor improvements to fix typos and revise docs to align with engineering changes.
Breaking changes to the API
kedro package
does not produce.egg
files anymore, and now relies exclusively on.whl
files.
Community contributions
Many thanks to the following Kedroids for contributing PRs to this release:
0.18.8
Major features and improvements
- Added
KEDRO_LOGGING_CONFIG
environment variable, which can be used to configure logging from the beginning of thekedro
process. - Removed logs folder from the kedro new project template. File-based logging will remain but just be level INFO and above and go to project root instead.
Bug fixes and other changes
- Improvements to Jupyter E2E tests.
- Added full
kedro run
CLI command to session store to improve run reproducibility usingKedro-Viz
experiment tracking.
Documentation changes
- Improvements to documentation about configuration.
- Improvements to Sphinx toolchain including incrementing to use a newer version.
- Improvements to documentation on visualising Kedro projects on Databricks, and additional documentation about the development workflow for Kedro projects on Databricks.
- Updated Technical Steering Committee membership documentation.
- Revised documentation section about linting and formatting and extended to give details of
flake8
configuration. - Updated table of contents for documentation to reduce scrolling.
- Expanded FAQ documentation.
- Added a 404 page to documentation.
- Added deprecation warnings about the removal of
kedro.extras.datasets
.
0.18.7
Release 0.18.7
Major features and improvements
- Added new Kedro CLI
kedro jupyter setup
to setup Jupyter Kernel for Kedro. kedro package
now includes the project configuration in a compressedtar.gz
file.- Added functionality to the
OmegaConfigLoader
to load configuration from compressed files ofzip
ortar
format. This feature requiresfsspec>=2023.1.0
. - Significant improvements to on-boarding documentation that covers setup for new Kedro users. Also some major changes to the spaceflights tutorial to make it faster to work through. We think it's a better read. Tell us if it's not.
Bug fixes and other changes
- Added a guide and tooling for developing Kedro for Databricks.
- Implement missing dict-like interface for
_ProjectPipeline
.
0.18.6
Release 0.18.6
Bug fixes and other changes
- Fixed bug that didn't allow to read or write datasets with
s3a
ors3n
filepaths - Fixed bug with overriding nested parameters using the
--params
flag - Fixed bug that made session store incompatible with
Kedro-Viz
experiment tracking
Migration guide from Kedro 0.18.5 to 0.18.6
A regression introduced in Kedro version 0.18.5
caused the Kedro-Viz
console to fail to show experiment tracking correctly. If you experienced this issue, you will need to:
- upgrade to Kedro version
0.18.6
- delete any erroneous session entries created with Kedro 0.18.5 from your session_store.db stored at
<project-path>/data/session_store.db
.
Thanks to Kedroids tomohiko kato, tsanikgr and maddataanalyst for very detailed reports about the bug.
0.18.5
Release 0.18.5
NOTE: This version of Kedro introduced a bug such that the Kedro-Viz console to fail to show experiment tracking correctly. We recommend that you don't use it and prefer instead to use Kedro version
0.18.6
.
Major features and improvements
- Added new
OmegaConfigLoader
which usesOmegaConf
for loading and merging configuration. - Added the
--conf-source
option tokedro run
, allowing users to specify a source for project configuration for the run. - Added
omegaconf
syntax as option for--params
. Keys and values can now be separated by colons or equals signs. - Added support for generator functions as nodes, i.e. using
yield
instead of return.- Enable chunk-wise processing in nodes with generator functions.
- Save node outputs after every
yield
before proceeding with next chunk.
- Fixed incorrect parsing of Azure Data Lake Storage Gen2 URIs used in datasets.
- Added support for loading credentials from environment variables using
OmegaConfigLoader
. - Added new
--namespace
flag tokedro run
to enable filtering by node namespace. - Added a new argument
node
for all four dataset hooks. - Added the
kedro run
flags--nodes
,--tags
, and--load-versions
to replace--node
,--tag
, and--load-version
.
Bug fixes and other changes
- Commas surrounded by square brackets (only possible for nodes with default names) will no longer split the arguments to
kedro run
options which take a list of nodes as inputs (--from-nodes
and--to-nodes
). - Fixed bug where
micropkg
manifest section inpyproject.toml
isn't recognised as allowed configuration. - Fixed bug causing
load_ipython_extension
not to register the%reload_kedro
line magic when called in a directory that does not contain a Kedro project. - Added
anyconfig
'sac_context
parameter tokedro.config.commons
module functions for more flexibleConfigLoader
customizations. - Change reference to
kedro.pipeline.Pipeline
object throughout test suite withkedro.modular_pipeline.pipeline
factory. - Fixed bug causing the
after_dataset_saved
hook only to be called for one output dataset when multiple are saved in a single node and async saving is in use. - Log level for "Credentials not found in your Kedro project config" was changed from
WARNING
toDEBUG
. - Added safe extraction of tar files in
micropkg pull
to fix vulnerability caused by CVE-2007-4559. - Documentation improvements
- Bug fix in table font size
- Updated API docs links for datasets
- Improved CLI docs for
kedro run
- Revised documentation for visualisation to build plots and for experiment tracking
- Added example for loading external credentials to the Hooks documentation
Breaking changes to the API
Community contributions
Many thanks to the following Kedroids for contributing PRs to this release:
Upcoming deprecations for Kedro 0.19.0
project_version
will be deprecated inpyproject.toml
please usekedro_init_version
instead.- Deprecated
kedro run
flags--node
,--tag
, and--load-version
in favour of--nodes
,--tags
, and--load-versions
.