Release CDAP 6.1.1 · cdapio/cdap

Summary

This release introduces a number of new features, improvements, and bug fixes to CDAP. Some of the main highlights of the release are:

Pipeline improvements
- Validation checks for plugins for early error detection and prevention
- New widgets for better pipeline configurability
- Wrangler ADLS connection
Field Level Lineage
- New, intuitive UI for field level lineage
- Field level lineage support for more plugins
Platform enhancements
- Performance improvements across the platform
- Migration of more UI components from Angular to React

New Features

Added field level lineage support for Error Transform.(CDAP-16102)
Added region support for google cloud plugins.(CDAP-16037)
New UI landing page.(CDAP-15795)
Allow plugin developers to define filters to show/hide properties based on custom plugin configuration logic..(CDAP-15789)
Introduced new FailureCollector apis for better user experience via contextual error messages.(CDAP-15787)
Added support for reading INT96 types in parquet file sources..(CDAP-15767)
New ConfigurationGroup component in UI.(CDAP-15728)
Added support for pipeline to run in shared vpc network.(CDAP-15723)
Stage level validation for plugin properties..(CDAP-15619)
Added a new REST endpoint that retrieves back all field lineage information about a dataset..(CDAP-15482)
Added support for bytes types in the bigquery sink.(CDAP-15342)

Deprecation

Removed the outdated Validator plugin. (CDAP-15917)

Bug Fixes

Fix the preview run state after JVM restarted(CDAP-16193)
content type detection now uses case insensitive file extensions(CDAP-16146)
Fixed bug that prevents users from navigating to pipeline studio (indicating system artifacts being loaded for a long time).(CDAP-16137)
Fixed the dataproc provisioner to log the error message if the dataproc creation operation fails.(CDAP-15973)
Fixed a bug that caused pipeline startup to take longer than needed for cloud runs(CDAP-15899)
Fixed regex usage in GCS and S3 source plugins.(CDAP-15879)
Fixed a bug with the Datastore source that was overly restrictive when validating the user provided schema(CDAP-15878)
Fixing a bug which can cause a thread spinning in an infinite while loop due to multi thread consumers on a queue that allows a single consumer.(CDAP-15809)
Fixed a bug that caused pipeline failures when writing nullable byte fields as json.(CDAP-15770)
Fixed a bug that caused MapReduce and Spark logs to be missing for remote pipeline runs(CDAP-15757)
Fixed a race condition that could cause a program to get stuck in the pending state when stopped in the pending state(CDAP-15747)
Added some safeguards to prevent cloud pipeline runs from getting stuck in certain edge cases(CDAP-15742)
Fixed a bug where secure macros were not evaluated in preview mode(CDAP-15726)
Fixed a bug in the BigQuery source that cause automatic bucket creation to fail if the dataset is in a different project.(CDAP-15617)
Fix bug in new user tour on lower resolution screens(CDAP-15583)
Fixed a bug that wrong resolution is used if a time range is specified for metrics query(CDAP-15554)
Fixed an issue where BigQuery multi sink doesn't work if using an Oracle database as a source.(CDAP-15535)
Fixed the dataproc provisioner to disable YARN pre-emptive container killing and to disable conscrypt. (CDAP-15498)
Fixed a bug in the MLPredictor plugin that caused error when using a classification model(CDAP-15445)
Fixed bug that didn't allow users to paste schema as runtime argument(CDAP-15423)
Spark pipelines no longer try to run sinks in parallel unless runtime argument 'pipeline.spark.parallel.sinks.enabled' is set to 'true'. This prevents pipeline sections from being re-processed in the majority of situations.(CDAP-15388)
Fixed the dataproc provisioner to handle networks that do not use automatic subnet creation(CDAP-15373)
Fixed a Wrangler bug where the wrong jdbc driver would be used in some situations and where required classes could be unavailable.(CDAP-15353)
Fixed a bug about artifact version comparison(CDAP-15221)
Fixed a bug that the rollup of the workflow lineage does not remove the local datasets.(CDAP-15206)
Expanding filename format that UI takes in when uploading artifacts.(CDAP-15097)

Improvements

Fixed batch pipeline preview to read only the preview records instead of the full input.(CDAP-16110)
Greatly improved the time it takes to calculate field level lineage(CDAP-16069)
Set Spark as the default execution engine for batch pipeline(CDAP-15983)
Improved error message for csv, tsv, and delimited formats when the schema has fewer fields than the data(CDAP-15794)
Added support to automatically fill field level lineage for plugins that do not emit any(CDAP-15782)
Upgrades Nodejs version from 8.x to 10.16.2(CDAP-15738)
Added support to restore preview status after restart(CDAP-15677)
Route user directly to the pipeline's detail page from pipeline card in Control Center. (CDAP-15659)
New user experience for log level selection.(CDAP-15489)
Added image version as a configuration setting to the dataproc provisioner(CDAP-15265)
Improved the way pipelines with macros that are provided by intermediate stages run.(CDAP-16076)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CDAP 6.1.1

Summary

New Features

Deprecation

Bug Fixes

Improvements