Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test Spark 4.0 #297

Draft
wants to merge 1 commit into
base: develop
Choose a base branch
from
Draft

Test Spark 4.0 #297

wants to merge 1 commit into from

Conversation

dolfinus
Copy link
Member

@dolfinus dolfinus commented Jul 25, 2024

Change Summary

  • Update CI matrix to include Spark 4.x. By default, it is not used for tests, unless some integration has been changed (like 2.x).
  • Spark 4.x also supports Java 22, include it to Readme & CI matrix.
  • Update JDBCConnection methods to be compatible with Spark 4.0 JDBCUtils methods signatures.
  • Update XML class get_packages() and check_if_supported() methods, as Spark 4.0 includes XML support.
  • Since Spark 4.0, DecimalType(38, 10).typeName() started returning decimal(38, 10) instead of decimal, which breaks some Oracle tests. Updated SparkTypeToHWM implementation to have direct mapping between Spark data type classes and HWM classes.

TODO:

  • Fix MySQL tests using IncrementalStrategy are failing with obscure error
  • Fix S3 tests failing because of /data/data folder is being created somehow
  • Skip Excel tests on Spark 4.0

Related issue number

Checklist

  • Commit message and PR title is comprehensive
  • Keep the change as small as possible
  • Unit and integration tests for the changes exist
  • Tests pass on CI and coverage does not decrease
  • Documentation reflects the changes where applicable
  • docs/changelog/next_release/<pull request or issue id>.<change type>.rst file added describing change
    (see CONTRIBUTING.rst for details.)
  • My PR is ready to review.

@dolfinus dolfinus self-assigned this Jul 25, 2024
Copy link

codecov bot commented Jul 25, 2024

Codecov Report

Attention: Patch coverage is 55.88235% with 15 lines in your changes missing coverage. Please review.

Project coverage is 94.70%. Comparing base (b657322) to head (0ef84bd).

Files Patch % Lines
.../connection/db_connection/jdbc_mixin/connection.py 69.23% 2 Missing and 2 partials ⚠️
onetl/file/format/xml.py 55.55% 2 Missing and 2 partials ⚠️
onetl/connection/db_connection/kafka/connection.py 33.33% 1 Missing and 1 partial ⚠️
...nnection/file_df_connection/spark_s3/connection.py 50.00% 1 Missing and 1 partial ⚠️
onetl/file/format/avro.py 50.00% 1 Missing and 1 partial ⚠️
onetl/_util/scala.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #297      +/-   ##
===========================================
- Coverage    94.84%   94.70%   -0.14%     
===========================================
  Files          210      210              
  Lines         8200     8219      +19     
  Branches      1413     1420       +7     
===========================================
+ Hits          7777     7784       +7     
- Misses         299      305       +6     
- Partials       124      130       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant