Skip to content

Commit

Permalink
Upgrade services from iceberg 1.2.0 to iceberg 1.5.2 (#179)
Browse files Browse the repository at this point in the history
## Summary
Upgrade `services` from iceberg 1.2.0 to iceberg 1.5.2. `Integrations`,
`apps`, and `tables-test-fixtures` will remain iceberg 1.2.0.

Notable changes from iceberg 1.2.0 to iceberg 1.5.2 are not limited as
following:
- Add view support
- Add FileIO that supports ADLSv2 storage
- Support file and partition delete granularity
- Track partition statistics in TableMetadata
- Add last updated timestamp and snapshot ID to partitions metadata
table
- Added support for Spark 3.5 and removed support for Spark 3.2
- Add fast_forward procedure

## Changes
Added `openhouse.iceberg-conventions-1.2` to strictly pin the
dependencies version to be 1.2.0.

Excluded the iceberg libraries from the `tables` and
`tables-test-fixtures` module to remove the circular dependency.

**After the change, modules using iceberg 1.5.2 will be**:
- services
- cluster:storage
- internalcatalog
- htscatalog

**Modules using iceberg 1.2.0 will be**:
- apps
- integrations
- libs
- tables-test-fixtures

**Checkboxes**:
- [ ] Client-facing API Changes
- [ ] Internal API Changes
- [ ] Bug Fixes
- [ ] New Features
- [ ] Performance Improvements
- [ ] Code Style
- [x] Refactoring
- [ ] Documentation
- [ ] Tests

For all the boxes checked, please include additional details of the
changes made in this pull request.

## Testing Done
Comprehensive spark compatibility tests performed in the local cluster:
https://docs.google.com/document/d/1yXORH5ety5Gdr6Avsh7XEmIghiJ8wQrg0l4G4_9Soxo/edit#heading=h.h69v5xcbp1md

- [ ] Manually Tested on local docker setup. Please include commands
ran, and their output.
- [ ] Added new tests for the changes made.
- [x] Updated existing tests to reflect the changes made.
- [ ] No tests added or updated. Please explain why. If unsure, please
feel free to ask for help.
- [ ] Some other form of testing like staging or soak time in
production. Please explain.

For all the boxes checked, include a detailed description of the testing
done for the changes made in this pull request.

# Additional Information

- [ ] Breaking Changes
- [ ] Deprecations
- [ ] Large PR broken into smaller PRs, and PR plan linked in the
description.

For all the boxes checked, include additional details of the changes
made in this pull request.
  • Loading branch information
jiang95-dev authored Oct 17, 2024
1 parent dee3a27 commit d49e636
Show file tree
Hide file tree
Showing 15 changed files with 44 additions and 24 deletions.
2 changes: 1 addition & 1 deletion apps/spark/build.gradle
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
plugins {
id 'openhouse.java-conventions'
id 'openhouse.hadoop-conventions'
id 'openhouse.iceberg-conventions'
id 'openhouse.iceberg-conventions-1.2'
id 'openhouse.maven-publish'
id 'com.github.johnrengelman.shadow' version '7.1.2'
}
Expand Down
11 changes: 11 additions & 0 deletions buildSrc/src/main/groovy/openhouse.iceberg-conventions-1.2.gradle
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
ext {
icebergVersion = '1.2.0'
}

dependencies {
implementation('org.apache.iceberg:iceberg-bundled-guava:' + icebergVersion + "!!")
implementation('org.apache.iceberg:iceberg-data:' + icebergVersion + "!!")
implementation('org.apache.iceberg:iceberg-core:' + icebergVersion + "!!")
implementation('org.apache.iceberg:iceberg-common:' + icebergVersion + "!!")
implementation('org.testcontainers:testcontainers:1.19.8')
}
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
ext {
icebergVersion = '1.2.0'
icebergVersion = '1.5.2'
}

dependencies {
Expand Down
2 changes: 1 addition & 1 deletion cluster/storage/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ plugins {
id 'openhouse.hadoop-conventions'
id 'openhouse.iceberg-aws-conventions'
id 'openhouse.iceberg-azure-conventions'
id 'openhouse.iceberg-conventions'
id 'openhouse.iceberg-conventions-1.5.2'
id 'openhouse.maven-publish'
}

Expand Down
2 changes: 1 addition & 1 deletion iceberg/azure/build.gradle
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
plugins {
id 'openhouse.java-conventions'
id 'openhouse.iceberg-conventions'
id 'openhouse.iceberg-conventions-1.5.2'
id 'openhouse.iceberg-azure-conventions'
id 'openhouse.maven-publish'
}
2 changes: 1 addition & 1 deletion iceberg/openhouse/htscatalog/build.gradle
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
plugins {
id 'openhouse.springboot-conventions'
id 'openhouse.hadoop-conventions'
id 'openhouse.iceberg-conventions'
id 'openhouse.iceberg-conventions-1.5.2'
id 'openhouse.maven-publish'
}

2 changes: 1 addition & 1 deletion iceberg/openhouse/internalcatalog/build.gradle
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
plugins {
id 'openhouse.springboot-conventions'
id 'openhouse.client-codegen-convention'
id 'openhouse.iceberg-conventions'
id 'openhouse.iceberg-conventions-1.5.2'
id 'openhouse.iceberg-azure-conventions'
id 'openhouse.hadoop-conventions'
id 'openhouse.iceberg-aws-conventions'
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -114,8 +114,9 @@ void testDoCommitAppendSnapshotsInitialVersion() throws IOException {

Map<String, String> updatedProperties = tblMetadataCaptor.getValue().properties();
Assertions.assertEquals(
4,
updatedProperties.size()); /*location, lastModifiedTime, version and appended_snapshots*/
5,
updatedProperties
.size()); /*write.parquet.compression-codec, location, lastModifiedTime, version and appended_snapshots*/
Assertions.assertEquals(
"INITIAL_VERSION", updatedProperties.get(getCanonicalFieldName("tableVersion")));
Assertions.assertEquals(
Expand Down Expand Up @@ -155,8 +156,9 @@ void testDoCommitAppendSnapshotsExistingVersion() throws IOException {

Map<String, String> updatedProperties = tblMetadataCaptor.getValue().properties();
Assertions.assertEquals(
4,
updatedProperties.size()); /*location, lastModifiedTime, version and deleted_snapshots*/
5,
updatedProperties
.size()); /*write.parquet.compression-codec, location, lastModifiedTime, version and deleted_snapshots*/
Assertions.assertEquals(
TEST_LOCATION, updatedProperties.get(getCanonicalFieldName("tableVersion")));

Expand Down Expand Up @@ -205,8 +207,9 @@ void testDoCommitAppendAndDeleteSnapshots() throws IOException {

Map<String, String> updatedProperties = tblMetadataCaptor.getValue().properties();
Assertions.assertEquals(
5,
updatedProperties.size()); /*location, lastModifiedTime, version and deleted_snapshots*/
6,
updatedProperties
.size()); /*write.parquet.compression-codec, location, lastModifiedTime, version, appended_snapshots and deleted_snapshots*/
Assertions.assertEquals(
TEST_LOCATION, updatedProperties.get(getCanonicalFieldName("tableVersion")));

Expand Down Expand Up @@ -259,8 +262,9 @@ void testDoCommitDeleteSnapshots() throws IOException {

Map<String, String> updatedProperties = tblMetadataCaptor.getValue().properties();
Assertions.assertEquals(
4,
updatedProperties.size()); /*location, lastModifiedTime, version and deleted_snapshots*/
5,
updatedProperties
.size()); /*write.parquet.compression-codec, location, lastModifiedTime, version and deleted_snapshots*/
Assertions.assertEquals(
TEST_LOCATION, updatedProperties.get(getCanonicalFieldName("tableVersion")));

Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[
"{ \"snapshot-id\" : 1151407017102313399, \"parent-snapshot-id\" : 1151407017102313398, \"timestamp-ms\" : 1669126937912, \"summary\" : { \"operation\" : \"append\", \"wap.id\" : \"wap1\", \"spark.app.id\" : \"local-1669126906634\", \"added-data-files\" : \"1\", \"added-records\" : \"1\", \"added-files-size\" : \"673\", \"changed-partition-count\" : \"1\", \"total-records\" : \"1\", \"total-files-size\" : \"673\", \"total-data-files\" : \"1\", \"total-delete-files\" : \"0\", \"total-position-deletes\" : \"0\", \"total-equality-deletes\" : \"0\" }, \"manifest-list\" : \"/data/openhouse/db/test-7a9e8c95-1a62-4d29-9621-d8784047fc6b/metadata/snap-2151407017102313398-1-aa0dcbb9-707f-4f53-9df8-394bad8563f2.avro\", \"schema-id\" : 0}",
"{ \"snapshot-id\" : 2151407017102313399, \"parent-snapshot-id\" : 1151407017102313398, \"timestamp-ms\" : 1669126937912, \"summary\" : { \"operation\" : \"append\", \"wap.id\" : \"wap2\", \"spark.app.id\" : \"local-1669126906634\", \"added-data-files\" : \"1\", \"added-records\" : \"1\", \"added-files-size\" : \"673\", \"changed-partition-count\" : \"1\", \"total-records\" : \"1\", \"total-files-size\" : \"673\", \"total-data-files\" : \"1\", \"total-delete-files\" : \"0\", \"total-position-deletes\" : \"0\", \"total-equality-deletes\" : \"0\" }, \"manifest-list\" : \"/data/openhouse/db/test-7a9e8c95-1a62-4d29-9621-d8784047fc6b/metadata/snap-2151407017102313398-1-aa0dcbb9-707f-4f53-9df8-394bad8563f2.avro\", \"schema-id\" : 0}",
"{ \"snapshot-id\" : 3151407017102313399, \"parent-snapshot-id\" : 1151407017102313399, \"timestamp-ms\" : 1669126937912, \"summary\" : { \"source-snapshot-id\" : \"2151407017102313399\", \"operation\" : \"append\", \"published.wap.id\" : \"wap2\", \"spark.app.id\" : \"local-1669126906634\", \"added-data-files\" : \"1\", \"added-records\" : \"1\", \"added-files-size\" : \"673\", \"changed-partition-count\" : \"1\", \"total-records\" : \"1\", \"total-files-size\" : \"673\", \"total-data-files\" : \"1\", \"total-delete-files\" : \"0\", \"total-position-deletes\" : \"0\", \"total-equality-deletes\" : \"0\" }, \"manifest-list\" : \"/data/openhouse/db/test-7a9e8c95-1a62-4d29-9621-d8784047fc6b/metadata/snap-2151407017102313398-1-aa0dcbb9-707f-4f53-9df8-394bad8563f2.avro\", \"schema-id\" : 0}"
"{ \"snapshot-id\" : 1151407017102313399, \"parent-snapshot-id\" : 1151407017102313398, \"sequence-number\" : 1, \"timestamp-ms\" : 1669126937912, \"summary\" : { \"operation\" : \"append\", \"wap.id\" : \"wap1\", \"spark.app.id\" : \"local-1669126906634\", \"added-data-files\" : \"1\", \"added-records\" : \"1\", \"added-files-size\" : \"673\", \"changed-partition-count\" : \"1\", \"total-records\" : \"1\", \"total-files-size\" : \"673\", \"total-data-files\" : \"1\", \"total-delete-files\" : \"0\", \"total-position-deletes\" : \"0\", \"total-equality-deletes\" : \"0\" }, \"manifest-list\" : \"/data/openhouse/db/test-7a9e8c95-1a62-4d29-9621-d8784047fc6b/metadata/snap-2151407017102313398-1-aa0dcbb9-707f-4f53-9df8-394bad8563f2.avro\", \"schema-id\" : 0}",
"{ \"snapshot-id\" : 2151407017102313399, \"parent-snapshot-id\" : 1151407017102313398, \"sequence-number\" : 2, \"timestamp-ms\" : 1669126937912, \"summary\" : { \"operation\" : \"append\", \"wap.id\" : \"wap2\", \"spark.app.id\" : \"local-1669126906634\", \"added-data-files\" : \"1\", \"added-records\" : \"1\", \"added-files-size\" : \"673\", \"changed-partition-count\" : \"1\", \"total-records\" : \"1\", \"total-files-size\" : \"673\", \"total-data-files\" : \"1\", \"total-delete-files\" : \"0\", \"total-position-deletes\" : \"0\", \"total-equality-deletes\" : \"0\" }, \"manifest-list\" : \"/data/openhouse/db/test-7a9e8c95-1a62-4d29-9621-d8784047fc6b/metadata/snap-2151407017102313398-1-aa0dcbb9-707f-4f53-9df8-394bad8563f2.avro\", \"schema-id\" : 0}",
"{ \"snapshot-id\" : 3151407017102313399, \"parent-snapshot-id\" : 1151407017102313399, \"sequence-number\" : 3, \"timestamp-ms\" : 1669126937912, \"summary\" : { \"source-snapshot-id\" : \"2151407017102313399\", \"operation\" : \"append\", \"published.wap.id\" : \"wap2\", \"spark.app.id\" : \"local-1669126906634\", \"added-data-files\" : \"1\", \"added-records\" : \"1\", \"added-files-size\" : \"673\", \"changed-partition-count\" : \"1\", \"total-records\" : \"1\", \"total-files-size\" : \"673\", \"total-data-files\" : \"1\", \"total-delete-files\" : \"0\", \"total-position-deletes\" : \"0\", \"total-equality-deletes\" : \"0\" }, \"manifest-list\" : \"/data/openhouse/db/test-7a9e8c95-1a62-4d29-9621-d8784047fc6b/metadata/snap-2151407017102313398-1-aa0dcbb9-707f-4f53-9df8-394bad8563f2.avro\", \"schema-id\" : 0}"
]
2 changes: 1 addition & 1 deletion libs/datalayout/build.gradle
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
plugins {
id 'openhouse.java-conventions'
id 'openhouse.hadoop-conventions'
id 'openhouse.iceberg-conventions'
id 'openhouse.iceberg-conventions-1.2'
id 'openhouse.maven-publish'
}

Expand Down
2 changes: 1 addition & 1 deletion services/common/build.gradle
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
plugins {
id 'openhouse.java-conventions'
id 'openhouse.iceberg-conventions'
id 'openhouse.iceberg-conventions-1.5.2'
id 'openhouse.hadoop-conventions'
id 'openhouse.maven-publish'
id 'java-test-fixtures'
Expand Down
2 changes: 1 addition & 1 deletion services/housetables/build.gradle
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
plugins {
id 'openhouse.springboot-ext-conventions'
id 'openhouse.hadoop-conventions'
id 'openhouse.iceberg-conventions'
id 'openhouse.iceberg-conventions-1.5.2'
id 'openhouse.maven-publish'
/**
* FIXME: Ideally, the below line are also defined in shared buildSrc. But raises following error:
Expand Down
2 changes: 1 addition & 1 deletion services/jobs/build.gradle
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
plugins {
id 'openhouse.springboot-ext-conventions'
id 'openhouse.hadoop-conventions'
id 'openhouse.iceberg-conventions'
id 'openhouse.iceberg-conventions-1.5.2'
id 'openhouse.maven-publish'
/**
* FIXME: Ideally, the below line are also defined in shared buildSrc. But raises following error:
Expand Down
6 changes: 4 additions & 2 deletions services/tables/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ plugins {
id 'openhouse.hadoop-conventions'
id 'openhouse.iceberg-aws-conventions'
id 'openhouse.iceberg-azure-conventions'
id 'openhouse.iceberg-conventions'
id 'openhouse.iceberg-conventions-1.5.2'
id 'openhouse.maven-publish'
/**
* FIXME: Ideally, the below line are also defined in shared buildSrc. But raises following error:
Expand Down Expand Up @@ -34,5 +34,7 @@ dependencies {
testImplementation 'org.junit.jupiter:junit-jupiter-engine:' + junit_version
testImplementation 'org.springframework.security:spring-security-test:5.7.3'
testImplementation(testFixtures(project(':services:common')))
testImplementation project(':tables-test-fixtures_2.12')
testImplementation (project(':tables-test-fixtures_2.12')) {
exclude group: 'org.apache.iceberg'
}
}
5 changes: 4 additions & 1 deletion tables-test-fixtures/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ plugins {
id 'openhouse.java-minimal-conventions'
id 'com.github.johnrengelman.shadow' version '7.1.2'
id 'openhouse.maven-publish'
id 'openhouse.iceberg-conventions-1.2'
}

import com.github.jengelman.gradle.plugins.shadow.transformers.PropertiesFileTransformer
Expand All @@ -16,7 +17,9 @@ configurations {
dependencies {
implementation 'org.junit.jupiter:junit-jupiter-engine:' + junit_version
implementation 'org.springframework.boot:spring-boot-starter-test:' + spring_web_version
implementation(project(':services:tables'))
implementation ((project(':services:tables'))) {
exclude group: 'org.apache.iceberg'
}
compileOnly 'org.springframework.boot:spring-boot-starter-tomcat:' + spring_web_version
compileOnly('org.apache.spark:spark-sql_2.12:' + spark_version){
// These classes are available from `client-codegen-convention.gradle`
Expand Down

0 comments on commit d49e636

Please sign in to comment.