-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce AllocateTableStorage
in Storage interface
#119
Conversation
cc: @sumedhsakdeo , @jainlavina , @ctrezzo |
cluster/storage/src/main/java/com/linkedin/openhouse/cluster/storage/BaseStorage.java
Outdated
Show resolved
Hide resolved
cluster/storage/src/main/java/com/linkedin/openhouse/cluster/storage/Storage.java
Outdated
Show resolved
Hide resolved
cluster/storage/src/main/java/com/linkedin/openhouse/cluster/storage/BaseStorage.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code looks good, had one question about skipProvisioning
cluster/storage/src/main/java/com/linkedin/openhouse/cluster/storage/BaseStorage.java
Outdated
Show resolved
Hide resolved
cluster/storage/src/main/java/com/linkedin/openhouse/cluster/storage/hdfs/HdfsStorage.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the work!
I left some comments mainly centered around how we encapsulate the table location manipulation logic.
I have scar tissue from the MapReduce code and converting between Paths, URIs, URLs and having that logic scattered everywhere, so sorry in advance if I am focused too much on this.
cluster/storage/src/main/java/com/linkedin/openhouse/cluster/storage/BaseStorage.java
Show resolved
Hide resolved
cluster/storage/src/main/java/com/linkedin/openhouse/cluster/storage/BaseStorage.java
Show resolved
Hide resolved
cluster/storage/src/main/java/com/linkedin/openhouse/cluster/storage/hdfs/HdfsStorage.java
Show resolved
Hide resolved
...les/src/main/java/com/linkedin/openhouse/tables/repository/impl/InternalRepositoryUtils.java
Show resolved
Hide resolved
cluster/storage/src/main/java/com/linkedin/openhouse/cluster/storage/BaseStorage.java
Outdated
Show resolved
Hide resolved
AllocateTableSpace
in Storage interface AllocateTableStorage
in Storage interface
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thank you for providing the extra clarifying comments around issue #121 . I think it makes sense that the logic will be contained in BaseStorage once this issue is fixed and the refactor around allocateTableLocation is complete.
Summary
We introduce a new method in the storage interface as follows:
which returns the tablelocation where table objects will be created.
Implementations of this api should also execute one time operations such as creating buckets, assigning permissions etc.
Please note the comment:
this method should avoid creating TableFormat specific directories (ex: /data, /metadata for Iceberg or _delta_log for DeltaLake)
,it describes what shouldn't be done as part of this implementation.Layering with internal implementations
Internal implementations such as
LiHdfsStorage
will now have flexibiliity to do one-time operations such as table provisioning.Changes
For all the boxes checked, please include additional details of the changes made in this pull request.
Testing Done