Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: clarify locking mechanism requirement for S3 #2558

Merged
merged 8 commits into from
Jun 1, 2024

Conversation

inigohidalgo
Copy link
Contributor

@inigohidalgo inigohidalgo commented May 31, 2024

  • It was unclear to me that concurrent writing was available by default for non-S3 backends, so I am making the language clearer.
  • I have also added an extra section showing that R2 and maybe MinIO can enable concurrent writing
  • Fixed a couple of unrelated formatting issues in the page I edited

closes #2556

#2069 also had the same confusion

@github-actions github-actions bot added the binding/python Issues for the Python package label May 31, 2024
Copy link

ACTION NEEDED

delta-rs follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

@inigohidalgo
Copy link
Contributor Author

ACTION NEEDED

delta-rs follows the [Conventional Commits specification]

I have updated the pr title

@inigohidalgo inigohidalgo changed the title Docs: clarify locking mechanism requirement for S3 dpcs: clarify locking mechanism requirement for S3 May 31, 2024
@inigohidalgo inigohidalgo changed the title dpcs: clarify locking mechanism requirement for S3 docs: clarify locking mechanism requirement for S3 May 31, 2024
@ion-elgreco
Copy link
Collaborator

Can you mention the exception for Cloudflare R2, see this PR: #2083

@ion-elgreco
Copy link
Collaborator

ion-elgreco commented May 31, 2024

#1356 (comment) here is an example for S3 Cloudflare

@inigohidalgo
Copy link
Contributor Author

To make the docstrings explicit: what is the current status of other backends like MinIO? Can concurrent writes be enabled for them with DynamoDB? Or does that only apply to AWS?

@ion-elgreco
Copy link
Collaborator

To make the docstrings explicit: what is the current status of other backends like MinIO? Can concurrent writes be enabled for them with DynamoDB? Or does that only apply to AWS?

I'm not sure about Minio, @wjones127 mentions it supports custom headers so it might work as well for Minio like R2.

@inigohidalgo
Copy link
Contributor Author

Ok. Should I add a small section at the end (or beginning) of the S3 section with the example from the linked issue and also indicate that the same might be possible in MinIO? Or should I just mention the known R2 compatibility?

@ion-elgreco
Copy link
Collaborator

You can mention that Minio might work in similar fashion, but needs to be double checked against their docs or something along those lines

@inigohidalgo
Copy link
Contributor Author

Just spotted this section in the readme https://github.com/delta-io/delta-rs/tree/b05d7e90dc0717b39c7fca35ec3c99c251aee839?tab=readme-ov-file#cloud-integrations

Should I change anything there for MinIO?

@inigohidalgo
Copy link
Contributor Author

I have added some cross-referencing links in the paragraphs I added so I wanted to check they work ok on my end before publishing the PR. I am having some trouble getting the docs to build but the rust + python docs builds are throwing me off a bit (I'm just used to Sphinx)

cd python
make develop  # crate and wheel build successfully
make build-docs  # installs all requirements correctly but then fails when building
INFO    -  [macros] - Macros arguments: {'module_name': 'docs/_build/macro', 'modules': [], 'render_by_default': True, 'include_dir': '', 'include_yaml': [], 'j2_block_start_string': '', 'j2_block_end_string': '',
           'j2_variable_start_string': '', 'j2_variable_end_string': '', 'on_undefined': 'keep', 'on_error_fail': False, 'verbose': False}
INFO    -  [macros] - Found local Python module 'docs/_build/macro' in: /Users/inigo/Programming/Repos/delta-rs
INFO    -  [macros] - Found external Python module 'docs/_build/macro' in: /Users/inigo/Programming/Repos/delta-rs
INFO    -  [macros] - Extra variables (config file): ['python_api_url', 'generator', 'social']
INFO    -  [macros] - Extra filters (module): ['pretty']
INFO    -  Cleaning site directory
INFO    -  Building documentation to directory: /Users/inigo/Programming/Repos/delta-rs/site
INFO    -  mkdocstrings_handlers: Formatting signatures requires Black to be installed.
ERROR   -  Error reading page 'api/exceptions.md': Could not resolve alias deltalake._internal.DeltaError pointing at _internal.DeltaError (in python/deltalake/_internal.abi3.so:None)
Traceback (most recent call last):
  File "/Users/inigo/Programming/Repos/delta-rs/.venv/lib/python3.10/site-packages/griffe/dataclasses.py", line 1373, in _resolve_target
    resolved = self.modules_collection.get_member(self.target_path)
  File "/Users/inigo/Programming/Repos/delta-rs/.venv/lib/python3.10/site-packages/griffe/mixins.py", line 78, in get_member
    return self.members[parts[0]].get_member(parts[1:])  # type: ignore[attr-defined]
KeyError: '_internal'

If I open python I am able to access deltalake.exceptions.DeltaError without issue so I'm a bit lost on what the problem could be.

Copy link
Contributor Author

@inigohidalgo inigohidalgo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cross-reference syntax

docs/usage/writing/writing-to-s3-with-locking-provider.md Outdated Show resolved Hide resolved
python/deltalake/writer.py Outdated Show resolved Hide resolved
@inigohidalgo inigohidalgo marked this pull request as ready for review June 1, 2024 18:07
@inigohidalgo
Copy link
Contributor Author

I have verified the documentation builds on my end, and the links I added work okay.

Copy link
Collaborator

@ion-elgreco ion-elgreco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thankss for writing this up!

@ion-elgreco ion-elgreco enabled auto-merge (squash) June 1, 2024 18:10
@inigohidalgo
Copy link
Contributor Author

https://github.com/delta-io/delta-rs/actions/runs/9332194054/job/25687819730?pr=2558

Oops. I forgot I had this same failure locally #2558 (comment), and I commented out the exceptions in docs/api/exceptions.md. I don't see how this can be related to my PR though.

@ion-elgreco ion-elgreco merged commit d42b68d into delta-io:main Jun 1, 2024
24 of 25 checks passed
@inigohidalgo
Copy link
Contributor Author

Okay I see that action has been failing for a while.

@inigohidalgo inigohidalgo deleted the doc/concurrent-writes branch June 3, 2024 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

documentation: concurrent writes for non-S3 backends
2 participants