Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Getting SSL error while using gcsio.GcsIO #24333

Closed
lkbhitesh07 opened this issue Nov 23, 2022 · 3 comments
Closed

[Bug]: Getting SSL error while using gcsio.GcsIO #24333

lkbhitesh07 opened this issue Nov 23, 2022 · 3 comments

Comments

@lkbhitesh07
Copy link

lkbhitesh07 commented Nov 23, 2022

What happened?

Context
We are trying to introduce the functionality of having communication with GCS via beam jobs. For that matter, we are using gcs.GcsIo(python) to get things done. Now we are facing some unexpected SSL error which we are not sure about.

Relevant links to the code
You can have a look at the implementation for this matter:

  1. Relevant PR - here
  2. Our implementation of using gcs.GcsIo - here
  3. Our file to test the functionality - here

Stack trace

File "/workspace/core/jobs/io/gcs_io.py", line 95, in _read_file
    file = gcs.open(gcs_url, mode=self.mode)
  File "/usr/local/lib/python3.8/site-packages/apache_beam/io/gcp/gcsio.py", line 225, in open
    downloader = GcsDownloader(
  File "/usr/local/lib/python3.8/site-packages/apache_beam/io/gcp/gcsio.py", line 595, in __init__
    project_number = self._get_project_number(self._bucket)
  File "/usr/local/lib/python3.8/site-packages/apache_beam/io/gcp/gcsio.py", line 165, in get_project_number
    bucket_metadata = self.get_bucket(bucket_name=bucket)
  File "/usr/local/lib/python3.8/site-packages/apache_beam/io/gcp/gcsio.py", line 184, in get_bucket
    return self.client.buckets.Get(request)
  File "/usr/local/lib/python3.8/site-packages/apache_beam/io/gcp/internal/clients/storage/storage_v1_client.py", line 282, in Get
    return self._RunMethod(
  File "/usr/local/lib/python3.8/site-packages/apitools/base/py/base_api.py", line 728, in _RunMethod
    http_response = http_wrapper.MakeRequest(
  File "/usr/local/lib/python3.8/site-packages/apitools/base/py/http_wrapper.py", line 348, in MakeRequest
    return _MakeRequestNoRetry(
  File "/usr/local/lib/python3.8/site-packages/apitools/base/py/http_wrapper.py", line 397, in _MakeRequestNoRetry
    info, content = http.request(
  File "/usr/local/lib/python3.8/site-packages/oauth2client/transport.py", line 167, in new_request
    resp, content = orig_request_method(uri, method, body,
  File "/usr/local/lib/python3.8/site-packages/httplib2/__init__.py", line 1314, in request
    (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
  File "/usr/local/lib/python3.8/site-packages/httplib2/__init__.py", line 1064, in _request
    (response, content) = self._conn_request(conn, request_uri, method, body, headers)
  File "/usr/local/lib/python3.8/site-packages/httplib2/__init__.py", line 987, in _conn_request
    conn.connect()
  File "/usr/local/lib/python3.8/http/client.py", line 1425, in connect
    self.sock = self._context.wrap_socket(self.sock,
  File "/usr/local/lib/python3.8/ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "/usr/local/lib/python3.8/ssl.py", line 1040, in _create
    self.do_handshake()
  File "/usr/local/lib/python3.8/ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
RuntimeError: ssl.SSLError: [SSL] internal error (_ssl.c:1131) [while running 'Read files from the GCS/Read the file-ptransform-72']

Please let me know what more information I can provide.
Thanks in advance.

Issue Priority

Priority: 2

Issue Component

Component: io-py-gcp

@Abacn
Copy link
Contributor

Abacn commented Nov 23, 2022

Python GcsIO is based on apitools which has been deprecated for a while. Migration to cloud api is planned (#19073). If just need to communicate with gcs cloud libraries are recommended.

@lkbhitesh07
Copy link
Author

Thanks, @Abacn, I really appreciate your quick response. Just wanted to confirm that we will be able to use the from google.cloud import storage to work our way, I mean the storage library would work fine?
Thanks

@Abacn
Copy link
Contributor

Abacn commented Nov 26, 2022

@lkbhitesh07 google-cloud-storage is not yet a dependency of apache-beam[gcp] (though the Py37, Py38, Py39 bases image we published included this dep:

but Py310 does not, which is weird).

Yes, using google-cloud-storage should work in its way as long as the dependency has been added to your project and environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants