Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FTP stream decompression - unable to transparently decompress #841

Open
3 tasks done
nickmtrendii opened this issue Oct 16, 2024 · 2 comments · May be fixed by #842
Open
3 tasks done

FTP stream decompression - unable to transparently decompress #841

nickmtrendii opened this issue Oct 16, 2024 · 2 comments · May be fixed by #842

Comments

@nickmtrendii
Copy link

Problem description

Be sure your description clearly answers the following questions:

When attempting to decompress a file stream from an FTP source, I am getting

WARNING unable to transparently decompress <_io.BufferedReader name=4> because it seems to lack a string-like .name

This then causes errors as it returns the still compressed file blob.

Steps/code to reproduce the problem

import smart_open
import logging
logging.basicConfig(
    level='DEBUG'
)

def main():
    streamreader = smart_open.open(uri='ftp://myftp.example//myfile.txt.gz', mode='r', errors='surrogateescape', encoding='utf-8')
    print(streamreader.__next__())

if __name__ == '__main__':
    main()

Output:

DEBUG:smart_open.smart_open_lib:{'uri': 'ftp://myftp.example//myfile.txt.gz', 'mode': 'r', 'buffering': -1, 'encoding': 'utf-8', 'errors': 'surrogateescape', 'newline': None, 'closefd': True, 'opener': None, 'compression': 'infer_from_extension', 'transport_params': None}
WARNING:smart_open.compression:unable to transparently decompress <_io.BufferedReader name=580> because it seems to lack a string-like .name
DEBUG:smart_open.smart_open_lib:encoding_wrapper: {'fileobj': <_io.BufferedReader name=580>, 'mode': 'r', 'encoding': 'utf-8', 'errors': 'surrogateescape', 'newline': None}
▼���r�8��y�O����ٰ\♦�����Gf�TWugMWTfw�\9(        �X�H5I��        ]���l►�-�i��G��g.&�2y�u@�~▬A|���ok?�A��ُe���9�%-n�bz�R6�n�:+♂V��☼⌂a�Vj��▬�}�▬���}��a������D�������<��t��♂�ؗ"�↨��Oy�`?��J������O�����fu�>�dEqϲ�������

I (poorly) tweaked smart_open_lib.py#L225 to include the filename param and it works to decode my file.

    decompressed = so_compression.compression_wrapper(binary, binary_mode, compression, filename=os.path.basename(uri))

Versions

Windows-10-10.0.19045-SP0
Python 3.13.0 (tags/v3.13.0:60403a5, Oct  7 2024, 09:38:07) [MSC v.1941 64 bit (AMD64)]
smart_open 7.0.5

Checklist

Before you create the issue, please make sure you have:

  • Described the problem clearly
  • Provided a minimal reproducible example, including any required data
  • Provided the version numbers of the relevant software
@ddelange
Copy link
Contributor

no need for the basename call, passing filename=uri should be sufficient and OK since it's always available there anyway. @mpenkov any reason not to add this?

@mpenkov
Copy link
Collaborator

mpenkov commented Oct 16, 2024

No, can't think of one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants