Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect handling of Unicode keys when creating npz files #49

Open
cerisola opened this issue Oct 31, 2021 · 1 comment
Open

Incorrect handling of Unicode keys when creating npz files #49

cerisola opened this issue Oct 31, 2021 · 1 comment

Comments

@cerisola
Copy link

Hi, I am running into issues when using NPZ to create an npz file that uses unicode strings as keys.

Just to be clear, everything works fine when creating the file using Numpy and reading it using NPZ, i.e. this works fine in Python

>>> import numpy as np

>>> np.savez("file.npz", α=1)

>>> D = np.load("file.npz")

>>> print(D["α"])
1

and reading the file in Julia using NPZ also works as expected

julia> using NPZ

julia> D = npzread("file.npz")
Dict{String, Int64} with 1 entry:
  "α" => 1

julia> D["α"]
1

However, if I try creating this file from NPZ, while NPZ can read it as expected, it cannot be properly read by Numpy.
Indeed, from the NPZ side:

julia> npzwrite("file.npz", Dict("α" => 1))

julia> D = npzread("file.npz")
Dict{String, Int64} with 1 entry:
  "α" => 1

julia> D["α"]
1

everything works fine. However, when I try opening the file with Numpy, while it does load it, the keys are not what I would expect:

>>> D = np.load("file.npz")

>>> print(D["α"])
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-17-7d756a0b03cf> in <module>
----> 1 print(D["α"])

/usr/lib/python3.9/site-packages/numpy/lib/npyio.py in __getitem__(self, key)
    258                 return self.zip.read(key)
    259         else:
--> 260             raise KeyError("%s is not a file in the archive" % key)
    261 
    262 

KeyError: 'α is not a file in the archive'

Indeed if I print the keys of the loaded file I get some different unicode string:

>>> list(D.keys())
['╬▒']
@cerisola
Copy link
Author

cerisola commented Nov 2, 2021

After digging into the source of the library to try to find the cause of this issue, I am now pretty sure the problem lies within the ZipFile.jl library that NPZ.jl uses to create the zip file. I have now created an issue for the ZipFile.jl project (see fhs/ZipFile.jl#84) to address this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant