KeyError: "'codedValues' not defined in section 7" when loading a possibly corrupted file #131

lkugler · 2018-07-22T20:30:22Z

Hi all,
I got a KeyError: "'codedValues' not defined in section 7" when loading a grib2. I suspect it is raised because the file is somehow corrupt, because loading works on some files and does not on others.

I uploaded a sample file here, which raises the error for me: http://homepage.univie.ac.at/a1254888/ICON_EU_single_level_elements_CLCM_2018050512.grib2
It originates from a grib2 from opendata.dwd.de, which has been loaded with Iris, reduced in size (area) and saved. (on about 10-20% of all files the error is raised)

It is enough to do
cube = iris.load('ICON_EU_single_level_elements_CLCM_2018050512.grib2')[0]
cube.data
and the error is raised.

How should I load many files, not knowing which/if a file is corrupted?
For now I am iterating over all files and load them inside a try-block. Does not seem to be a lot slower on ~300 files.

Would be great if you could share your comments on the problem.
Thanks!

The text was updated successfully, but these errors were encountered:

DPeterK · 2018-08-20T13:13:20Z

Hi @loxn8773 - thanks for raising this, and apologies that it's taken a little while for anyone to get back to you...

In GRIB2, section 7 of each message stores the actual data values of that particular message. As you may be aware, GRIB2 stores files as a series of messages containing data and metadata. Each message contains a 2D field of (latitude-like, longitude-like) values at a given height, time, or other. This way, when all the messages within a GRIB2 file are loaded, the 2D fields are tiled together to make one or more n-dimensional data structures (such as Iris cubes). If the data values are missing from a message, this effectively means that one of the tiles has no data, so part of the cube simply won't have any data.

As it's the codedValues key that stores the 2D data for each message, if this key is missing then Iris won't be able to construct a cube from the data. As such, this key is a key that must be present in each message, so iris-grib will raise an error if it isn't present - this is the error you've encountered here.

In effect then, this GRIB2 file is corrupted. It might be worth contacting the data supplier to see if you can work out why some of the messages are missing their data element. You can see some of the missing codedValues keys by using the grib_dump command-line tool and grepping for both the message header and the presence of a codedValues key within each message:

grib_dump -O ICON_EU_single_level_elements_CLCM_2018050512.grib2 | grep "MESSAGE\|codedV"
...
#==============   MESSAGE 22 ( length=179 )                ==============
#==============   MESSAGE 23 ( length=2525 )               ==============
6-2351    codedValues = (782,2346) {
} # data_g2simple_packing codedValues 
#==============   MESSAGE 24 ( length=2525 )               ==============
6-2351    codedValues = (782,2346) {
...

You can see from this snip that message 22 (among others) is missing a codedValues key.

In terms of working around this in Python, it sounds like your existing try-except is a very reasonable solution, especially as it isn't too slow in your case. One danger is that you're loading a lot of data into memory, as you only encounter the error on data loading. This means the try-except solution may break down with more significant data volumes.

Another option is to make use of some of the underlying iris_grib functionality to check each message for the presence of a codedValues key, and only load the resultant cube if all messages contributing to the cube have a codedValues key. I've put together an example of this, which I've attached, but note that this does not take account of differing phenomena in the input fields, so if you have a single file that loads to more than one cube, this code may not behave correctly.

Hope this helps!

codedValues_key_present.txt

lkugler · 2018-08-20T14:25:47Z

Great explanation, thanks!

trexfeathers · 2022-09-21T09:54:51Z

This looks like it was solved several years ago! We're gonna close it, feel free to re-open if you still need help.

greenlaw · 2023-02-07T17:52:22Z

For anyone else who comes across this - We have also encountered this with many messages, and according to people who are more knowledgeable about GRIB2 than I am, it is completely valid for codedValues to be missing in certain cases. In those cases, the array can be computed by inspecting the contents of Section 5 (Data Representation Section).

To quote the spec:

(4) The original data value Y (in the units of code table 4.2) can be recovered with the formula:

Y * 10**D= R + (X1+X2) * 2**E

For simple packing and all spectral data
E = Binary scale factor,
D = Decimal scale factor
R = Reference value of the whole field,
X1 = 0,
X2 = Scaled (encoded) value.

For complex grid point packing schemes, E, D, and R are as above, but

X1 = Reference value (scaled integer) of the group the data value belongs to,
X2 = Scaled (encoded) value with the group reference value (XI) removed.

More information can be found here: https://apps.ecmwf.int/codes/grib/format/grib2/regulations/

pp-mo · 2023-11-06T17:40:45Z

This problem has recently been re-raised.
Whether or not this is "correct" encoding, it clearly is out there.

I'm doubting whether this was actually "fixed" as stated above -- perhaps we still need to add robustness for this case ??

greenlaw · 2023-11-06T18:02:44Z

@pp-mo Just FYI, I posted our own monkeypatch workaround in the issue linked above: #355 (comment)

If it would be helpful, I can submit this in a PR. We have some other internal patches that would also be good candidates for PRs, but haven't gotten around to packaging them up just yet.

larsbarring · 2023-11-20T08:52:18Z

@greenlaw I just saw your comment (above), and thanks for the monkeypatch in #355, it worked out nicely. As we are more and more exploring Iris [also] as tool for reading grib files we would be interested the other patches you have. At least from my perspective a PR would be useful

pp-mo · 2023-11-21T11:08:14Z

This problem has recently been re-raised ... perhaps we still need to add robustness for this case ??

Should be fixed in latest v0.19 release, so I hope we can close this issue now.
@lkugler @larsbarring can you confirm this ?

larsbarring · 2023-11-22T10:57:27Z

Regarding the latest 0.19 release, please note this comment and this response. Unfortunately, I do not have a suitable test file at hand.

Regarding @greenlaw's kind and constructive offer to share their improvements, I think that would be very helpful because GriB is a complicated format (as we have seen in this issue) and all insights are useful in preventing oneself to run into problems.

Having said that, I am fine with closing this issue.

lkugler · 2023-11-22T11:20:44Z

I'm sorry, I can't test the patch, it's been a long time, I don't work with GRIB at anymore.

bjlittle · 2023-11-29T10:08:19Z

Given the recent iris-grib 0.19.0 release, we consider this issue addressed.

Please reopen if you want to discuss further and propose additional changes.

trexfeathers closed this as completed Sep 21, 2022

larsbarring mentioned this issue Oct 5, 2023

GriB file with constant field (all values = 0) throws an error #355

Open

pp-mo reopened this Nov 6, 2023

bjlittle closed this as completed Nov 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KeyError: "'codedValues' not defined in section 7" when loading a possibly corrupted file #131

KeyError: "'codedValues' not defined in section 7" when loading a possibly corrupted file #131

lkugler commented Jul 22, 2018 •

edited

Loading

DPeterK commented Aug 20, 2018

lkugler commented Aug 20, 2018

trexfeathers commented Sep 21, 2022

greenlaw commented Feb 7, 2023 •

edited

Loading

pp-mo commented Nov 6, 2023

greenlaw commented Nov 6, 2023 •

edited

Loading

larsbarring commented Nov 20, 2023

pp-mo commented Nov 21, 2023

larsbarring commented Nov 22, 2023

lkugler commented Nov 22, 2023 •

edited

Loading

bjlittle commented Nov 29, 2023

KeyError: "'codedValues' not defined in section 7" when loading a possibly corrupted file #131

KeyError: "'codedValues' not defined in section 7" when loading a possibly corrupted file #131

Comments

lkugler commented Jul 22, 2018 • edited Loading

DPeterK commented Aug 20, 2018

lkugler commented Aug 20, 2018

trexfeathers commented Sep 21, 2022

greenlaw commented Feb 7, 2023 • edited Loading

pp-mo commented Nov 6, 2023

greenlaw commented Nov 6, 2023 • edited Loading

larsbarring commented Nov 20, 2023

pp-mo commented Nov 21, 2023

larsbarring commented Nov 22, 2023

lkugler commented Nov 22, 2023 • edited Loading

bjlittle commented Nov 29, 2023

lkugler commented Jul 22, 2018 •

edited

Loading

greenlaw commented Feb 7, 2023 •

edited

Loading

greenlaw commented Nov 6, 2023 •

edited

Loading

lkugler commented Nov 22, 2023 •

edited

Loading