difficulty reading CF compliant files #135

gaelforget · 2020-02-28T21:15:08Z

After loading one of my files via Panoply to verify that there was nothing wrong with it (see below) I tried the model = load(gcm_files, "tasmax", poly=poly_reg) example and got ERROR: Manually verify x/lat dimension name.

Taking a look in the code I see that getdim_lat relies on a list of hard coded names. I thought that the more general approach was to rely on long_name + units. Not sure what to suggest -- adding to the hard coding list would be a short term fix just for me...

  lon_c   (720)
    Datatype:    Float64
    Dimensions:  lon_c
    Attributes:
     units                = degrees_east
     long_name            = longitude

The text was updated successfully, but these errors were encountered:

gaelforget · 2020-02-28T21:23:08Z

Also, the next file I am planning to present to climatetools is also CF-compliant but not on a regular lat-lon grid (see below). But I am going to wait a bit before I try that.

Balinus · 2020-02-28T23:14:54Z

Thanks for the input! Indeed, this is certainly not an elegant function. From memory, this was coded for a project that involved regional climate models (your second case).

Not sure if the extraction of lon_c based on long_name is robust though. Seems more robust to go with the detected dimensions. For instance, for a regional climate model, the dimension will not have longitude as their dimension. They will have a longitude grid though, with the long_name being longitude. If I rely on detecting say longitude, we will extract the longitude grid and not the native dimension which could be meters, degrees on a stereographic grid, etc...

Open to suggestions though as hardcoding this is not a robust solution either.

gaelforget · 2020-03-04T14:33:14Z

Open to suggestions though as hardcoding this is not a robust solution either.

Cool. Will take a deeper look and might send PR later if I find a way to improve code

regional climate models (your second case)

Just to clarify, I use sets of these files that collectively add up to global model variables

Balinus · 2020-03-09T20:03:54Z

Just to clarify, I use sets of these files that collectively add up to global model variables

You mean likes "tiles" ?

lmilechin · 2020-03-10T18:27:22Z

Just for reference: http://cfconventions.org/Data/cf-conventions/cf-conventions-1.6/build/cf-conventions.html#latitude-coordinate

From what I've seen with other tools, they detect dimensions using the units, which is what the CF Conventions seems to imply as well.

Balinus · 2020-03-10T19:21:59Z

Thanks! I've seen that in RCMs, latitude and longitude grid have also an official standard_name. Hence, this should be possible to discern dimensions and coordinates adequately.

I'm gonna rework this extraction part asap.

gaelforget · 2020-03-10T19:40:06Z

Thanks! I've seen that in RCMs, latitude and longitude grid have also an official standard_name. Hence, this should be possible to discern dimensions and coordinates adequately.

As highlighted by @lmilechin it is the units attribute that should be used to identify coordinates per the CF guidelines -- as opposed to standard_name which is only optional and e.g. does not distinguish between different longitude conventions

I'm gonna rework this extraction part asap.

Great! Thanks

Balinus · 2020-03-10T19:59:13Z

To effectively tackle this issue, having access to some problematic datasets would be welcomed.

gaelforget · 2020-03-10T20:09:31Z

To effectively tackle this issue, having access to some problematic datasets would be welcomed.

How about using the files I mentioned at the top of this thread?

These get generated by running 04_netcdf.ipynb from GlobalOceanNotebooks :

outputs/nctiles-newfiles/interp/ETAN.nc
outputs/nctiles-newfiles/tiled/ETAN/ETAN.*.nc

ps. I just reran the notebook in binder & regenerated these without problem

gaelforget · 2020-03-10T20:28:17Z

Just to clarify, I use sets of these files that collectively add up to global model variables

You mean likes "tiles" ?

Yes -- one tile = 1 file in this example

Balinus · 2020-03-11T01:48:38Z

To effectively tackle this issue, having access to some problematic datasets would be welcomed.

How about using the files I mentioned at the top of this thread?

These get generated by running 04_netcdf.ipynb from GlobalOceanNotebooks :
outputs/nctiles-newfiles/interp/ETAN.nc
outputs/nctiles-newfiles/tiled/ETAN/ETAN.*.nc
ps. I just reran the notebook in binder & regenerated these without problem

Thanks, I was able to produce the files at home.

Balinus · 2020-03-11T01:52:14Z

Also, re-read the thread and wanted to clarify: when I spoke about "dimension" I was mostly referring to the dimensions of the datasets, not the units/measure of the variable itself. Hence, the need to distinguish between a rotated latitude "dimension" versus the latitude grid (a variable in the dataset, not the one of the dimension) of a datasets for projected grids.

Anyway, I'll be forced to think about a more general solution to this!

edit - For example, for this dataset, there is rlat and rlon.

Dimensions
   rlat = 412
   rlon = 424
   time = 2920
   bnds = 2

Variables
  lat   (424 × 412)
    Datatype:    Float64
    Dimensions:  rlon × rlat
    Attributes:
     standard_name        = latitude
     long_name            = latitude
     units                = degrees_north

  lon   (424 × 412)
    Datatype:    Float64
    Dimensions:  rlon × rlat
    Attributes:
     standard_name        = longitude
     long_name            = longitude
     units                = degrees_east

  pr   (424 × 412 × 2920)
    Datatype:    Float32
    Dimensions:  rlon × rlat × time
    Attributes:
     grid_mapping         = rotated_pole
     _FillValue           = 1.0e20
     missing_value        = 1.0e20
     standard_name        = precipitation_flux
     long_name            = Precipitation
     units                = kg m-2 s-1
     coordinates          = lon lat
     cell_methods         = time: mean

  rlat   (412)
    Datatype:    Float64
    Dimensions:  rlat
    Attributes:
     standard_name        = grid_latitude
     long_name            = latitude in rotated pole grid
     units                = degrees
     axis                 = Y

  rlon   (424)
    Datatype:    Float64
    Dimensions:  rlon
    Attributes:
     standard_name        = grid_longitude
     long_name            = longitude in rotated pole grid
     units                = degrees
     axis                 = X

Balinus · 2020-03-13T20:28:10Z

I've sketched some code in #137

It's pretty rough right now but so far it works. Just not sure about the robustness though. Haven't had the time to test your files @gaelforget but I'm pretty sure it does not work. I'm currently testing for axis (optional attribute in CF files) and standard_name attributes of the dimensions. Will add long_name later.

Balinus · 2020-03-14T17:08:12Z

@gaelforget In the files produced by the Notebook, both lat_c and lon_c has a longitude attribute as their long_name.

Balinus self-assigned this Mar 10, 2020

Balinus added help wanted enhancement labels Mar 10, 2020

gaelforget mentioned this issue Mar 12, 2020

support more dimension / variable names #136

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

difficulty reading CF compliant files #135

difficulty reading CF compliant files #135

gaelforget commented Feb 28, 2020

gaelforget commented Feb 28, 2020

Balinus commented Feb 28, 2020

gaelforget commented Mar 4, 2020 •

edited

Loading

Balinus commented Mar 9, 2020

lmilechin commented Mar 10, 2020

Balinus commented Mar 10, 2020

gaelforget commented Mar 10, 2020 •

edited

Loading

Balinus commented Mar 10, 2020

gaelforget commented Mar 10, 2020 •

edited

Loading

gaelforget commented Mar 10, 2020

Balinus commented Mar 11, 2020

Balinus commented Mar 11, 2020 •

edited

Loading

Balinus commented Mar 13, 2020 •

edited

Loading

Balinus commented Mar 14, 2020

difficulty reading CF compliant files #135

difficulty reading CF compliant files #135

Comments

gaelforget commented Feb 28, 2020

gaelforget commented Feb 28, 2020

Balinus commented Feb 28, 2020

gaelforget commented Mar 4, 2020 • edited Loading

Balinus commented Mar 9, 2020

lmilechin commented Mar 10, 2020

Balinus commented Mar 10, 2020

gaelforget commented Mar 10, 2020 • edited Loading

Balinus commented Mar 10, 2020

gaelforget commented Mar 10, 2020 • edited Loading

gaelforget commented Mar 10, 2020

Balinus commented Mar 11, 2020

Balinus commented Mar 11, 2020 • edited Loading

Balinus commented Mar 13, 2020 • edited Loading

Balinus commented Mar 14, 2020

gaelforget commented Mar 4, 2020 •

edited

Loading

gaelforget commented Mar 10, 2020 •

edited

Loading

gaelforget commented Mar 10, 2020 •

edited

Loading

Balinus commented Mar 11, 2020 •

edited

Loading

Balinus commented Mar 13, 2020 •

edited

Loading