Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xarray attributes (CRS, transform) missing #91

Closed
g2giovanni opened this issue Jun 8, 2021 · 6 comments
Closed

Xarray attributes (CRS, transform) missing #91

g2giovanni opened this issue Jun 8, 2021 · 6 comments

Comments

@g2giovanni
Copy link

g2giovanni commented Jun 8, 2021

I have issues opening some catalog on https://earth-search.aws.element84.com/v0.
When I try to make some operation with rioxarray (e.g., a clip operation) I get the error:

MissingCRS: CRS not found. Please set the CRS with 'rio.write_crs()'

That's because into the attributes of the xarray returned from the to_dask() method attributes transform, crs and res have not be set. I noticed this is happening only for catalogues of the beginning of 2018.
For this reason, rioxarry can't georeference data in the proper way.
Why those attributes have not been set?

A snippet of code to reproduce the error:

from shapely import wkt
import intake
import intake_stac
import satsearch

wkt_geometry = "Polygon ((9.59935220198788208 44.99328353692637705, 9.82295202809843637 44.99189113160298348, 9.82469113104509972 45.11301999240544092, 9.60061886478441018 45.11441825799879268, 9.59935220198788208 44.99328353692637705))"

start_date = "2018-01-01"
end_date = "2018-01-03"

results = satsearch.Search.search (
            url="https://earth-search.aws.element84.com/v0",
            intersects=mapping(wkt.loads(wkt_geometry)),
            datetime="%s/%s" % (start_date, end_date),
            query={"eo:cloud_cover": {"lte": 100}},
            collections=["sentinel-s2-l2a"],
        )

items = results.items()

for item in items:
        print("Processing item {} for asset {}".format(item, asset_key))
        single_item = intake.open_stac_item(item) # stac_item_collection[str(item_by_date)]

        asset_xarray = single_item['SCL'](chunks=dict(band=1, y=2048, x=2048)).to_dask()
        asset_xarray.rio.set_nodata(0)
        asset_clipped = asset_xarray.rio.clip([mapping(_wkt_bounds)], crs=4326, all_touched=True, drop=True)
        xarray_all_patches.append(asset_clipped)

    mosaic = rioxarray.merge.merge_arrays(xarray_all_patches, precision=50, nodata=nodataval)

The clip operation with rioxarray is the one that fails (asset_clipped = asset_xarray.rio.clip([mapping(_wkt_bounds)], crs=4326, all_touched=True, drop=True) )

@scottyhq
Copy link
Collaborator

@g2giovanni Currently intake-stac returns a standard xarray DataArray. Note that rioxarray is an extension of xarray, and the CRS is only added to the xarray object if you open a file via rioxarray or if you explicitly use the xarray_dataarray.rio.write_crs() method as the Error message indicates.

So before running methods like rio.clip() you can do something like asset_xarray.rio.write_crs(item.properties['proj:epsg'], inplace=True)

There is some discussion of using rioxarray behind the scenes in a future version of intake-stac (#75)

@matthewhanson
Copy link
Collaborator

@g2giovanni I'm also noticing you are using the 'sentinel-s2-l2a' collection rather than 'sentinel-s2-l2a-cogs' which is probably what you want since they are COGs rather than jpeg2k and not in a requester pays bucket

Also, have a look at pystac-client which is replacing sat-search

@scottyhq
Copy link
Collaborator

@g2giovanni I'm also noticing you are using the 'sentinel-s2-l2a' collection rather than 'sentinel-s2-l2a-cogs' which is probably what you want since they are COGs rather than jpeg2k and not in a requester pays bucket

@matthewhanson seems that not all L2a data is available as COGs. I actually tried this when looking into the issue here:

import intake
from pystac_client import Client # https://github.com/stac-utils/pystac-client

aoi = {'type': 'Polygon',
 'coordinates': [[[9.59, 44.99],
   [9.82, 44.99],
   [9.82, 45.11],
   [9.60, 45.11],
   [9.59, 44.99]]]}

catalog = Client.open("https://earth-search.aws.element84.com/v0")
results = catalog.search(#collections=['sentinel-s2-l2a'], #2 found
                         collections=['sentinel-s2-l2a-cogs'], #0 found
                         intersects=aoi,
                         datetime='2018-01-01/2018-01-03',
                         )

print(f"{results.matched()} items found")

@RichardScottOZ
Copy link
Contributor

Definitely some things missing here and there.

@g2giovanni
Copy link
Author

@scottyhq you're right. It's the reason why I use the "sentinel-s2-l2a" for older years.
Anyway, I think I found the problem: some geotiff file is not georeferenced properly in the collection. Actually, I bypassed the problem, georeferencing images with this kind of problem. Maybe could those files be fixed?

@scottyhq
Copy link
Collaborator

I think I found the problem: some geotiff file is not georeferenced properly in the collection. Actually, I bypassed the problem, georeferencing images with this kind of problem. Maybe could those files be fixed?

@g2giovanni if there is a metadata issue with the sentinel-s2-l2a-cogs you can open an issue here describing in detail https://github.com/cirrus-geo/cirrus-earth-search

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants