All notable changes to this project will be documented in this file. The format is based on Keep a Changelog and this project adheres to Semantic Versioning. _________________________________________________________________________
+
+
2023-10-10
+
+
Summary
+
+
Improved Finding Concurrent Data Notebook text/instructions
+
Renamed contribute.md
+
added repo description
+
+
Added
+
+
Repository description
+
+
+
+
+
2023-09-28
+
+
Summary
+
Updated notebook ROI to Carpenteria Salt Marsh
+
Added
+
+
Added landcover.geojson
+
+
+
+
+
2023-09-22
+
+
Summary
+
Updated contribute.md and added user contributed directory
We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.
+
We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.
+
+
+
Our Standards
+
Examples of behavior that contributes to a positive environment for our community include:
+
+
Demonstrating empathy and kindness toward other people
+
Being respectful of differing opinions, viewpoints, and experiences
+
Giving and gracefully accepting constructive feedback
+
Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
+
Focusing on what is best not just for us as individuals, but for the overall community
+
+
Examples of unacceptable behavior include:
+
+
The use of sexualized language or imagery, and sexual attention or advances of any kind
+
Trolling, insulting or derogatory comments, and personal or political attacks
+
Public or private harassment
+
Publishing others’ private information, such as a physical or email address, without their explicit permission
+
Other conduct which could reasonably be considered inappropriate in a professional setting
+
+
+
+
Enforcement Responsibilities
+
Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.
+
Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.
+
+
+
Scope
+
This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event.
+
+
+
Enforcement
+
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at LPDAAC@usgs.gov. All complaints will be reviewed and investigated promptly and fairly.
+
All community leaders are obligated to respect the privacy and security of the reporter of any incident.
+
+
+
Enforcement Guidelines
+
Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct:
+
+
1. Correction
+
Community Impact: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.
+
Consequence: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested.
+
+
+
2. Warning
+
Community Impact: A violation through a single incident or series of actions.
+
Consequence: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban.
+
+
+
3. Temporary Ban
+
Community Impact: A serious violation of community standards, including sustained inappropriate behavior.
+
Consequence: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban.
+
+
+
4. Permanent Ban
+
Community Impact: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.
+
Consequence: A permanent ban from any sort of public interaction within the community.
Please submit a pull request early in the development phase, outlining the changes you intend to make or features you intend to add. This allows us to offer feedback early on, ensuring your contribution can be added to the repository before you invest a significant amount of time.
+
+
We want your help! Even if you’re not a coder! There are several ways you can contribute to this repository:
Update code, documentation, notebooks or other files (even fixing typos)
+
Propose a new notebook
+
+
In the sections below we outline how to approach each of these types of contributions. If you’re new to GitHub, you can sign up here. There are a bunch of great resources on the GitHub Quickstart page. The GitHub Cheatsheet is also quite helpful, even for experienced users. Please reach out to lpdaac@usgs.gov with questions or concerns.
+
+
Report an Issue or Make a Recommendation
+
If you’ve found a problem with the repository, we want to know about it! Please submit an Issue. Before submitting, we would appreciate if you check to see if a similar issue already exists. If not, create a new issue, providing as much detail as possible. Things like screenshots and code excerpts demonstrating the problem are very helpful!
+
+
+
Updating Code, Documentation, Notebooks, or Other Files
+
To contribute a solution to an issue or make a change to files within the repository we’ve created a typical outline of how to do that below. If you want to make a simple change, like correcting a typo within a markdown document or other documentation, there’s a great video explaining how to do that without leaving the GitHub website here. To make a more complex change to a notebook, code, or other file follow the instructions below.
+
+
Please create an Issue or comment on an existing issue describing the changes you intend to make.
+
+
Create a fork of this repository. This will create your own copy of the repository. When working from your fork, you can do whatever you want, you won’t mess up anyone else’s work so you’re safe to try things out. Worst case scenario you can delete your fork and recreate it.
+
+
Clone your fork to your local computer or cloud workspace using your preferred command line interface after navigating to the directory you want to place the repository in:
+
git clone your-fork-repository-url
+
+
Change directories to the one you cloned
+
+
cd repository-name
+
+
Add the upstream repository, this is the original repository that you want to contribute to.
+
+
git remote add upstream original-repository-url
+
+
You can use the following to view the remote repositories:
+
+
git remote -v
+
+
upstream, which refers to the original repository
+
+
origin, which refers to your personal fork
+
+
Develop your contribution:
+
+
Create a new branch named appropriately for the feature you want to work on:
+
+
git checkout -b new-branch-name
+
+
Often, updates to an upstream repository will occur while you are developing changes on your personal fork. You can pull the latest changes from upstream
+
+
git pull upstream dev
+
+
You can check the status of your local copy of the repository to see what changes have been made using:
+
+
git status
+
+
Commit locally as you progress using git add and git commit.` For example, updating a readme.md file:
You can check the status of your local copy of the repository again to see what pending changes have not been added or committed using:
+
+
git status
+
+
After making some changes, push your changes back to your fork on GitHub:
+
+
git push origin branch-name
+
+
Enter username and password, depending on your settings, you may need to use a Personal access token
+
+
To submit your contribution, navigate to your forked repository GitHub page and make a pull request using the Compare &pull request green button. Make sure to select the base repository and its dev branch. Also select your forked repository as head repository and make sure compare shows your branch name. You can add your comments and press Create pull request green button. Our team will be notified and will review your suggested revisions.
+
+
Please submit a pull request early in the development phase, outlining the changes you intend to make or features you intend to add. This allows us to offer feedback early on, ensuring your contribution can be added to the repository before you invest a significant amount of time.
+
+
+
+
+
Adding New Notebooks or Example Workflows
+
In the spirit of open science, we want to minimize barriers to sharing code and examples. We have added a user_contributed directory to the repository for anyone to share examples of their work in notebook or code form. Documentation and descriptions do not need to be as thorough as the examples we’ve created, but we ask that you provide as much as possible. Follow the instructions above, placing your new notebook or module in a suitably named directory within the user_contributed directory. Be sure to remove any large datasets and indicate where users can retrieve them.
Space Station Synergies: Applying ECOSTRESS and EMIT ecological problems for Scientific Insight
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Description:
+
The International Space Station is a critical asset for the Earth science community – both for advancing critical science and applications priorities, and as a platform for technology demonstrations/pathfinders. These benefits have been particularly significant in recent years, with the installation and operation of instruments such as ECOSTRESS, a multispectral thermal instrument, and EMIT, a visible to short wave infrared imaging spectrometer with best-in-class signal to noise - both acquiring data at field-scale (<70-m). With both sensors mounted on the ISS, there is an unprecedented opportunity to demonstrate the compounded benefits of working with both datasets. In this workshop we highlight the power of these tools when used together, through the use of open source tools and services, cloud compute resources to effectively combine data from ECOSTRESS and EMIT to perform scientific analyses and apply data to real world issues.
+
+
+
Learning Outcomes:
+
Imaging Spectroscopy and thermal measurements 101, the electromagnetic spectrum and sensor specific considerations How to access EMIT and ECOSTRESS data Data Preprocessing and Exploratory Analysis How to manipulate, combine, and visualize EMIT and ECOSTRESS data Participants will learn the basics of thermal and visible to shortwave infrared (VSWIR) imaging spectroscopy, including how the measurements are different and why they complement each other. Participants will learn how to find and access both EMIT and ECOSTRESS data through available tools and open data use resources managed by the LP DAAC using a cloud environment accessed with their own computer. We will provide examples of integrated analysis that have relevance to applied sciences. Tutorial examples may include vignettes related to agriculture, aquatic heat waves and algal blooms, methane emissions and flaring, and forest health, wildfire risk and stress.
If running this notebook locally, you will find instructions to set up a compatible environment in the setup folder. If running on the Openscapes 2i2c Cloud Instance for a Workshop, no additional setup is required.
+
+
Summary
+
Both the ECOsystem Spaceborne Thermal Radiometer Experiment on Space Station (ECOSTRESS) and the Earth surface Mineral dust source InvesTigation (EMIT) instruments are located on the International Space Station (ISS). Their overlapping fields of view provide an unprecedented opportunity to demonstrate the compounded benefits of working with both datasets. In this notebook we will show how to utilize the earthaccess Python library to find concurrent ECOSTRESS and EMIT data.
+
Background
+
The ECOSTRESS insturment is a multispectral thermal imaging radiometer designed to answer three overarching science questions:
+
+
How is the terrestrial biosphere responding to changes in water availability?
+
How do changes in diurnal vegetation water stress the global carbon cycle?
+
Can agricultural vulnerability be reduced through advanced monitoring of agricultural water consumptive use and improved drought estimation?
+
+
The ECOSTRESS mission is answering these questions by accurately measuring the temperature of plants. Plants regulate their temperature by releasing water through tiny pores on their leaves called stomata. If they have sufficient water they can maintain their temperature, but if there is insufficient water, their temperatures rise and this temperature rise can be measured with ECOSTRESS. The images acquired by ECOSTRESS are the most detailed temperature images of the surface ever acquired from space and can be used to measure the temperature of an individual farmers field.
+
More details about ECOSTRESS and its associated products can be found on the ECOSTRESS website and ECOSTRESS product pages hosted by the Land Processes Distributed Active Archive Center (LP DAAC).
+
The EMIT instrument is an imaging spectrometer that measures light in visible and infrared wavelengths. These measurements display unique spectral signatures that correspond to the composition on the Earth’s surface. The EMIT mission focuses specifically on mapping the composition of minerals to better understand the effects of mineral dust throughout the Earth system and human populations now and in the future. In addition, the EMIT instrument can be used in other applications, such as mapping of greenhouse gases, snow properties, and water resources.
+
More details about EMIT and its associated products can be found on the EMIT website and EMIT product pages hosted by the LP DAAC.
+
Requirements - NASA Earthdata Account - No Python setup requirements if connected to the workshop cloud instance! - Set up Python Environment - See setup_instructions.md in the /setup/ folder
+
Learning Objectives
+- How to use earthaccess to find concurrent EMIT and ECOSTRESS data. - How to export a list of files and download them programatically.
To download or stream NASA data you will need an Earthdata account, you can create one here. We will use the login function from the earthaccess library for authentication before downloading at the end of the notebook. This function can also be used to create a local .netrc file if it doesn’t exist, or add your login info to an existing .netrc file. If no Earthdata Login credentials are found in the .netrc you’ll be prompted for them. This step is not necessary to conduct searches, but is needed to download or stream data.
+
+
+
+
2. Search for ECOSTRESS and EMIT Data
+
Both EMIT and ECOSTRESS products are hosted by the Land Processes Distributed Active Archive Center (LP DAAC). In this example we will use the cloud-hosted EMIT_L2A_RFL and ECOSTRESS_L2T_LSTE products available from the LP DAAC to find data. Any results we find for these products, should be available for other products within the EMIT and ECOSTRESS collections.
+
To find data we will use the earthaccess Python library. earthaccess searches NASA’s Common Metadata Repository (CMR), a metadata system that catalogs Earth Science data and associated metadata records. The results can then be used to download granules or generate lists granule search result URLs.
+
Using earthaccess we can search based on the attributes of a granule, which can be thought of as a spatiotemporal scene from an instrument containing multiple assets (eg. Reflectance, Reflectance Uncertainty, Masks for the EMIT L2A Reflectance Collection). We can search using attributes such as collection, acquisition time, and spatial footprint. This process can also be used with other EMIT or ECOSTRESS products, other collections, or different data providers, as well as across multiple catalogs with some modification.
+
+
2.1 Define Spatial Region of Interest
+
For this example, our spatial region of interest (ROI) will be the Carpenteria Salt Marsh. You can learn more about it here: https://ucnrs.org/reserves/carpinteria-salt-marsh-reserve/. If you want to create a geojson polygon for your own ROI, you can do so using this website: https://geojson.io/#map=2/20/0, or you can convert a shapefile to a geojson using some code in the Appendices.
+
In this example, we elect to search using a polygon rather than a standard bounding box because bounding boxes will have a larger spatial extent, capturing a lot of area we may not be interested in. This becomes more important for searches with larger ROIs than our example here. To search for intersections with a polygon using earthaccess, we need to format our ROI as a counter-clockwise list of coordinate pairs.
+
Open the geojson file containing a landcover classification of Carpenteria Salt Marsh as a geodataframe, and check the coordinate reference system (CRS) of the data.
The CRS is EPSG:4326 (WGS84), which is also the CRS we want the data in to submit for our search.
+
Next, lets examine our polygon a bit closer.
+
+
polygon
+
+
We can see this geodataframe consists of multiple classes, each containing a multipolygon within our study site. We need to create an exterior boundary polygon containing these, and make sure the vertices are in counter-clockwise order to submit them in our query. To do this, create a polygon consisting of all the geometries, then calculate the convex hull of the union. This will give us a simple exterior polygon around our full ROI. After that, use the orient function to place our coordinate pairs in counter-clockwise order.
+
+
# Merge all Polygon geometries and create external boundary
+roi_poly = polygon.unary_union.convex_hull
+# Re-order vertices to counter-clockwise
+roi_poly = orient(roi_poly, sign=1.0)
+
+
We can go ahead and visualize our region of interest and the original landcover polygon. First add a function to help reformat bound box coordinates to work with leaflet notation.
+
+
# Function to convert a bounding box for use in leaflet notation
+
+def convert_bounds(bbox, invert_y=False):
+"""
+ Helper method for changing bounding box representation to leaflet notation
+
+ ``(lon1, lat1, lon2, lat2) -> ((lat1, lon1), (lat2, lon2))``
+ """
+ x1, y1, x2, y2 = bbox
+if invert_y:
+ y1, y2 = y2, y1
+return ((y1, x1), (y2, x2))
Above we can see our region of interest (ROI) and the landcover classification polygon that we opened. We can hover over different areas to see the land cover class.
+
Lastly we need to convert our polygon to a list of coordinate pairs.
+
+
# Set ROI as list of exterior polygon vertices as coordinate pairs
+roi =list(roi_poly.exterior.coords)
+
+
+
+
2.2 Define Collections of Interest
+
We need to specify which products we want to search for using their short-names. As mentioned above, we will conduct our search using the EMIT Level 2A Reflectance (EMITL2ARFL) and ECOSTRESS Level 2 Tiled Land Surface Temperature and Emmissivity (ECO_L2T_LSTE).
+
+
Note: Here we use the Tiled ECOSTRESS LSTE Product. This will also work with the gridded LSTE and the swath; however, the swath product does not have a browse image for the visualization in section 4, and will require additional processing for subsequent analysis.
+
+
+
# Data Collections for our search
+collections = ['EMITL2ARFL', 'ECO_L2T_LSTE']
+
+
+
+
2.3 Define Date Range
+
For our date range, we’ll look at data collected across the month of April 2023. The date_range can be specified as a pair of dates, start and end (up to, not including).
+
+
# Define Date Range
+date_range = ('2023-01-01','2023-09-01')
As we can see from above, the results object contains a list of objects with metadata and links. We can convert this to a more readable format, a dataframe. In addition, we can make it a geodataframe by taking the spatial metadata and creating a shapely polygon representing the spatial coverage, and further customize which information we want to use from other metadata fields.
+
First, we define some functions to help us create a shapely object for our geodataframe, and retrieve the specific browse image URLs that we want. By default the browse image selected by earthaccess is the first one in the list, but the ECO_L2_LSTE has several browse images and we want to make sure we retrieve the png file, which is a preview of the LSTE.
+
+
# Function to create shapely polygon of spatial coverage
+def get_shapely_object(result:earthaccess.results.DataGranule):
+# Get Geometry Keys
+ geo = result['umm']['SpatialExtent']['HorizontalSpatialDomain']['Geometry']
+ keys = geo.keys()
+
+if'BoundingRectangles'in keys:
+ bounding_rectangle = geo['BoundingRectangles'][0]
+# Create bbox tuple
+ bbox_coords = (bounding_rectangle['WestBoundingCoordinate'],bounding_rectangle['SouthBoundingCoordinate'],
+ bounding_rectangle['EastBoundingCoordinate'],bounding_rectangle['NorthBoundingCoordinate'])
+# Create shapely geometry from bbox
+ shape = geometry.box(*bbox_coords, ccw=True)
+elif'GPolygons'in keys:
+ points = geo['GPolygons'][0]['Boundary']['Points']
+# Create shapely geometry from polygons
+ shape = geometry.Polygon([[p['Longitude'],p['Latitude']] for p in points])
+else:
+raiseValueError('Provided result does not contain bounding boxes/polygons or is incompatible.')
+return(shape)
+
+# Retrieve png browse image if it exists or first jpg in list of urls
+def get_png(result:earthaccess.results.DataGranule):
+ https_links = [link for link in result.dataviz_links() if'https'in link]
+iflen(https_links) ==1:
+ browse = https_links[0]
+eliflen(https_links) ==0:
+ browse ='no browse image'
+ warnings.warn(f"There is no browse imagery for {result['umm']['GranuleUR']}.")
+else:
+ browse = [png for png in https_links if'.png'in png][0]
+return(browse)
+
+
Now that we have our functions we can create a dataframe, then calculate and add our shapely geometries to make a geodataframe. After that, add a column for our browse image urls and print the number of granules in our results, so we can monitor the quantity we are working with a we winnow down to the data we want.
+
+
# Create Dataframe of Results Metadata
+results_df = pd.json_normalize(results)
+# Create shapely polygons for result
+geometries = [get_shapely_object(results[index]) for index in results_df.index.to_list()]
+# Convert to GeoDataframe
+gdf = gpd.GeoDataFrame(results_df, geometry=geometries, crs="EPSG:4326")
+# Remove results df, no longer needed
+del results_df
+# Add browse imagery links
+gdf['browse'] = [get_png(granule) for granule in results]
+gdf['shortname'] = [result['umm']['CollectionReference']['ShortName'] for result in results]
+# Preview GeoDataframe
+print(f'{gdf.shape[0]} granules total')
+
+
Preview our geodataframe to get an idea what it looks like.
+
+
gdf.head()
+
+
There are a lot of columns with data that is not relevant to our goal, so we can drop those. To do that, list the names of colums.
+
+
# List Column Names
+gdf.columns
+
+
Now create a list of columns to keep and use it to filter the dataframe.
+
+
# Create a list of columns to keep
+keep_cols = ['meta.concept-id','meta.native-id', 'umm.TemporalExtent.RangeDateTime.BeginningDateTime','umm.TemporalExtent.RangeDateTime.EndingDateTime','umm.CloudCover','umm.DataGranule.DayNightFlag','geometry','browse', 'shortname']
+# Remove unneeded columns
+gdf = gdf[gdf.columns.intersection(keep_cols)]
+gdf.head()
+
+
This is looking better, but we can make it more readable by renaming our columns.
Note: If querying on-premises (not cloud) LP DAAC datasets, the meta.concept-id will not show as xxxxxx-LPCLOUD. For these datasets, the granule name can be retrieved from the umm.DataGranule.Identifiers column.
+
+
We can filter using the day/night flag as well, but this step will be unnecessary as we check to ensure all results from ECOSTRESS fall within an hour of resulting EMIT granules.
+
+
# gdf = gdf[gdf['day_night'].str.contains('Day')]
+
+
Our first step toward filtering the datasets will be to add a column with a datetime.
+
+
You may have noticed that the date format is similar for ECOSTRESS and EMIT, but the ECOSTRESS data has an additional fractional seconds. If using the recommended lpdaac_vitals Windows environment, you will need to pass the format='ISO8601'argument to the to_datetime function, like in the commented out line.
Now we will separate the results into two dataframes, one for ECOTRESS and one for EMIT and print the number of results for each so we can monitor how many granules we’re filtering.
+
+
# Suppress Setting with Copy Warning - not applicable in this use case
+pd.options.mode.chained_assignment =None# default='warn'
+
+# Split into two dataframes - ECO and EMIT
+eco_gdf = gdf[gdf['granule'].str.contains('ECO')]
+emit_gdf = gdf[gdf['granule'].str.contains('EMIT')]
+print(f' ECOSTRESS Granules: {eco_gdf.shape[0]}\n EMIT Granules: {emit_gdf.shape[0]}')
+
+
+
emit_gdf.head()
+
+
We still haven’t filtered the locations where EMIT and ECOSTRESS have data at the same spatial location and time-frame. The EMIT acquisition mask has been added to ECOSTRESS, so in most cases if EMIT is collecting data, so will ECOSTRESS, but there are edge cases where this is not true. To do this we’ll use two filters to catch the edge-cases, and provide an example that can be used with other datasets.
+
First, since EMIT has a smaller swath width, we can can use a unary union of the spatial coverage present in our geodataframe to filter out ecostress granules that do not overlap with it.
+
+
# Subset ECOSTRESS Granules in Geodataframe by intersection with EMIT granules
+## Create new column based on intersection with union of EMIT polygons.
+eco_gdf['intersects'] = eco_gdf.intersects(emit_gdf.unary_union)
+## Apply subsetting
+eco_gdf = eco_gdf[eco_gdf['intersects'] ==True]
+print(f' ECOSTRESS Granules: {eco_gdf.shape[0]}\n EMIT Granules: {emit_gdf.shape[0]}')
+
+
In this instance, our results aren’t narrowed because our region of interest is smaller than a single EMIT scene. If the spatial ROI was very large, this would be much more unlikely.
+
Additionally, we want to make sure that data in our results are collected at the same time. For EMIT and ECOSTRESS, the EMIT acquisition mask has been added to the ECOSTRESS mask, meaning that if there is an EMIT scene, there should also be an ECOSTRESS scene acquired at the same time. In practice, however, the timestamps on the scenes can vary slightly. In order to capture this slight variability, we need to use a range instead of a single timestamp to capture concurrent data. To do this, we’ll ensure all ECOSTRESS granule start times fall within 10 minutes of any of the EMIT granules in our results, and vice-versa.
+
Write a function to evaluate whether these datetime objects fall within 10 minutes of one another using the timedelta function.
+
+
# Function to Filter timestamps that do not fall within a time_delta of timestamps from the other acquisition time
+def concurrent_match(gdf_a:pd.DataFrame, gdf_b:pd.DataFrame, col_name:str, time_delta:timedelta):
+"""
+ Cross references dataframes containing a datetime object column and keeps rows in
+ each that fall within the provided timedelta of the other. Acceptable time_delta examples:
+
+ months=1
+ days=1
+ hours=1
+ minutes=1
+ seconds=1
+
+ """
+# Match Timestamps from Dataframe A with Time-range of entries in Dataframe B
+# Create empty list
+ a_list = []
+# Iterate results for product a based on index values
+for _n in gdf_b.index.to_list():
+# Find where product b is within the window of each product a result
+ a_matches = (gdf_a[col_name] > gdf_b[col_name][_n]-time_delta) & (gdf_a[col_name] < gdf_b[col_name][_n]+time_delta)
+# Append list with values
+ a_list.append(a_matches)
+# Match Timestamps from Dataframe B with Time-range of entries in Dataframe A
+# Create empy list
+ b_list =[]
+for _m in gdf_a.index.to_list():
+# Find where product a is within the window of each product b result
+ b_matches = (gdf_b[col_name] > gdf_a[col_name][_m]-time_delta) & (gdf_b[col_name] < gdf_a[col_name][_m]+time_delta)
+# Append list with values
+ b_list.append(b_matches)
+# Filter Original Dataframes by summing list of bools, 0 = outside of all time-ranges
+ a_filtered = gdf_a.loc[sum(a_list) >0]
+ b_filtered = gdf_b.loc[sum(b_list) >0]
+return(a_filtered, b_filtered)
Now that we have geodataframes containing some concurrent data, we can visualize them on a map using folium. It’s often difficult to visualize a large time-series of scenes, so we’ve included an example in Appendix A1 on how to filter to a single day.
+
+
# Plot Using Folium
+
+# Create Figure and Select Background Tiles
+fig = Figure(width="1100px", height="550px")
+map1 = folium.Map(tiles='https://mt1.google.com/vt/lyrs=y&x={x}&y={y}&z={z}', attr='Google')
+fig.add_child(map1)
+
+# Plot STAC ECOSTRESS Results - note we must drop the datetime_obj columns for this to work
+eco_gdf.drop(columns=['datetime_obj']).explore(
+"granule",
+ categorical=True,
+ tooltip=[
+"granule",
+"start_datetime",
+"cloud_cover",
+ ],
+ popup=True,
+ style_kwds=dict(fillOpacity=0.1, width=2),
+ name="ECOSTRESS",
+ m=map1,
+)
+
+# Plot STAC EMITL2ARFL Results - note we must drop the datetime_obj columns for this to work
+emit_gdf.drop(columns=['datetime_obj']).explore(
+"granule",
+ categorical=True,
+ tooltip=[
+"granule",
+"start_datetime",
+"cloud_cover",
+ ],
+ popup=True,
+ style_kwds=dict(fillOpacity=0.1, width=2),
+ name="EMIT",
+ m=map1,
+)
+
+# ECOSTRESS Browse Images - Comment out to remove
+for _n in eco_gdf.index.to_list():
+ folium.raster_layers.ImageOverlay(
+ image=eco_gdf['browse'][_n],
+ name=eco_gdf['granule'][_n],
+ bounds=[[eco_gdf.bounds['miny'][_n], eco_gdf.bounds['minx'][_n]], [eco_gdf.bounds['maxy'][_n], eco_gdf.bounds['maxx'][_n]]],
+ interactive=False,
+ cross_origin=False,
+ opacity=0.75,
+ zindex=1,
+ ).add_to(map1)
+
+# Plot Region of Interest
+polygon.explore(
+ popup=False,
+ style_kwds=dict(fillOpacity=0.1, width=2),
+ name="Region of Interest",
+ m=map1
+)
+
+map1.fit_bounds(bounds=convert_bounds(gdf.unary_union.bounds))
+map1.add_child(folium.LayerControl())
+display(fig)
+
+
In the figure above, you can zoom in and out, click and drag to reposition the legend, and add or remove layers using the layer control in the top right. Notice that since we’re using the tiled ecostress product, we have 2 overlapping tiles at our ROI. You can visualize the tiles by adding or removing the layers.
+
+
4.2 Previewing EMIT Browse Imagery
+
The EMIT browse imagery is not orthorectified, so it can’t be visualized on a plot like the ECOSTRESS browse imagery. To get an idea what scenes look like we can plot them in a grid using matplotlib.
+
+
Note: The black space is indicative of onboard cloud masking that occurs before data is downlinked from the ISS.
We can see that some of these granules likely won’t work because of the large amount of cloud cover, we can use a list of these to filter them out. Make a list of indexes to filter out.
Creating a list of results URLs will include all of these assets, so if we only want a subset we need an additional filter to keep the specific assets we want.
+
If you look back, you can see we kept the same indexing throughout the notebook. This enables us to simply subset the earthaccess results object to retrieve the results we want.
filtered_results = [result for i, result inenumerate(results) if i in keep_granules]
+
+
Now we can download all of the associated assets, or retrieve the URLS and further filter them to specifically what we want.
+
First, log into Earthdata using the login function from the earthaccess library. The persist=True argument will create a local .netrc file if it doesn’t exist, or add your login info to an existing .netrc file. If no Earthdata Login credentials are found in the .netrc you’ll be prompted for them. As mentioned in section 1.2, this step is not necessary to conduct searches, but is needed to download or stream data.
+
+
earthaccess.login(persist=True)
+
+
Now we can download all assets using the following cell.
+
+
# # Download All Assets for Granules in Filtered Results
+# earthaccess.download(filtered_results, '../data/')
+
+
Or we can create a list of URLs and use that to further refine which files we download.
+
+
# Retrieve URLS for Assets
+results_urls = [granule.data_links() for granule in filtered_results]
+
+
Granules often have several assets associated with them, for example, ECO_L2T_LSTE has several assets: - Water Mask (water) - Cloud Mask (cloud) - Quality (QC) - Land Surface Temperature (LST) - Land Surface Temperature Error (LST_err) - Wide Band Emissivity (EmisWB) - Height (height)
+
The results list we just generated contains URLs to all of these files. We can further filter our results list using string matching to remove unwanted assets.
+
Create a list of strings and enumerate through our results_url list to filter out unwanted assets.
+
+
filtered_asset_links = []
+# Pick Desired Assets (leave _ on RFL to distinguish from RFLUNC, LST. to distinguish from LST_err)
+desired_assets = ['RFL_', 'LST.'] # Add more or do individually for reflectance, reflectance uncertainty, or mask
+# Step through each sublist (granule) and filter based on desired assets.
+for n, granule inenumerate(results_urls):
+for url in granule:
+ asset_name = url.split('/')[-1]
+ifany(asset in asset_name for asset in desired_assets):
+ filtered_asset_links.append(url)
+filtered_asset_links
+
+
Uncomment the cell below (select all, then ctrl+/) and download the data that we’ve filtered.
+
+
# # Get requests https Session using Earthdata Login Info
+# fs = earthaccess.get_requests_https_session()
+# # Retrieve granule asset ID from URL (to maintain existing naming convention)
+# for url in filtered_asset_links:
+# granule_asset_id = url.split('/')[-1]
+# # Define Local Filepath
+# fp = f'../data/{granule_asset_id}'
+# # Download the Granule Asset if it doesn't exist
+# if not os.path.isfile(fp):
+# with fs.get(url,stream=True) as src:
+# with open(fp,'wb') as dst:
+# for chunk in src.iter_content(chunk_size=64*1024*1024):
+# dst.write(chunk)
+
+
Congratulations, now you you have downloaded concurrent data from the ECOSTRESS and EMIT instruments on the ISS.
+
+
+
Contact Info:
+
Email: LPDAAC@usgs.gov
+Voice: +1-866-573-3222
+Organization: Land Processes Distributed Active Archive Center (LP DAAC)¹
+Website: https://lpdaac.usgs.gov/
+Date last modified: 10-12-2023
+
¹Work performed under USGS contract G15PD00467 for NASA contract NNG14HH33I.
+
+
+
Appendices
+
These contain some extra code that may be useful when performing a similar workflow.
+
+
A1. Further Limiting Search for Visualization Purposes
+
A large quantity of results may be difficult to understand when mapping with folium. We can create a subset that falls within a single day. First add another column of dates only, then find the unique dates.
We can convert a shapefile to a geojson using the following cell. Note that we need to reorder the polygon external vertices so we can submit them as a list of points for our search.
+
+
# # Use Sedgwick Reserve Shapefile
+# # Open Shapefile
+# polygon = gpd.read_file('../data/Sedgwick_Boundary/Sedgwick_Boundary.shp').to_crs("EPSG:4326")
+# # Reorder vertices into Counter-clockwise order
+# polygon.geometry[0] = orient(polygon.geometry[0], sign=1.0)
+# # Save as a geojson (not necessary)
+# polygon.to_file('../data/sedgwick_boundary_epsg4326.geojson', driver='GeoJSON')
ECOSTRESS-EMIT Workshop Fall 2023 - Carpinteria Salt Marsh Analysis
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Gregory Halverson Jet Propulsion Laboratory, California Institute of Technology
+
Claire Villanueva-Weeks Jet Propulsion Laboratory, California Institute of Technology
+
This research was carried out at the Jet Propulsion Laboratory, California Institute of Technology, and was sponsored by ECOSTRESS and the National Aeronautics and Space Administration (80NM0018D0004).
## Summary In this notebook we will open an EMIT L2A Reflectance product file in NETCDF4 format and an ECOSTRESS L2 Land Surface Temperature product file in GEOTIFF format. We will demonstrate using specific bands to calculate NDVI and use holoviews to plot the EMIT spectra and calculated NDVI. We will also demonstrate opening and mapping the ECOSTRESS L2 Land Surface Temperature product.
+
+
Importing Libraries
+
These are some built-in Python functions we need for this notebook, including functions for handling filenames and dates.
We’re using the rioxarray package for loading raster data from a GeoTIFF file, and we’re importing it as rxr. We’re using the numpy library to handle arrays, and we’re importing it as np. We’re using the rasterio package to subset the data.
+
+
import rioxarray as rxr
+import numpy as np
+import rasterio as rio
+
+
We’re using the geopandas library to load vector data from GeoJSON files, and we’re importing it as gpd. We’re using the shapely library to handle vector data and the pyproj library to handle projections.
+
+
import geopandas as gpd
+from shapely.geometry import Point, box
+from shapely.ops import transform
+from pyproj import Transformer
+
+
We’re using the pandas library to handle tables, and we’re importing it as pd.
+
+
import pandas as pd
+
+
We’re using the seaborn library to produce our graphs, and we’re importing it as sns. We’re using the hvplot library to produce our maps.
+
+
import seaborn as sns
+import hvplot.xarray
+import hvplot.pandas
+
+
+
import earthaccess
+from osgeo import gdal
+import math
+import netCDF4 as nc
+
+
Import the emit_tools module and call use from emit_tools import emit_xarray help(emit_xarray) the help function to see how it can be used.
+
+
Note: This function currently works with L1B Radiance and L2A Reflectance Data.
This is the location of the ECOSTRESS and EMIT product files
+
+
DATA_DIRECTORY ="/home/jovyan/shared/ECOSTRESS-EMIT_data/data/"# FIXME set this to the common path in OpenScapes
+print(f"data directory: {DATA_DIRECTORY}")
+
+
+
+
Loading and Mapping an ECOSTRESS LST granule
+
First, let’s trying opening a data layer from a product file.
We’re using rioxarray to open the surface temperature product on the 11SKU tile covering the Carpinteria Salt Marsh. We’re passing the filename of the GeoTIFF file directly into rioxarray.
The hvplot package extends xarray to allow us to plot maps. We’re reprojecting the raster geographic projection EPSG 4326 to overlay on the basemap with a latitude and longitude graticule. We’re using the jet color scheme to render temperature with a rainbow of colors with red meaning hot and blue meaning cool. We’re setting the alpha to make the raster semi-transparent on top of the basemap. We’re filtering out values lower than the 2% percentile and higher than the 98% percentile to make the variation in the image more visible.
+
The temperatures in the L2T_LSTE product are given in Kelvin. To convert them to Celsius, we subtract 273.15.
Opening the file with nc allows us to see file information and the different groups, theres reflectance which were concerned with, sensor band parameters, and location.
+
+
ds_nc = nc.Dataset(EMIT_fp)
+ds_nc
+
+
Now we will use the emit_xarray function from the emit_tools module wirh ortho set to True to orthorectify the L2A reflectance data and place it into an xarray.Dataset.
+
+
For a detailed walkthrough of the orthorectification process using the GLT see section 2 of the How_to_Orthorectify.ipynb in the how-tos folder.
+
+
+
ds = emit_xarray(EMIT_fp,ortho=True)
+ds
+
+
+
+
Visualizing EMIT Spectral and Spatial Data
+
Here we picked out and mapped wavelengths nearest to 800 nm and 675 nm using the .sel function from xarray
These are the nearest to 800 nm and 675 nm wavelengths, they can be used to calculate the NDVI using a ratio of the difference between between the wavelengths to the sum of the wavelengths.NDVI is a metric by which we can estimate vegetation greenness.
Our ROI is the Carpinteria Salt Marsh Habitat, which is a marsh reserve here in Southern California that is home to many different species of plants and animals (more information- https://carpinteria.ucnrs.org/). Tomorrow when we take a fieldtrip there where there will be a guided tour where we will be learning more about the Carpinteria Salt Marsh and its ecology.
+
To clip the raster image to the extent of the vector dataset, we want to subset the raster to the bounds of the vector dataset. This dataset is included here in GeoJSON format, which we’ll load in as a geodatagrame using the geopandas package.
This vector dataset contains polygons classifying the surface of the Carpinteria Salt Marsh into channel, salt flat, upland, pan, and marsh. You can see that this vector dataset contains 5 polygons that classify the Carpinteria Salt Marhs the Marsh into channel, salt flat, upland, pan, and marsh.
Now we can use the clip function from rasterio to mask out a subset of the the LST and NDVI datasets to the extent of the polygons from the vector dataset. Setting all_touched to True will include pixels that intersect with the edges of the polygons.
Here’s another way we can visualize them, we can map them side by side laid over a satellite basemap, also setting the alpha to a lower value to increase transparency of the raster datasets
+
+
LSTmap1 = LST_clip.hvplot.image(
+ tiles=BASEMAP,
+ cmap='jet',
+ alpha=.6,
+ title ="Carpinteria Salt Marsh Surface Temperature (Celsius)"
+)
+
+NDVImap1 = NDVI_clip.hvplot.image(
+ tiles=BASEMAP,
+ cmap='RdYlGn',
+ alpha=.6,
+ title ="Carpinteria Salt Marsh Vegetation Index"
+)
+
+LSTmap1.options(xlabel="Longitude", ylabel="Latitude")+NDVImap1.options(xlabel="Longitude", ylabel="Latitude")
Space Station Synergies AGU Workshop take place on December 10, 2023.
+
This workshop is hosted by NASA’s Land Processes Distributed Activate Archive LP DAAC and NASA Jet Propulsion Laboratory JPL with support from the NASA Openscapes project.
+Hands-on exercises will be executed from a Jupyter Hub on the 2i2c cloud instance. Your GitHub username is used to enable access to the cloud instance during the workshop. Please pass along your Github Username to get access if you have not already.
+
+
To open the JupyterHub and clone the VITALS repository automatically you can use the following link: https://tinyurl.com/yckery74.
+
If you have cloned the repository using this link before, make sure the older one is deleted and then click on the link to get the most updated content.
This repository uses a Python environment similar to the EMIT Data Resources Repository, with the addition of the scikit-image library. If you have previously installed the emit_tutorials environment, you can simply add this library and use that environment. If you plan to work with the EMIT-Data-Resources Repository, this lpdaac_vitals environment can be used.
+
+
The how-tos and tutorials in this repository require a NASA Earthdata account, an installation of Git, and a compatible Python Environment. We recommend mamba to manage Python packages, but if you are already using another package manager like conda, that will work as well. To install mamba, download mambaforge for your operating system. If using Windows, be sure to check the box to “Add mamba to my PATH environment variable” to enable use of mamba directly from your command line interface. Note that this may cause an issue if you have an existing mamba install through Anaconda.
+
+
Python Environment Setup
+
These Python Environments will work for all of the guides, how-to’s, and tutorials within this repository. A .yml file that can be used to set up the necessary environment has been included in the repository for both Windows and MacOS. Use the appropriate file in the steps below.
+
+
If you wish to use conda as your package manager you can simply substitute ‘conda’ for ‘mamba’ in the steps below.
+
+
+
Using your preferred command line interface (command prompt, terminal, cmder, etc.) navigate to your local copy of the repository, then type the following to create a compatible Python environment.
Next, activate the Python Environment that you just created.
+
mamba activate lpdaac_vitals
+
Now you can launch Jupyter Notebook to open the notebooks included.
+
jupyter notebook
+
+
If you’re having trouble creating a compatible Python Environment or having an issue with one of the above environments, you can also try to create one using the commands below. Using your preferred command line interface (command prompt, terminal, cmder, etc.) type the following to create a compatible Python environment.
After this, you should be able to do steps 2 and 3 above.
+
Still having trouble getting a compatible Python environment set up? Contact LP DAAC User Services.
+
+
+
+
Contact Info
+
Email: LPDAAC@usgs.gov
+Voice: +1-866-573-3222
+Organization: Land Processes Distributed Active Archive Center (LP DAAC)¹
+Website: https://lpdaac.usgs.gov/
+Date last modified: 09-29-2023
+
¹Work performed under USGS contract G15PD00467 for NASA contract NNG14HH33I.