Merge pull request #48 from ASFHyP3/develop

Release v0.8.0
ASFHyP3 · Sep 23, 2024 · 7a390fc · 7a390fc
2 parents b73ad05 + f1ad019
commit 7a390fc
Show file tree

Hide file tree

Showing 14 changed files with 562 additions and 54 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -6,6 +6,15 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [PEP 440](https://www.python.org/dev/peps/pep-0440/)
 and uses [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [0.8.0]
+
+### Added
+* New `time_series` workflow for time series processing of GSLC stacks.
+
+### Changed
+* The `back_projection` workflow now accepts an optional `--bounds` parameter to specify the DEM extent
+* The back-projection product now includes the elevation.dem.rsc file.
+
 ## [0.7.0]
 
 ### Changed
@@ -74,4 +83,3 @@ and uses [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
 ### Added
 * Initial version of repository.
-
diff --git a/Dockerfile b/Dockerfile
@@ -47,6 +47,7 @@ SHELL ["/bin/bash", "-l", "-c"]
 USER ${CONDA_UID}
 WORKDIR /home/conda/
 
+COPY --chown=${CONDA_UID}:${CONDA_GID} --from=builder /srg/snaphu_v2.0b0.0.0/bin/snaphu /srg/bin/snaphu
 COPY --chown=${CONDA_UID}:${CONDA_GID} --from=builder /srg /srg
 COPY --chown=${CONDA_UID}:${CONDA_GID} --from=builder /hyp3-srg /hyp3-srg
 

diff --git a/Dockerfile.gpu b/Dockerfile.gpu
@@ -69,6 +69,7 @@ SHELL ["/bin/bash", "-l", "-c"]
 USER ${CONDA_UID}
 WORKDIR /home/conda/
 
+COPY --chown=${CONDA_UID}:${CONDA_GID} --from=builder /srg/snaphu_v2.0b0.0.0/bin/snaphu /srg/bin/snaphu
 COPY --chown=${CONDA_UID}:${CONDA_GID} --from=builder /srg /srg
 COPY --chown=${CONDA_UID}:${CONDA_GID} --from=builder /hyp3-srg /hyp3-srg
 

diff --git a/README.md b/README.md
@@ -6,9 +6,10 @@ HyP3 plugin for Stanford Radar Group (SRG) SAR Processor
 > [!WARNING]
 > Running the workflows in this repository requires a compiled version of the [Stanford Radar Group Processor](https://github.com/asfhyp3/srg). For this reason, running this repository's workflows in a standard Python is not implemented yet. Instead, we recommend running the workflows from the docker container as outlined below.
 
-The HyP3-SRG plugin provides a set of workflows (currently only accessible via the docker container) that can be used to process SAR data using the [Stanford Radar Group Processor](https://github.com/asfhyp3/srg). The workflows currently included in this plugin are:
+The HyP3-SRG plugin provides a set of workflows (currently only accessible via the docker container) that can be used to process SAR data using the [Stanford Radar Group Processor](https://github.com/asfhyp3/srg). This set of workflow uses the [SRG alogorithms]((https://doi.org/10.1109/LGRS.2017.2753580)) to process Level-0 Sentinel-1 (S1) data to geocoded, user-friendly products that can be used for time-series analysis. The workflows currently included in this plugin are:
 
-- `back_projection`: A workflow for creating geocoded Sentinel-1 SLCs from Level-0 data using the [back-projection methodology](https://doi.org/10.1109/LGRS.2017.2753580).
+- [`back_projection`](#back-projection): A workflow for creating geocoded Sentinel-1 SLCs.
+- [`time_series`](#time-series): A workflow for creating a deformation timeseries of geocoded Sentinel-1 SLCs. 
 
 To run a workflow, you'll first need to build the docker container:
 ```bash
@@ -23,7 +24,10 @@ docker run -it --rm \
     ++process [WORKFLOW_NAME] \
     [WORKFLOW_ARGS]
 ```
-Here is an example command for the `back_projection` workflow:
+
+### Back-projection 
+The `back_projection` processing type produces geocoded SLCs using Level-0 Sentinel-1 data as input. The workflow takes a list of Level-0 Sentinel-1 granule names and outputs them as geocoded SLCs (GSLCs).
+An example command for the `back_projection` workflow is:
 ```bash
 docker run -it --rm \
     -e EARTHDATA_USERNAME=[YOUR_USERNAME_HERE] \
@@ -34,7 +38,20 @@ docker run -it --rm \
     S1A_IW_RAW__0SDV_20231229T134404_20231229T134436_051870_064437_5F38-RAW
 ```
 
-## Earthdata Login
+### Time-series
+The `time_series` workflow takes a list of up to 50 Sentinel-1 GSLC granule names, along with a bounding box, and produces a time-series. **Note that all of the input GSLSs must have been generated with the provided bounding box.**  Stacks are created with `10` range looks, `10` azimuth looks,  and temporal and spatial baselines of `1000` and `1000`, respectively. Candidate reference points are chosen with a correlation threshold of `0.5` - meaning the correlation must be above `0.5` in all scenes at that point. A tropospheric correction is applied using an elevation-dependent regression.
+ The following command will run the `time_series` workflow: 
+```
+docker run -it --rm \
+    -e EARTHDATA_USERNAME=[YOUR_USERNAME_HERE] \
+    -e EARTHDATA_PASSWORD=[YOUR_PASSWORD_HERE] \
+    hyp3-srg:latest \
+    ++process time_series \
+   S1A_IW_RAW__0SDV_20240828T020812_20240828T020844_055407_06C206_6EA7 \
+   S1A_IW_RAW__0SDV_20240816T020812_20240816T020844_055232_06BB8A_C7CA \
+   S1A_IW_RAW__0SDV_20240804T020812_20240804T020844_055057_06B527_1346
+```
+### Earthdata Login
 
 For all workflows, the user must provide their Earthdata Login credentials in order to download input data.
 
@@ -45,24 +62,47 @@ Your credentials can be passed to the workflows via environment variables (`EART
 If you haven't set up a `.netrc` file
 before, check out this [guide](https://harmony.earthdata.nasa.gov/docs#getting-started) to get started.
 
-## GPU Setup:
+
+## Developer setup
+### GPU Setup
 In order for Docker to be able to use the host's GPU, the host must have the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/index.html) installed and configured. 
 The process is different for different OS's and Linux distros. The setup process for the most common distros, including Ubuntu, 
 can be found [here](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#configuration). Make sure to follow the [Docker configuration steps](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#configuration) after installing the package.
 
+The AWS ECS-optimized GPU AMI has the configuration described already set up. You can find the latest version of this AMI by calling:
+```bash
+aws ssm get-parameters --names /aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended --region us-west-2
+```
+
+### GPU Docker Container
+Once you have a compute environment set up as described above, you can build the GPU version of the container by running:
+```bash
+docker build --build-arg="GPU_ARCH={YOUR_ARCH}" -t ghcr.io/asfhyp3/hyp3-srg:{RELEASE}.gpu -f Dockerfile.gpu .
+```
+
+You can get the value of `COMPUTE_CAPABILITY_VERSION` by running `nvidia-smi` on the instance to obtain GPU type, then cross-reference this information with NVIDIA's [GPU type compute capability list](https://developer.nvidia.com/cuda-gpus). For a g6.2xlarge instance, this would be:
+```bash
+docker --build-arg="GPU_ARCH=89" -t ghcr.io/asfhyp3/hyp3-srg:{RELEASE}.gpu -f Dockerfile.gpu .
+```
+The compute capability version will always be the same for a given instance type, so you will only need to look this up once per instance type.
+The default value for this argument is `89` - the correct value for g6.2xlarge instances.
+**THE COMPUTE CAPABILITY VERSION MUST MATCH ON BOTH THE BUILDING AND RUNNING MACHINE!**
+
+The value of `RELEASE` can be obtained from the git tags.
+
+You can push a manual container to HyP3-SRG's container repository by following [this guide](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry#pushing-container-images).
+
 ### EC2 Setup
 > [!CAUTION]
-> Running the docker container on an Amazon Linux 2023 Deep Learning AMI runs, but will result in all zero outputs. Work is ongoing to determine what is causing this issue. For now, we recommend using option 2.i.
+> Running the docker container on an Amazon Linux 2023 Deep Learning AMI runs, but will result in all zero outputs. Work is ongoing to determine what is causing this issue. For now, we recommend using option 2.3.
 
 When running on an EC2 instance, the following setup is recommended:
 1. Create a [G6-family EC2 instance](https://aws.amazon.com/ec2/instance-types/g6/) that has **at least 32 GB of memory**.
 2. Launch your instance with one of the following setups (**option i is recommended**):
     1. Use the latest [Amazon Linux 2023 AMI](https://docs.aws.amazon.com/linux/al2023/ug/ec2.html) with `scripts/amazon_linux_setup.sh` as the [user script on launch](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html).
     2. Use the latest [Ubuntu AMI](https://cloud-images.ubuntu.com/locator/ec2/) with the `scripts/ubuntu_setup.sh` as the [user script on launch](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html).
-    3. Use the [Ubuntu Deep Learning Base OSS Nvidia Driver GPU AMI](https://aws.amazon.com/releasenotes/aws-deep-learning-base-gpu-ami-ubuntu-22-04/) (no install script required).
-3. Build the GPU docker container with the correct compute capability version. To determine this value, run `nvidia-smi` on the instance to obtain GPU type, then cross-reference this information with NVIDIA's [GPU type compute capability list](https://developer.nvidia.com/cuda-gpus). For a g6.2xlarge instance, this would be:
+    3. Use the latest AWS ECS-optimized GPU AMI (`aws ssm get-parameters --names /aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended --region us-west-2`)
+3. Build the GPU docker container with the correct compute capability version (see section above). To determine this value, run `nvidia-smi` on the instance to obtain GPU type, then cross-referencke this information with NVIDIA's [GPU type compute capability list](https://developer.nvidia.com/cuda-gpus). For a g6.2xlarge instance, this would be:
 ```bash
-docker --build-arg="GPU_ARCH=89" -t hyp3-srg:gpu-89 -f Dockerfile.gpu .
+docker --build-arg="GPU_ARCH=89" -t ghcr.io/asfhyp3/hyp3-srg:{RELEASE}.gpu -f Dockerfile.gpu .
 ```
-The compute capability version will always be the same for a given instance type, so you will only need to look this up once per instance type.
-The default value for this argument is `89` - the correct value for g6.2xlarge instances.
diff --git a/pyproject.toml b/pyproject.toml
@@ -43,6 +43,7 @@ Documentation = "https://hyp3-docs.asf.alaska.edu"
 
 [project.entry-points.hyp3]
 back_projection = "hyp3_srg.back_projection:main"
+time_series = "hyp3_srg.time_series:main"
 
 [tool.pytest.ini_options]
 testpaths = ["tests"]

diff --git a/src/hyp3_srg/__main__.py b/src/hyp3_srg/__main__.py
@@ -14,7 +14,7 @@ def main():
     parser = argparse.ArgumentParser(prefix_chars='+', formatter_class=argparse.ArgumentDefaultsHelpFormatter)
     parser.add_argument(
         '++process',
-        choices=['back_projection'],
+        choices=['back_projection', 'time_series'],
         default='back_projection',
         help='Select the HyP3 entrypoint to use',  # HyP3 entrypoints are specified in `pyproject.toml`
     )

diff --git a/src/hyp3_srg/back_projection.py b/src/hyp3_srg/back_projection.py
@@ -18,19 +18,6 @@
 log = logging.getLogger(__name__)
 
 
-def create_param_file(dem_path: Path, dem_rsc_path: Path, output_dir: Path):
-    """Create a parameter file for the processor.
-
-    Args:
-        dem_path: Path to the DEM file
-        dem_rsc_path: Path to the DEM RSC file
-        output_dir: Directory to save the parameter file in
-    """
-    lines = [str(dem_path), str(dem_rsc_path)]
-    with open(output_dir / 'params', 'w') as f:
-        f.write('\n'.join(lines))
-
-
 def check_required_files(required_files: Iterable, work_dir: Path) -> None:
     for file in required_files:
         if not (work_dir / file).exists():
@@ -77,6 +64,8 @@ def create_product(work_dir) -> Path:
     gslc_path = list(work_dir.glob('S1*.geo'))[0]
     product_name = gslc_path.with_suffix('').name
     orbit_path = work_dir / f'{product_name}.orbtiming'
+    rsc_path = work_dir / 'elevation.dem.rsc'
+    bounds_path = work_dir / 'bounds'
     zip_path = work_dir / f'{product_name}.zip'
 
     parameter_file = work_dir / f'{product_name}.txt'
@@ -89,13 +78,16 @@ def create_product(work_dir) -> Path:
     with zipfile.ZipFile(zip_path, 'w', compression=zipfile.ZIP_STORED) as z:
         z.write(gslc_path, gslc_path.name)
         z.write(orbit_path, orbit_path.name)
+        z.write(rsc_path, rsc_path.name)
+        z.write(bounds_path, bounds_path.name)
         z.write(parameter_file, parameter_file.name)
 
     return zip_path
 
 
 def back_project(
     granules: Iterable[str],
+    bounds: list[float] = None,
     earthdata_username: str = None,
     earthdata_password: str = None,
     bucket: str = None,
@@ -107,6 +99,7 @@ def back_project(
 
     Args:
         granules: List of Sentinel-1 level-0 granules to back-project
+        bounds: DEM extent bounding box [min_lon, min_lat, max_lon, max_lat]
         earthdata_username: Username for NASA's EarthData service
         earthdata_password: Password for NASA's EarthData service
         bucket: AWS S3 bucket for uploading the final product(s)
@@ -127,9 +120,10 @@ def back_project(
         bboxs.append(granule_bbox)
         granule_orbit_pairs.append((granule_path, orbit_path))
 
-    full_bbox = unary_union(bboxs).buffer(0.1)
-    dem_path = dem.download_dem_for_srg(full_bbox, work_dir)
-    create_param_file(dem_path, dem_path.with_suffix('.dem.rsc'), work_dir)
+    if bounds is None or bounds == [0, 0, 0, 0]:
+        bounds = unary_union(bboxs).buffer(0.1).bounds
+    dem_path = dem.download_dem_for_srg(bounds, work_dir)
+    utils.create_param_file(dem_path, dem_path.with_suffix('.dem.rsc'), work_dir)
 
     back_project_granules(granule_orbit_pairs, work_dir=work_dir, gpu=gpu)
 
@@ -156,9 +150,20 @@ def main():
     parser.add_argument('--bucket', help='AWS S3 bucket HyP3 for upload the final product(s)')
     parser.add_argument('--bucket-prefix', default='', help='Add a bucket prefix to product(s)')
     parser.add_argument('--gpu', default=False, action='store_true', help='Use the GPU-based version of the workflow.')
+    parser.add_argument(
+        '--bounds',
+        default=None,
+        type=str.split,
+        nargs='+',
+        help='DEM extent bbox in EPSG:4326: [min_lon, min_lat, max_lon, max_lat].'
+    )
     parser.add_argument('granules', type=str.split, nargs='+', help='Level-0 S1 granule(s) to back-project.')
     args = parser.parse_args()
     args.granules = [item for sublist in args.granules for item in sublist]
+    if args.bounds is not None:
+        args.bounds = [float(item) for sublist in args.bounds for item in sublist]
+        if len(args.bounds) != 4:
+            parser.error('Bounds must have exactly 4 values: [min lon, min lat, max lon, max lat] in EPSG:4326.')
     back_project(**args.__dict__)
 
 

diff --git a/src/hyp3_srg/dem.py b/src/hyp3_srg/dem.py
@@ -1,9 +1,9 @@
 """Prepare a Copernicus GLO-30 DEM virtual raster (VRT) covering a given geometry"""
 import logging
 from pathlib import Path
+from typing import Optional
 
 import requests
-from shapely.geometry import Polygon
 
 from hyp3_srg import utils
 
@@ -27,26 +27,30 @@ def ensure_egm_model_available():
                 f.write(chunk)
 
 
-def download_dem_for_srg(
-    footprint: Polygon,
-    work_dir: Path,
-) -> Path:
-    """Download the given DEM for the given extent.
+def download_dem_for_srg(bounds: list[float], work_dir: Optional[Path]):
+    """Download the DEM for the given bounds - [min_lon, min_lat, max_lon, max_lat].
 
     Args:
-        footprint: The footprint to download a DEM for
+        bounds: The bounds of the extent of the desired DEM - [min_lon, min_lat, max_lon, max_lat].
         work_dir: The directory to save create the DEM in
 
     Returns:
         The path to the downloaded DEM
     """
+    if (bounds[0] >= bounds[2] or bounds[1] >= bounds[3]):
+        raise ValueError(
+            "Improper bounding box formatting, should be [max latitude, min latitude, min longitude, max longitude]."
+        )
+
     dem_path = work_dir / 'elevation.dem'
     dem_rsc = work_dir / 'elevation.dem.rsc'
 
+    with open(work_dir / 'bounds', 'w') as bounds_file:
+        bounds_file.write(' '.join([str(bound) for bound in bounds]))
+
     ensure_egm_model_available()
 
-    # bounds produces min x, min y, max x, max y; stanford wants toplat, botlat, leftlon, rightlon
-    stanford_bounds = [footprint.bounds[i] for i in [3, 1, 0, 2]]
+    stanford_bounds = [bounds[i] for i in [3, 1, 0, 2]]
     args = [str(dem_path), str(dem_rsc), *stanford_bounds]
     utils.call_stanford_module('DEM/createDEMcop.py', args, work_dir=work_dir)
     return dem_path