-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for multiple time series datasets via glob and fix enso_diags
, streamflow
, tc_analysis
#866
base: cdat-migration-fy24
Are you sure you want to change the base?
Add support for multiple time series datasets via glob and fix enso_diags
, streamflow
, tc_analysis
#866
Conversation
enso_diags
and tc_analysis
enso_diags
and tc_analysis
enso_diags
, streamflow
, tc_analysis
- Required for faster downstream operations, which require in-memory NumPy arrays
@forsyth2 For the failing Can you fix this file? |
@tomvothecoder thank you for helping identify this problem. In this version of v3 Low Resolution datasets (documented here), much lower TC activity is found. It is likely that no TC is detected during the testing period (1987-1988). Therefore no file called "cyclones_stitch_v3.LR.historical_0051_1987_1988.dat" was generated from an upstream process. So the testing result that skipped tc_analysis figure is expected in case of v3.. |
Thank you for clarifying. It sounds like @forsyth2 should exclude |
@forsyth2 The run script in zppy specifies input dir paths that contain sub-directories. e3sm_diags cannot determine which Instead, the Why?I tried implementing functionality to parse the input data path for For example:
Fix -> set Input Path to Fixes for
|
@tomvothecoder Thank you for looking into all this.
I also added a failure if the stitch file is empty (included in E3SM-Project/zppy#628) to catch this in the future. But yes, for now, we'll just have to exclude tc_analysis from v3 testing.
I'm going to have to dive deeper into this. In theory,
|
Got it. I will do final testing then merge this PR soon for you to try in |
- Fix units and long name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @chengzhuzhang, I would appreciate a PR review to ensure my code changes look good to you as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main of file of interest to add support for glob of .nc
files.
new_var.attrs["units"] = "W/m2" | ||
new_var.attrs["long_name"] = "Surface latent heat flux" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addresses #818
var_start_year = int(filename.split("/")[-1].split("_")[-2][:4]) | ||
var_end_year = int(filename.split("/")[-1].split("_")[-1][:4]) | ||
start_yr_int, end_yr_int = int(self.start_yr), int(self.end_yr) | ||
var_start_year, var_end_year = self._parse_years_from_filepaths(filepaths) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomvothecoder can we parse_years based on ds_subset, so that we don't need this _parse_years_from_filepaths
function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR looks good! I only have one minor comment.
Description
Summary of changes
glob
filepathopen_dataset()
withopen_mfdataset()
in_get_time_series_dataset_obj()
_get_time_series_filepaths
and_get_matching_time_series_filepaths
to return a list of filepath(s) or None if no filepaths found_get_time_slice()
to support parsing a list of filepath(s) for start and end years via_parse_years_from_filepaths()
enso_diags
(related errors)calculate_nino_index_model()
not catching the correctIOError
message before trying to get thesst
dataset using the"TS"
variable.load()
with time series dataset for downstream operationstc_analysis
/lcrc/group/e3sm/ac.forsyth2/zppy_min_case_e3sm_diags_cdat_migrated_output/test-diags-no-cdat-20240917/v3.LR.historical_0051/post/atm/tc-analysis_1987_1988/cyclones_stitch_v3.LR.historical_0051_1987_1988.dat
streamflow
.nc
files usingglob
and regex under nested directories -- not sure if this will work because how will e3sm_diags know which.nc
files to use if there are multiple sub-directories with the same matching files?zppy
to use exact filepath for root directory containing file/lcrc/group/e3sm/ac.forsyth2/zppy_min_case_e3sm_diags_cdat_migrated_output/test-diags-no-cdat-20240917/v3.LR.historical_0051/post/rof/native/ts/monthly/2yr/
Checklist
If applicable: