-
-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transform vceregen renewable generation profiles #3898
Merged
Merged
Changes from 96 commits
Commits
Show all changes
106 commits
Select commit
Hold shift + click to select a range
9fadfff
Add source metadata for vceregen
aesharpe 4036479
Add profiles to vceregen dataset name
aesharpe d8b992e
Remove blank line in description
aesharpe cac5509
Add blank Data Source template for vceregen
aesharpe a44a399
Add links to download docs section
aesharpe b8aa4e4
Add availability section
aesharpe 65ddf74
Add respondents section
aesharpe d8071ed
Add original data section
aesharpe 82be4d4
Stash WIP of extraction
e-belfer e044e93
Extract VCE tables to raw dask dfs
e-belfer 57ad9a7
Clean up warnings and restore EIA 176
e-belfer b922328
Revert to pandas concatenation
e-belfer 53934f2
Add latlonfips
e-belfer 3c8cab6
Add blank transform module for vceregen
aesharpe 9f7814b
Fill out the basic vceregen transforms
aesharpe 600b3e2
Add underscores back to function names
aesharpe e6242f2
Update time col calculation
aesharpe 130f1f0
Update docstrings and comments to reflect new time cols
aesharpe 339d791
Change merge to concat
aesharpe 3677f78
Remove dask, coerce dtypes on read-in
e-belfer c870b63
override load_column_maps behavior
e-belfer 4a4511d
Merge branch 'main' into extract-vceregen
e-belfer a91d073
Update addition of county and state name fields
aesharpe 076b113
Merge branch 'extract-vceregen' into transform-vceregen
aesharpe 0b3fa45
Add vceregen to init files and metadata so that it will run on dagste…
aesharpe 79c6016
Add resource metadata for vcregen
aesharpe 54fd155
Clean county strings more
aesharpe 6d14dae
Add release notes
aesharpe deae1e2
Add function to validate state_county_names and improve performance o…
aesharpe 63d2666
make for loops into dict comp, update loggers, and improve regex
aesharpe 24f4fb5
Add asset checks and remove inline checks
aesharpe da272f5
Change hour_utc to datetime_utc
aesharpe 7bc1741
Remove incorrect docstring
aesharpe ae85f64
Update dataset and field metadata
aesharpe d669c74
Rename county col to county_or_subregion
aesharpe 140a181
Merge branch 'transform-vceregen' into vceregen-docs
aesharpe 7dbe5d2
Update data_source docs page
aesharpe 98fd118
change axis=1 to axis=columns
aesharpe b6b5e6c
Merge branch 'main' into extract-vceregen
e-belfer 291ba7d
Update DOI to sandbox and temporarily xfail DOI test
e-belfer 44f3ae8
Merge branch 'extract-vceregen' into transform-vceregen
aesharpe 8e6d88a
Change county_or_subregion to county_or_lake_name
aesharpe 6a49f69
Change county_or_subregion to county_or_lake_name
aesharpe 1f3666c
Merge branch 'transform-vceregen' of https://github.com/catalyst-coop…
aesharpe 7319e7f
Update docs to explain solar cap fac
aesharpe 08d7341
Merge branch 'main' into transform-vceregen
aesharpe 3eaebe6
Update regen to rare
e-belfer b324123
Merge branch 'main' into extract-vceregen
e-belfer 120451d
Merge branch 'extract-vceregen' of https://github.com/catalyst-cooper…
e-belfer 5b98e60
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 9f6204e
Merge branch 'main' into extract-vceregen
e-belfer 77d47a4
Update gsutil in zenodo-cache-sync
e-belfer adfff81
Merge branch 'extract-vceregen' of https://github.com/catalyst-cooper…
e-belfer ece9dab
Merge branch 'extract-vceregen' into transform-vceregen
e-belfer 0e3792d
Merge branch 'extract-vceregen' into transform-vceregen
aesharpe 93e4487
Merge branch 'transform-vceregen' of https://github.com/catalyst-coop…
aesharpe 7e3c926
Rename vceregen to vcerare
aesharpe 6dea332
Add back user project
e-belfer 7554a36
Update project path
e-belfer 9ada9f5
Update project to billing project
e-belfer 4158afd
Update dockerfile to replace gsutil with gcloud storage
e-belfer 069c246
Merge branch 'extract-vceregen' into transform-vceregen
e-belfer 2ba9b16
Update docs/release_notes.rst
aesharpe e0f6524
Update docs/release_notes.rst
aesharpe f793776
Update docs/templates/vcerare_child.rst.jinja
aesharpe d41c44d
First batch of little docs fixes
aesharpe 98c2f69
Restructure _combine_city_county_records function
aesharpe d7b59d5
Add link to zenodo archive to data source page
aesharpe 178f0fb
Clarify 1 vs. 100 in data source page
aesharpe 7a1ebe1
Spread out comments in the _prep_lat_long_fips_df function
aesharpe 782e925
Update docstring for _prep_lat_long_fips_df
aesharpe 58aa99f
Switch order of add_time_cols and make_cap_frac functions
aesharpe e494482
Update _combine_city_county_records and move assertion to asset checks
aesharpe c2f3f75
Change all().all() to any().any()
aesharpe ccaa4ae
Add validations to merges
aesharpe 3913006
Resolve merge conflicts with main
aesharpe 865756a
docs cleanup tidbits
aesharpe c838e63
Turn _combine_city_county_records function into _drop_city_records an…
aesharpe 05376e8
Make fips columns categorical and narrow scope of regex
aesharpe ef1a243
data source docs updates
aesharpe 78fe904
Add downloadable docs to vcerare data source and fix data source file…
aesharpe 6becc52
Remove 1.34 from field description for capacity_factor_solar_pv
aesharpe c2d16ae
Add some logs and a function to null county_id_fips values from lakes…
aesharpe 0d13365
Update solar_pv metadata
aesharpe 6cde307
Update solar_pv metadata
aesharpe f356336
Merge branch 'main' into transform-vceregen
aesharpe d03eab3
Rename RARE dataset in the release notes
aesharpe 73b70d9
Add issue number to release notes
aesharpe 15e0f40
Merge branch 'transform-vceregen' of https://github.com/catalyst-coop…
aesharpe ff23cbe
Update field description for county_or_lake_name
aesharpe de63b12
Update docstring for transform module
aesharpe 710ead0
Make all references to FIPS uppercase in notes and comments
aesharpe 69b4f71
Correct inline comment in _null_non_county_fips_rows
aesharpe 71af223
Fix asset check
aesharpe d1074f8
Minor late-night PR fixes
zaneselvans ae5fd8c
Log during VCE RARE asset checks to see what's slow.
zaneselvans 06db957
Add simple notebook for processing vcerare data
aesharpe e6904ac
Re-enable Zenodo DOI validation unit test.
zaneselvans 6516d7c
Update docs to use gcloud storage not gsutil
zaneselvans 6f64a29
Try to reduce memory use & concurrency for VCE RARE dataset
zaneselvans a40b1f3
Retry policy for VCE + highmem use for VCE asset check.
zaneselvans 077a47f
Bump VM RAM and remove very-high memory tag & retry
zaneselvans 71256e4
Bump vCPUs to 16
zaneselvans cd78e56
Add fancy charts to notebook
aesharpe 81297c9
Merge branch 'transform-vceregen' of https://github.com/catalyst-coop…
aesharpe bd13830
Add link to VCE data in nightly build outputs. Other docs tweaks.
zaneselvans File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file added
BIN
+13.6 MB
docs/data_sources/vcerare/2018 VCE Study_Dataset Methods and Analysis.pdf
Binary file not shown.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
{% extends "data_source_parent.rst.jinja" %} | ||
{% block background %} | ||
The data in the Resource Adequacy Renewable Energy (RARE) Power Dataset was produced by | ||
Vibrant Clean Energy based on outputs from the NOA HRRR model and are licensed | ||
to the public under the Creative Commons Attribution 4.0 International license | ||
(CC-BY-4.0). | ||
|
||
See the `Zenodo archive README <https://doi.org/10.5281/zenodo.13937523>`__ for more | ||
detailed information. | ||
|
||
The cleaned PUDL table is hundreds of millions of rows, so don't try and open it with Excel! | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ideally we'd link to the Kaggle notebook here, but we obviously don't have a link yet. |
||
{% endblock %} | ||
|
||
{% block download_docs %} | ||
{% for filename in download_paths %} | ||
* :download:`{{ filename.stem.replace("-", " ").replace("_", " ").title() }} ({{ filename.suffix.replace('.', '').upper() }}) <{{ filename }}>` | ||
{% endfor %} | ||
* `NOAA HRRR Model Overview <https://rapidrefresh.noaa.gov/hrrr/>`__ | ||
{% endblock %} | ||
|
||
|
||
{% block availability %} | ||
Hourly, county-level data from 2019 - 2023 is integrated into PUDL. There is a | ||
second release of data for the years 2014 - 2018 expected in Q1 of 2025, which will be | ||
integrated into PUDL pending funding availability. | ||
{% endblock %} | ||
|
||
{% block respondents %} | ||
This data does not come from a government agency, and is not the result of compulsory | ||
data reporting. | ||
{% endblock %} | ||
|
||
{% block original_data %} | ||
The contents of the original CSVs are formatted so that Excel can display the | ||
data without crashing. There's one file per year per generation type, and each | ||
file contains an index column for time (simply 1, 2, 3...8760 to | ||
represent the hours in a year) and columns for each county populated with capacity | ||
factor values as a percentage from 0-100. | ||
{% endblock %} | ||
|
||
{% block notable_irregularities %} | ||
Non-county regions | ||
------------------ | ||
|
||
The original data include capacity factors for some non-county areas including the Great | ||
Lakes and 2 small cities (Bedford City, VA and Clifton Forge City, VA). It associated | ||
"county" FIPS IDs with those areas, meaning that there was not a 1:1 relationship | ||
between the FIPS IDs and the named areas, and the geographic region implied by the | ||
FIPS IDs did not correspond to the named area. We've dropped the cities -- one of which | ||
contained no data -- and set the FIPS codes for the Great Lakes to NA. Note that lakes | ||
bordering multiple states will appear more than once in the data. VCE used a nearest | ||
neighbor technique to assign the state waters to the counties (this pertains to coastal | ||
areas as well). | ||
|
||
Capacity factors > 1 | ||
-------------------- | ||
There are a couple of capacity factor values for the solar pv data that exceed | ||
aesharpe marked this conversation as resolved.
Show resolved
Hide resolved
|
||
the maximum value of 1 for capacity factor (or 100 for the raw data--PUDL converts the | ||
data from a percentage to a fraction to match other reported capacity factor data). This | ||
is due to power production performance being correlated with panel temperatures. During | ||
cold sunny periods, some solar capacity factor values are greater than 1 (but less that | ||
1.1). | ||
|
||
8760-hour years | ||
--------------- | ||
This data is primarily used for modeling purposes and conforms to the 8760 hour/year | ||
standard regardless of leap years. This means that 2020 is missing data for December | ||
31st. | ||
|
||
{% endblock %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -466,6 +466,56 @@ | |
"type": "number", | ||
"description": "Fraction of potential generation that was actually reported for a plant part.", | ||
}, | ||
"capacity_factor_offshore_wind": { | ||
"type": "number", | ||
"description": ( | ||
"Estimated capacity factor (0-1) calculated for offshore wind " | ||
"assuming a 140m hub height and 120m rotor diameter." | ||
"Based on outputs from the NOAA HRRR operational numerical " | ||
"weather prediction model. Capacity factors are normalized " | ||
"to unity for maximal power output. " | ||
"Vertical slices of the atmosphere are considered across the " | ||
"defined rotor swept area. Bringing together wind speed, density, " | ||
"temperature and icing information, a power capacity is estimated " | ||
"using a representative power coefficient (Cp) curve to determine " | ||
"the power from a given wind speed, atmospheric density and " | ||
"temperature. There is no wake modeling included in the dataset." | ||
), | ||
}, | ||
"capacity_factor_onshore_wind": { | ||
"type": "number", | ||
"description": ( | ||
"Estimated capacity factor (0-1) calculated for onshore wind " | ||
"assuming a 100m hub height and 120m rotor diameter." | ||
"Based on outputs from the NOAA HRRR operational numerical " | ||
"weather prediction model. Capacity factors are normalized " | ||
"to unity for maximal power output. " | ||
"Vertical slices of the atmosphere are considered across the " | ||
"defined rotor swept area. Bringing together wind speed, density, " | ||
"temperature and icing information, a power capacity is estimated " | ||
"using a representative power coefficient (Cp) curve to determine " | ||
"the power from a given wind speed, atmospheric density and " | ||
"temperature. There is no wake modeling included in the dataset." | ||
), | ||
}, | ||
"capacity_factor_solar_pv": { | ||
"type": "number", | ||
"description": ( | ||
"Estimated capacity factor (0-1) calculated for solar PV " | ||
"assuming a fixed axis panel tilted at latitude and DC power " | ||
"outputs. Due to power production performance being correlated " | ||
"with panel temperatures, during cold sunny periods, some solar " | ||
"capacity factor values are greater than 1 (but less that 1.1)." | ||
"All values are based on outputs from the NOAA HRRR operational " | ||
"numerical weather prediction model. Capacity factors are " | ||
"normalized to unity for maximal power output. " | ||
"Pertinent surface weather variables are pulled such as " | ||
"incoming short wave radiation, direct normal irradiance " | ||
"(calculated in the HRRR 2016 forward), surface temperature " | ||
"and other parameters. These are used in a non-linear I-V curve " | ||
"translation to power capacity factors." | ||
), | ||
}, | ||
"capacity_mw": { | ||
"type": "number", | ||
"description": "Total installed (nameplate) capacity, in megawatts.", | ||
|
@@ -836,6 +886,13 @@ | |
"type": "string", | ||
"description": "County name as specified in Census DP1 Data.", | ||
}, | ||
"county_or_lake_name": { | ||
"type": "string", | ||
"description": ( | ||
"County or lake name. Lake names may also appear several times--once for " | ||
"each state it touches. FIPS ID values for lakes have been nulled." | ||
), | ||
}, | ||
"country_code": { | ||
"type": "string", | ||
"description": "Three letter ISO-3166 country code (e.g. USA or CAN).", | ||
|
@@ -1970,6 +2027,10 @@ | |
"description": "The energy contained in fuel burned, measured in million BTU.", | ||
"unit": "MMBtu", | ||
}, | ||
"hour_of_year": { | ||
"type": "integer", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I changed this to integer. |
||
"description": "Integer between 1 and 8670 representing the hour in a given year.", | ||
}, | ||
"unit_heat_rate_mmbtu_per_mwh": { | ||
"type": "number", | ||
"description": "Fuel content per unit of electricity generated. Coming from MCOE calculation.", | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
"""Table definitions for data coming from the Vibrant Clean Energy Renewable Generation Profiles.""" | ||
|
||
from typing import Any | ||
|
||
RESOURCE_METADATA: dict[str, dict[str, Any]] = { | ||
"out_vcerare__hourly_available_capacity_factor": { | ||
"description": ( | ||
"The data in this table were produced by Vibrant Clean Energy, and are " | ||
"licensed to the public under the Creative Commons Attribution 4.0 International " | ||
"license (CC-BY-4.0). The table consists of estimated county-averaged hourly " | ||
"capacity factors for wind and solar generating facilities across the contiguous " | ||
"United States (US) to be used as a tool and input for resource adequacy modeling " | ||
"and planning. The hourly capacity factors are normalized to unity for maximal power " | ||
"output. To convert to units of power, the user must multiply by the installed capacity " | ||
"within the county.\n\n" | ||
"The technologies provided are:\n" | ||
"1) Onshore wind assuming a 100m hub height and 120m rotor diameter;\n" | ||
"2) Offshore wind assuming a 140m hub height and 120m rotor diameter;\n" | ||
"3) Utility solar assuming a fixed axis panel tilted at latitude.\n\n" | ||
"The foundation of the capacity factors provided here is the NOAA HRRR " | ||
"operational numerical weather prediction model. The HRRR covers the entire " | ||
"contiguous US at a horizontal resolution of 3 km. Forecasts are intialized each " | ||
"hour of the year. Forecast hour two (2) is used as the input data for the power " | ||
"algorithms. This forecast hour is chosen to trade-off the impact of the measurement " | ||
"and data assimilation procedure of the HRRR with the physics of the model to derive " | ||
"the most complete picture of the atmosphere at the forecast time horizon. " | ||
"Hourly capacity factors are spatially averaged across each county over the contiguous " | ||
"USA. There are a handful of counties that are too small to pick up representation on " | ||
"the HRRR operational forecast grid. As such, these counties will have no wind or solar " | ||
"power production curves.\n\n" | ||
"For wind capacity factors: vertical slices of the atmosphere are considered across " | ||
"the defined rotor swept area. Bringing together wind speed, density, temperature and " | ||
"icing information, a power capacity is estimated using a representative power coefficient " | ||
"(Cp) curve to determine the power from a given wind speed, atmospheric density and " | ||
"temperature. There is no wake modeling included in the dataset.\n\n" | ||
"For solar capacity factors: pertinent surface weather variables are pulled such as " | ||
"incoming short wave radiation, direct normal irradiance (calculated in the HRRR 2016 " | ||
"forward), surface temperature and other parameters. These are used in a non-linear " | ||
"I-V curve translation to power capacity factors. Due to power production " | ||
"performance being correlated with panel temperatures, during cold sunny periods, " | ||
"some solar capacity factor values are greater than 1 (but less that 1.1)." | ||
), | ||
"schema": { | ||
"fields": [ | ||
"datetime_utc", | ||
"hour_of_year", | ||
"report_year", | ||
"county_id_fips", | ||
"county_or_lake_name", | ||
"state", | ||
aesharpe marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"latitude", | ||
"longitude", | ||
"capacity_factor_solar_pv", | ||
"capacity_factor_onshore_wind", | ||
"capacity_factor_offshore_wind", | ||
], | ||
"primary_key": ["datetime_utc", "state", "county_or_lake_name"], | ||
}, | ||
"sources": ["vcerare"], | ||
"field_namespace": "vcerare", | ||
"etl_group": "vcerare", | ||
"create_database_schema": False, | ||
}, | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -77,4 +77,5 @@ | |
gridpathratoolkit, | ||
nrelatb, | ||
params, | ||
vcerare, | ||
) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We asked Chris which of the two reports ought to be used as a reference and he indicated the 2020 one was preferable. Is there a reason to include this one too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My reasoning was that it has more detail about the methodology. When Chris said the other one was preferable I didn't necessarily take that to mean that this one wasn't accurate, rather that the other was more succinct. But we can remove it if you think it might not be what Chris intended!