-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ktokunaga/30/add marketingyears to distribs #60
base: main
Are you sure you want to change the base?
Conversation
Taking Eugene's restructure branch (from PR #59), this updates the following: 1. Added cleaned version of Rob's code to database.by --> creates an ingredient_rxcui_year table & a product_rxcui_year table, for their respective distributions in generate_module 2. added 'year' column to the generate_module dataframes/CSV files. 3. fixed default_probability typo in utils.py (if idx == 1 --> changed to if idx == 0)
del medication_ingredient_rxcui_years | ||
del medication_product_rxcui_years |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kristentaytok - I get the error below because you are deleting a df that doesn't exist yet. Maybe you meant a different df? Please take a look. This only errors out if you are initially loading the DB, but doesn't affect the actual DB load.
UnboundLocalError: local variable 'medication_ingredient_rxcui_years' referenced before assignment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agh, I had that error at first and oddly it was fine after running the exact same code once in a Jupyter nb. but I think I figured it out in case we decide to use this (or some variation of it): I was supposed to write it this way:
del med_marketing_year_dict['medication_ingredient_rxcui_years']
del med_marketing_year_dict['medication_product_rxcui_years']
because I created the dataframes as dictionary key-value pairs because that's the only option (I'm aware of) to dynamically create a variable name in the for loop (i.e., create a variable called "medication_ingredient_df" in the first loop, and another variable "medication_product_df" in the next loop).
Uh oh @kristentaytok - I'm getting the error when I run Synthea that I was worried about:
What this means is that I don't think we can build year into the CSV distribution because the current year isn't technically a patient attribute... which complicates things... I think this means I will have to build the conditional logic into the JSON to check current year and if it's too early, go back to the beginning of the submodule to find a different match that works... I dunno... |
I also tried replacing https://github.com/synthetichealth/synthea/wiki/Generic-Module-Framework%3A-Logic#date |
One hacky thing we could try would be to assign a |
@kristentaytok - I just emailed the Synthea devs asking whether either of my ideas are technically possible. For now, let's hold off on this. As we discussed before, not having the FDA year thing figured out should not impede validation testing. Thanks for doing all this - I did not expect you to turn this all around in a day, but other than Synthea not playing nice, it seems to work very well! I really hope I didn't waste your time with this one. |
@jrlegrand thanks for the update and testing out the year column! and no worries - I figured that like the validation/chi square testing, it was something we should "fail fast" to see if it was possible and more of the time spent on the issue would be figuring out what to do if it doesn't lol. |
Fixes #30
Explanation
Carrying the FDA marketing date steps over the finish line:
Rationale
See issue #30
Tests
testing logs