The Medication Diversification Tool (MDT) leverages publicly-available, government-maintained datasets to enhance Synthea’s Synthetic Patient Generator. The synthetic health data generated by Synthea can be used by researchers, software developers, policymakers, and clinicians to develop healthcare solutions. In its current state, the process for generating medications in Synthea is manual and limited to a small selection of medications in individual modules. The goal for the MDT is to create more diverse synthetic patient medication orders that accurately reflects the heterogeneity of medications being prescribed in the US population.
The MDT automates the process for finding relevant medication codes and calculating a distribution of medications, using medication classification dictionaries from RxClass and population-level prescription data from the Medical Expenditure Panel Survey (MEPS). The medication distributions can be tailored to specific patient demographics (e.g., age, gender, state of residence) and combined with Synthea data to generate medication records for a sample patient population.
- Clone the repo.
git clone https://github.com/coderxio/medication-diversification.git
cd medication-diversification
- Create and activate a venv.
python -m venv venv
source venv/bin/activate
Or on Windows (using Git Bash):
py -m venv venv
venv/scripts/activate
If using VSCode on Windows and getting error "Activate.ps1 is not digitally signed. You cannot run this script on the current system.", then you may need to temporarily change the PowerShell execution policy to allow scripts to run. If this is the case, try
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope Process
and then repeat step 2.
- Install MDT as an installed editable package (note the
.
after-e
).
pip install -e .
- Change to a new directory outside of the
medication-diversification/
project folder to test out MDT.
cd ..
mkdir mdt-test
cd mdt-test
- Initialize MDT. This only needs to be done once. This will create a
data/
directory and load theMDT.db
database.
mdt init
- Create a new module. This will create a
<<module_name>>/
directory which is empty except for an initialsettings.yaml
file.
mdt module -n <<module_name>> create
- Edit the
settings.yaml
folder in the newly created<<module_name>>/
directory, following the directions in this README. - Build the module.
mdt module -n <<module_name>> build
This will create:
- A
<<module_name>>.json
file which is the Synthea module itself - A
lookup_tables/
directory with all transition table CSVs - A
log/
directory with helpful output logs and debugging CSVs
Repeat steps 7 and 8 until MDT is producing medications that align with what you would expect. Use the
log <<timestamp>>.txt
files in thelog/
directory as a quick and easy way to validate the output of the module with a clinical subject matter expert.
To create a new module, start at step 6.
Pre-built module settings file examples available in the docs/examples folder.
Setting | Type | Description |
---|---|---|
name |
string |
(optional) The name of your module. Defaults to the camel_case name of the module folder. Also used as assign_to_attribute property by default. |
assign_to_attribute |
string |
(optional) The name of the "attribute" to assign this state to. Defaults to <<module_name>> . |
reason |
string |
(optional) Either an "attribute" or a "State_Name" referencing a previous ConditionOnset state. |
chronic |
boolean |
(optional) If true , a medication is considered a chronic medication for a chronic condition. This will cause Synthea to reissue the same medication as a new prescription AND discontinue the old prescription at each wellness encounter. Defaults to false . |
as_needed |
boolean |
(optional) If true , the medication may be taken as needed instead of on a specific schedule. Defaults to false . |
refills |
integer |
(optional) The number of refills to allow. Defaults to 0 . |
NOTE: At least one RxClass include
or RXCUI include
is required to run MDT.
Setting | Type | Description |
---|---|---|
include |
list of objects |
class_id / relationship pairs of RxClass classes to include. See RxClass for valid options. |
exclude |
list of objects |
class_id / relationship pairs of RxClass classes to exclude. See RxClass for valid options. |
Examples:
NOTE: All yaml keys in the default generated settings file must be present even if the key value is empty, this will be adjusted in a future version of MDT to set appropriate default values if a key is omitted.
rxclass:
include:
- class_id: R01AD
relationship: ATC
exclude: <-- Required key, read as an empty array
# -
Corticosteroid medications
rxclass:
include:
- class_id: R01AD
relationship: ATC
exclude:
Medications that may treat hypothyroidism
rxclass:
include:
- class_id: D007037
relationship: may_treat
exclude:
HMG CoA reductase inhibitor medications AND medications that may prevent stroke
rxclass:
include:
- class_id: R01AD
relationship: ATC
- class_id: D020521
relationship: may_prevent
exclude:
Medications that may prevent stroke EXCLUDING P2Y12 platelet inhibitors
rxclass:
include:
- class_id: D020521
relationship: may_prevent
exclude:
- class_id: N0000182142
relationship: has_EPC
NOTE: At least one RxClass include
or RXCUI include
is required to run MDT. RXCUIs in the include
and exclude
sections must be surrounded by single quotation marks.
Setting | Type | Description |
---|---|---|
include |
list of strings |
RXCUIs to include. See ingredients section of RxNav for valid options. |
exclude |
list of strings |
RXCUIs to exclude. See ingredients section of RxNav for valid options. |
ingredient_tty_filter |
string |
(optional) IN to only return single ingredient products or MIN to only return multiple ingredient products. |
dose_form_filter |
list of strings |
(optional) A list of dose forms or dose form group names to filter products by. See this RxNorm dose form reference for valid options. |
Examples:
Prednisone medications
rxcui:
include:
- '8640'
exclude:
Albuterol AND levalbuterol medications
rxcui:
include:
- '435'
- '237159'
exclude:
Fluticasone / salmeterol (TTY = MIN, multiple ingredient) medications
rxcui:
include:
- '284635'
exclude:
Single ingredient inhalant product fluticasone medications only
rxcui:
include:
- '41126'
exclude:
ingredient_tty_filter: IN
dose_form_filter:
- Inhalant Product
Setting | Type | Description |
---|---|---|
age_ranges |
list of strings |
Age ranges to break up distributions by. Defaults to MDT system defaults. |
demographic_distribution_flags |
object |
Whether to break up distributions by age , gender , and state . All three default to true . |
Examples:
Custom age ranges for pediatric patients only
meps:
age_ranges:
- 0-5
- 6-12
- 13-17
Split population under and over 65 years old
meps:
age_ranges:
- 0-64
- 65-103
To replace a MedicationOrder with one of our MDT submodules, replace the MedicationOrder state with a CallSubmodule state.
"Medication_Submodule": {
"type": "CallSubmodule",
"submodule": "medications/<<name_of_your_mdt_submodule_here_without_json_file_extension>>"
}
Put the submodule JSON file in the synthea/src/main/resources/modules/medications
folder.
Put your transition table CSV files in the synthea/src/main/resources/modules/lookup_tables
folder.
Example for asthma module:
Using the existing asthma module as an example...
Change this...
...
"Prescribe_Maintenance_Inhaler": {
"type": "MedicationOrder",
"reason": "asthma_condition",
"codes": [
{
"system": "RxNorm",
"code": "895994",
"display": "120 ACTUAT Fluticasone propionate 0.044 MG/ACTUAT Metered Dose Inhaler"
}
],
"prescription": {
"as_needed": true
},
"direct_transition": "Prescribe_Emergency_Inhaler",
"chronic": true
},
...
To this...
...
"Prescribe_Maintenance_Inhaler": {
"type": "CallSubmodule",
"submodule": "medications/maintenance_inhaler",
"direct_transition": "Prescribe_Emergency_Inhaler"
},
...
And make sure your submodule JSON and transition table CSVs are in the folder locations specified above.
- Put a
maintenance_inhaler.json
file in thesynthea/src/main/resources/modules/medication
folder. - Put all the transition table CSV files in the
synthea/src/main/resources/modules/lookup_tables
folder.
See below for example file structure:
synthea/
├─ src/
│ ├─ main/
| │ ├─ resources/
| │ │ ├─ modules/
| │ │ │ ├─ medication/
| │ │ │ │ ├─ maintenance_inhaler.json
| │ │ │ │ ├─ ...
| │ │ │ ├─ lookup_tables/
| │ │ │ │ ├─ maintenance_inhaler_ingredient_distribution.csv
| │ │ │ │ ├─ maintenance_inhaler_fluticasone_product_distribution.csv
| │ │ │ │ ├─ maintenance_inhaler_budesonide_product_distribution.csv
| │ │ │ │ ├─ maintenance_inhaler_beclomethasone_product_distribution.csv
| │ │ │ │ ├─ maintenance_inhaler_mometasone_product_distribution.csv
| │ │ │ │ ├─ ...
| │ │ │ ├─ asthma.json
| │ │ │ ├─ ...
Lastly, if the calling module (in this case, asthma.json
) ends medications by a specific State_Name
of a previous MedicationOrder
state, you will need to change that MedicationEnd
state to instead end a medication by attribute
. The reason for this is that our MDT JSON module generates different MedicationOrder
state names for each potential prescribed product, but they all have the same attribute
.
Change this...
...
"Maintenance_Medication_End": {
"type": "MedicationEnd",
"medication_order": "Prescribe_Maintenance_Inhaler",
"direct_transition": "Emergency_Medication_End"
},
...
To this...
...
"Maintenance_Medication_End": {
"type": "MedicationEnd",
"referenced_by_attribute": "maintenance_inhaler",
"direct_transition": "Emergency_Medication_End"
},
...
- In Synthea, change setting in
synthea/src/main/resources/synthea.properties
to disable FHIR exporting and enable CSV exporting.
...
exporter.fhir.export = false
...
exporter.csv.export = true
...
- Each time you run Synthea, make sure you havea all Synthea CSV output files closed, or it will error out with a non-specific error message.
- Run Synthea with a large enough sample size (at least 1000) to see a noticable impact from MDT.
- Check the
medications.csv
output file for medications produced by your MDT-generated module.
Please see docs/validation for a python notebook which can be used to validate Synthea + MDT patient populations against MEPS patient populations.