This code is for the large language model assertion pipeline. Detailed instructions coming soon!
- Run
run_umls_synonym_ner.py
andrun_dataset_ner.py
to build NER datasets (recommend using targeted NER prompts instead of broad NER prompts for NER dataset pull) - (Optional - highly recommended) Run
run_ner_cosine_similarity.py
followed byrun_llm_filter_cosine_sim_ner_output.py
to filter NER outputs (filter NER outputs to remove the low-yield named entities --> also helpful to review filtered NER outputs and remove those that are not related to your target entity) - Run
run_extraction.py
to build target-matcher and extract high-yield text from clinical notes - Run
run_llm_assertion.py
to generate LLM assertions