Skip to content

kaushikacharya/clinical_occupation_recognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MEDDOPROF: MEDical DOcuments PROFessions recognition shared task

Tasks

  • MEDDOPROF-NER: Named Entity Recognition to extract entities related to occupation and employment status.
  • MEDDOPROF-NORM: Normalize the entities to codes.

Algorithms

  • NER: Conditional Random Fields (CRF) using hand-crafted features.
  • NORM: Vector embedding similarity.

Shared Task Webpage

https://temu.bsc.es/meddoprof/

Data

  • Genre: Medical documents
  • Language: Spanish

Sample

Sample

Dependencies

  $ pip install -r requirements.txt

How to run?

Below are the example commands.

  • Train model
        python -u -m src.crf --train_model <path_trained_model> --flag_train
  • Predict:
        python -u -m src.crf --train_model <path_trained_model> --flag_predict
  • Evaluate

Results

MEDDOPROF-NER Micro-average metrics

Metrics/Split Train Test
Precision 0.953 0.807
Recall 0.839 0.524
F-score 0.892 0.635

MEDDOPROF-NORM Micro-average metrics

Metrics/Split Train Test
Precision 0.956 0.720
Recall 0.840 0.467
F-score 0.894 0.566

Leaderboard

Leaderboard

Task Description Paper

NLP applied to occupational health: MEDDOPROF shared task at IberLEF 2021 on automatic recognition, classification and normalization of professions and occupations from medical texts by Salvador Lima Lopez et al.

Publication

Occupation Recognition and Normalization in Clinical Notes by Kaushik Acharya

Related Work

Spanish to English translation using Neural Machine Translation

About

MEDDOPROF: MEDical DOcuments PROFessions recognition shared task

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages