ChemNER: Fine-Grained Chemistry Named Entity Recognition with Ontology-Guided Distant Supervision

Code

The code for distant supervision generation is in corpus.ipynb. The next step is to train a standard sequence labeling model (Bi-LSTM, RoBERTa, ChemBERTa, ...) based on distant supervision.

Data

The data is in the folder /data. The training data is too big to be uploaded and can be found here: CHEM_train.json. The human-annotated test data is in /data/CHEM_test_annotations.jsonl.

Citation

@inproceedings{wang2021chemner,
  title={ChemNER: Fine-grained chemistry named entity recognition with ontology-guided distant supervision},
  author={Wang, Xuan and Hu, Vivian and Song, Xiangchen and Garg, Shweta and Xiao, Jinfeng and Han, Jiawei},
  booktitle={Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing},
  year={2021}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChemNER: Fine-Grained Chemistry Named Entity Recognition with Ontology-Guided Distant Supervision

Code

Data

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
README.md		README.md
corpus.ipynb		corpus.ipynb

xuanwang91/ChemNER

Folders and files

Latest commit

History

Repository files navigation

ChemNER: Fine-Grained Chemistry Named Entity Recognition with Ontology-Guided Distant Supervision

Code

Data

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages