Offensive-comments

Immplementation of data selection with multi-task learning method to detect offensive comments. The idea was taken from Domain Adaptation with BERT-based Domain Classification and Data Selection.

The Scikit classification report:

               precision    recall  f1-score   support

non-offensive       0.81      0.95      0.88       212
    offensive       0.95      0.83      0.89       268

     accuracy                           0.88       480
    macro avg       0.88      0.89      0.88       480
 weighted avg       0.89      0.88      0.88       480

I cannot share the target dataset as it is owned by Eternio GmbH.

Branches:

domain-classifier : Trains BERT to find the probability with which a comment in the dataset (say Germ Eval 2017) will belong to the Eternio dataset (target dataset).
domain-adaptation-single-task : Supports fine tuning the BERT model for a single task.
mtl : Supports fine tuning the BERT model for a multiple tasks. I referred MT-DNN for this.

Download the thesis document here.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
base		base
data_loader		data_loader
dataset		dataset
logger		logger
model		model
trainer		trainer
utils		utils
.gitignore		.gitignore
README.md		README.md
germ-eval-config.json		germ-eval-config.json
jigsaw-toxic-comments-config.json		jigsaw-toxic-comments-config.json
jigsaw-unintended-toxic-comments-config.json		jigsaw-unintended-toxic-comments-config.json
msc_chilwant_nikhil.pdf		msc_chilwant_nikhil.pdf
offensive-comments.sh		offensive-comments.sh
parse_config.py		parse_config.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Offensive-comments

About

Releases

Packages

Languages

nikhilbchilwant/Offensive-comments

Folders and files

Latest commit

History

Repository files navigation

Offensive-comments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages