Skip to content

socialcomplab/semeval23-mcpt

Repository files navigation

mCPT at SemEval-2023 Task 3: Multilingual Label-Aware Contrastive Pre-Training of Transformers for Few- and Zero-shot Framing Detection

Alexander Ertl
Markus Reiter-Haas
Kevin Innerebner
Elisabeth Lex

This repository contains the code for the paper: mCPT at SemEval-2023 Task 3: Multilingual Label-Aware Contrastive Pre-Training of Transformers for Few- and Zero-shot Framing Detection (ACL Anthology, arXive Preprint).

TLDR: Our system (mCPT) employs a pre-training procedure based on multilingual Transformers using a label-aware contrastive loss function to tackle the SemEval-2023 Task 3 Subtask Framing Detection.
The challenge of the shared task lies in identifying a set of 14 frames when only a few or zero samples are available, i.e., a multilingual multi-label few- or zero-shot setting.
Therein, we exploit two features of the task: (i) multi-label information and (ii) multilingual data for pre-training.
We are first on the Spanish framing prediction leaderboard.

mCPT-Overview

Our contributions are:

  • C1: We adopt a multi-label contrastive loss function for natural language processing to optimize the embeddings of textual data.
  • C2 We describe a two-phase multi-stage training procedure for multilingual scenarios with limited data, i.e., few- and zero-shot predictions.
  • C3 We demonstrate the effectiveness of our winning system for framing detection supported by embedding and ablation studies.

The pretrained model is available at HuggingFace organization page

Model usage:

from transformers import AutoModel, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2")
model = AutoModel.from_pretrained("socialcomplab/mcpt-body-semeval2023task3")
print(model(**tokenizer("Test sentence.", return_tensors="pt")))

The basic repository structure mirrors the paper sections.
mcpt provides the basic components for re-usability.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published