Home

Welcome to the Wiki of the amazing Nala pipeline.

You can have a quick peek on why Nala was created in the first place. --> 2 theses' project in order to build up tagtog.net

Introductory talk:
Online presentation Thesis

Pipeline diagram View the pipeline visualization externally in a bigger resolution here. Editable version of the online diagram can be found here. (requires to log in/create an account to Lucidchart).

Goals of 2 theses and this method:

Study significance of NL mentions in mutation mention recognition

ratio of standard vs NL in abstracts & full text
% of novel mutations not present in SwissProt (would require manual annotation of protein
% of mutation mentions in natural language that don't appear as standard mention

Define/extend corpus of NLs

size depends on significance of NLs

Method for mutation mention extraction grounded to their genes/proteins

Mutation mention recognizer better than tmVar for standard mentions
If NLs are relevant, prove good F1 performance (> 70-80)
Simple or optionally advanced normalization method
Easy to use program:
- Good documentation:
  - code
  - end-user (biology researcher level, how to call from the command line, ...)
- Accept inputs: programmatical call (string), text file, corpora' formats**
- Accept outputs: ann.json (tagtog suitable)

Paper

Full draft (1 or 2 papers?) by end of August submittable to Burkhard Rost
Submit by September-October

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Goals of 2 theses and this method:

Theses Documentation

Clone this wiki locally