`psynlp` --- NLP functionality for psychiatric text

❗ Most of the functionality in this project has now been made available the library clinlp: production ready NLP pipelines for Dutch Clinical Text. Although the code here might still benefit some projects, the project itself is no longer maintained (and thus archived).

This package bundles some functionality for applying NLP (preprocessing) techniques to clinical text in psychiatry. Specifically, it contains the following submodules:

preprocessing -- Preprocessing text
spelling -- Spelling correction
entity -- Entity matching
context -- Detecting properties of entities (e.g. negation, plausibility) based on context

These submodules are further documented in their respective readmes, which you will find by following the links above.

Installation

Since some paths need to be initialized, installation is most easily done by downloading the source, modifying paths in (psynlp/utils.py -- see Requirements below), and running:

pip install -r requirements.txt
python setup.py install

Dependencies

The psynlp package has the following dependencies (automatically installed when using the commands above):

doublemetaphone
gensim
nltk
pandas
spacy

Requirements

Some functionality requires specific models, which are not included in the repository because of their privacy-sensitive nature. Their paths should be specified in psynlp/utils.py.

A spacy model can be obtained here (e.g. python -m spacy download nl_core_news_sm for standard Dutch model)
A gensim trained Word2Vec model, used for the EmbeddingRanker in the spelling module.
Token frequencies in the specific corpus required for the NoisyRanker, in a csv file (;-separated with a token and a frequency column).

Usage

psynlp follows an object-oriented paradigm, much like the sklearn libary for machine learning. To use the spelling correction from the spelling submodule for instance, the following code can be used:

from psynlp.spelling import SpellChecker
c = SpellChecker(spacy_model="your_spacy_model_name")
c.correct("Dit is een tekst met daarin een splefout")
>>> "Dit is een tekst met daarin een spelfout"

Usage is futher documented in detail in the respective submodule READMEs.

Examples

Basic usage and API of each submodule is documented in the submodule README. Additionally, some use cases are documented in the following notebooks (also referenced in the relevant submodule READMEs):

preprocessing.ipynb -- Example code for preprocessing
spelling.ipynb -- Example code for spelling correction
entity.ipynb -- Example code for entity recognition
context.ipynb -- Example code for context matching
example_pipeline.ipynb -- Example code for extracting variables from text, using all of the four submodules

Contributors

Vincent Menger -- Conceptualization, developing code

Nick Ermers -- Improving context detection

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
docs		docs
psynlp		psynlp
.gitignore		.gitignore
MANIFEST.in		MANIFEST.in
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`psynlp` --- NLP functionality for psychiatric text

Installation

Dependencies

Requirements

Usage

Examples

Contributors

About

Releases

Packages

Languages

vmenger/psynlp

Folders and files

Latest commit

History

Repository files navigation

psynlp --- NLP functionality for psychiatric text

Installation

Dependencies

Requirements

Usage

Examples

Contributors

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`psynlp` --- NLP functionality for psychiatric text

Packages