Skip to content

aiThanet/FNC-1

Repository files navigation

FNC-1 COMP9417 project

Fake New Challenge : FakeNewsChallenge.org.

This project start from provided baseline on github.

Requirements

    python >= 3.7.0 (tested with 3.7.2)

Installation

  1. Install required python packages.

    pip install -r requirements.txt --upgrade
    
  2. Parts of the Natural Language Toolkit (NLTK) might need to be installed manually.

    python3 -c "import nltk; nltk.download('stopwords'); nltk.download('punkt'); nltk.download('wordnet')"
    
  3. In order to reproduce the same results, please use our features and models. The features and models have been generated. If you want to reproduce it, delete all files in features and models directory. Keep them, you can skip to 6.

  4. To generate name entity feature, you need to run CoreNLP server version 3.9.2: Download Stanford CoreNLP, extract anywhere and execute following command in corenlp directory (It takes about 5 hours on dev enviroment to generate):

    java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9020
    
  5. To generate doc2vec feature, you need 2 paragraph vector models in models directory name h_d2v.model and b_d2v.model. You can generate from doc2vecModelGenerator.py or use the one we've already generated.

  6. To run and generate the model (if features or models do not exist, the script will generate them automatically).

    python3 FinalClassifier.py
    
  7. XGBoostClassifier.py is the old version of project which classify 4 classes by only a XGBoost model.

Report

Report.

References

About

COMP9417 Project - Fake New Challenge

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages