Skip to content

A web application developed as part of coursework for CSC532 (Machine Learning), capable of automatically detecting Personally Identifiable Information (PII) in student writing

Notifications You must be signed in to change notification settings

jedipw/PIIDataDetector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PII Data Detector

The Personally Identifiable Information (PII) Data Detector is an individual machine learning project developed as part of the CSC532 Machine Learning course. The goal of this project is to detect personally identifiable information (PII) in student writing. In the web application, users can input text into the text editor. The application will then highlight words considered as PII and suggest removing those words. Additionally, users can save the text for later viewing.

Technology Stacks

Programming Languages:

  • Python
  • TypeScript
  • Go

AI/Data Science Tools:

  • PEFT
  • Spacy
  • Transformers
  • Gemma
  • Faker
  • Numpy
  • Pandas
  • Matplotlib
  • Seaborn

Development Tools:

  • Web Application: NextJS
  • Backend APIs: Go Fiber, Flask
  • Database: PostgreSQL
  • Database ORM: Prisma
  • 3rd Party API: Firebase Authentication
  • Container Management: Docker
  • Hosting: Google Cloud Run
  • CI/CD: GitHub Action

Screenshot

PIIDataDetector_Screenshot

First Time Setup

PII Data Detector

  1. Download model.safetensors from https://drive.google.com/file/d/19gw8qc6TlHQb5Ag2Ke_e2vEPfVGCRrW3/view?usp=sharing and place it in /pii_data_detector/model.
  2. In the terminal, navigate to the /pii_data_detector directory.
  3. Run pip install -r requirements.txt in the terminal.
  4. Run python main.py in the terminal to start the server.

Backend

  1. In the terminal, navigate to the /backend directory.
  2. Create .env in that directory. (See the example in .env.example)
  3. Run go run github.com/steebchen/prisma-client-go db push in the terminal.
  4. Run go run server.go in the terminal to start the server.

Frontend

  1. In the terminal, navigate to the /frontend directory.
  2. Create .env.local in that directory. (See the example in .env.example)
  3. Run npm i in the terminal.
  4. Run npm run dev in the terminal to start the server.

Note 1: You may be required to install additional packages/libraries.
Note 2: You must run all three servers in order for the web application to be fully functional.
Note 3: It is mandatory to set up the PostgreSQL database and Firebase Authentication before running.

More Information

For more information, please refer to the "Wiki" section at https://github.com/jedipw/PIIDataDetector/wiki.

About

A web application developed as part of coursework for CSC532 (Machine Learning), capable of automatically detecting Personally Identifiable Information (PII) in student writing

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published