Skip to content
This repository has been archived by the owner on Aug 15, 2024. It is now read-only.

Office-of-Digital-Services/ca-jobs-schema

 
 

Repository files navigation

California Jobs Schema

###Exploring ALL California Job Classification Schemas!

Exploratory project intended to make CalHR job classification data useful in parsed form; this will enable enhanced sorting, in-depth analysis, ladder analysis, cross-referencing, clustering and more.

All collaboration is welcome, specifically re: pdf conversion to txt (see known issues), front end, sqlite db (tiny, ~10,000 rows), content copy

####Currently:

  • Uses wget to download the most recent version of the alphabetic jobs classification schema
  • Python parsing of all fields using string manipulation and export to txt, csv, etc.

####Known Issues:

  • Requires consistent, automated way to convert PDF to raw text
  • Have considered xpdf, pdftotxt, ghostscript, ps2ascii; all work.. but text recognition formats are inconsistent, even without considering OCR
  • This is still a manual process as of this posting

####Future Plans:

  • Cluster job classifications by schema, in order to demonstrate promotional opportunities by strict ladder classification
  • Provide visual representation of job clusters and potential lateral promotions between distinct ladders
  • Potentially via python scikit-learn or scipy.cluster, GNU gephi
  • Integrate schematic arrangement of classes via http://www.calhr.ca.gov/Pay%20Scales%20Library/PS_Sec_16.pdf

About

Exploring ALL California Job Classification Schemas

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 98.5%
  • CoffeeScript 0.6%
  • Python 0.2%
  • CSS 0.2%
  • Ruby 0.2%
  • HTML 0.2%
  • JavaScript 0.1%