Skip to content

geooff/cia_world_factbook_api

 
 

Repository files navigation

CIA World Factbook API

Converts the CIA World Factbook into a json data structure.

Data

  • Latest - approx 3 MB - updated weekly
  • Historical - approx 30 MB - last updated 2017-08-31
  • Html Archives - approx 240 MB - last updated 2017-08-31

Usage

If you just want the latest data, get it using the Latest link above.

If you also want to get the full historical data set, use the Historical link above.

If you want to parse the factbook html into json for yourself:

  • clone this repository to your local machine.
  • download the Html Archives above.
  • edit config.json with the paths to your downloaded html archives.
  • run go run parse_html_to_json.go to convert each country html to a json structure.
  • run go run create_weekly_json_files.go to combine each individual country into a week-by-week data file.

If you want to fetch the html files yourself and then parse them:

  • clone this repository to your local machine.
  • edit config.json with the paths to use for the downloaded html archives.
  • run python fetch.py to fetch the historical html files from archive.org (will take several days).
  • run go run parse_html_to_json.go to convert each country html to a json structure.
  • run go run create_weekly_json_files.go to combine each individual country into a week-by-week data file.

Tests

  • clone this repository to your local machine.
  • cd cwf/src/country
  • go test

Contributing

Contributions are most welcome.

Reporting Issues

Please report issues using the Issues tab at the top of this page.

Pull Requests

If you modify the code please submit a pull request for review.

Most of the parsing logic is in src/country in the files page.go and string_conversions.go.

If the parser is modified, please update the VERSION contant in country/page.go.

License

MIT - see LICENSE

About

Converts the CIA World Factbook into a json data structure

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Go 96.8%
  • HTML 2.9%
  • Other 0.3%