This is a simple Python client for the Media Cloud CLIFF-CLAVIN geocoder.
If you just want to use this library to talk to a CLIFF server you have running somewhere, first install it
pip install mediacloud-cliff
Then instantiate and use it like this:
from cliff.api import Cliff
my_cliff = Cliff('http://myserver.com:8080')
my_cliff.parse_text("This is about Einstien at the IIT in New Delhi.")
This will return results like this:
{
"results": {
"organizations": [
{
"count": 1,
"name": "IIT"
}
],
"places": {
"focus": {
"cities": [
{
"id": 1261481,
"lon": 77.22445,
"name": "New Delhi",
"score": 1,
"countryGeoNameId": "1269750",
"countryCode": "IN",
"featureCode": "PPLC",
"featureClass": "P",
"stateCode": "07",
"lat": 28.63576,
"stateGeoNameId": "1273293",
"population": 317797
}
],
"states": [
{
"id": 1273293,
"lon": 77.1,
"name": "National Capital Territory of Delhi",
"score": 1,
"countryGeoNameId": "1269750",
"countryCode": "IN",
"featureCode": "ADM1",
"featureClass": "A",
"stateCode": "07",
"lat": 28.6667,
"stateGeoNameId": "1273293",
"population": 16787941
}
],
"countries": [
{
"id": 1269750,
"lon": 79,
"name": "Republic of India",
"score": 1,
"countryGeoNameId": "1269750",
"countryCode": "IN",
"featureCode": "PCLI",
"featureClass": "A",
"stateCode": "00",
"lat": 22,
"stateGeoNameId": "",
"population": 1173108018
}
]
},
"mentions": [
{
"id": 1261481,
"lon": 77.22445,
"source": {
"charIndex": 37,
"string": "New Delhi"
},
"name": "New Delhi",
"countryGeoNameId": "1269750",
"countryCode": "IN",
"featureCode": "PPLC",
"featureClass": "P",
"stateCode": "07",
"confidence": 1,
"lat": 28.63576,
"stateGeoNameId": "1273293",
"population": 317797
}
]
},
"people": [
{
"count": 1,
"name": "Einstien"
}
]
},
"status": "ok",
"milliseconds": 22,
"version": "2.6.1"
}
You can also just get info from the GeoNames database inside CLIFF:
from cliff.api import Cliff
my_cliff = Cliff('http://myserver.com:8080')
my_cliff.geonames_lookup(4943351)
This will give you results like this:
{
"results": {
"id": 4943351,
"lon": -71.09172,
"name": "Massachusetts Institute of Technology",
"countryGeoNameId": "6252001",
"countryCode": "US",
"featureCode": "SCH",
"featureClass": "S",
"parent": {
"id": 4943909,
"lon": -71.39184,
"name": "Middlesex County",
"countryGeoNameId": "6252001",
"countryCode": "US",
"featureCode": "ADM2",
"featureClass": "A",
"parent": {
"id": 6254926,
"lon": -71.10832,
"name": "Massachusetts",
"countryGeoNameId": "6252001",
"countryCode": "US",
"featureCode": "ADM1",
"featureClass": "A",
"parent": {
"id": 6252001,
"lon": -98.5,
"name": "United States",
"countryGeoNameId": "6252001",
"countryCode": "US",
"featureCode": "PCLI",
"featureClass": "A",
"stateCode": "00",
"lat": 39.76,
"stateGeoNameId": "",
"population": 310232863
},
"stateCode": "MA",
"lat": 42.36565,
"stateGeoNameId": "6254926",
"population": 6433422
},
"stateCode": "MA",
"lat": 42.48555,
"stateGeoNameId": "6254926",
"population": 1503085
},
"stateCode": "MA",
"lat": 42.35954,
"stateGeoNameId": "6254926",
"population": 0
},
"status": "ok",
"version": "2.6.1"
}
If you want to work on this API client, then first clone the source repo from GitHub and install the dependencies
nmake install
Then make a .env
file in this directory and put the url to your CLIFF server in it:
CLIFF_URL=http://localhost:8080
- Run
make test
to make sure all the test pass - Update the version number in
cliff/__init__.py
- Make a brief note in the version history section below about the changes
- Run
make build-release
to create an install package - Run
make release-test
to upload it to PyPI's test platform - Run
make release
to upload it to PyPI
- v2.6.1: upgrade to CLIFF v2.6.1 (internal build changes)
- v2.6.0: upgrade to CLIFF v2.6.0 (adds multi-lingual support at query level and upgrades NER models)
- v2.5.0: upgrade to CLIFF v2.5.0 (and keep version numbers roughly in sync)
- v2.1.0: upgrade to CLIFF v2.4.2
- v2.0.2: update examples in readme file
- v2.0.1: init with url instead of host/port
- v2.0.0: move to
mediacloud
naming, underscored method names, remove deprecated NLP endpoint - v1.4.0: upgrade to CLIFF v2.4.1, add support for extractContent endpoint
- v1.3.1: updates for python3
- v1.3.0: updates for python3, support for client-side text replacements
- v1.2.0: points at CLIFF v2.3.0 (updates Stanford NER & has new plugin architecture)
- v1.1.0: points at CLIFF v2.2.0 (adds ancestry to
geonamesLookup
helper) - v1.0.2: first release to PyPI