Skip to content

Geospatial Data APIs

Jeffrey K Gillan edited this page Jan 19, 2024 · 97 revisions

Session Summary

For our inaugural session in the NextGen Geospatial Data Science workshop series we will start with Data APIs. We will discuss:

  1. What they are
  2. Why you would use one, and
  3. Show a hands-on example using satellite imagery from Planet. We will be using the python scripting language in a Google Colab Jupyter Notebook to access and visualize Planet imagery. Open Colab Notebook

This session, as with the whole workshop series, is targeted to researchers with some prior geospatial experience (GIS/Remote sensing) but may have limited experience with scripting, coding, or 'data science'. Our goal is to demystify vocabulary and show you how to use these tools with straight forward examples. If you are an experienced data scientist or developer looking to learn to about geospatial, we are happy to have you, but this may feel a little slow to you. These workshop sessions are only 1 hour in length, so we can only scratch the surface and get you introduced to these tools. We won't be able to dive deep into any one topic.

Watch a Recording of the Workshop

Example Image

Prerequisites

  • If you come to the session in person, please bring your laptop if you want to code along together.

  • You need a Google account in order to access and use Colab. Anyone with an 'arizona.edu' email address should have direct access

  • Please have a Planet account prior to the session. You will need this to find your API key.

The UA Institute for Computation and Data-Enabled Insight(ICDI) has purchased a campus-wide license so all UofA students, staff, and faculty can access tons of satellite data products for free! Please click here to learn how to get your account and starting getting imagery.

Finding You API Key

  1. Logon to your Planet account at https://www.planet.com/login. You should land in your user dashboard


  1. Click on 'My Settings'. On this page, you should see your API Key. Copy this as you will use it in Colab.

...and now to the lesson.




Geospatial Data is Everywhere Online





GIS and remotely sensed data have proliferated online. From drone imagery, to stream networks, to land cover classifications, to wildfire, to wildlife, to human population, there is spatial data to represent every conceivable phenomenon that can be put on a map. In the earth observation space, there is around 1600 satellites orbiting Earth and snapping images around the clock in a variety of band frequencies.

The imagery company Planet operates more than 200 satellites which covers most of the Earth landmass every single day. Their flagship imagery product called Planetscope consists of 4 bands (blue, green, red, near-infrared) and has 3 meter spatial resolution.




Data Delivery

The Typical Way

The vast majority of online geospatial data is downloaded through a graphical website. And for good reason, it's just easy! Data repositories such as Earth Explorer and many others, have powerful and intuitive tools for searching and downloading data. Many of them combine a map interface with filters and search terms. For the vast majority of geospatial data seekers, these tools work perfectly.



The Data Science Way

If you are reading this material or attending this workshop, you may be in a select group that wants to do things the hard way. But with great challenge comes great reward! Many geospatial datasets can be searched for and downloaded through scripting languages. This allows you automate repetitive tasks and scale your imagery analysis well beyond what is possible with point-and-click computing.



Application Programming Interfaces (APIs)

Increasingly, geospatial data are available through web application programming interfaces (API). The term API can be used in several different contexts, but for our purposes here let's define it as:

a set of protocols for communicating between computers over the web.

It's just like a website but specifically for data transfer. An API is served on a computer and has a specific web address (URL) just like a website. But instead of interacting with the web address through a graphical browser, you interact with the API through scripting languages. Both websites and APIs use HyperText Transfer Protocol (HTTP).

For our hands-on example today, we will be using a RESTful API. REST, which stands Representational State Transfer, is a common API architecture that is favored for their scalability and ease of integration. They typically use JSON format for data exchange. REST APIs use standard commands for requests (from user to the API) and response (from API back to user). For example:

  • GET: Retrieves data from a server. It’s used for reading information.
  • POST: Sends data to a server to create or update a resource. It’s often used for creating new resources.

We will see examples of both of these commands in our python example.



Why use APIs to get Data when Web Browsing is so much Easier?

Using scripting languages to download geospatial data through APIs makes you a more powerful data science wizard.

Here are the top reasons to use APIs:

  1. Automation and Scalability: Python APIs allow for the automation of data retrieval and processing. For instance, a researcher monitoring deforestation could set up a script to automatically download satellite images of a specific region at regular intervals. This is much more efficient than manually downloading each image through a web interface. In that same script, the data can be analyzed in Python's rich ecosystem of libraries.

  2. Reproducibility and Sharing of Research: Python scripts can be shared and reproduced by other researchers, ensuring transparency and reproducibility of the research. For example, a script used to analyze urban expansion using satellite images can be shared with the research community, allowing others to replicate the study or build upon it.

  3. Handling Large Datasets: For large-scale projects that involve huge amounts of data, using a Python API can be much more practical. A researcher studying global water resources could write a script to handle and analyze terabytes of satellite data, a task that would be impractical through a website explorer.

  4. Real-time Data Processing: Some research might require real-time or near-real-time data processing, which can be facilitated by Python APIs. For instance, monitoring natural disasters like wildfires or floods in real-time for quick response.



Hands-on Exercise

For our hands-on exercise, we will be tapping into an API from the satellite imagery company Planet. We will be using the python scripting language in a Google Colab Jupyter Notebook to access and visualize Planet imagery. Open Colab Notebook



Planet Resources

Planet tutorials on APIs

Planet API and python analysis tutorials using Jupyter Notebooks

Planet API Documentation

Other API tutorials

Carpentries STAC API tutorial

CU-Bounlder EarthLab tutorial using R

Tutorial to get Sentinel data

Another Sentinel Tutorial

Using NEON API in R

Other Geospatial APIs

USGS Earth Explorer json API

Google Earth Engine

National Ecological Observation Network (NEON)

STAC APIs

USGS Landsat Collection

Earth Search public datasets on AWS

Geospatial Data Sources

Awesome list of Free GIS Data

NOAA

NEON

Nasa Earth Data

Earth Explorer

National Map

US Census Data

Data.gov

3DEP

National Hydrography

Open Topo

Geonadir

Open Aerial

AZGEO Data Hub

Pima County Geospatial Data Portal

https://opensourcegisdata.com/state/arizona.html

Vocab

API - An Application Programming Interface. For the session's purposes, this is a way to get data over the web using programming languages like python instead of using websites. APIs often have a web address just like any other website. For example "https://api.planet.com/data/v1" is the address of the Planet API.

API Endpoint - An endpoint is referring to a specific API web address that has a specific function. In the case of Planet, it has an endpoint "https://api.planet.com/data/v1/stats" that is used to return statistics on imagery assets in their catalog. Planet has a different endpoint https://api.planet.com/data/v1/quick-search" that is used to getting the imagery names and downloading them.

API Key A password-like string of letters and numbers that are issued by the host of the API. For today's session, Planet has issued unique API keys for each user that allows them to access data from the API.

Jupyter Notebook - A browser-based environment for writing computer code in python, R, or julia. A Notebook can be served on your local machine or on remote machine that you access over the web. Notebooks are nice tools for sharing and collaborating on code. They are perfect for a classroom of students to code together.

Google Colab - A Jupyter Notebook environment that is hosted on a Google virtual machine (VM). The free tier gives each user a relatively small allocation of RAM (memory), disk storage, and CPU. This VM allocation is temporary and lasts less than a day, after which it disappears forever. Colab is not place to store your data. But Colab is great place to do some coding analysis without having to install software on your local machine.