Skip to content

CyVerse Geospatial

Jeffrey K Gillan edited this page Jun 17, 2024 · 59 revisions


Housed at the University of Arizona, CyVerse is a one-of-a-kind cloud computing system for the academic and research communities. It's mission is to design, deploy and expand a national cyberinfrastructure for scientific research and to train scientists how to use it. CyVerse is an excellent platform to make your geospatial research open and reproducible!

Cyverse (originally called Iplant) has been in existence for 16 years; has spent $120M in research funds; has 135,000 registered users; and has facilitated 1,700 peer-reviewed publications across many scientific fields such as plant genetics, genomics, astronomy, geosciences, health, and agriculture.

Cyverse is completely Free for University of Arizona students, staff, and faculty.




Data Storage and Sharing

Cyverse Data Store is the ideal cloud storage to host your large geospatial datasets, share data with colleagues, and meet publication/grant archival requirements.

Utilizing cloud storage infrastructure eliminates the need for researchers to maintain their own servers (and APIs) and allows them to focus more on their mission. It empowers individuals to easily share their data on the web while eliminating costly local storage. Cloud storage is also a great solution for never losing your data due to hardware failure.

  • Cyverse Data Store is object cloud storage similar to Azure Blob, or Amazon S3
  • Pro account has a 3TB limit
  • Data I/O is accessible through website and multiple command line tools
  • Share your data with your colleagues and world with a URL
  • Data can be public/private, shared with anyone, set permission levels



Share Your Cloud Native Geospatial Formats

Sharing of geospatial data from cloud storage can be greatly improved with the use of Cloud Native Formats. These formats are designed to be used in the cloud and are built for http streaming. This means that users can view and analyze data without downloading the entire dataset. Analagously, this is like going from the original Napster model of downloading music to the Spotify model of streaming music.

There is a cloud native format to fit almost any geospatial data type. For example, FlatGeoBuf is a cloud native format for vector data, Cloud Optimized GeoTIFF (COG) is a cloud native format for raster data, and Cloud Optimized Point Cloud (COPC) is a cloud native format for point cloud data. Zarr is cloud native format that can be used for multi-dimensional raster data.



The above image shows an example of Cloud-optimized Point Cloud streamed from Cyverse Data Store into a web application.







Permanent Archival

Cyverse Data Commons is our public facing data storage interface. It enables you to share data with people outside of Cyverse.

Community Released: A folder of data you want to share with colleagues, stakeholders, or the entire world. You control read/write/own permissions and make it public or private.

Curated: Data that is tied to a peer-reviewed publication and needs permanent archival. You can apply many types of metadata templates your data as well as receive a permanent DOI.




CyVerse Discovery Environment

The CyVerse Discovery Environment is a place where you can run cloud instances of the most popular and powerful data analysis programs. It is backed by robust computing resources that can help scale your analysis beyond what is possible on your laptop.

  • Launch instances of QGIS, RStudio, Jupyter Notebooks, VSCode,and more
  • Launch existing applications or create and launch your own analysis (containerized software)

  • GPU Resources for machine learning and other high-computation data processing

  • Easily access you data in the CyVerse Data Store

  • Get access to a Linux desktop


Advantages of Cloud Computing

  • Avoid the upfront cost and complexity of owning and maintaining your own IT infrastructure

  • Allows groups or individuals to scale up (or down) their operations quickly as their computing needs change

  • Allows users to access their data and applications from anywhere, on any device, at any time






Planet Satellite Imagery

The UA Institute for Computation and Data-Enabled Insight(ICDI) has purchased a campus-wide license so all UofA students, staff, and faculty can access tons of satellite data products for free! Please click here to learn how to get your account and starting getting imagery.

The imagery company Planet operates more than 200 satellites which covers most of the Earth landmass every single day. Their flagship imagery product called Planetscope consists of 4 bands (blue, green, red, near-infrared) and has 3 meter spatial resolution.






Agisoft Metashape Licenses

Agisoft Metashape is an industry leading structure-from-motion photogrammetry software application that is ideal for drone and aerial imagery mapping. It is used heavily in the fields of forestry, mining, hydrology, construction, agriculture, and many more. Cyverse and the University of Arizona ICDI have 20 educational floating licenses for the latest version of Metashape. We are happy to provide these licenses to University of Arizona personnel for use on university computers and the HPC. More details on the licenses can be found here. Please contact Jeff Gillan ([email protected]) or Tyson Swetnam([email protected]) to learn how to get access the Metashape licenses.






Jetstream2

For researchers that have larger computing needs, we can help you get access to cloud computer Jetstream2. Cyverse is a partner in Jetstream2, which is a national cyberinfrastructure funded by the National Science Foundation. Housed primarily at Indiana University, Jetstream2 is cloud computing a grand scale! Any funded research in the USA could get computing allocation on Jetstream2.



Cacao

Cyverse has developed the platform Cloud Automation & Continuous Analysis Orchestration (CACAO) to make it easier to deploy and provision virtual machines in a cloud environment. Known as Infrastructure-as-code, with a few clicks, a user can generate any size of virtual computing to do their scientific analysis. Cacao is cloud agnostic which means it can run in Jetstream2 but also AWS, AZURE, and Google Cloud.



Resources

Abernathey, R. P. et al. (2021) "Cloud-Native Repositories for Big Scientific Data," in Computing in Science & Engineering, vol. 23, no. 2, pp. 26-35, 1 March-April 2021, https://doi.org/10.1109/MCSE.2021.3059437

Chris Holmes's blog on Cloud Native

Cloud-Native Geospatial Outreach Event - April 2022 - from Open Geospatial Consortium (OGS)

Gentemann, C. L., et al. (2021). “Science Storms the Cloud”. AGU Advances, 2, e2020AV000354. https://doi.org/10.1029/2020AV000354

Mapscaping Podcast on Cloud Native Geospatial