Tanagra

Tanagra is a project to build a configurable cohort builder and data explorer. Our goal is to make it easy to set up a new dataset for exploring with little or no custom code required, so everything we've built is configuration-driven.

Project overview

The project has three main pieces: indexer, service, UI. All three pieces are highly interconnected and are not intended to be used or deployed separately. Everything lives in this single GitHub repository.

The indexer takes the source dataset and produces a logical copy that's better suited to the types of queries the UI needs to run. It denormalizes some data, precomputes some things, and reorganizes tables. The goal is not to meet some query benchmark, only to have the UI not time out.

The service processes queries for the UI and manages the application database, which stores user-managed artifacts like cohorts and data feature sets.

The UI includes the cohort builder, data feature set builder, export, and cohort review interfaces.

Configure a new dataset

Tanagra supports data patterns, rather than specific SQL schemas. Check the list of currently supported patterns to see how they map to your dataset.

Tanagra defines a custom object model on top of the underlying relational data. The dataset configuration language is based on this object model, so it's helpful to be familiar with the main concepts.

A dataset configuration is spread across multiple files, to improve readability and allow easier sharing across datasets. See an overview of the different files and directory structure, as well as pointers to example files. Check the full dataset configuration schema documentation to lookup specific properties. Documentation for protocol buffers used for visualizations and criteria plugins is here.

Set up a new deployment

Choose a deployment pattern and configure the GCP project(s).

Once you've defined the configuration files for a dataset, run the indexer. Check the full indexer CLI documentation to lookup specific commands.

Tanagra does not provide an API for managing access control for a population of users. Instead, we provide an interface for calling an external access control service. (e.g. The VUMC admin service serves as the external access control service for the SD deployment.) Either reuse an existing access control implementation, or add your own.

We expect deployments to require varied methods of exporting data. Either reuse an existing export implementation, or add your own.

Check the full application configuration documentation to lookup specific deployment properties.

Once your deployment is up and running, create a regression test suite to detect unexpected changes due to config or underlying data changes and run it regularly.

Manage releases

Tanagra supports multiple deployments, all with different release cadences. See more details about the codebase versioning and release process, and how you can manage the version for a specific deployment.

Use this tool to diff two release tags, when you're planning on bumping a deployment to a newer version of this codebase.

Contribute to the codebase

Check the guidelines for developers, including instructions for getting things running locally on your machine.

See an overview of the codebase structure, and information specifically about the UI.

All documentation links

These are all linked in the sections above. This is just in list format if you already know what you're looking for.

Project overview

Service Artifacts

Configure a new dataset

Set up a new deployment

Manage releases

Contribute to the codebase

Name		Name	Last commit message	Last commit date
Latest commit History 1,610 Commits
.github		.github
.run		.run
annotationProcessor		annotationProcessor
buildSrc		buildSrc
cli		cli
client		client
docs		docs
gradle		gradle
indexer		indexer
scripts		scripts
service		service
ui		ui
underlay		underlay
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
build.gradle		build.gradle
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
publish.sh		publish.sh
pull-credentials.sh		pull-credentials.sh
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tanagra

Project overview

Configure a new dataset

Set up a new deployment

Manage releases

Contribute to the codebase

All documentation links

Codebase test status

About

Releases 649

Packages

Contributors 22

Languages

License

DataBiosphere/tanagra

Folders and files

Latest commit

History

Repository files navigation

Tanagra

Project overview

Configure a new dataset

Set up a new deployment

Manage releases

Contribute to the codebase

All documentation links

Codebase test status

About

Resources

License

Stars

Watchers

Forks

Releases 649

Packages 0

Contributors 22

Languages

Packages