Always know what to expect from your data.
We're making some major revisions to the project right now, so expect a BIG update to documentation by the end of June.
In the meantime, the Great Expectations Slack channel is the best place to get up-to-date information:
https://tinyurl.com/great-expectations-slack
Teaser: the next round of revisions doesn't change the existing behavior of Great Expectations at all, but it does add tons of new support for profiling, documenting, and deploying Expectations. It significantly raises the bar for making Great Expectations fully production-ready.
Great Expectations helps teams save time and promote analytic integrity by offering a unique approach to automated testing: pipeline tests. Pipeline tests are applied to data (instead of code) and at batch time (instead of compile or deploy time). Pipeline tests are like unit tests for datasets: they help you guard against upstream data changes and monitor data quality.
Software developers have long known that automated testing is essential for managing complex codebases. Great Expectations brings the same discipline, confidence, and acceleration to data science and engineering teams.
To get more done with data, faster. Teams use great_expectations to
- Save time during data cleaning and munging.
- Accelerate ETL and data normalization.
- Streamline analyst-to-engineer handoffs.
- Monitor data quality in production data pipelines and data products.
- Simplify debugging data pipelines if (when) they break.
- Codify assumptions used to build models when sharing with distributed teams or other analysts.
It's easy! Just use pip install:
$ pip install great_expectations
You can also clone the repository, which includes examples of using great_expectations.
$ git clone https://github.com/great-expectations/great_expectations.git
$ pip install great_expectations/
Expectations include:
expect_table_row_count_to_equal
expect_column_values_to_be_unique
expect_column_values_to_be_in_set
expect_column_mean_to_be_between
- ...and many more
Visit the glossary of expectations for a complete list of expectations that are currently part of the great expectations vocabulary.
Absolutely. Yes, please. Start here, and don't be shy with questions!
For full documentation, visit Great Expectations on readthedocs.io.
Down with Pipeline Debt! explains the core philosophy behind Great Expectations. Please give it a read, and clap, follow, and share while you're at it.
For quick, hands-on introductions to Great Expectations' key features, check out our walkthrough videos:
If you have questions, comments, feature requests, etc., opening an issue is definitely the best path forward.
We also have a slack channel, which you can join here: https://tinyurl.com/great-expectations-slack
It depends. If you have needs that the library doesn't meet yet, please upvote an existing issue(s) or open a new issue and we'll see what we can do. Great Expectations is under active development, so your use case might be supported soon.