Overview

An Open Source Project from the Data to AI Lab, at MIT

Overview

Website: https://sdv.dev
Documentation: https://sdv.dev/Copulas
Repository: https://github.com/sdv-dev/Copulas
License: MIT
Development Status: Pre-Alpha

Copulas is a Python library for modeling multivariate distributions and sampling from them using copula functions. Given a table containing numerical data, we can use Copulas to learn the distribution and later on generate new synthetic rows following the same statistical properties.

Some of the features provided by this library include:

A variety of distributions for modeling univariate data.
Multiple Archimedean copulas for modeling bivariate data.
Gaussian and Vine copulas for modeling multivariate data.
Automatic selection of univariate distributions and bivariate copulas.

Supported Distributions

Univariate

Beta
Gamma
Gaussian
Gaussian KDE
Log-Laplace
Student T
Truncated Gaussian
Uniform

Archimedean Copulas (Bivariate)

Clayton
Frank
Gumbel

Multivariate

Gaussian Copula
D-Vine
C-Vine
R-Vine

Install

Requirements

Copulas is part of the SDV project and is automatically installed alongside it. For details about this process please visit the SDV Installation Guide

Optionally, Copulas can also be installed as a standalone library using the following commands:

Using pip:

pip install copulas

Using conda:

conda install -c sdv-dev -c conda-forge copulas

For more installation options please visit the Copulas installation Guide

Quickstart

In this short quickstart, we show how to model a multivariate dataset and then generate synthetic data that resembles it.

import warnings
warnings.filterwarnings('ignore')

from copulas.datasets import sample_trivariate_xyz
from copulas.multivariate import GaussianMultivariate
from copulas.visualization import compare_3d

# Load a dataset with 3 columns that are not independent
real_data = sample_trivariate_xyz()

# Fit a gaussian copula to the data
copula = GaussianMultivariate()
copula.fit(real_data)

# Sample synthetic data
synthetic_data = copula.sample(len(real_data))

# Plot the real and the synthetic data to compare
compare_3d(real_data, synthetic_data)

The output will be a figure with two plots, showing what both the real and the synthetic data that you just generated look like:

What's next?

For more details about Copulas and all its possibilities and features, please check the documentation site.

There you can learn more about how to contribute to Copulas in order to help us developing new features or cool ideas.

Credits

Copulas is an open source project from the Data to AI Lab at MIT which has been built and maintained over the years by the following team:

Manuel Alvarez [email protected]
Carles Sala [email protected]
(Alicia) Yi Sun [email protected]
José David Pérez [email protected]
Kevin Alex Zhang [email protected]
Andrew Montanez [email protected]
Gabriele Bonomi [email protected]
Kalyan Veeramachaneni [email protected]
Iván Ramírez [email protected]
Felipe Alex Hofmann [email protected]
paulolimac [email protected]
nazar-ivantsiv [email protected]

The Synthetic Data Vault

This repository is part of The Synthetic Data Vault Project

Website: https://sdv.dev
Documentation: https://sdv.dev/SDV

Name		Name	Last commit message	Last commit date
Latest commit History 701 Commits
.github		.github
conda		conda
copulas		copulas
data		data
docs		docs
tests		tests
tutorials		tutorials
.editorconfig		.editorconfig
.gitignore		.gitignore
.travis.yml		.travis.yml
AUTHORS.rst		AUTHORS.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
HISTORY.md		HISTORY.md
INSTALL.md		INSTALL.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
RELEASE.md		RELEASE.md
reza.ipynb		reza.ipynb
setup.cfg		setup.cfg
setup.py		setup.py
tasks.py		tasks.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Supported Distributions

Univariate

Archimedean Copulas (Bivariate)

Multivariate

Install

Requirements

Quickstart

What's next?

Credits

The Synthetic Data Vault

About

Releases

Packages

Languages

License

ArezaB/Copulas

Folders and files

Latest commit

History

Repository files navigation

Overview

Supported Distributions

Univariate

Archimedean Copulas (Bivariate)

Multivariate

Install

Requirements

Quickstart

What's next?

Credits

The Synthetic Data Vault

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages