A full pipeline AutoML tool for tabular data
-
Updated
Jun 23, 2024 - Python
A full pipeline AutoML tool for tabular data
Evaluation Tool for Anomaly Detection Algorithms on Time Series
AGATHA: Automatic Graph-mining And Transformer based Hypothesis generation Approach
Unified Distributed Execution
Notes on Data Engineering with Pandas, PySpark, Dask, Ray, Arrow DataFusion, Polars etc.
Parallel Lammps Python interface - control a mpi4py parallel LAMMPS instance from a serial python process or a Jupyter notebook
Loop like a pro, make parameter studies fun.
Perform I/O intensive workloads on high-volume data sparsely located across multiple AWS regions through the use of Dask.
Test LightGBM's Dask integration on different cluster types
Code for "Training models when data doesn't fit in memory" post
Scalable Cytometry Image Processing (SCIP) is an open-source tool that implements an image processing pipeline on top of Dask, a distributed computing framework written in Python. SCIP performs projection, illumination correction, image segmentation and masking, and feature extraction.
Open Data Profiling, Quality and Analysis on NYC OpenData dataset with semantic profiling using fuzzy ratio, Levenshtein distance and regex
Launch a Dask cluster from a Poetry environment
Dask tutorial;Dask汉化教程
Procurement: Dask Cluster as a Process.
Python library to query and transform genomic data from indexed files
HPC cluster deployment and management for the Hetzner Cloud
Fraud detection ML pipeline and serving POC using Dask and hopeit.engine. Project created with nbdev: https://www.fast.ai/2019/12/02/nbdev/
Scale up concurrent requests to Earth Engine interactive endpoints with Dask
Efficiently read climate/meteorology data into Xarray using Dask for parallelization. Transform the data for your modelling needs.
Add a description, image, and links to the dask-distributed topic page so that developers can more easily learn about it.
To associate your repository with the dask-distributed topic, visit your repo's landing page and select "manage topics."