[RubyNLP | RubyML | RubyInterop]
Links and Resources for Data Processing and Analysis in Ruby
Data Science is a new "sexy" buzzword without specific meaning but often used to substitute Statistics, Scientific Computing, Text and Data Mining and Visualization, Machine Learning, Data Processing and Warehousing as well as Retrieval Algorithms of any kind.
This curated list comprises awesome tutorials, libraries, information sources about various Data Science applications using the Ruby programming language.
A lot of useful resources on this list come from the development by The Ruby Science Foundation, our contributors and our own day to day work on various data intensive applications. Read why this list is awesome.
✨ Every contribution is welcome! Add links through pull requests or create an issue to start a discussion.
Follow us on Twitter
and please spread the word using the #RubyDataScience
hash tag!
- Ruby vs. Python vs. Julia vs. R
- Standing on the shoulders of giants
- Data Manipulation
- Distributed Computing
- Data Structures
- Statistics
- Numeric and Symbolic Computation
- Visualization
- Interactive Computing
- Input and Output
- Provisioning Infrastructure
- Machine Learning
- Articles, Posts, Talks, and Presentations
- Related resources
- Wait but why?
- License
Ruby | Python | Julia | R |
---|---|---|---|
Daru | Pandas | ||
NArray | NumPy |
Ruby is (for now) not a Data Science centric language with a very large established library. Leveraging libraries from R, Python, and Julia helps Ruby to solve your tasks!
- pycall - Bridge into the Python world.
- rserve-client - Ruby connector for Rserve, R's binary server.
- kiba - lightweight Ruby ETL (Extract-Transform-Load) framework.
- ruby-spark - Ruby Interface to Apache Spark 1.x.x.
- jruby-spark - JRuby based bindings for Apache Spark.
- daru - Data Frame and Vector structures with comprehensive manipulating and visualization methods.
- numo-narray - n-dimensional Numerical Array for Ruby.
- nmatrix - dense and sparse linear algebra library for Ruby via SciRuby.
- kdtree - blazingly fast native 2d k-d tree.
- mdarray -
Array structure for
JRuby
. - spreadsheet - manipulation library for MS Excel spreadsheets.
- networkx - Ruby based NetworkX clone that handles various usecases of the Graph Data Structure.
- rb-gsl - Ruby interface to the GNU Scientific Library. [dep: GLS]
- simple_stats -
Enumerable
patches for descriptive statistics. - enumerable-statistics -
fast implementation of descriptive statistics for the
Enumerable
module. - statsample - basic and advanced statistics for Ruby. [dep: GLS]
- statsample-glm -
extension of
statsample
by Generalized Linear Models. - statsample-bivariate-extension -
extension of
statsample
by Bivariate Correlations. - statsample-timeseries -
extension of
statsample
by Time Series estimators. - pca - Principal Component Analysis (PCA) in Ruby.
- descriptive-statistics -
descriptive extensions for the
Enumerable
module or standalone usage. - distribution - probabilistic distributions and descriptive measures for them.
- statistics2 - Normal, Chi-square, t- and F- probability distributions for Ruby.
- numo-linalg - linear algebraic operations for NArray.
- numo-gsl - Math and Statistics for NArray using GSL.[dep: GSL]
- symengine - Symbolic Computation with SymEngine.
- numo-ffte - Fast Fourier Transformation for NArray using the FFTE package.[FFTE]
Comprehensive tools for Data Visualization.
- matplotlib - Ruby based wrapper around matplotlib. [dep: matplotlib]
- mathematical - PNG and MathML renderings for your equations.
- daru-plotly - Plotly based visualization for Daru.
- https://github.com/v0dro/benchmark-plot
- https://github.com/domitry/Nyaplotjs
- https://github.com/domitry/nyaplot
- https://github.com/SciRuby/gnuplotrb
- ruby-graphviz [dep: Graphviz]
- gnuplot [dep: gnuplot]
- https://github.com/zuhao/plotrb
- https://github.com/brasten/scruffy
- https://github.com/zverok/worldize
- https://github.com/masa16/ruby-mathgl
- numo-gnuplot - gnuplot interface for the Numo package.
- iruby - Ruby kernel for Jupyter.
- iruby-rails - Integration library for IRuby and Rails.
- https://github.com/fiksu/rcsv
- ox - Optimized for speed XML parser and object marshaller.
- oj - High-speed JSON parser.
- Markdown
- Nokogiri
- CSV
- pg
- Mongo
- MySQL
- BibTeX
- inih - fast C based INI parser for Ruby.
- bolognese - conversion tool for citation formats like BibTeX, RIS, or Crossref XML.
- https://github.com/mrkn/gpu-instance
- https://github.com/mrkn/computing_node
- https://github.com/k1LoW/awspec
Please look at our extensive Awesome ML with Ruby list.
- 2017
- Progress of Ruby-Numo: Numerical Computing for Ruby by Masahiro Tanaka [slides]
- Chartkick: data visualization made easy with Ruby by Govind Unnikrishnan [post]
- Development of Data Science Ecosystem for Ruby by Kenta Murata [slides | video | page]
- 2016
- Scientific Computation and Data Visualization with Ruby by Sameer Deshmukh [slides | video]
- 2015
- 2014
- 2013
- Seeing the Big Picture: Quick and Dirty Data Visualization with Ruby by Aja Hammerly [video | slides | code]
- 2012
- 2011
- 2010
- NArray and scientific computing with Ruby by Masahiro Tanaka [video | slides]
- https://gitter.im/red-data-tools/en
- https://gitter.im/red-data-tools/ja
- http://ruby-data.org/
- https://twitter.com/RubyData
- https://discourse.ruby-data.org/
- ImageMagick
- GSL
- FFTE
- SymEngine
- Awesome Big Data - awesome curated list on all around Big Data.
- Awesome Spark - awesome list on Apache Spark goodies.
There are a lot of software lists with tools related to the Data Science. There are a couple of lists with Ruby related projects. There are no lists of only working and tested software with documented scope. We'll try to make one!
What is awesome? Awesome are documented, maintained and focused tools.
Can something turn not awesome at a point? Yes! Abandoned projects with broken dependencies aren't awesome any more! They leave this list.
Awesome Data Science with Ruby
by Andrei Beliankou and
Contributors.
To the extent possible under law, the person who associated CC0 with
Awesome Data Science with Ruby
has waived all copyright and related or neighboring rights
to Awesome Data Science with Ruby
.
You should have received a copy of the CC0 legalcode along with this work. If not, see https://creativecommons.org/publicdomain/zero/1.0/.