Distributed table structures and data manipulation operations built on top of Dagger.jl
The package registered in the general repository, so you can add it by typing:
julia> ]add DTables
Below you can find a quick example on how to get started with DTables.
There's a lot more you can do though, so please refer to the documentation!
# launch a Julia session with threads/workers
julia> using DTables
julia> dt = DTable((a=rand(100), b=rand(100)), 10)
DTable with 10 partitions
Tabletype: NamedTuple
julia> m = map(r -> (x=sum(r), id=Threads.threadid(),), dt)
DTable with 10 partitions
Tabletype: NamedTuple
julia> xsum = reduce((x, y) -> x + y, m, init=0, cols=[:x])
EagerThunk (running)
julia> threads_used = reduce((acc, el) -> union(acc, el), m, init=Set(), cols=[:id])
EagerThunk (running)
julia> fetch(xsum)
(x = 95.71209812014976,)
julia> fetch(threads_used)
(id = Set(Any[5, 4, 6, 13, 2, 10, 9, 12, 8, 3]),)