-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Apache Arrow tables in database clients #320
Comments
Retitled this issue to describe the more generic problem: we want to support Apache Arrow tables as a tabular data representation throughout database clients, SQL cells, and data table cells. |
apache/arrow#34939 adds an indexed access proxy for Arrow but the performance isn't great compared to properly adopting Arrow. It would be great to have Arrow support throughout the different clients and cells. |
Now that Arrow is used in a lot more places, I think it may be a good time to revisit this issue. The extra copies are introducing extra overhead in many places and I think it would be super awesome if we could just pass Arrow columns directly into Plot (observablehq/plot#191) without it making extra copies. |
FWIW, Framework’s DuckDBClient (as of 1.3) returns Apache Arrow tables without materializing array-of-objects. So there’s that. |
Oh nice. I guess you can't just remove the toArray call here for backwards compatibility? How good is Arrow/columnar data support in Plot these days? |
That’s correct, it wouldn’t be backwards-compatible so I don’t think we are likely to change the behavior in Observable notebooks any time soon. (But eventually we’ll have a way to version control the Observable standard library, and port improvements from Observable Framework back to notebooks.) Plot uses columnar data internally, so I would rate support as excellent, but we don’t yet have the shorthand syntax so it’s cumbersome to avoid materializing the array-of-objects — you have to pass the column vectors in yourself for each channel. observablehq/plot#191 covers making the syntax more convenient. |
stdlib/src/duckdb.js
Line 58 in 6058924
The text was updated successfully, but these errors were encountered: