Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Arrow Flight SQL frontend #478

Merged
merged 5 commits into from
Dec 13, 2023
Merged

Add Arrow Flight SQL frontend #478

merged 5 commits into from
Dec 13, 2023

Conversation

gruuya
Copy link
Contributor

@gruuya gruuya commented Dec 7, 2023

Closes #477.

This PR introduces an optional Arrow Flight SQL interface to Seafowl.

  1. Only ad-hoc queries supported for now
  2. No prepared statements, no transactions, no write statements
  3. No authentication

@gruuya gruuya requested a review from mildbyte December 7, 2023 11:33
@gruuya gruuya self-assigned this Dec 7, 2023
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking that a better test here would be to have a randomly generated batch/table with more field types, and more rows.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we'd ideally want JDBC/ODBC integration tests as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, or reuse some of the datasets we have in the HTTP tests? IIRC there's one that tests JSON serialization for everything. Probably want to test some gnarlier datatypes like Parquet structs here too.

Comment on lines +69 to +70
// Use a new UUID to fingerprint a query
// TODO: Should we use something else here (and keep that in the results map)?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this how the caller references the query when asking us for a result? I think this should be a new random UUID every time (instead of being deterministic based on the original query), we won't care about caching at this level and I guess we also don't want to hold on to old query results in Seafowl for too long?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct, the client goes back with that "ticket" to collect the results.

Keeping it as UUID's then.

@@ -12,6 +12,7 @@ use clap::Parser;
use futures::{future::join_all, Future, FutureExt};

use pretty_env_logger::env_logger;
use seafowl::frontend::flight::run_flight_server;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be behind a cfg(feature = ...)?

src/frontend/flight/sql.rs Show resolved Hide resolved

let flight_info = FlightInfo::new()
.try_with_schema(&schema)
.expect("encoding schema")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this an expect instead of propagating the error up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh good catch, forgot to transform that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, or reuse some of the datasets we have in the HTTP tests? IIRC there's one that tests JSON serialization for everything. Probably want to test some gnarlier datatypes like Parquet structs here too.


// Try out the actual query
let cmd = CommandStatementQuery {
query: "SELECT * FROM flight_table".to_string(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd test some basic DDL here (CREATE TABLE / CREATE EXTERNAL TABLE maybe) and a slightly more complex SELECT. Maybe also test two queries getting executed at the same time, with interleaving? i.e. start Q1, start Q2, get results for Q2, get results for Q1.

@gruuya gruuya merged commit 91d95d9 into main Dec 13, 2023
1 check passed
@gruuya gruuya deleted the arrow-flight-frontend branch December 13, 2023 08:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Arrow Flight frontend
2 participants