Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tool for capturing query parameters as if they were string interpolations #213

Open
c42f opened this issue Jan 8, 2021 · 5 comments
Open

Comments

@c42f
Copy link
Contributor

c42f commented Jan 8, 2021

I've recently written a macro tool for conveniently capturing query parameters as part of a private project. I've found it handy, so I wondered whether people would be interested in having this as part of LibPQ. Or if not, whether you have suggestions for a better home for it.

The general idea is that we should be able to build up queries by pasting together fragments of SQL and interpolated parameters like one might do with strings or Cmd backtick interpolation. But with all interpolations turned into SQL parameters for safety and consistency in converting those to SQL types.

To just paste the readme in here:

The readme

The main thing provided here is the @query macro to allow queries to be
constructed by normal-looking string interpolation but without danger of SQL
injection attacks.

Note that @query does not parse or understand the SQL source text as this
would be a lot of work. Instead, it keeps any literal SQL text you write as-is
and only treats the Julia-level string interpolations specially.

Use runquery to execute queries generated by @query.

Simple usage

Creating a table and inserting some values

conn = LibPQ.connection(your_connection_string)

runquery(conn, @query "create table foo (email text, userid integer)")

for (email,id) in [ ("[email protected]", 1)
                    ("[email protected]",   2)]
    runquery(conn, @query "insert into foo values ($email, $id)")
end

Thence:

julia> runquery(conn, @query "select * from foo") |> DataFrame
2×2 DataFrame
 Row │ email              userid
     │ String?            Int32?
─────┼───────────────────────────
   1 │ [email protected]       1
   2 │ [email protected]         2

Howto: Inserting values from a Julia array into a row

In some circumstances it can be useful to use splatting syntax to interpolate a
Julia collection into a comma-separated list of values. Generally simple scalar
parameters should be preferred for simplicity, but splatting can be useful on
occasion:

email_and_id = ("[email protected]", 3)
runquery(conn, @query "insert into foo values ($(email_and_id...))")

Howto: Using the in operator with a Julia collection

There's two ways to do this. First, using in and splatting syntax

julia> ids = (1,2)
       runquery(conn, @query "select * from foo where userid in ($(ids...))") |> DataFrame
       2×2 DataFrame
        Row │ email              userid
            │ String?            Int32?
       ─────┼───────────────────────────
          1 │ admin@example.com       1
          2 │ foo@example.com         2

Second, using the SQL any operator and simply passing a single SQL array parameter:

julia> ids = [1,2]
       runquery(conn, @query "select * from foo where userid = any($ids)") |> DataFrame
       2×2 DataFrame
        Row │ email              userid
            │ String?            Int32?
       ─────┼───────────────────────────
          1 │ [email protected]       1
          2 │ [email protected]         2

Howto: Building up a query from fragments

conn = LibPQ.connection(your_connection_string)

some_condition = true

x = 100
x = 20
# Example of an optional clauses - use empty @query() to disable it.
and_clause = some_condition ? @query("and y=$y") : @query()

# Interpolation of values produces SQL parameters; interpolating @query
# fragments adds them to the query.
q = @query "select * from table where x=$x $and_clause"
runquery(conn, q)
@iamed2
Copy link
Collaborator

iamed2 commented Jan 8, 2021

Okay this is pretty great! And best of all it's optional so if someone's using a database which uses the PostgreSQL wire protocol but not the parameter syntax (i.e., $1, $2) they can just not use the macro.

I'm presuming this functions something like this:

julia> @query "insert into foo values ($(email_and_id...))"
Query("insert into foo values (\$1, \$2)", ("[email protected]", 3))

Would this make sense as a string macro?

@c42f
Copy link
Contributor Author

c42f commented Jan 11, 2021

I'm presuming this functions something like this

Very similar (though the interpolation syntax is constructed lazily — this allows multiple such strings to be pasted together dynamically, as in the "Buliding up a query from fragments" section. (It would also allow non-PostgreSQL parameter syntax to be supported for other connection types.)

julia> email_and_id = ("[email protected]", 3)
("[email protected]", 3)

julia> @query "insert into foo values ($(email_and_id...))"
insert into foo values ($1,$2)
  $1 = "[email protected]"
  $2 = 3

julia> dump(@query "insert into foo values ($(email_and_id...))")
JHubDbUtils.SafeQuery
  args: Array{Any}((5,))
    1: QueryFragment
      fragment: String "insert into foo values ("
    2: String "[email protected]"
    3: QueryFragment
      fragment: String ","
    4: Int64 3
    5: QueryFragment
      fragment: String ")"

Would this make sense as a string macro?

Yes I tried this, but I found it has a big downside: interpolations aren't syntax-highlighted. In practice I felt this quite outweighed the benefits of a string macro for visually parsing complex queries. (The benefits of a string macro being a separate namespace and no need for parentheses in larger expressions with trailing parts.)

@c42f
Copy link
Contributor Author

c42f commented Jan 11, 2021

By the way, I found it helpful to add a runquery function which was separate from LibPQ.execute() and only accepted SafeQuery in an effort to discourage use of string interpolation in constructing queries. I'm not sure how this would make sense if we had @query in LibPQ itself; maybe it's something application authors could decide to do separately from LibPQ.

By the way, regarding naming I'm not married to using @query; we could call this @sql or @psql, or something else if it made sense. Some people might find it annoying if this clashed with, eg, Query.@query though that's not an issue for me personally.

In the end, @query is really just a way of capturing interpolations for processing later. Actually it's somewhat similar to an old PR in Julia Base https://github.com/JuliaLang/julia/pull/33711/files#diff-281780f4de49ab611364d45135bc3d40067c69924b7548e5e73cfec655fb356a. So another option might be to start a new package — LazyStrings, or some such.

@iamed2
Copy link
Collaborator

iamed2 commented Jan 11, 2021

Yes I tried this, but I found it has a big downside: interpolations aren't syntax-highlighted. In practice I felt this quite outweighed the benefits of a string macro for visually parsing complex queries. (The benefits of a string macro being a separate namespace and no need for parentheses in larger expressions with trailing parts.)

If it was a string macro you could choose to use the @macro_str form if you wanted, whereas it doesn't work the other way around.

The laziness actually presents a nice argument for this being a generic separate package, since the same syntax could be used for many different database systems.

@c42f
Copy link
Contributor Author

c42f commented Feb 6, 2021

The laziness actually presents a nice argument for this being a generic separate package, since the same syntax could be used for many different database systems.

Yes that's a good point. I've taken a stab at refactoring this a bit and put the code into a new package SqlStrings.jl. I'll probably register a first version of this shortly:

https://github.com/JuliaComputing/SQLStrings.jl

If it was a string macro you could choose to use the @macro_str form if you wanted, whereas it doesn't work the other way around.

In SqlStrings, I've made it @sql_cmd because:

  • It's important to have syntax highlighting for interpolations, but editors typically disable this within normal string macros.
  • @sql_cmd is very conceptually similar to the builtin backticks and Base.Cmd: it's a lightweight layer which deals only with preserving the structure of tokens in the source text.

At the moment it requires a small amount of glue code to integrate with LibPQ, but it would be great to make that unnecessary.

I'm not 100% sure of the best way forward for integration. We could make SqlStrings a dependency of LibPQ and make LibPQ.execute optionally take an SqlString?

A longer term goal might be to make the publicly exported execute() not take a normal string which would heavily discourage people from using plain string interpolation to construct queries. But obviously that would be quite breaking. Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants