Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile time query support #801

Open
Timmmm opened this issue Sep 14, 2020 · 19 comments
Open

Compile time query support #801

Timmmm opened this issue Sep 14, 2020 · 19 comments

Comments

@Timmmm
Copy link
Contributor

Timmmm commented Sep 14, 2020

It would be cool if this crate supported some of the compile time query features that SQLx has.

@gwenn
Copy link
Collaborator

gwenn commented Sep 14, 2020

We should be able to validate query like sql!.
But due to dynamic typing, we may not be able to deduce the correct type...

@Timmmm
Copy link
Contributor Author

Timmmm commented Sep 14, 2020

Syntax checking would be a great start. I would love it if Rusqlite could also deduce the correct type from the schema too (SQLite's "it's a feature" is a feeble excuse at best, downright idiotic at worst). But I'll settle for compile time syntax checking!

@thomcc
Copy link
Member

thomcc commented Oct 4, 2020

https://github.com/gwenn/lemon-rs/ contains a SQLite parser. I don't know how up to date it is though. @gwenn any opinions on integrating it vs going some other approach?

One thing I'll note is: sqlx expects you to have some sort of "live development database", which it uses to get type information. This is kind of unidiomatic/uncommon for SQLite in my experience. I'd love us to have some smarts around schema for this sort of thing. (Edit: I've heard that sqlx might not require this anymore but I don't know).

I agree that the dynamic typing is a concern, but I think as long as we don't force checking (and instead just offer a checked API) it wouldn't be a problem.

Anyway, beyond basic syntax checking, if we could verify the correct parameter/result names/counts that would be huge.

I do think this should all be optional though, since proc macros add quite a bit to compile time, and currently we don't use any.

@Timmmm
Copy link
Contributor Author

Timmmm commented Oct 5, 2020

Yes I think having a live database is rather overkill. I would rather specify the schema I expect in the source code. For example in SQLite you can do this:

SELECT 
  m.name as table_name,
  p.cid as column_id,
  p.name as column_name,
  p.type as column_type,
  p.`notnull` as column_notnull
FROM 
  sqlite_master AS m
JOIN 
  pragma_table_info(m.name) AS p
ORDER BY 
  m.name, 
  p.cid

Which gives you a table something like this:

image

How hard would it be to allow copy/pasting into a macro something like this?

define_schema!(myschema,
compilation_trace|0|id|INTEGER|0
compilation_trace|1|cat|TEXT|1
compilation_trace|2|dur|INTEGER|1
compilation_trace|3|name|TEXT|1
compilation_trace|4|ph|TEXT|1
compilation_trace|5|pid|INTEGER|1
compilation_trace|6|tid|INTEGER|1
compilation_trace|7|ts|INTEGER|1
compute_sets|0|compute_set|INTEGER|0
compute_sets|1|name|TEXT|1
);

let statement = sql!(myschema, "SELECT cat FROM compilation_trace");

@gwenn
Copy link
Collaborator

gwenn commented Oct 5, 2020

I am afraid that currently lemon-rs is not a good candidate mainly because it may panic while parsing but it should be up to date with last stable version (ie 3.33.0).
And pragma_table_info is not useful when you have expressions.
Finally, we don't need a database but only a SQL schema (like Jetbrains Database plugin) except for virtual table and pragma.

So we need:

@thomcc
Copy link
Member

thomcc commented Oct 5, 2020

lemon-rs is not a good candidate mainly because it may panic

This would be in a proc-macro crate I think, and so we could just catch the panic, assuming it just means invalid input.

(That said I still think this all would be a lot of work, and it only answers possibly the first part anyway)

@gwenn
Copy link
Collaborator

gwenn commented Oct 5, 2020

sqlx parses EXPLAIN output.

@Timmmm
Copy link
Contributor Author

Timmmm commented Oct 9, 2020

By the way, even if there isn't support for validating queries against a schema, just checking the SQL syntax would be good. I keep getting extra commas by writing queries like this:

query("SELECT
  foo,
  bar,
  baz
FROM table");

And because SQL doesn't support trailing commas if you delete baz it results in a runtime syntax error, which could presumably be checked at compile time with nothing other than a SQL parser.

@gwenn
Copy link
Collaborator

gwenn commented Nov 8, 2020

See https://github.com/gwenn/rusqlite/tree/rusqlite-macros/rusqlite-macros

Todo:

  • publish lemon-rs on crates.io
  • find a way to transform parse error location into Span
  • check SQL statements against database schema.
  • should rusqlite-macros be in a dedicated repository ?
  • ...

@thomcc
Copy link
Member

thomcc commented Nov 8, 2020

Wow!!! Haven't looked yet but that's awesome!

should rusqlite-macros be in a dedicated repository?

I don't think so, since it's possible we may want to change an API in rusqlite, which also impacts the macros.

@gwenn
Copy link
Collaborator

gwenn commented Nov 9, 2020

The (easy) next step would be to to validate the number of place holders / parameters / variables (and their optional names).
But this is possible only if they are specified inline.
And SQLite is fine when any parameter is not specified (NULL is used instead).
Are you ok if we raise an error only when we are sure (it doesn't seem possible to raise warning in stable, see proc_macro_diagnostic feature) ?

@gwenn
Copy link
Collaborator

gwenn commented Feb 2, 2021

Ok,
I am going to publish lemon-rs.
But I want to give the crate a different name because lemon is only the parser generator.
And I can't use sqlite-parser anymore.
Are you okay with sqlite3-parser or lemon-sql ?

@thomcc
Copy link
Member

thomcc commented Feb 3, 2021

https://crates.io/crates/sqlite_parser

Oof, missed by just a bit.

Are you okay with sqlite3-parser or lemon-sql ?

sqlite3-parser seems clearer to me.

That said it's totally your call. I agree that lemon-rs would be a confusing name, though.

@gwenn
Copy link
Collaborator

gwenn commented Feb 14, 2021

@gwenn
Copy link
Collaborator

gwenn commented Apr 2, 2021

Just for reference: https://github.com/google/cpp-from-the-sky-down/tree/master/rust (full SQLite parser ?)

@Cloudef
Copy link

Cloudef commented Dec 11, 2023

Here is my build.rs which sanity checks all queries for syntax errors.

use anyhow::Result;

fn sanity_check_sql_queries() -> Result<()> {
    use syn::visit::{self, Visit};

    struct Visitor {
        db: rusqlite::Connection,
        in_sql_call: bool,
        errors: Vec<anyhow::Error>,
        queries: u64,
    }

    impl<'ast> Visit<'ast> for Visitor {
        fn visit_lit_str(&mut self, node: &'ast syn::LitStr) {
            if self.in_sql_call {
                self.queries += 1;
                let query = node.token().to_string();
                self.db.execute_batch(&format!("EXPLAIN QUERY PLAN {}", &query[1..query.len()-1])).unwrap_or_else(|err| {
                    match err {
                        rusqlite::Error::ExecuteReturnedResults => {},
                        _ => {
                            self.queries -= 1;
                            self.errors.push(anyhow::anyhow!(err))
                        },
                    }
                });
            }
            visit::visit_lit_str(self, node);
        }

        fn visit_expr_method_call(&mut self, node: &'ast syn::ExprMethodCall) {
            match node.method.to_string().as_str() {
                "execute" | "execute_batch" |
                "prepare" | "prepare_cached" |
                "query_row" => {
                    self.in_sql_call = true;
                    for arg in &node.args {
                        visit::visit_expr(self, &arg);
                        break; // only care about first arg
                    }
                    self.in_sql_call = false;
                },
                _ => {},
            }
            visit::visit_expr_method_call(self, node);
        }
    }

    let mut visitor = Visitor {
        db: rusqlite::Connection::open_in_memory()?,
        in_sql_call: false,
        errors: Vec::new(),
        queries: 0,
    };

    visitor.db.execute_batch(&String::from_utf8_lossy(include_bytes!("src/db.sql")))?;

    for entry_ in walkdir::WalkDir::new("src") {
        if let Ok(entry) = &entry_  {
            if entry.path().file_name().unwrap().to_str().unwrap().ends_with(".rs") {
                let contents = std::fs::read_to_string(entry.path())?;
                let syntax_tree = syn::parse_file(&contents)?;
                let last_queries = visitor.queries;
                visitor.visit_file(&syntax_tree);
                if visitor.queries != last_queries {
                    println!("cargo:warning=Found {} SQLite queries in {}", visitor.queries - last_queries, entry.path().display());
                    println!("cargo:rerun-if-changed={}", entry.path().display());
                }
            }
        }
    }

    println!("cargo:warning=Checked {} SQLite queries from which {} failed", visitor.queries, visitor.errors.len());

    if !visitor.errors.is_empty() {
        let mut merged: String = "".to_string();
        for err in visitor.errors {
            merged += &format!("{err:?}");
        }
        anyhow::bail!(merged);
    }

    Ok(())
}

fn main() -> Result<()> {
    sanity_check_sql_queries()?;
    let output = std::process::Command::new("git").args(&["rev-parse", "HEAD"]).output()?;
    let git_hash = String::from_utf8(output.stdout)?;
    println!("cargo:rustc-env=GIT_HASH={}", git_hash);
    pkg_config::Config::new().probe("donutdb")?;
    println!("cargo:rerun-if-changed=build.rs");
    Ok(())
}

@gwenn
Copy link
Collaborator

gwenn commented Dec 11, 2023

@Cloudef see #1346 for syntax error (but not against a DB schema) and placeholders names / count.

@Cloudef
Copy link

Cloudef commented Dec 12, 2023

@gwenn nice, if it was possible for it to also do EXPLAIN QUERY PLAN {} on a memory db which you can fill with schema as I do above with execute_batch(&String::from_utf8_lossy(include_bytes!("src/db.sql")))?; then that would also check typos in table names etc.

@gwenn
Copy link
Collaborator

gwenn commented Dec 12, 2023

As far as I know, EXPLAIN QUERY PLAN doesn't check placeholders.
But indeed EXPLAIN QUERY PLAN checks query against DB schema.

@gwenn gwenn pinned this issue Mar 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants