--- title: "Introduction to sqlq package" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to sqlq package} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` *sqlq* helps writing SQL queries, using factory functions or an explicit syntax tree. With *sqlq*, identifiers and strings are safely quoted when needed. Complex conditional expressions can be created for WHERE clauses, and they can be built dynamically. The original motivation for this package was to help building complex WHERE clauses, especially when the conditions are not known in advance. Here is a set of conditional parameters that may be used to construct a WHERE clause: ```{r} author <- NULL year_range <- list(min=2010, max=2020) price_ranges <- list(list(min=5, max=9), list(min=20,max=29), list(min=50,max=100)) ``` In the following example the WHERE clause is built dynamically, not knowing the values of the different conditions in advance (the list price ranges may be empty or contain 1, 10 or more elements, the author name may be provided or not, etc.): ```{r} library(sqlq) expr <- ExprCommOp$new("and") if (! is.null(author)) expr$add(ExprBinOp$new(ExprField$new('author')), '=', ExprValue$new(author)) if (! is.null(year_range)) expr$add(make_between('year', year_range$min, year_range$max)) if (length(price_ranges) > 0) { or <- ExprCommOp$new("or") for (rng in price_ranges) or$add(make_between('price', rng$min, rng$max)) expr$add(or) } where <- make_where(expr) ``` The final WHERE expression constructed this way is then used to build a SELECT query: ```{r} make_select_all("books", where = where)$toString() ``` Another motivation was to automatically generate the correct quoting for identifiers and strings, depending on the database management system (DBMS) used, but avoid quoting when unnecessary. Identifiers are only quoted when they contain special characters (spaces, punctuation, etc.) or when they are reserved keywords. ## SELECT query Using the factory function `make_select_all()`, we can create a simple `select *` query on a table: ```{r} make_select_all("books")$toString() ``` We may add the DISTINCT keyword to remove duplicates: ```{r} make_select_all("books", distinct = TRUE)$toString() ``` ## Selecting specific fields Using the `make_select()` factory function, we choose the fields we want to retrieve: ```{r} make_select("books", fields = c("title", "author"))$toString() ``` ## LIMIT keyword We may add the `LIMIT` keyword by specifying the limit inside the `make_select*()` functions: ```{r} make_select_all("books", limit = 16)$toString() ``` ## WHERE clause The `make_select*()` functions accept also a `WHERE` clause: ```{r} where <- StmtWhere$new(ExprBinOp$new( ExprField$new("year"), ">", ExprValue$new(2015) )) make_select_all("books", where = where)$toString() ``` ## Complex where clause Expression objects can be combined to form more complex expressions. In the following example we want to retrieve all books whose author's name starts with an "A" and whose publication year is after 2015. We first define the expressions to check author en year: ```{r} year_cond <- ExprBinOp$new(ExprField$new("year"), ">", ExprValue$new(2015)) author_cond <- ExprBinOp$new( ExprField$new("author"), "like", ExprValue$new("A%") ) ``` Then we build the WHERE statement: ```{r} where <- StmtWhere$new(ExprBinOp$new(year_cond, "and", author_cond)) ``` Finally we construct the SELECT query: ```{r} make_select_all("books", where = where)$toString() ``` ## Dynamic where clause Arbitrarily complex expressions can be built using the various `Expr*` classes and used inside a WHERE clause. In this example we use a vector of patterns to build a condition that tests a field on the different patterns. From the following list of patterns: ```{r} patterns <- c("A%", "D%", "Z%") ``` And the author field: ```{r} author <- ExprField$new("author") ``` We build the list of expressions that checks all the patterns: ```{r} expressions <- lapply( patterns, function(x) { ExprBinOp$new( author, "like", ExprValue$new(x) ) } ) ``` Then we link all these expressions with OR operators: ```{r} cond <- ExprCommOp$new("or", expressions) ``` Finally, we build the SELECT query: ```{r} make_select_all("books", where = StmtWhere$new(cond))$toString() ``` ## JOIN statement In this example, we make a join between tables *books* and *authors*. First we define the fields we want to retrieve, from table *books*: ```{r} fields <- list(ExprField$new("title"), ExprField$new("name", "authors")) ``` Then we define the join statement, that operates on *authors.id* and *books.author_id*, using the `make_join()` function: ```{r} join <- make_join("id", "authors", "author_id", "books") ``` Finally we create the SELECT query: ```{r} make_select(tabl = "books", fields = fields, join = join)$toString() ``` ### Setting the type of JOIN By default, the `make_join()` function creates an `INNER JOIN`. If you want to use a different type of join, you can specify it using the `type` argument: ```{r} make_join("id", "authors", "author_id", "books", type = "LEFT")$toString() ``` ### Join on multiple tables You can also join multiple tables by using the `add()` method of the `SelectQuery` class, which is returned by the `make_select_all()` or `make_select()` functions: ```{r} x <- make_select_all(tabl = "books") x$add(make_join("book_id", "bookcat", "id", "books")) x$add(make_join("id", "categories", "cat_id", "bookcat")) x$toString() ``` ## INSERT INTO query To generate an `INSERT INTO` query, use the `make_insert()`: ```{r} values <- list(list('John Smith', 'Memories', 1999), list('Barbara', 'My Life', 2010)) make_insert(tabl = 'books', fields = c('author', 'title', 'year'), values = values)$toString() ``` ## CREATE TABLE query To generate a `CREATE TABLE` query, use the `make_create_table()` function. We must first define the fields/columns of the table: ```{r} fields_def <- list(ExprFieldDef$new('id', 'integer', primary=TRUE), ExprFieldDef$new('title', 'varchar(200)', nullable=FALSE), ExprFieldDef$new('author', 'varchar(80)', nullable=FALSE)) ``` Then we can call the `make_create_table()` function: ```{r} make_create_table(tabl = 'books', fields_def = fields_def)$toString() ``` ## DELETE FROM query To generate a `DELETE FROM` query, use the `make_delete()` function. We first define the WHERE clause: ```{r} where <- StmtWhere$new(ExprBinOp$new( ExprField$new("year"), "<", ExprValue$new(2015) )) ``` Then we call the `make_delete()` function: ```{r} make_delete(tabl = "books", where = where)$toString() ``` ## UPDATE query To generate a `UPDATE` query, use the `make_update()` function, along with the `make_set()` function. We first define the WHERE clause: ```{r} where <- StmtWhere$new(ExprBinOp$new( ExprField$new("year"), "<", ExprValue$new(2010) )) ``` Then we create the `SET` statement using the `make_set()` function: ```{r} set <- make_set(price = 9.50, old = TRUE) ``` Finally we create the query using the `make_update()` function: ```{r} make_update('books', set = set, where = where)$toString() ``` ## Noticeable expressions ### Binary operators Binary operators are used to compare two expressions, such as fields or values. The `ExprBinOp` class is used to create such expressions. It accepts two operands and an operator as arguments. The operands can be `ExprField`, `ExprValue`, or other `Expr` instances. ```{r} comp1 <- ExprBinOp$new(ExprField$new("year"), ">", ExprValue$new(1994)) comp2 <- ExprBinOp$new(ExprField$new("author"), "=", ExprValue$new("John Doe")) where <- StmtWhere$new(ExprBinOp$new(comp1, "or", comp2)) make_select_all("books", distinct = TRUE, where = where)$toString() ``` ### Commutative operators Commutative operators are used to combine multiple expressions with the same operator, such as `AND` or `OR`. The `ExprCommOp` class is used to create such expressions. It accepts an operator and a list of expressions as arguments. ```{r} or <- ExprCommOp$new("or", list( ExprBinOp$new(ExprField$new("year"), ">", ExprValue$new(1994)), ExprBinOp$new(ExprField$new("author"), "=", ExprValue$new("John Doe")), ExprBinOp$new(ExprField$new("title"), "like", ExprValue$new("A%")) )) where <- StmtWhere$new(or) make_select_all("books", distinct = TRUE, where = where)$toString() ``` ### BETWEEN Using the `make_between()` function, we can create a `BETWEEN` expression: ```{r} make_between('i', 1, 10)$toString() ``` `make_between()` accepts both atomic values and `ExprValue` instances for low and high limits. ### IS NULL and IS NOT NULL To test if a field is NULL use the `ExprIsNull` class: ```{r} StmtWhere$new(ExprIsNull$new(ExprField$new("name")))$toString() ``` To test if a field is NULL use the `ExprIsNotNull` class: ```{r} StmtWhere$new(ExprIsNotNull$new(ExprField$new("name")))$toString() ``` ## Style options ### Using proper quoting for a specific database In order to generate correct SQL requests for a DBMS, it is recommended to inform the *sqlq* package about the connector we are using. This is done with the `sqlq_conn` global option. First we create the database connector (here a connector to an In-Memory instance of SQLite DB): ```{r} mydb <- DBI::dbConnect(RSQLite::SQLite(), ":memory:") ``` Then we declare the connector to *sqlq*: ```{r} options(sqlq_conn = mydb) ``` Finally we build the request: ```{r} fields <- c("The Title", "author") where <- StmtWhere$new(ExprBinOp$new(ExprField$new("author"), "=", ExprValue$new("John Doe"))) make_select("books", fields = fields, where = where)$toString() ``` We can see that now backticks (\`) are used instead of regular quotes ("). This is indeed the official mean of quoting identifiers in SQLite. This is also the case for MariaDB. ```{r, echo=FALSE, results=FALSE} DBI::dbDisconnect(mydb) ``` ### Keywords uppercase/lowercase By default keywords and alphabetical operators (OR, AND ,etc.) are written uppercase. You can force lowercase by setting the global option `sqlq_uppercase` to `FALSE`: ```{r} options(sqlq_uppercase = FALSE) comp1 <- ExprBinOp$new(ExprField$new("year"), ">", ExprValue$new(1994)) comp2 <- ExprBinOp$new(ExprField$new("author"), "=", ExprValue$new("John Doe")) where <- StmtWhere$new(ExprBinOp$new(comp1, "or", comp2)) make_select_all("books", distinct = TRUE, where = where)$toString() ``` ### Spaces Unnecessary spaces may be removed by setting the `sqlq_spaces` global option to `FALSE`. Without setting the option, spaces are set around non-alphabetical operators (here `>`): ```{r} where <- StmtWhere$new(ExprBinOp$new( ExprField$new("year"), ">", ExprValue$new(1994) )) make_select_all("books", distinct = TRUE, where = where)$toString() ``` When setting the option to `FALSE`, such space characters are removed: ```{r} options(sqlq_spaces = FALSE) where <- StmtWhere$new(ExprBinOp$new( ExprField$new("year"), ">", ExprValue$new(1994) )) make_select_all("books", distinct = TRUE, where = where)$toString() ```