Instant Gratification
Learning Goal
We'll start by quickly running through the following DB operations, which should give you some "instant gratification" (as the title says!) However, do not start writing apps with Opaleye immediately after reading this one chapter. As they say, a little knowledge is a dangerous thing! You are strongly encouraged to read all the chapters in this tutorial before using Opaleye in any serious project.
- Connecting to the Postgres DB
- Selecting multiple rows
- Selecting a single row - and getting a sneak-peek into the Opaleye query DSL
- Selecting a row
- Inserting a row
- Updating a row
Danger
This tutorial is for opaleye-0.7 onwards. Opaleye went through a lot of breaking changes in 0.6.7 and then again in 0.7. If you're going through this tutorial, make sure you're using the correct version of Opaleye, else none of the examples will compile.
Setting up the Postgres schema
Opaleye works only with Postgres. While there is an opaleye-sqlite package, it is rather unmaintained. So, the first order of business is to get our sample Postgres DB in place for this tutorial.
First, let's create the
opaleye_tutorial
user + database, which we'll use throughout this tutorial.-- Run the following SQL snippet as a DB superuser create role opaleye_tutorial with createdb login password 'opaleye_tutorial'; create database opaleye_tutorial with owner=opaleye_tutorial;
Next, connect to the newly created
opaleye_tutorial
DB using theopaleye_tutorial
user, and create theusers
table, with a few rows:-- run this script using the opaleye_tutorial DB user create table users id serial primary key ( not null , name text not null , email text date , dob ); insert into users(name, email) values ('John', 'john@mail.com'); insert into users(name, email) values ('Bob', 'bob@mail.com', '2001-01-01'); insert into users(name, email) values ('Alice', 'alice@mail.com', '2002-02-02');
The Complete Code
Here's the complete code in a single glance. The rest of this chapter walks through this code and introduces various Opaleye concepts in a simplified manner.
{-# LANGUAGE Arrows #-}
{-# LANGUAGE FlexibleContexts #-}
module InstantGratification where
import Opaleye as O
import Database.PostgreSQL.Simple (Connection, connect, ConnectInfo(..))
import Data.Profunctor.Product (p4)
import Control.Arrow (returnA)
import Data.Time (Day)
import Control.Monad (void)
getDbConn :: IO Connection
= connect ConnectInfo
getDbConn = "localhost"
{ connectHost = 5432
, connectPort = "opaleye_tutorial"
, connectDatabase = "opaleye_tutorial"
, connectUser = "opaleye_tutorial"
, connectPassword
}
selectAllUsers :: Connection -> IO [(Int, String, String, Maybe Day)]
=
selectAllUsers conn $ -- execute a SELECT query
O.runSelect conn -- the actual SELECT query to execute
O.selectTable userTable
selectByEmail :: Connection -> String -> IO [(Int, String, String, Maybe Day)]
= runSelect conn $ proc () -> do
selectByEmail conn email @(_, em, _, _) <- selectTable userTable -< ()
row-< (em .== toFields email)
restrict -< row
returnA
insertUser :: Connection -> (Int, String, String, Maybe Day) -> IO ()
=
insertUser conn row $ runInsert_ conn ins
void where
= Insert
ins = userTable
{ iTable = [toFields row]
, iRows = rCount
, iReturning = Nothing
, iOnConflict
}
updateUser :: Connection -> (Int, String, String, Maybe Day) -> IO ()
=
updateUser conn (i, n, e, d) $ runUpdate_ conn u
void where
= Update
u = userTable
{ uTable = (\(iDb, _, _, _) -> (iDb, toFields n, toFields e, toFields d))
, uUpdateWith = (\(iDb, _, _, _) -> iDb .== toFields i)
, uWhere = rCount
, uReturning }
Now, let's walk through what this code is doing...
Haskell code preliminaries
Usually, the module
definition and import
s warrant no comment, but in the case of Opaleye they do! Take special note of the Arrows
language extension. Without enabling Arrows
, you will be unable to write SELECT
queries in Opaleye's special (type-safe) DSL. Similarly, take note of the Control.Arrow
module and the returnA
function (among a few other) that will be used whenver you want to write SELECT
queries in Opaleye.
{-# LANGUAGE Arrows #-}
{-# LANGUAGE FlexibleContexts #-}
module InstantGratification where
import Opaleye as O
import Database.PostgreSQL.Simple (Connection, connect, ConnectInfo(..))
import Data.Profunctor.Product (p4)
import Control.Arrow (returnA)
import Data.Time (Day)
import Control.Monad (void)
Connecting to the Postgres DB
There isn't anything specific to Opaleye while connecting to the Postgres DB. This is standard usage of the postgeresql-simple
library.
Opaleye does not deal with managing connections to Postgres - it is left up to you. Every function in Opaleye that interacts with the Postgres database, like runSelect
, runInsert
, runUpdate
, etc, expects a Connection
(from the postgresql-simple
package) as one of the arguments.
Tip
Link-off to connection management
- As a result of the first point, there is no overarching custom monad in
Opaleye
. All functions that interact with the Postgres database are directly in standardIO
monad. - As another result of the first point, in all probability you will end-up writing wrappers for the core Opaleye function, because you will need to extract a
Connection
from aPool Connection
to be able to call the Opaleye functions (basically implement database connection pooling on your own).
getDbConn :: IO Connection
= connect ConnectInfo
getDbConn = "localhost"
{ connectHost = 5432
, connectPort = "opaleye_tutorial"
, connectDatabase = "opaleye_tutorial"
, connectUser = "opaleye_tutorial"
, connectPassword }
Teaching Opaleye your table structure
If you're coming from the Rails world (as I was), this will be especially surprising and frustrating for you. Unlike ActiveRecord (and a few other ORMs / database-libraries), Opaleye does not automagically figure-out your DB schema and give you a Haskell interface to query/manipulate your DB tables. Instead, you have to painstakingly list out the types + names of each column in each table of your Postgres schema, to make Opaleye aware of them.
Tip
TODO: Link-off to chapter describing autogeneration of this boilerplate
Let's do this for our users
table, to give us userTable
, which will then act as our type-safe interface to it (to issue SQL statements).
userTable :: Table (Field SqlInt4, Field SqlText, Field SqlText, FieldNullable SqlDate)
Field SqlInt4, Field SqlText, Field SqlText, FieldNullable SqlDate)
(= Table "users" (p4 ( tableField "id"
userTable "name"
, tableField "email"
, tableField "dob")) , tableField
Here are the four things this function is telling Opaleye:
First: we will be using a 4-tuple to represent the table's rows in Haskell. The same Haskell type will be used to read and write rows to the table, hence the type (Field SqlInt4, Field SqlText, Field SqlText, FieldNullable SqlDate)
is repeated twice in the userTable :: Table ...
annotation, first for the the type used to write rows, and next for the type used to read rows.
Field x
is the type that Opaleye uses to represent Postgres columns in Haskell-land. Sointeger
columns becomeField SqlInt4
;varchar
columns becomeField SqlText
; and so on.- Similarly,
FieldNullable x
is the type used by Opaleye to represent nullable Postgres columns in Haskell. This is the reason thedob
column corresponds toFieldNullable SqlDate
instead ofField SqlDate
Tip
Opaleye allows you to read and write rows from Haskell data-structures. It is possible to have separate data-structures for reading vs writing for very valid reasons. Read Basic DB mappings for more details on this.
Second: the table's name (i.e. users
), names of the four columns in the table (i.e. id, name, email, dob
), and how each column corresponds to the types we will be using to represent the table's row in Haskell. That is, id
will correspond to the first value in the 4-tuple; name
will correspond to the second value in the 4-tuple; and so on.
Tip
It is not necessary that a table's rows be represented only as tuples. You can also represent them as Haskell records, and this is discussed in detail in Advanced DB Mappings
Third: Whether each column is nullable, or NOT NULL
, and consequently, whether it can be omitted during write operations, or not. In this case, the first three columns may NOT be omitted during INSERT
operations (which usually, not always, is the result of having NOT NULL
constraints on columns), whereas the fourth column may be omitted. In this particular code snippet, this information is being inferred by Opaleye on the basis of the types corresponding to each DB field, i.e. Field x
vs FieldNullable x
. But there are advanced scenarios possible discussed in Advanced DB Mappings
Tip
To learn more about mapping different types of DB schemas to Opaleye's Table
types, please read Basic DB mappings and Advanced DB Mappings chapters.
Fourth: there is a strange p4
function that is telling something very important to Opaleye. It is related to how Opaleye is able to map DB rows to tuples, records, or any other data-structure for that matter. Unfortunately, getting into these details is too much for an introductory chapter, so we'll skim this for now. You can read more about this in Advanced DB Mappings. For now it will suffice to know about the existence of p1
through p62
which can be used depending upon the number of columns in your table.
Selecting all rows
Now that we have taught Opaleye about the users
table, we will write a function that can read all rows from this table and return them as a list of Haskell 4-tuples.
-- If you have loaded this tutorial in your GHCi session, you can
-- run this function via:
--
-- getDbConn >>= selectAllUsers
--
selectAllUsers :: Connection -> IO [(Int, String, String, Maybe Day)]
=
selectAllUsers conn $ -- execute a SELECT query
O.runSelect conn -- the actual SELECT query to execute O.selectTable userTable
This uses runSelect
, which is used to execute a SELECT
query in Opaleye. The actual select query being executed is selectTable userTable
. This is the simplest query possible, which essentially corresponds to a SELECT * FROM users
in SQL. For a slightly more complicated SQL query take a look at Searching by email (and introducing the Opaleye DSL).
Note about type signatures
Please take special note of the type signature of this function. It returns the rows as IO [(Int, String, String, Maybe Day)]
, whereas we clearly told Opaleye that we will be reading rows of type (Field SqlInt4, Field SqlText, Field SqlText, FieldNullable SqlDate)
. Shouldn't this function return tuples of this type instead?
For a detailed explanation of what's going on, please read Basic DB mappings, but here's a super-simplified explanation: there is no advantage in returning the final values as Field SqlInt4
, or Field SqlText
, because your code will not be able to process them using any standard Haskell library. So, Opaleye knows how to convert most data types from DB => Haskell (eg. Field SqlInt4
=> Int
, Field SqlText
=> Text
). And vice versa.
Danger
following para required?
However, here's a gotcha! Try compiling ths function without the type signature. The compiler will fail to infer the types. This is also due to the underlying infrastructure that Opaleye uses to convert DB => Haskell types. To understand this further, please read Advanced DB Mappings.
Searching by email (and introducing the Opaleye DSL)
Let's make our SELECT
query a little more interesting by using a bit of the Arrow-based DSL that Opaleye provides. The DSL is what (a) provides a type-safe version of SQL, (b) makes it hard (if not impossible) to write invalid SQL, and (c) also makes SQL easy to refactor/maintain (especially as your table structure changes with time).
Tip
The DSL (and the way it translates to actual SQL) is the defining feature of Opaleye. You can read more about it at Opaleye Query Tutorial
selectByEmail :: Connection -> String -> IO [(Int, String, String, Maybe Day)]
= O.runSelect conn $ proc () -> do
selectByEmail conn email @(_, em, _, _) <- O.selectTable userTable -< ()
row-< (em .== O.toFields email)
restrict -< row returnA
Now the fun begins! The proc
keyword, and its related operators, such as -<
, work only when you enable the Arrows
language extension. Please note, proc
is a keyword, not a function, which means you will not find it any package, or Hackage docs, just like you won't find case
or if-then-else
in any package.
Everything after proc () -> do
is, loosely speaking. not regular Haskell code. It is Haskell for sure, since GHC type-checks and compiles it, but conceptually it's a representation of SQL in Haskell-land. Keep this in mind for future - in case you struggle to get some Haskell code to type-check in a proc
block, especially something in the IO
monad. You are essentially constructing an SQL query-string using extremely type-safe and special-purpose operators/functions.
Here's how that DSL corresponds to the underlying SQL query:
SELECT id, name, email, dob -- returnA -< row
FROM users -- selectTable userTable
WHERE email = "whatever@example.com" -- restrict -< (em .== toFields email)
The selectTable userTable
expression makes DB rows available to us using the same 4-tuple type that we ourselves configured in Teaching Opaleye your table structure. This means that row
is of type (Field SqlInt4, Field SqlText, Field SqlText, FieldNullable Day)
.
We pattern-match on these 4-tuples and capture the entire tuple in row
and the tuple's second value in em
, and use this for further processing.
Using the restrict
function, we select only those rows which match the condition - em .== toFields email
. This raises two questions: (a) What is the .== operator, and (b) What is the toFields function, both of which are explain in their respective sub-sections.
Finally returnA
does some magic to return the entire row back to Haskell-land. Notice, that we don't have to do a DB => Haskell conversion here, because runSelect
does that conversion automagically.
What is the .==
operator
The restrict
expression gets directly converted to the WHERE ...
clause in SQL. Therefore, it needs to be an expression that takes a Postgres row-type and returns a Postgres boolean (as opposed to a regular Haskell boolean). This is why we cannot use the standard Haskell equality operator ==
, since it represents equality in Haskell-land and evaluates to a Bool
. Instead we use .==
, which represents equality in Postgres-land and evaluates to a Field SqlBool
. You will come across a lot of such special operators that make sense only when used on Postgres values and will be finally evaluated/executed by Postgres, instead of Haskell.
So, if we add explicit type-annotations for the purpose of explaining what's going on, here's what .==
is comparing...
em :: Field SqlText) .== ((toFields (email :: String)) :: Field SqlText) (
...which brings us to the String -> Field SqlText
conversion being done by toFields
.
What is the toFields
function
So, what does toFields em
do? It converts a value from Haskell-land to its suitable SQL representation. In case of strings that would mean single-quoting the string (as opposed to double-quoting in Haskell), and dealing with single quotes within the string itself. In case of dates and timestamps, it would mean converting the Haskell value to a string representation that Postgres understands. Similarly, for lists it would convert to a string representation that Postgres understands. A few examples:
Haskell value | Haskell type | Postgres value | Postgres type | Notes |
---|---|---|---|---|
"Hello world" |
String |
'Hello world' |
Field SqlText |
Notice the difference in single vs double quotes |
"Where's Waldo?" |
String |
'Where''s Waldo?' |
Field SqlText |
Notice the sanitisation of the single-quote within the string |
fromGregorian 2020 1 1 |
Day |
'2020-01-01' |
Field SqlDate |
Date format natively understood by Postgres |
[1, 2, 3] |
[Int] |
'{1,2,3}' |
Field (SqlArray Int) |
Array format natively understood by Postgres |
Danger
Remember, toField
converts from Day -> Field SqlDate
, not Day -> String
(as an example). This means you will not have access to the raw SQL string that a Haskell value is converted to. This is to ensure you don't interchange native-Haskell and native-SQL representations of values.
Inserting a row
insertUser :: Connection -> (Int, String, String, Maybe Day) -> IO ()
=
insertUser conn row $ O.runInsert_ conn ins
void where
= O.Insert
ins = userTable
{ iTable = [O.toFields row]
, iRows = O.rCount
, iReturning = Nothing
, iOnConflict }
Opaleye's API tries to mimic the underlying SQL as much as possible (while providing a strong cover of type-safety over it):
- This is why the
runInsert_
function always accepts a list of rows to insert into a table (and not a single row) via theiRows
argument. (This is howINSERT
in Postgres actually works). - Further, depending upon how you configure the
iReturning
argument it either returns the count of rows inserted (rCount
), or an SQL expression evaluated over the newly inserted rows (rReturning
). Again, if you userRetuning
it will always return a list of rows, because that's how Postgres returns them.
Our INSERT
statement is configured to return COUNT (*)
over the inserted rows, (i.e. the number of rows inserted), but we aren't really interested in it. This the reason we've stuck a void
before runInsert_
.
Another note about the toFields
function
Take note of iRows = [toFields row]
in the code above.
In What is the toFields function we saw toFields
converting a String -> Field SqlText
. However, here we are using toFields
to convert the entire row from Haskell => DB. That is, in this case it is doing the following conversion:
Int, String, String, Maybe Day) -> (Field SqlInt, Field SqlText, Field SqlText, FieldNullable SqlDate) (
Couple this with the fact that, in Teaching Opaleye your table structure we said that Opaleye can work with any arbitrary type to represent rows. You might wonder how toFields
knows to "navigate" the structure of any arbitrary Haskell type? This is where the stange p1
through p62
(and related) functions come in. We conventiently glossed over them in Teaching Opaleye your table structure, but you can find more details at Advanced DB Mappings.
Updating a row
updateUser :: Connection -> (Int, String, String, Maybe Day) -> IO ()
=
updateUser conn (i, n, e, d) $ O.runUpdate_ conn u
void where
= O.Update
u = userTable
{ uTable = (\(iDb, _, _, _) -> (iDb, O.toFields n, O.toFields e, O.toFields d))
, uUpdateWith = (\(iDb, _, _, _) -> iDb .== O.toFields i)
, uWhere = O.rCount
, uReturning }
Again, Opaleye tries to correspond as closely to the underlying SQL as possible:
UPDATE users
SET ... -- corresponds to uUpdateWith
WHERE ... -- corresponds to uWhere
RETURNING ... -- corresponds to uReturning
uUpdateWith
expects a function that works with a Postgres row-type, not a Haskell row-type. This is because it will be directly converted to the SET ...
clause (of the UPDATE
query). Hence, it will be dealing with native Postgres values, not Haskell values. Further, the function needs to convert values from the read-type to the write-type. Fortunately, since we've setup the row-types for reading and writing to be the same (ref: Teaching Opaleye your table structure), this doesn't pose a problem for us. The the type of our uUpdateWith
function needs to be:
uUpdateWith :: (Field SqlInt4, Field SqlText, Field SqlText, FieldNullable Date) -- row-type for reads -> (Field SqlInt4, Field SqlText, Field SqlText, FieldNullable Date) -- row-type for writes
Can you now guess why we used toFields
to convert n
, e
, and d
in the code snippet above, but left iDb
as-is?
Danger
TODO - collapsible answer
Similarly, the uWhere
function gets directly converted to the WHERE ...
clause. It needs to be a function that takes a Postgres row-type and returns a Postgres boolean (as opposed to a regular Haskell boolean). This is why we use .==
as opposed to ==
(explained in What is the .== operator)
uWhere :: (Field SqlInt4, Field SqlText, Field SqlText, FieldNullable Date) -> Field SqlBool
Tip
As you can see from the updateUser
function, updating rows in Opaleye is not very pretty! The biggest pain is that you cannot specify/update only a few columns in a row -- you are forced to provide the new value of every single column in a row, even if it doesn't need to change. An example is the iDb
column (the table's primary-key) in our example above. We do not need to change it, but the uUpdateWith
function still needs to put in the 4-tuple it returns. More about this in Updating rows.