Instant Gratification

Learning Goal

We'll start by quickly running through the following DB operations, which should give you some "instant gratification" (as the title says!) However, do not start writing apps with Opaleye immediately after reading this one chapter. As they say, a little knowledge is a dangerous thing! You are strongly encouraged to read all the chapters in this tutorial before using Opaleye in any serious project.

Connecting to the Postgres DB
Selecting multiple rows
Selecting a single row - and getting a sneak-peek into the Opaleye query DSL
Selecting a row
Inserting a row
Updating a row

Danger

This tutorial is for opaleye-0.7 onwards. Opaleye went through a lot of breaking changes in 0.6.7 and then again in 0.7. If you're going through this tutorial, make sure you're using the correct version of Opaleye, else none of the examples will compile.

Setting up the Postgres schema

Opaleye works only with Postgres. While there is an opaleye-sqlite package, it is rather unmaintained. So, the first order of business is to get our sample Postgres DB in place for this tutorial.

First, let's create the opaleye_tutorial user + database, which we'll use throughout this tutorial.

-- Run the following SQL snippet as a DB superuser

create role opaleye_tutorial with createdb login password 'opaleye_tutorial';
create database opaleye_tutorial with owner=opaleye_tutorial;

Next, connect to the newly created opaleye_tutorial DB using the opaleye_tutorial user, and create the users table, with a few rows:

-- run this script using the opaleye_tutorial DB user

create table users
( id serial primary key
, name text not null
, email text not null
, dob date
);

insert into users(name, email) values ('John', 'john@mail.com');
insert into users(name, email) values ('Bob', 'bob@mail.com', '2001-01-01');
insert into users(name, email) values ('Alice', 'alice@mail.com', '2002-02-02');

The Complete Code

Here's the complete code in a single glance. The rest of this chapter walks through this code and introduces various Opaleye concepts in a simplified manner.

{-# LANGUAGE Arrows #-}
{-# LANGUAGE FlexibleContexts #-}
module InstantGratification where

import Opaleye as O
import Database.PostgreSQL.Simple (Connection, connect, ConnectInfo(..))
import Data.Profunctor.Product (p4)
import Control.Arrow (returnA)
import Data.Time (Day)
import Control.Monad (void)

getDbConn :: IO Connection
getDbConn = connect ConnectInfo
  { connectHost = "localhost"
  , connectPort = 5432
  , connectDatabase = "opaleye_tutorial"
  , connectUser = "opaleye_tutorial"
  , connectPassword = "opaleye_tutorial"
  }

selectAllUsers :: Connection -> IO [(Int, String, String, Maybe Day)]
selectAllUsers conn = 
  O.runSelect conn $        -- execute a SELECT query
  O.selectTable userTable   -- the actual SELECT query to execute

selectByEmail :: Connection -> String -> IO [(Int, String, String, Maybe Day)]
selectByEmail conn email = runSelect conn $ proc () -> do
  row@(_, em, _, _) <- selectTable userTable -< ()
  restrict -< (em .== toFields email)
  returnA -< row

insertUser :: Connection -> (Int, String, String, Maybe Day) -> IO ()
insertUser conn row = 
  void $ runInsert_ conn ins
  where
    ins = Insert 
      { iTable = userTable  
      , iRows = [toFields row]
      , iReturning = rCount
      , iOnConflict = Nothing
      }

updateUser :: Connection -> (Int, String, String, Maybe Day) -> IO ()
updateUser conn (i, n, e, d) = 
  void $ runUpdate_ conn u
  where
    u = Update 
      { uTable = userTable  
      , uUpdateWith = (\(iDb, _, _, _) -> (iDb, toFields n, toFields e, toFields d))
      , uWhere = (\(iDb, _, _, _) -> iDb .== toFields i)
      , uReturning = rCount
      }

Now, let's walk through what this code is doing...

Haskell code preliminaries

Usually, the module definition and imports warrant no comment, but in the case of Opaleye they do! Take special note of the Arrows language extension. Without enabling Arrows, you will be unable to write SELECT queries in Opaleye's special (type-safe) DSL. Similarly, take note of the Control.Arrow module and the returnA function (among a few other) that will be used whenver you want to write SELECT queries in Opaleye.

{-# LANGUAGE Arrows #-}
{-# LANGUAGE FlexibleContexts #-}
module InstantGratification where

import Opaleye as O
import Database.PostgreSQL.Simple (Connection, connect, ConnectInfo(..))
import Data.Profunctor.Product (p4)
import Control.Arrow (returnA)
import Data.Time (Day)
import Control.Monad (void)

Connecting to the Postgres DB

There isn't anything specific to Opaleye while connecting to the Postgres DB. This is standard usage of the postgeresql-simple library.

Opaleye does not deal with managing connections to Postgres - it is left up to you. Every function in Opaleye that interacts with the Postgres database, like runSelect, runInsert, runUpdate, etc, expects a Connection (from the postgresql-simple package) as one of the arguments.

Tip

Link-off to connection management

As a result of the first point, there is no overarching custom monad in Opaleye. All functions that interact with the Postgres database are directly in standard IO monad.
As another result of the first point, in all probability you will end-up writing wrappers for the core Opaleye function, because you will need to extract a Connection from a Pool Connection to be able to call the Opaleye functions (basically implement database connection pooling on your own).

getDbConn :: IO Connection
getDbConn = connect ConnectInfo
  { connectHost = "localhost"
  , connectPort = 5432
  , connectDatabase = "opaleye_tutorial"
  , connectUser = "opaleye_tutorial"
  , connectPassword = "opaleye_tutorial"
  }

Teaching Opaleye your table structure

If you're coming from the Rails world (as I was), this will be especially surprising and frustrating for you. Unlike ActiveRecord (and a few other ORMs / database-libraries), Opaleye does not automagically figure-out your DB schema and give you a Haskell interface to query/manipulate your DB tables. Instead, you have to painstakingly list out the types + names of each column in each table of your Postgres schema, to make Opaleye aware of them.

Tip

TODO: Link-off to chapter describing autogeneration of this boilerplate

Let's do this for our users table, to give us userTable, which will then act as our type-safe interface to it (to issue SQL statements).

userTable :: Table (Field SqlInt4, Field SqlText, Field SqlText, FieldNullable SqlDate)
                   (Field SqlInt4, Field SqlText, Field SqlText, FieldNullable SqlDate)
userTable = Table "users" (p4 ( tableField "id"
                              , tableField "name"
                              , tableField "email"
                              , tableField "dob"))

Here are the four things this function is telling Opaleye:

First: we will be using a 4-tuple to represent the table's rows in Haskell. The same Haskell type will be used to read and write rows to the table, hence the type (Field SqlInt4, Field SqlText, Field SqlText, FieldNullable SqlDate) is repeated twice in the userTable :: Table ... annotation, first for the the type used to write rows, and next for the type used to read rows.

Field x is the type that Opaleye uses to represent Postgres columns in Haskell-land. So integer columns become Field SqlInt4 ; varchar columns become Field SqlText ; and so on.

Similarly, FieldNullable x is the type used by Opaleye to represent nullable Postgres columns in Haskell. This is the reason the dob column corresponds to FieldNullable SqlDate instead of Field SqlDate

Tip

Opaleye allows you to read and write rows from Haskell data-structures. It is possible to have separate data-structures for reading vs writing for very valid reasons. Read Basic DB mappings for more details on this.

Second: the table's name (i.e. users), names of the four columns in the table (i.e. id, name, email, dob), and how each column corresponds to the types we will be using to represent the table's row in Haskell. That is, id will correspond to the first value in the 4-tuple; name will correspond to the second value in the 4-tuple; and so on.

Tip

It is not necessary that a table's rows be represented only as tuples. You can also represent them as Haskell records, and this is discussed in detail in Advanced DB Mappings

Third: Whether each column is nullable, or NOT NULL, and consequently, whether it can be omitted during write operations, or not. In this case, the first three columns may NOT be omitted during INSERT operations (which usually, not always, is the result of having NOT NULL constraints on columns), whereas the fourth column may be omitted. In this particular code snippet, this information is being inferred by Opaleye on the basis of the types corresponding to each DB field, i.e. Field x vs FieldNullable x. But there are advanced scenarios possible discussed in Advanced DB Mappings

Tip

To learn more about mapping different types of DB schemas to Opaleye's Table types, please read Basic DB mappings and Advanced DB Mappings chapters.

Fourth: there is a strange p4 function that is telling something very important to Opaleye. It is related to how Opaleye is able to map DB rows to tuples, records, or any other data-structure for that matter. Unfortunately, getting into these details is too much for an introductory chapter, so we'll skim this for now. You can read more about this in Advanced DB Mappings. For now it will suffice to know about the existence of p1 through p62 which can be used depending upon the number of columns in your table.

Selecting all rows

Now that we have taught Opaleye about the users table, we will write a function that can read all rows from this table and return them as a list of Haskell 4-tuples.

-- If you have loaded this tutorial in your GHCi session, you can 
-- run this function via:
-- 
-- getDbConn >>= selectAllUsers
-- 
selectAllUsers :: Connection -> IO [(Int, String, String, Maybe Day)]
selectAllUsers conn = 
  O.runSelect conn $        -- execute a SELECT query
  O.selectTable userTable   -- the actual SELECT query to execute

This uses runSelect, which is used to execute a SELECT query in Opaleye. The actual select query being executed is selectTable userTable. This is the simplest query possible, which essentially corresponds to a SELECT * FROM users in SQL. For a slightly more complicated SQL query take a look at Searching by email (and introducing the Opaleye DSL).

Note about type signatures

Please take special note of the type signature of this function. It returns the rows as IO [(Int, String, String, Maybe Day)], whereas we clearly told Opaleye that we will be reading rows of type (Field SqlInt4, Field SqlText, Field SqlText, FieldNullable SqlDate). Shouldn't this function return tuples of this type instead?

For a detailed explanation of what's going on, please read Basic DB mappings, but here's a super-simplified explanation: there is no advantage in returning the final values as Field SqlInt4, or Field SqlText, because your code will not be able to process them using any standard Haskell library. So, Opaleye knows how to convert most data types from DB => Haskell (eg. Field SqlInt4 => Int, Field SqlText => Text). And vice versa.

Danger

following para required?

However, here's a gotcha! Try compiling ths function without the type signature. The compiler will fail to infer the types. This is also due to the underlying infrastructure that Opaleye uses to convert DB => Haskell types. To understand this further, please read Advanced DB Mappings.

Searching by email (and introducing the Opaleye DSL)

Let's make our SELECT query a little more interesting by using a bit of the Arrow-based DSL that Opaleye provides. The DSL is what (a) provides a type-safe version of SQL, (b) makes it hard (if not impossible) to write invalid SQL, and (c) also makes SQL easy to refactor/maintain (especially as your table structure changes with time).

Tip

The DSL (and the way it translates to actual SQL) is the defining feature of Opaleye. You can read more about it at Opaleye Query Tutorial

selectByEmail :: Connection -> String -> IO [(Int, String, String, Maybe Day)]
selectByEmail conn email = O.runSelect conn $ proc () -> do
  row@(_, em, _, _) <- O.selectTable userTable -< ()
  restrict -< (em .== O.toFields email)
  returnA -< row

Now the fun begins! The proc keyword, and its related operators, such as -<, work only when you enable the Arrows language extension. Please note, proc is a keyword, not a function, which means you will not find it any package, or Hackage docs, just like you won't find case or if-then-else in any package.

Everything after proc () -> do is, loosely speaking. not regular Haskell code. It is Haskell for sure, since GHC type-checks and compiles it, but conceptually it's a representation of SQL in Haskell-land. Keep this in mind for future - in case you struggle to get some Haskell code to type-check in a proc block, especially something in the IO monad. You are essentially constructing an SQL query-string using extremely type-safe and special-purpose operators/functions.

Here's how that DSL corresponds to the underlying SQL query:

SELECT id, name, email, dob              -- returnA -< row
FROM users                               -- selectTable userTable
WHERE email = "whatever@example.com"     -- restrict -< (em .== toFields email)

The selectTable userTable expression makes DB rows available to us using the same 4-tuple type that we ourselves configured in Teaching Opaleye your table structure. This means that row is of type (Field SqlInt4, Field SqlText, Field SqlText, FieldNullable Day).

We pattern-match on these 4-tuples and capture the entire tuple in row and the tuple's second value in em, and use this for further processing.

Using the restrict function, we select only those rows which match the condition - em .== toFields email. This raises two questions: (a) What is the .== operator, and (b) What is the toFields function, both of which are explain in their respective sub-sections.

Finally returnA does some magic to return the entire row back to Haskell-land. Notice, that we don't have to do a DB => Haskell conversion here, because runSelect does that conversion automagically.

What is the `.==` operator

The restrict expression gets directly converted to the WHERE ... clause in SQL. Therefore, it needs to be an expression that takes a Postgres row-type and returns a Postgres boolean (as opposed to a regular Haskell boolean). This is why we cannot use the standard Haskell equality operator ==, since it represents equality in Haskell-land and evaluates to a Bool. Instead we use .==, which represents equality in Postgres-land and evaluates to a Field SqlBool. You will come across a lot of such special operators that make sense only when used on Postgres values and will be finally evaluated/executed by Postgres, instead of Haskell.

So, if we add explicit type-annotations for the purpose of explaining what's going on, here's what .== is comparing...

(em :: Field SqlText) .== ((toFields (email :: String)) :: Field SqlText)

...which brings us to the String -> Field SqlText conversion being done by toFields.

What is the `toFields` function

So, what does toFields em do? It converts a value from Haskell-land to its suitable SQL representation. In case of strings that would mean single-quoting the string (as opposed to double-quoting in Haskell), and dealing with single quotes within the string itself. In case of dates and timestamps, it would mean converting the Haskell value to a string representation that Postgres understands. Similarly, for lists it would convert to a string representation that Postgres understands. A few examples:

Haskell value	Haskell type	Postgres value	Postgres type	Notes
`"Hello world"`	`String`	`'Hello world'`	`Field SqlText`	Notice the difference in single vs double quotes
`"Where's Waldo?"`	`String`	`'Where''s Waldo?'`	`Field SqlText`	Notice the sanitisation of the single-quote within the string
`fromGregorian 2020 1 1`	`Day`	`'2020-01-01'`	`Field SqlDate`	Date format natively understood by Postgres
`[1, 2, 3]`	`[Int]`	`'{1,2,3}'`	`Field (SqlArray Int)`	Array format natively understood by Postgres

Danger

Remember, toField converts from Day -> Field SqlDate, not Day -> String (as an example). This means you will not have access to the raw SQL string that a Haskell value is converted to. This is to ensure you don't interchange native-Haskell and native-SQL representations of values.

Inserting a row

insertUser :: Connection -> (Int, String, String, Maybe Day) -> IO ()
insertUser conn row = 
  void $ O.runInsert_ conn ins
  where
    ins = O.Insert 
      { iTable = userTable  
      , iRows = [O.toFields row]
      , iReturning = O.rCount
      , iOnConflict = Nothing
      }

Opaleye's API tries to mimic the underlying SQL as much as possible (while providing a strong cover of type-safety over it):

This is why the runInsert_ function always accepts a list of rows to insert into a table (and not a single row) via the iRows argument. (This is how INSERT in Postgres actually works).
Further, depending upon how you configure the iReturning argument it either returns the count of rows inserted (rCount), or an SQL expression evaluated over the newly inserted rows (rReturning). Again, if you use rRetuning it will always return a list of rows, because that's how Postgres returns them.

Our INSERT statement is configured to return COUNT (*) over the inserted rows, (i.e. the number of rows inserted), but we aren't really interested in it. This the reason we've stuck a void before runInsert_.

Another note about the `toFields` function

Take note of iRows = [toFields row] in the code above.

In What is the toFields function we saw toFields converting a String -> Field SqlText. However, here we are using toFields to convert the entire row from Haskell => DB. That is, in this case it is doing the following conversion:

(Int, String, String, Maybe Day) -> (Field SqlInt, Field SqlText, Field SqlText, FieldNullable SqlDate)

Couple this with the fact that, in Teaching Opaleye your table structure we said that Opaleye can work with any arbitrary type to represent rows. You might wonder how toFields knows to "navigate" the structure of any arbitrary Haskell type? This is where the stange p1 through p62 (and related) functions come in. We conventiently glossed over them in Teaching Opaleye your table structure, but you can find more details at Advanced DB Mappings.

Updating a row

updateUser :: Connection -> (Int, String, String, Maybe Day) -> IO ()
updateUser conn (i, n, e, d) = 
  void $ O.runUpdate_ conn u
  where
    u = O.Update 
      { uTable = userTable  
      , uUpdateWith = (\(iDb, _, _, _) -> (iDb, O.toFields n, O.toFields e, O.toFields d))
      , uWhere = (\(iDb, _, _, _) -> iDb .== O.toFields i)
      , uReturning = O.rCount
      }

Again, Opaleye tries to correspond as closely to the underlying SQL as possible:

UPDATE users    
SET ...        -- corresponds to uUpdateWith
WHERE ...      -- corresponds to uWhere
RETURNING ...  -- corresponds to uReturning

uUpdateWith expects a function that works with a Postgres row-type, not a Haskell row-type. This is because it will be directly converted to the SET ... clause (of the UPDATE query). Hence, it will be dealing with native Postgres values, not Haskell values. Further, the function needs to convert values from the read-type to the write-type. Fortunately, since we've setup the row-types for reading and writing to be the same (ref: Teaching Opaleye your table structure), this doesn't pose a problem for us. The the type of our uUpdateWith function needs to be:

uUpdateWith :: (Field SqlInt4, Field SqlText, Field SqlText, FieldNullable Date) -- row-type for reads
            -> (Field SqlInt4, Field SqlText, Field SqlText, FieldNullable Date) -- row-type for writes

Can you now guess why we used toFields to convert n, e, and d in the code snippet above, but left iDb as-is?

Danger

TODO - collapsible answer

Similarly, the uWhere function gets directly converted to the WHERE ... clause. It needs to be a function that takes a Postgres row-type and returns a Postgres boolean (as opposed to a regular Haskell boolean). This is why we use .== as opposed to == (explained in What is the .== operator)

uWhere :: (Field SqlInt4, Field SqlText, Field SqlText, FieldNullable Date)
       -> Field SqlBool

Tip

As you can see from the updateUser function, updating rows in Opaleye is not very pretty! The biggest pain is that you cannot specify/update only a few columns in a row -- you are forced to provide the new value of every single column in a row, even if it doesn't need to change. An example is the iDb column (the table's primary-key) in our example above. We do not need to change it, but the uUpdateWith function still needs to put in the 4-tuple it returns. More about this in Updating rows.

Instant Gratification

Learning Goal

Setting up the Postgres schema

The Complete Code

Haskell code preliminaries

Connecting to the Postgres DB

Teaching Opaleye your table structure

Selecting all rows

Note about type signatures

Searching by email (and introducing the Opaleye DSL)

What is the .== operator

What is the toFields function

Inserting a row

Another note about the toFields function

Updating a row

What is the `.==` operator

What is the `toFields` function

Another note about the `toFields` function