User-defined types, polymorphism, and pattern-matching

Tip

Learning goal

In this section you will be introduced to one of the many super-powers of the Haskell type-system: [A]lgebraic [D]ata [T]ypes. You will also learn about how to build you own complex data types using the building blocks of built-in types, ADTs, records & tuples.

This super-power, called ADT, can be practically put to use only with type-polymorphism and pattern-matching. So, we’ll cover those concepts, as well.

Reading material

Exercises

Warning

Before jumping into the exercises

You will almost immediately hit the problem mentioned in Displaying values of your custom data-types. Read that section before you start-off with the exercises.

  • Go back and re-read some type-signatures that you were struggling with earlier (say, some API docs) . Do they make more sense now? For example, without reading the accompanying documentation, can you take an educated guess about what the following functions do?

    someFunc1 :: [Maybe a] -> [a]
    
    someFunc2 :: [Either a b] -> [b]
    
    someFunc3 :: [Either a b] -> ([a], [b])
    

    Stumped? Go to your LTS listing page and search for these type signatures, eg. search for [Maybe a] -> [a] and now read the API docs. Can you now appreciate how types are very self-documenting? Not yet? Don’t worry. Keep ploughing through, and these type-signatures will gradually starting “speaking” to you, as well.

    Note

    Btw, this is one of the reasons why a lot of Haskell libraries don’t have very explanatory API docs. Advanced Haskellers can understand a lot from the type-signatures themselves. And most libraries are written by advanced Haskellers. So, there.

  • If you have worked with Java or C#, try implementing Maybe and Either data-types using Generics in those languages. Also, if you understand Generics from Java/C# ecosystems, you’ll find it easier to grasp type-level polymorphism in Haskell.

  • Take some time to understand the difference and relationship between if-then-else, pattern-matching, and function-guards. Revisit your solutions to the previous chapter’s Before you start the exercises and rewrite each solution using all three - (a) if-then-else, (b) pattern-matching, and (c) function-guards. Are they completely interchangeable? Did you get stuck while expressing an operation/logic with any of the three techniques?

  • http://exercism.io/exercises/haskell/run-length-encoding/readme - Implement a solution without any custom-types. Then, implement a solution which uses the following custom-types at its core. Which one did you find easier to implement?

    data RLEUnit = Repetition Char Int | NoRepetition Char
    type RLE = [RLEUnit]
    
    computeRLE = String -> RLE
    computeRLE = _todo
    
    showCompressed :: RLE -> String
    showCompressed = _todo
    
    showOriginal :: RLE -> String
    showOriginal = _todo
    
  • http://exercism.io/exercises/haskell/simple-linked-list/readme

  • Write the core data-structures & functions of a hypothetical library management app (as in, physical library with books and all). For example:

    data Book = { _defineTheRecordStructure }
    data Member = { _defineTheRecordStructure }
    
    
    -- This data-structure represents a list of all the books in the library,
    -- along with, who has issued them.
    --
    type IssueDate = (Int, Int, Int) -- NOTE: (year, month, date)
    type Library = [(Book, Maybe (Member, IssueDate))]
    
    
    -- NOTE: It is not necessary that all the functions take two arguments. The
    -- function signatures are just placeholders. You have to figure them out.
    --
    addBook :: ?? -> ?? -> ??
    removeBook :: ?? -> ?? -> ??
    issueBook :: ?? -> ?? -> ??
    returnBook :: ?? -> ?? -> ??
    
    -- Return a list of books that have been issued for more than 7 days
    overdueBooks :: ?? -> ?? -> ??
    

    You should be able to interact with you library-management app using GHCi itself, by calling the functions defined above, passing in the required arguments, and letting GHCi print the resulting data-structures to your terminal.

    • Can you think of at least two more ways of defining the core data-structures? Think about the pros & cons of each.
    • What happens to your library after you call addBook? How do you call addBook to add 5 new books, and then issueBook for one of the newly added books?
    • How are you handling the edge-case when issueBook is called on a book that is already issued? If you haven’t handled that, modify your code to handle that now? What does this do to the function signature of issueBook?
    • Were you able to write the overdueBooks function? How did you handle the case where adding 7 days to the issueDate was causing the date to overflow into the next month/year? And, how did you get the current date of the system? (Don’t beat yourself up if you weren’t able to write this function. We’ll revisit this part when we come to “Monads”)

Todo

add more exercises

Stuff that you might struggle with

Records

Trying to create multiple records where fields share the same name will result in a compile error. Example:

data User = User
  { name :: String
  , email :: String
  , login :: String
  }

data Company = Company
  { name :: String      -- Same field-name as User
  , email :: String     -- Same field name as User
  , address :: String
  }

-- ERROR

-- /private/tmp/first-project/app/Main.hs:10:5: error:
--   Multiple declarations of ‘name’
--  Declared at: /private/tmp/first-project/app/Main.hs:4:5
--                /private/tmp/first-project/app/Main.hs:10:5

-- /private/tmp/first-project/app/Main.hs:11:5: error:
--  Multiple declarations of ‘email’
--  Declared at: /private/tmp/first-project/app/Main.hs:5:5
--                /private/tmp/first-project/app/Main.hs:11:5

Note

Yes, this was a huge WTF-moment for me when I first encountered this, as well. All the great features of Haskell’s type-system are neutralized by one of the worst record systems of any typed language.

There are a few attempts, like DuplicateRecordFields, OverloadedLabels, and generic-lens, that attempt to rectify this problem, but they aren’t proven in a real-world setting, yet. I’d recommend steering clear of them, for now, and solving this problem using an easy and battle-tested way (given below).

Two ways to create records with the same field names:

  • recommended Prefix record names, eg.

    data User = User { userName :: String, userEmail :: String }
    data Company = Company { companName :: String, companyEmail :: String }
    
  • Or, define these data types in separate modules, and qualify their imports. eg.

    -- FILE: User.hs
    
    module User where
    
    data User = User { name :: String, email :: String }
    
    -- FILE: Company.hs
    
    module Company where
    
    data Company = Company { name :: String, email :: String }
    
    -- FILE: Main.hs
    
    module Main where
    import qualified User as U
    import qualified Company as C
    
    u :: U.User
    u = U.User { U.name = "Saurabh", U.email = "saurabh@vacationlabs.com" }
    
    c :: C.Company
    c = C.Company { C.name = "Vacation Labs", C.email = "support@vacationlabs.com" }
    

Type-constructors vs Values

You might start thinking that Maybe, Just and Nothing in the following type definition are all types. Dispel that notion right-away, or you’ll be left scratching your head as you progress with more advanced topics.

data Maybe a = Nothing | Just a

In the type definition given above, Maybe a is the type, whereas Nothing and Just a are the values of that possible type. Re-read the last sentence, and then read the table below to appreciate the difference:

Type Possible values
Int 0, 1, 2, 3, etc.
Float 0.0, 1.2, 4.5, 8.92, etc.
Bool True, False (nothing else – a Bool can have only one of two values)
Maybe a Nothing or Just a
Maybe Int Nothing, Just 1, Just 2, Just 100 and so on…
Maybe String Nothing, Just "hello", Just "world" and so on

Questions:

  • How many values can an expression of type Maybe Bool have? Can you write a case expression to pattern-match all the values of that type?

  • Look-up the how Either a b is defined. What is the type and what are the values? How may values can an Either Bool Bool have? Can you pattern-match all of them?

  • Can you write a function with the following type signature? Why / why not?

    someFunc :: Nothing -> Int
    
    someFunc2 :: Just Int -> Int
    
  • What is the type of Nothing? Try checking the type of Nothing in GHCi. Do you spot something odd? Stash this away in your head for a while. We’ll come across this oddity and will have to deal with it head-on as we progress to more advanced Haskell.

Tip

If you have some experience with TypeScript, please understand that this is very different from the following in TypeScript:

type Shape = Square | Rectangle | Circle;

In TypeScript, this is called a Discriminated Union and in that language it is indeed possible to express the following idea: “this variable can either be of the type Rectangle or Shape or Circle”. However, it not possible to express this in Haskell easily [1] In Haskell, a similar syntax would result in a single type, called Shape with three values, called Rectangle, Circle and Square.

Todo

Is there any existing article that explains this difference in a newbie-friendly manner?

Displaying values of your custom data-types

Right now, you might be wondering why the the following works…

GHCi> let x = 10
GHCi> x
10

…but the following doesn’t:

GHCi> data User = User { userName :: String, userEmail :: String }
GHCi> let u = User { userName = "Saurabh", userEmail = "saurabh@vacationlabs.com" }
GHCi> u

<interactive>:11:1: error:
  • No instance for (Show User) arising from a use of ‘print’
  • In a stmt of an interactive GHCi command: print it

Don’t worry. Once you read about type-classes you’ll understand why. Right now, for every custom record-type that you define, just put a deriving (Eq, Show) at the end, as shown below. This will ensure that GHCi is able to print/show values of your custom types AND that you can use the == operator to compare values of your custom types.

data User = User { userName :: String, userEmail :: String } deriving (Eq, Show)

Polymorphism

If you’re familiar with Java, you probably know the term “polymorphism” in context with how it’s done in Java. This is very different from what polymorphism means in Haskell. So, there is a bit of unlearning that you’ll have to do here. For example, the following in Java, is almost impossible to translate to Haskell as-is.

public int add(int x, int y);
public int add(int x, int y, int z);
public int add(double x, double y)
// and so on..

If you are familiar with Ruby (or Javascript), you have probably written code that looks like this. Again, this is almost impossible to do in Haskell (and for very good reasons, btw). If you ever find yourself trying to express the following idea while writing Haskell - “I need a value of type Int or type String” then take a step back and read the material on type-signatures, ADTs and functions again.

def add(x, y)
  if x.is_a?(Numeric) && y.is_a?(Numeric)
    return (x + y)
  elsif x.is_a?(String) && y.is_a?(String)
    return (x.upcase + y.upcase)
  else
    throw "Both values should either be numbers or string"
  end
end
[1]If you are really determined, I am sure you will come across some type-level hackery that allows you to do this in Haskell as well. However, it almost certainly a bad idea - don’t waste your time with it.