Haskell Lists: The Ultimate Guide
Work In Progress
Danger
This guide is "Work in progress"
Haskell Lists: Two big Caveats
There are two major differences in Haskell lists, compared to other languages, especially dynamically typed languages, like Python, Ruby, PHP, and Javascript.
- First, lists in Haskell are homogenous. This means that a Haskell list can only hold elements of the same type
- Second, lists in Haskell are (internally) implemented as linked lists. This is different from many other languages, where the word "list" and "array" is used interchangably. Linked lists and arrays have very different performance characterstics when operating on large amounts of data. Keep this in mind for the future. If you're dealing with very large lists, where you want to randomly access the
i
-th element multiple times, use aVector
instead.
Constructing lists in Haskell
There are five different ways to construct lists in Haskell:
Square-bracket syntax: This is the simplest and most recognisable way.
-- A list of numbers let a = [1, 5, 7, 12, 56] -- A list of booleans let b = [True, False, False, True]
Colon operator: This is very similar to the
cons
function from Lisp-like languages. It adds a single element to the beginning of a list (and returns a new list).-- A list containing a single element -- This is the same as saying `[99]` let a = 99:[]
-- Keep adding single elements to the beginning of the list -- to progressively get a larger list let a = 1:5:6:12:[] -- A list of booleans let b = True:False:False:True:[]
Using ranges: This is short-hand for defining a list where the elements TODO
List comprehension: If you are starting out with Haskell, I would strongly recommend against using list comprehensions to construct lists. They seem like cool feature, but I find them very opaque and unmaintable. Similar to complex regular expressions - write once, read never! Nevertheless, there is a section dedicated to list comprehensions in Haskell for the sake of completeness.
Monoid interface: The most "complicated", but often used way of defining a list is via its Monoid interface. There is a section dedicated to the Monoid interface of lists if you'd like to know more.
mempty :: [Int] == []
Pattern-matching of Haskell lists
There are two ways to pattern-match over a list in Haskell, and there's a subtle difference between them.
- Using the
:
constructor:
-- Return the first element of a list, taking care of the edge-case where -- the list may be empty. Which is why the result is a (Maybe a) firstElem :: [a] -> Maybe a = case xs of firstElem xs -> Nothing [] -- Remember to put parantheses around this pattern-match else -- the compiler will throw a parse-error :_) -> Just x (x
-- Return the second element of a list secondElem :: [a] -> Maybe a = case xs of secondElem xs -- Remember the parantheses :y:_) -> Just y (_:[]) -> Nothing (_-> Nothing []
- Using the
[]
constructor:
-- Return the first element of a list firstElem :: [a] -> Maybe a = case xs of firstElem xs -- Remember to put parantheses around this pattern-match else -- the compiler will throw a parse-error :_) -> Just x (x-> Nothing []
Subtle difference between :
and []
when pattern-matching
With :
you can pattern-match a list with any number of elements. This is because the last :
matches the remainder of the list. Whereas, with []
, you can only pattern match a list with an exact number of elements.
For example, to pattern-match a list into (a) first element, (b) second element, and (c) everything else, you can use the :
operator as demonstrated below...
-- in the following expression: -- x = 1 -- y = 5 -- z = [7, 12, 45] let (x:y:z) = [1, 5, 7, 12, 45]
... however, there is no way to write a similar expression using []
. If you try, you'll get an error:
-- the following will always throw an error... let [x, y, z] = [1, 5, 7, 12, 45]
-- ...while the following will work let [x, y, z] = [1, 5, 7]
If you need to, you can also use :
to match a list with an exact number of elements. In fact, in the secondElem example above, we've used it to match a list with exactly one element. Here's an example of how to use it to pattern-match on a list with exactly two elements:
let x:y:[] = [1, 2]
Be careful how you use this. The following will always throw an error because you are forcing the last :
to match with a []
(empty list), but instead it gets a [3]
(list with single element 3
). If you want this to work, you'll have to go back to the first example in this section.
-- the following will always throw an error... let x:y:[] = [1, 2, 3]
Here's a complex example using both kinds of pattern matching. This converts a given list into a English phrase, such as "x, y, and z". For the four special cases (where the length has three, or fewer, elements) we use []
, whereas for the most general case, we use :
-- Complex example using multiple list-related functions let x = [1, 5, 20, 77, 45, 67] in case x of -> "(none)" [] -> show a [a] -> show a ++ " and " ++ show b [a, b] -> show a ++ ", " ++ show b ++ ", and " ++ show c [a, b, c] :b:c) -> show a ++ ", " ++ show b ++ ", and (" <> (show $ length c) <> ") more" (a
Iterating over a Haskell list
If you're starting out, you'd be surprised to know that there is no way to "iterate" over a list in Haskell, in a way that you might already be familiar with. To be specific, there's no way to do the following in Haskell:
// Familiar for-loops are NOT possible in Haskell!
var lst = [1, 5, 7, 2, 8, 10];
var sum = 0;
for(i = 0; i < lst.length; i++) {
= sum + lst[i];
sum }
If your thought-process requires you to iterate over a list, step back and think about why you need to it. Merely iterating over a list is not interesting; what you do in each iteration is the interesting part. And the Data.List
module has a rich set of functions which help you visit and do something with each element in a list, without having to write a for(i=0; i<l; i++)
boilerplate each time.
Keep this in mind when you're reading about the various operations you can do with lists. Haskell almost forces you to express your solution using a higher-level API, instead of dropping down to a for
-loop every time. Get familiar with the Data.List
API - you will be using it a lot when writing real-world Haskell code.
Tip
The closest that you can get to a for
-loop in Haskell, is the foldl
(or foldr
) function. Almost every other function in Data.List
can be written using this function. Or, you always have the option of implementing any iteration as a recursion - that's really the "lowest level" of getting this done - but it is not the idiomatic way of doing simple data transformations in Haskell.
Appending / Joining / Growing Haskell lists
There are four ways to join / concatentate / append / grow Haskell lists:
(++)
::
list1 -> list2 -> joined-list
When you have a few known lists that you want to join, you can use the ++
operator:
True, False] ++ [False, False] [
1, 2, 3] ++ [4, 5, 6] ++ [10, 20, 30, 50, 90] [
You can also use the ++
operator in it "prefixed function" form. This is useful short-cut when you want to pass it to another function, such as a foldl
, and don't want to write the verbose (\x y -> x ++ y)
-- you need to put parantheses around the operator otherwise Haskell
-- will throw a parse-error
++) [1, 2, 3] [4, 5, 6] (
concat
::
list-of-lists -> joined-list
While ++
is useful to join a fixed/known number of lists, sometimes you're dealing with an unknown/varying number of lists. In all probability you will represent them as a "list of lists". To join them together, use the concat
function:
concat [[1, 2, 3, 4, 5], [10, 20, 30, 40], [100, 200, 300]]
(:)
::
element -> list -> consed-list
The :
operator is also known as a the cons
operation, is actually a constructor of the []
type (it's a subtle fact that you don't need to bother with for most use-cases). Use it when you want to add a single element to the beginning of a list
1:[3,4,5]
Be careful, that the single element comes first, and the list comes next.
let x = [10, 20, 30]
= 99
y in y:x
You can also cons on top of an empty list. The example given below is the same as saying [999]
999:[]
intercalate
::
delimeter -> list -> joined-list
This function is typically used with a list of String
s where you want to join them together with a comma, or some other delimiter. Remember that a String is a type-synonym for [Char], so when intercalate
is used with strings the type-signature specializes to: [Char] -> [[Char]] -> [Char]
, which is the same thing as String -> [String] -> String
", " ["Haskell", "Tutorials", "Are", "Awesome"] intercalate
Tip
Do not confuse intercalate
with the similarly named intersperse
. The latter does not join lists.
Determining the length of a Haskell list
TODO
Finding a single element in a Haskell list
There are four commonly used ways to find a single element in a list, which vary slightly.
find
::
condition -> list -> Maybe element
The most general function for finding an element in a list that matches a given condition. Two things to note about this function:
- It returns a
Maybe a
, because it is possible that no element matches the given condition. - It return the first matching element
-- Find the first element greater than 10
-> x > 10) [5, 8, 7, 12, 11, 10, 99] find (\x
The following example is the same as the previous one, just written in a point free syntax.
-- Find the first element greater than 10
> 10) [5, 8, 7, 12, 11, 10, 99] find (
A slightly more complex example where we do something on the basis of whether an element exists in a list, or not (remember, the result is not a Bool
, but a Maybe a
):
-- Find the first user that has an incorrect age (you can possibly
-- use this to build some sort of validation in an API)
let users = [("Saurabh", 35), ("John", 45), ("Doe", -5)]
in case (find (\(_, age) -> age < 1 || age > 100) users) of
Nothing -> Right users
Just (name, age) -> Left $ name <> " seems to have an incorrect age: " <> show age
elem
::
element -> list -> Bool
Use elem
if you want to check whether a given element exists within a list. Two important differences with find
:
- The first argument to
elem
is the actual element you're searching for, whereas forfind
it is a condition to be applied to each element.elem
implicitly uses the==
operator to check for equality when searching for the given element, which is why there is anEq a
type-class constraint in its type-signature. - In the case of
elem
the result is aBool
, since you've already specified the exact element you're searching for. Whereas in the case offind
the result is aMaybe a
because you don't know beforehand which element will match the given condition.
elem 5 [1, 2, 5, 10]
Usually, elem
is used in its infix form, because it is easier to verbalize mentally.
5 `elem` [1, 2, 5, 10]
"Saurabh", 35) `elem` [("Saurabh", 35), ("John", 45), ("Doe", -5)] (
notElem
::
element -> list -> Bool
notElem
is the negation of elem
any
::
condition -> list -> Bool
any
lies in the "middle" of find
and elem
. It allows you to specify your own condition (like find
), but simply returns a True
/False
(like elem
) depending upon whether a match was found, or not.
let users = [("Saurabh", 35), ("John", 45), ("Doe", -5)]
in if (any (\(_, age) -> age < 1 || age > 100) users)
then Left "Some user has an incorrect age. Please fix the input data"
else Right users
Filtering / Rejecting / Selecting multiple elements from a Haskell list
There are three general ways to filter / reject / select multiple elements from a Haskell list:
- You want to go through the entire list and decide whether the element should be present in the resultant list, or not. You'd want to use filter for this.
- You want to extract the first (or last)
N
elements of a list (andN
is independent of the contents of the list). You'd want to usetake
ortakeEnd
in this case. - You want to stop selecting elements (basically terminate the iteration) as soon as a condition is met. For example:
- Select all elements from the beginning of a list of
Char
till you reach a delimiter, say,
. This could be the core of a home-grown CSV parser (although you shouldn't have to write it -- there's a library for that!) - You'd use one of
takeWhile
,dropWhile
, ordropWhileEnd
in this case.
- Select all elements from the beginning of a list of
Tip
Why is there no takeWhileEnd
?
filter
::
condition -> list -> filtered-list
The filter
function selects all elements from a list which satisfy a given condition (predicate).
Tip
This function is unfortunately named, because filter could mean either the act of selecting, or the act of removing elements based on a condition. I still get confused about which it is!
-- select all even elements from a list
filter (\x -> x `mod` 2 == 0) [1..20]
-- A more complex example that uses `filter` as well as `null`
let users = [("Saurabh", 35), ("John", 45), ("Trump", 105), ("Biden", 88), ("Doe", -5)]
= filter (\(_, age) -> age < 1 || age > 100) users
incorrectAge in if (null incorrectAge)
then Right users
else Left $ "Multiple users seem to have an incorrect age: " <> show incorrectAge
take
::
number-of-elements-to-take -> list -> shorter-list
take
is used to take the first N
elements from the beginning of a list. If N
is greater than the list's length, this function will NOT throw an error. It will simply return the entire list.
-- Simple example
let x = [1, 5, 20, 77, 45, 67]
in take 3 x
-- `N` is greater than the list length
let x = [1, 5, 20, 77, 45, 67]
in take 10 x
Tip
If you'd like to look at just the first element of the list, use one of the following methods instead:
- Pattern match on the first element
null
- if you just want to check if the list is non-empty.uncons
- if you want to split the list into it's "head" (first element) and "tail" (remaining elements)head
- Caution: Use this if, and only if, you are 100% sure that the list is non-empty. This function is infamously unsafe, and will throw an error on empty lists.
drop
::
number-of-elements-to-drop -> list -> shorter-list
drop
removes the first N
elements from a given list. If N
is greater that the list's length, an empty list will be returned.
takeWhile
::
condition -> list -> shorter-list
Keep taking (selecting) elements from the beginning of a list as long as the given condition holds true.
Here's how you can keep selecting Char
s till you encounter a ,
:
-- keep selecting elements from a [Char] till we encounter a comma
takeWhile (\x -> x /= ',') ['H', 'e', 'l', 'l', 'o', ',', 'W', 'o', 'r', 'l', 'd']
Same example, but using the familar syntax of writing a String
, which is a type-synonm for [Char]
-- keep selecting elements from a [Char] till we encounter a comma
takeWhile (\x -> x /= ',') "Hello,World"
Same example, but in point-free syntax:
-- keep selecting elements from a [Char] till we encounter a comma
takeWhile (/= ',') "Hello,World"
dropWhile
::
condition -> list -> shorter-list
dropWhile
is similar to takeWhile
, but instead of selecting elements based on the given condition, it removes them from the beginning of the list instead.
dropWhileEnd
::
condition -> list -> shorter-list
dropWhileEnd
is similar to dropWhile
, but instead of removing elements from the beginning of the list, it removes them from the end instead.