Lambdas and higher-order functions ================================== Learning goal ------------- By the end of this chapter you should understand what lambdas (also called anonymous functions) are, what are they used for, and how they are the basic building block of writing real-world Haskell code. We'll also revisit all the exercises of the previous chapter and solve them using higher-order functions like ``map``, ``filter``, etc and simplify our life! You have probably used lambdas before ------------------------------------- If you have made any AJAX calls using jQuery, you have already used lambdas: .. code-block:: javascript $.getJSON("http://something.com/file.json", function(data) { console.log("received JSON data", data); }); If you have written a ``click`` handler in jQuery, you have already used lambdas: .. code-block:: javascript $('#myButton').click(function() { alert("You clicked the button!"); }); If you have ever printed each line of a CSV file in Ruby, you have already used lambdas: .. code-block:: ruby CSV.foreach("file.csv") do |fields| puts fields.inspect end If you have ever filtered a list in Python, you have already used lambdas: .. code-block:: python >>> print filter(lambda x: x % 3 == 0, [2, 18, 9, 22, 17, 24, 8, 12, 27]) [18, 9, 24, 12, 27] Now that we have *almost* established that you have already used lambdas before let's start by building intuition of *why* they are needed in the first place. Why are lambdas needed ---------------------- Let's take a step back. Why are *functions* needed? You probably know intuitively why functions are needed and when you would write your own function, but let's articulate it nevertheless. We will build upon it to demonstrate why *lambdas* are needed. Functions are used to define common piece of logic/computation/steps that need to be used at multiple places. Is it the *exact same* logic/computation/steps? No. That would be very limiting. The logic/computation is *parameterized* somehow, so that every time it is needed, the behaviour can be slightly altered. This is achieved by passing different *arguments* to various function parameters. For example the ``sqrt`` function, depicted with the awesome ASCII art below, is fairly simple. It does one thing and does it well. It give us one parameter with to control its output - the number we want to find the square-root of. .. code:: +------+ 4 (input) --> | sqrt | --> 2 (output) +------+ +------+ 16 (input) --> | sqrt | --> 4 (output) +------+ Similarly, an exponentiation functions gives us *two* parameters - the base and the power. Now, let's take this up a notch. What if we were to look at another very common task in real-world apps: open a file => read data *line by line* => do something with each line => close the file. If you have written code in Java, there is a very high chance that you have written the following code over and over and over again: .. code-block:: java BufferedReader br = null; try { // open the file... br = new BufferedReader(new FileReader("somefile.txt")); // read each line till you reach EOF (end-of-file) String line = br.readLine(); while (line != null) { // ... finally do something with each line } } finally { // Let's close the file... // if (br!=null) { br.close() } } Processing a file *line by line* is a series of steps that need to be peformed at multiple places in your app. Yet we find ourselves writing similarly "shaped" code over and over again all over our app. The only difference would be the action that is being done with each line once it is read from the file. Can we use regular functions, like ``sqrt``, to reduce this code duplication? What parameters will we pass our hypothetical ``processFileLineByLine`` function? One parameter will obviously be for the filename that we want to process, but what about *what we want to do* with each line after it is read? To elaborate: 1. open file 2. read a line 3. do something with the line **<== This is what we want to change in different places in our app** 4. Go back to step #2 if we haven't reached the end of the file 5. Close the file At one place we might need to print each line to the terminal. In another place we might want to print only those lines which contain the term ``ERROR`` (say we are filtering a log file). In yet another place, we might want to write each line to a separate file masking out any email addresses to ``something@redacted.com`` (say, for data anonymization purposes). What we need is the ability to pass this hypothetical ``processFileLineByLine`` function a *step* (not a *value*, but a *step*). A *step* which contains the logic/computation of what to do with each line after it has been read. This is exactly what lambdas (and higher-order functions) allow us to do. They allow us to write functions, where we can pass-in a *step*, a piece of *logic*, or a *computation* that changes the behaviour of the original function. And, how can we write a *step*, a *computation*, or a piece of *logic* in any language? Why, functions, of course! So, we end-up with functions that can accept other functions as arguments. **Thus, they are called "higher order" functions.** Examples: CSV processing in Ruby ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Let's consider the Ruby example from earlier. This is very similar to the Java example of reading files. Here are the steps that ``CSV.foreach`` must be performing internally: - Opens ``file.csv`` - Reads each row from the file taking care of detecting the newlines (there is a difference in how Unix represents newlines and how Windows represents newlines) - For each row it separates the data into fields by splitting at the commas `,` taking care that commas may be present within a fields value itself! - Finally, it has access to all the data in the CSV neatly split into rows & fields. - Now what? What needs to be done with the fields in each row? the CSV library doesn't really know about that. That's what you, the application developer, has to write. .. code-block:: ruby CSV.foreach("file.csv") do |fields| puts row.inspect end This is where the lambda comes in: .. code-block:: ruby do |fields| puts fields.inspect end You are telling the CSV library, to run this piece of code for each row in the CSV file. The fields in each row are passed to your lambda (anonymous function) neatly in the form of an array. **You have, thankfully, been isolated from all the boilerplate of reading and parsing a CSV file** -- you are concentrating only on the core application logic of *what to do* with each row in the CSV. Example: AJAX call in Javascript ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Let's look at the example of making an AJAX call using jQuery: .. code-block:: javascript $.getJSON("http://something.com/file.json", function(data) { console.log("received JSON data", data); }); Here, the lambda is the following anonymous function: .. code-block:: javascript function(data) { console.log("received JSON data", data); } Notice, that this function doesn't have a name! It's just passed as any other argument to the ``getJSON`` function. In fact even the following is valid Javascript and will work: .. code-block:: javascript var dataProcessingFunction = function(data) { console.log("received JSON data", data); }; $.getJSON("http://something.com/file.json", dataProcessingFunction); Why is the lambda really being used here? Why can't this be written in the following manner: .. code-block:: javascript var data = $.getJSON("http://something.com/file.json"); console.log("received JSON data", data); It's written using a lambda because AJAX calls in Javascript are *asynchronous*. That is, if you write the following code the second ``console.log`` will be executed *before* the first! .. code-block:: javascript $.getJSON("http://something.com/file.json", function(data) { console.log("received JSON data", data); }); // Note the following line console.log("This will be printed before JSON data is received"); *Usually*, whenever an AJAX call is made in Javascript, the system doesn't wait for the AJAX call to complete. It makes the request in a parallel thread and continues executing the next line of code. Now, when the AJAX call completes, the ``getJSON`` function needs a way to do something with that data. That something is the lambda. The lambda is executed and the JSON data received from the URL is passed to it. This is known as an "event handler" or a "callback". Any asynchronous callback or event-handler will usually be using some form of higher-order functions / lambdas to achieve this. Example: Filtering a list in Python ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ In the examples we discussed till now, we had lambdas that represented very "real" steps. The lambda was "doing" something, eg. processing a row from a file, processing fields from a CSV, processing JSON data, etc. Now, we'll discuss an example where the lambda doesn't represent a "step" in a series of steps, but just a piece of *abstract logic* Consider the process of filtering a list based on some condition: 1. Start at the top of the list: ``i=0`` 2. Initialise an empty list: ``result=[]`` 3. Check the condition on the i-th element of the list **<== this is what we want to change** 4. If the condition holds true, add the eleemnt to the ``result`` 5. Increment the loop index: ``i=i+1`` 6. Go back to step 3 if we haven't reach the end of the list 7. Return the filtered list: ``return result;`` Now, this may seem like a very trivial thing to do and may be something that you have already committed to your muscle memory. In any real-world app there would be dozens of places where you would be copy-pasting the same procedure over, and over, and over again. Each time simply changing step #3 (the condition on which each element needs to be checked against). Some real-life examples: - User gives you a list of email addresses. You need to filter out all blank strings. - User gives you a list of email addresses. You need to filter out all *invalid* emails. - You need to filter out all punctuation from user-input before storing in DB. Let's use lambdas to simplify our life. Consider the Python example from earlier: .. code-block:: python >>> print filter(lambda x: x % 3 == 0, [2, 18, 9, 22, 17, 24, 8, 12, 27]) [18, 9, 24, 12, 27] Here, ``filter`` is a higher-order function that has "abstracted away" step #1-#7 and given you the ability to pass-in step #3. Note, that this "step" is really just a boolean condition. It is passed an element of the list and it needs to evaluate to a boolean true/false. If the result is ``true``, the element is included in the final list, else it is omitted. .. code-block:: python lambda x: x % 3 == 0 Here's how you would filter out all blank strings from a list of email addresses: .. code-block:: python >>> print filter(lambda x: x != "", ["saurabh@vacationlabs.com", "", "support@vacationlabs.com"]) ["saurabh@vacationlabs.com", "support@vacationlabs.com"] We just passed a different lambda with a different condition. That's it. Lambdas in Haskell ------------------ Now, moving on to what lambdas look like in Haskell. Here's a lambda which accepts a single ``Char`` argument and checks if it is equal to ``'a'``. Remember, lambdas are just anonymous *functions*. So, the type of the following lambda, ``Char -> Bool`` is the same as any other function that accepts a single ``Char`` argument and returns a ``Bool`` .. code-block:: Haskell \x -> x == 'a' -- GHCi> :t (\x -> x == 'a') -- (\x -> x == 'a') :: Char -> Bool Here's a lambda which accepts **two arguments** and checks if the first is a multiple of the second. Notice that the two parameters of the lambda ``x`` & ``y`` are separated by spaces, not commas - just like regular functions. .. code-block:: Haskell \x y -> (mod x y == 0) The three main differences between lambdas and regular *named* functions are: - The parameter list of lambdas start with a backslash: ``\`` - The parameter list of lambdas is separated from the function definition by a stabby arrow ``->`` instead of an ``=``, i.e. .. code-block:: Haskell \x -> (mod x 2) == 0 -- vs checkEven x = (mod x 2) == 0 -- this is not a lambda, we need to give it a name! - Lambdas, *by default,* do not have a type-signature. However, towards the end of this chapter, we will demonstrate a technique using which you *can* add type-signatures to lambdas for the purpose of clarity and debugging. Commonly used higher-order functions in Haskell ----------------------------------------------- In this section, let's revisit all the mental gymnastics that we did with tail-recursion in the previous chapter, and make our lives simpler by using higher-order functions. map ^^^ This higher-order function abstracts out the procedure of going through every element in a list and *transforming* it to a new value. That's it. You cannot change the order of elements. You cannot skip elements. You cannot add elements. The lambda that you pass, will be run on every single eleemnt of the list and the output values will be collected in a list and returned. Which means, that the length of the input list and output list will always be the same. Example use-cases: - Convert all characters in a ``String`` to uppercase (remember, a ``String`` is just ``[Char]``): .. code-block:: Haskell GHCi> import Data.Char (toUpper) GHCi> map (\x -> toUpper x) "abc" "ABC" - Double every element in an ``Int`` list: .. code-block:: Haskell GHCi> map (\i -> i * 2) [1, 2, 3] [2, 4, 6] - Convert every character in a string to an ``Int``: .. code-block:: Haskell GHCi> import Data.Char (digitToInt) GHCi> map (\c -> digitToInt c) "3345" [3, 3, 4, 5] Detour: Understanding the type-signature of "map" ************************************************* Take a look at the signature of ``map`` in GHCi and compare it with, say, ``Data.Char.toUpper`` .. code-block:: Haskell GHCi> :t map map :: (a -> b) -> [a] -> [b] GHCi> :t toUpper toUpper :: Char -> Char ``toUpper`` seems to have types that you recognize. It takes a ``Char`` and gives you back a ``Char``. On the other hand, ``map`` doesn't seem to have any recognizable type at all - just ``a`` and ``b``! What are these ``a`` and ``b``? .. note:: Legend has it, that if you can understand how to read such type signatures you don't need API docs at all! Just the type signatures will tell you how to use the function :) If you notice in the examples above, once we passed a ``Char`` list to ``map`` (conversion to uppercase AND converting all characters to integers). The other time we passed an ``Int`` list to ``map`` (doubling every number in a list). And all three times it worked! Same function, different types. That is, once we passed a lambda of type ``Char -> Char``, another time we passed an ``Int -> Int``, and yet another time we passed an ``Char -> Int``. Again, it worked without a compile error. Do you notice a pattern here? .. code-block:: haskell (Char -> Char) -> [Char] -> [Char] (Int -> Int) -> [Int] -> [Int] (Char -> Int) -> [Char] -> [Int] (a -> b) -> [a] -> [b] This is what is called a **polymorphic function** in Haskell. Any type that is represented by **lowercase** identifiers in the type-signature, like ``a``, ``b``, ``server``, ``file``, instead of ``Server`` or ``File``, will work if you pass values of *any* type to it. However, these "type-variables" will specify a relationship between the various arguments of the function. In the case of ``map`` the type-signature is telling us that the lambda (first argument) should accept elements of the same type that are passed in the second argument, because one is represented by ``a`` and the other by ``[a]``. It's also telling us that the ``map`` function will return a list containing elements of the same type as the type returned by the lambda (one is ``b`` and the other is ``[b]``). Now, try violating this relationship between the types of the arguments that you are passing to ``map``. Try calling ``map`` with arguments that **do not** follow this pattern, and stare at the big-fat compile error that it gives (note, we are using ``intToDigit``, the reverse of ``digitToInt`` in the following example): .. code-block:: haskell GHCi> import Data.Char (intToDigit) GHCi> :t intToDigit intToDigit :: Int -> Char GHCi> map (\x -> intToDigit x) "123" -- :81:26: error: -- • Couldn't match type ‘Char’ with ‘Int’ -- Expected type: [Int] -- Actual type: [Char] -- • In the second argument of ‘map’, namely ‘"123"’ -- In the expression: map (\ x -> intToDigit x) "123" -- In an equation for ‘it’: it = map (\ x -> intToDigit x) "123" You intuitively know why this wouldn't work. You want to transform a list of ``Char``, but the lambda that you have supplied is accepting an ``Int``. You can't pass a ``Char`` to a function that is expecting an ``Int``. .. note:: Even though the lambda does not have an explicit type-signature, Haskell intelligently guessed that the type of ``x`` in the lambda **must be** an ``Int`` because ``x`` is being passed to ``intToDigit``, which accepts only ``Int`` arguments. But, linking this back to the type-signature of the ``map`` function, here's why this is giving a compile error: .. code-block:: Haskell (a -> b) -> [a] -> [b] (Int -> Char) -> [Char] -> [Int] For **this particular function call**, if ``a`` is ``Char`` (because the second argument is a ``[Char]``), then it cannot simultaneously be an ``Int`` (becuase that's what the type-signature of the lambda says). **In dynamically typed languages, like Ruby, Python, or Javascript, the language would have happily accepted this erroneous code, only to blow-up once you run it.** .. important:: **A note about parentheses in type-signatures** Understand the different between the following type signatures: .. code-block:: Haskell (a -> b) -> [a] -> [b] -- vs a -> b -> [a] -> [b] The first is a function which accepts **two arguments**. The first argument is a lambda/function of the type ``a -> b`` and the second argument is a list of type ``[a]``. The second is a function which accepts **three arguments**. The first is ``a``, the second is ``b`` and the third is ``[b]`` Whenever you see parentheses in a type-signature, remember that you need to pass a function/lambda of that shape/type to that argument. filter ^^^^^^ Filter allows us to *select* elements, which satisy a given condition, from a list. It is abstracts away the repetitive procedure of going through every element in an input list, checking each element against a condition, and collecting all elements which pass that condition into a **new list.** .. note:: The ``filter`` function does not change the list that you pass it. The original list is untouched. Instead a new list is returned. In fact, this is a common theme across Haskell. Functions do NOT modify the values that you pass them. They operate on the values and return a new value. This is why there is no "pass by value" or "pass by reference" in Haskell (like it is in C / C++). Let's take a look at the type of ``filter``: .. code-block:: haskell GHCi> :t filter filter :: (a -> Bool) -> [a] -> [a] Here's what the type tells us: - Because of the parentheses in the signature, the first argument needs to be a lambda/function that accepts a value of a type ``a`` and return a ``Bool`` - The second argument must be a list containing elements of the *same type* ``a``. - Finally, the function will return a list of type ``[a]`` (the same type that we passed). Note, the *type* of the list is same, the values will be different. Let's see it in action. Filter out only even numbers from a list: .. code-block:: Haskell GHCi> filter (\x -> (mod x 2) == 0) [1, 2, 3, 4, 5, 6] [2, 4, 6] Filter out only alphabets from a given string (i.e. strip out all punctuation characters): .. code-block:: Haskell GHCi> import Data.Char (isAlpha) GHCi> filter (\x -> isAlpha x) "Hello world!" "Helloworld" Remove blank strings from a list of strings: .. code-block:: Haskell GHCi> filter (\x -> x /= "") ["hello", "", "world"] ["hello", "world"] .. _foldl: foldl' ^^^^^^ .. warning:: There is another function called ``foldl`` (notice the missing ``'``) at the end. While ``foldl`` and ``foldl'`` are very similar, there is a very subtle difference, which makes ``foldl`` unfit for *most* use-cases. It still exists in the standard library due to historical reasons and backward compatibility. **Do not** use ``foldl``. Use ``foldl'`` instead. The strangely named ``foldl'`` is an extremely important higher-order function. In some senses, it is the "mother function" of all higher-order functions that deal with processing lists. Using ``foldl'`` you can implement, ``map``, ``filter``, and every other such function that visits each element and do something with it. To understand why ``foldl'`` is so important, consider the procedure that ``map`` abstracts away for you: 1. Initialize an empty list and the loop index: ``result=[], i=0`` 2. Process the i-th element of the list **<== This is what we change with the lambda** 3. Append the output of this processing to the ``result`` list 4. Increment the loop index: ``i=i+1`` 5. Repeat from step #2, is we haven't reached the end of the list 6. Return the ``result`` list Next, consider the procedure that ``filter`` abstracts away for you: 1. Initialize an empty list and the loop index: ``result=[], i=0`` 2. Check the condition for the i-th element of the list **<== This is what we change with the lambda** 3. If the condition is true, append the output of this processing to the ``result`` list, else do not change the ``result`` list 4. Increment the loop index: ``i=i+1`` 5. Repeat from step #2, if we haven't reached the end of the list 6. Return the ``result`` list Notice, how your lambda (step #2), has no idea of the ``result`` being accumulated in both the cases? What if your filtering logic depends on the elements that have already been filtered? For example, try writing the following filtering logic using only the ``filter`` function: Given a list of integers, select *at most* one occurence of each integer, i.e. if the input is ``[1, 2, 3, 3, 2, 4]`` the output should be ``[1, 2, 3, 4]``. You will be unable to write this function using only the ``filter`` function. This is because the condition in step 2, the lambda that you are passing, needs access to the ``result`` of each step. This is exactly what ``foldl'`` gives you. The lambda that you pass to ``foldl'`` is of the shape ``b -> a -> b``, where the first argument is the accumulated result at the end of each iteration, the second argument is the current element being processed, and the lambda should return the new value of the accumulated result (this will be passed again to the lambda in the next iteration). Here's the **simplified** type-signature of ``foldl'``. Please note, if you check the type of ``foldl'`` in GHCi, you will see something else. Ingore that *for now*. We will revisit the actual type-signature once we cover typeclasses in a later chapter. .. code-block:: Haskell foldl' :: (b -> a -> b) -> b -> [a] -> b Let's look at ``foldl'`` in action. Summing up a list of integers: .. code-block:: Haskell GHCi> foldl' (\total x -> total + x) 0 [1, 2, 3, 4] 10 Notice that we are passing the second argument as ``0``. This is the initial value of the "result" that is being accumulated at every step. This is the value that will be passed to your lambda along with the first element of the list. .. tip:: Please take a pen & paper and work out the mechanics of how ``foldl'`` is working using the following table: .. code:: +----------------+---------------------+-------------+ | total | x (current element) | new total | +----------------+---------------------+-------------+ | 0 | 1 | 1 | | 1 | 2 | 3 | | ... | ... | ... | +----------------+---------------------+-------------+ Another example: Filtering out unique elements in an integer list. Note the following: - We are using the ``elem`` function to check if an element exists in a list - The accumulated value itself is a list, in this case. - We are using ``if-then-else`` in a single line to define the lambda in a terse manner. This is allowed by the many rules of indentation in Haskell. **Please use a pen & paper and draw a table to understand how this example is working** .. code-block:: Haskell GHCi> foldl' (\result x -> if (x `elem` result) then result else (x:result)) [] [1, 2, 3, 3, 2, 3, 1] [3, 2, 1] Another example: Let us defend the initial claim that ``foldl'`` is the "mother function". Let us implement ``map`` in terms of ``foldl'``: .. code-block:: Haskell -- Doubling all elements of a list... GHCi> foldl' (\result x -> result ++ [x * 2]) [] [1, 2, 3] [2, 4, 6] -- ... is _functionally_ equivalent to... GHCi> map (\x -> x * 2) [1, 2, 3] [2, 4, 6] foldr ^^^^^ ``foldl'`` processes the input list from left-to-right (hence the ``l`` in the function), and ``foldr`` processes the list from right-to-left. Here's the **simplified** type signature: .. code-block:: Haskell (a -> b -> b) -> b -> [a] -> b Notice how the order of arguments in the lambda are swapped (compared to ``foldl'`` Detour: Tuples ^^^^^^^^^^^^^^ In the previous chapter, we skipped a basic data-type of Haskell - tuples. Now that you have some understanding of how to read polymorphic type signatures, i.e. ``(a -> b) -> [a] -> [b]``, let's discuss tuples and how they are frequently used in higher-order functions. - A tuple is a collection of **fixed** number of items (as opposed to a list, which can contain any number of items - or even zero items). - Items in a tuple may be of different types. (as opposed to a list, where every item in the list *must* be of the same type). - A tuple with 2 elements is a **completely different type** from a tuple with 3 elements. We will elaborate on this below. What does a tuple look like? Tuples are written within parentheses ``()`` and the elements of a tuple are separated by commas. For example: .. code-block:: haskell GHCi> :t ('a', 'b') ('a', 'b') :: (Char, Char) GHCi> :t ('a', True) ('a', True) :: (Char, Bool) GHCi> :t ('a', 'b', True, False) ('a', 'b', True, False) :: (Char, Char, Bool, Bool) If you go back to the :ref:`defining_functions` section in the previous chapter, and see the code-snippet where we were erroneously passing ``(function, arguments, in, parentheses)`` you will now be able to appreciate the compiler's error message: .. code-block:: text Couldn't match expected type ‘Int’ with actual type ‘(Integer, Integer)’ The compiler is wondering why we are passing an ``(Integer, Integer)`` tuple for an ``Int`` argument! Now, let's see when you would reach out for a tuple in real code... Returning multiple values from functions **************************************** Whenever you want to return multiple values from a function where the type *and* number of return-values is known at the time of writing the code (compile time, as opposed to run-time). For example, take a look at the ``divMod`` function, which divides two integers, and returns the quotient **and** the remainder. The return type is a 2-tuple of type ``(Int, Int)``. Again, here's the **simplified** tyep-signature of ``divmod`` (what you'll see in GHCi will be different): .. code-block:: haskell divMod :: Int -> Int -> (Int, Int) -- GHCi> divMod 10 2 -- (5, 0) -- GHCi> divMod 10 3 -- (3, 1) How do we use the return value of this function? Let's understand by using ``divMod`` in a real-world example of time arithemetic. Say, you want to write a function that calculates the end-time of an event, given the start-time and the duration of the event (in minutes): .. code-block:: haskell calculateEndTime :: Int -> Int -> Int -> (Int, Int) calculateEndTime hr mn durationInMins = let (addHour, finalMins) = divMod (mn + durationInMins) 60 in (hr + addHour, finalMins) -- GHCi> calculateEndTime 10 30 45 -- (11, 15) -- GHCi> calculateEndTime 10 30 115 -- (12, 25) .. note:: **Quick note about let..in** **Note:** We suddenly introduced a new ``let...in`` syntax in this function. We will discuss this in greater details in later chatpers, but for now you can intuitively understand what it means. The structure of a ``let..in`` expression is the following: .. code-block:: haskell let binding1 = expression1 binding2 = expression2 binding3 = expression3 in The "return value" of a ``let...in`` expression is the whatever the expression after ``in`` evaluates to. So, in our example, the ``let`` is creating *two* bindings -- ``addHour`` and ``finalMins``, corresponding to the values of the 2-tuple returned by ``divMod``. These are then used in building another tuple ``(hr + addHour, finalMins)``, which is the "return value" of the ``let...in`` expression, which, in turn, is the "return value" of the ``calculateEndTime`` function. At the time of writing the code, we know that this function will *always* return exactly two values - the quotient and the remainder. Now, could the ``divMod`` function have returned a ``[Int]``? Yes it could have. In fact, that's exactly how the `divmod function in Ruby's standard library `_ is written. Why is, then, a tuple preferred over a list in this case? **Because of correctness.** When you call a function that returns a list, you can never be sure of how many elements it has. It can have two, three, 100, or even zero. If at the time of writing the code (compile-time) you *know* that a particular function will always return two values, you can encode that guarantee by picking the right type, i.e. a 2-tuple. This way every user of the function can stop worrying about the edge-case of getting a list which does not contain exactly two values. In the future, if you have to evolve your function and it now returns a 3-tuple, the compiler is going to *help you* by pointing out all the places where you were expecting the function to return a 2-tuple, but it now returning a 3-tuple instead. There would be a very good reason why the function is returning 3 values intead of 2 now, right? All the places where you have used the function now also need to deal with the third value. If you had used a list instead, the compiler wouldn't be able to help you, and you would have to actually run the code and give it various inputs to see where it is breaking. **This is the power of type-safety.** PS: In fact, `Python's divmod function `_ also returns a 2-tuple instead of a list. Ad-hoc way to "associate" elements ********************************** A lot of times you need a way to temporarily "associate" one piece of data with another piece to achieve the larger goal of the app. For example, you are writing a phone-book (or contact-list) app and have to build a UI which looks like the following: .. code-block:: text -- A ------------------- Adam Anuarg Ashton -- B ------------------- Bhushan Bimal Banksy Beth -- C ------------------- Charlie Chaman Chilgoza (and so on) In all probability you will query the "Contacts API" on the phone, which will return a flat list of all the contacts: .. code-block:: haskell -- The list may not necessarily be sorted alphabetically -- [ "Adam", "Bhushan", "Bimal", "Chilgoza", "Beth", "Anurag" , "Ashton", "Chaman", "Charlie", "Banksky", -- and more ] Now, on the UI, if you notice there are *two* lists being displayed. One list is the alphabets - A, B, C, D... - and the other list is the contacts "associated" with that alphabet. How will you process the flat list into a simpler data-structure that your UI can easily loop over and display the final output? If you have spent some time with dynamically typed languages (like Javascript, Ruby, Python), you may be tempted to reach out for something that looks like a nested list, i.e.: .. code-block:: haskell [ "a", ["Adam", "Anurag", "Ashtong"] , "b", ["Bhushan", "Bimal", "Banksky", "Beth" , "c", ["Charlie", "Chaman", "Chilgoza"] ] While, this would be an *acceptable* solution in many dynamic languages (though still not the best), it won't work in Haskell at all. Remember, that, in Haskell all elements in a list should be of the same type. However, in the data-structure given above, the list elements are alternating between ``String`` and ``[String]``. Aha, let's fix this problem, by making it a ``[[String]]``. What about the following: .. code-block:: haskell [ ["a", "Adam", "Anurag", "Ashtong"] , ["b", "Bhushan", "Bimal", "Banksky", "Beth" , ["c", "Charlie", "Chaman", "Chilgoza"] ] Now this is a valid list in Haskell of the type ``[[String]]``, **but** this has an **implicit assumption** that the first element of every inner list is going to the alphabet, and the other elements are going to be the names. This is the same problem as the ``divmod`` example given above. At the time of writing the code (compile time), we *know* that the first element *must* be an alphabet and the others *must* be names. However, our data-structure is not enforcing this. This is bound to cause bugs -- if not today, then three weeks later when someone else makes some changes to the code (say, the UI needs to change to display the name *and* the phone). Using Haskell's type-system to our advantage, we can make this *implicit assumption* completely *explicit* and use the compiler to help us prevent bugs. A *better* data structure for something like this would be of the type: ``[(Char, [String]]`` -- a list of 2-tuples. For example: .. code-block:: haskell [ ('a', ["Adam", "Anurag", "Ashtong"]) , ('b', ["Bhushan", "Bimal", "Banksky", "Beth"]) , ('c', ["Charlie", "Chaman", "Chilgoza"]) ] Notice, that because we are now using a tuple, we get the added advantage of using a ``Char`` to represent the alphabet. This makes another *implicit assumption* absolutely explicit -- it will no longer permit us to push a multi-letter string in place of what should have been a single alphabet. This will actually help the UI as well -- while displaying each header in the contact list, the UI can stop worrying about the edge-case that it *may* get a mulit-letter string instead of a single alphabet -- the types wont allow it! Passing multiple values to a lambda which accept only a single parameter ************************************************************************ Now, connected to the previous example (contact list / phone book), how do we actually transform our flat list of contacts to a list of tuples? That is, we want to build the following data transformation: .. code-block:: text Transform this => [ "Adam", "Bhushan", "Bimal", "Chilgoza", "Beth", "Anurag", ...] To this => [ ('a', ["Adam", "Anurag", ...] , ('b', ["Beth", "Bhushan", ...] , ('c', ["Chaman, "Charlie", ...] ] This is where we will use tuples to pass multiple values to a lambda which accepts only a single value. Let's elaborate this with an example of ``foldl'``: .. code-block:: haskell foldl' :: (b -> a -> b) -> b -> [a] -> b Here, ``foldl'`` is giving you access to a *single* accumulator variable, of *any* type ``b``. What if you want to write logic where you need access to *two* accumulator variables (which seems to be what we need in our case)? Tuples to the rescue! Remember, that ``b`` cany be *any* type. ``foldl'`` doesn't care. All it says is that, the lambda's first argument, the lambda's return value, and the initial value passed to the ``foldl'`` (which is technically the second argument) should be of the *same* type. Well, in that case ``b`` can very well be an n-tuple, if we need it to be! Let's see... .. code-block:: haskell module ContactList where import Data.Char (toLower) import Data.List (foldl') groupNamesByAlphabet :: [String] -> (Char, [String]) groupNamesByAlphabet names = foldl' -- the lambda (\ (alphabet, collectedNames) name -> if (length name > 0) && (toLower (head name) == (toLower alphabet)) then (alphabet, collectedNames ++ [name]) else (alphabet, collectedNames) ) -- the initial value ('a', []) -- the list names -- GHCi> :l ContactList -- GHCi> groupNamesByAlphabet [ "Adam", "Bhushan", "Bimal", "Chilgoza", "Beth", "Anurag", "Ashton", "Chaman", "Charlie", "Banksky"] - We've written a function that gives us a ``(Char, [String])`` back, but only for the names that start with ``a`` - Notice how we have split the arguments to ``foldl'`` across multiple lines for better readability. This is allowed in Haskell's many indentation rules. - Notice, how the lambda itself is split across multiple line - again allowed by Haskell's indentation rules. - The lambda itself is defined using a new-ish syntax, called "pattern matching", discussed below. - The alphabet that we want names for, is actually being passed as the first element of the tuple to the lambda *along with* the names collected so far (second element of the tuple). .. note:: **Quick note about pattern matching** If you notice, in the example we discussed above, the lambda was defined using a new syntax: .. code-block:: haskell \ (alphabet, collectedNames) name -> ... This is an example of "pattern matching" which we will cover in greater detail in the next chapter. For now, you can intuitively understand what it is doing. The two parameters of the function are separated by spaces, as usual (Note: The lambda is still accepting just two parameters, don't let the 2-tuple confuse you). On top of that, we are further *destructuring* the first parameter, which is a 2-tuple, into ``alphabet`` and ``collectedNames``. Try changing that to a 4-tuple, like ``(x, y, z, w)`` or a 3-tuple, liek ``(x, y, z)`` and notice how the compiler won't allow you to do that. Again, type-safety. However, we still haven't built the complete data transformation that we originally wanted. Let's build upon this idea. First, let's change our function to work for any alphabet that we pass-in: .. code-block:: haskell :emphasize-lines: 1,2,13, 19, 22 groupNamesByAlphabet :: Char -> [String] -> (Char, [String]) groupNamesByAlphabet alpha names = foldl' -- the lambda (\ (alphabet, collectedNames) name -> if (length name > 0) && (toLower (head name) == (toLower alphabet)) then (alphabet, collectedNames ++ [name]) else (alphabet, collectedNames) ) -- the initial value (alpha, []) -- the list names -- GHCi> :r -- GHCi> groupNamesByAlphabet 'b' [ "Adam", "Bhushan", "Bimal", "Chilgoza", "Beth", "Anurag", "Ashton", "Chaman", "Charlie", "Banksky"] -- ('b',["Bhushan","Bimal","Beth","Banksky"]) -- -- GHCi> groupNamesByAlphabet 'c' [ "Adam", "Bhushan", "Bimal", "Chilgoza", "Beth", "Anurag", "Ashton", "Chaman", "Charlie", "Banksky"] -- ('c',["Chilgoza","Chaman","Charlie"]) Finally, let's wrap this up in a ``map`` to make it work for all alphabets: .. code-block:: haskell :emphasize-lines: 23-25, 29-31 module ContactList where import Data.Char (toLower) import Data.List (foldl') groupNamesByAlphabet :: Char -> [String] -> (Char, [String]) groupNamesByAlphabet alpha names = foldl' -- the lambda (\ (alphabet, collectedNames) name -> if (length name > 0) && (toLower (head name) == (toLower alphabet)) then (alphabet, collectedNames ++ [name]) else (alphabet, collectedNames) ) -- the initial value (alpha, []) -- the list names groupNamesByAllAlphabets :: [String] -> [(Char, [String])] groupNamesByAllAlphabets names = map (\alphabet -> groupNamesByAlphabet alphabet names) "abcdefghijklmnopqwxyz" -- GHCi> :r -- GHCi> groupNamesByAllAlphabets [ "Adam", "Bhushan", "Bimal", "Chilgoza", "Beth", "Anurag", "Ashton", "Chaman", "Charlie", "Banksky"] -- [('a',["Adam","Anurag","Ashton"]),('b',["Bhushan","Bimal","Beth","Banksky"]),('c',["Chilgoza","Chaman","Charlie"]), -- ('d',[]),('e',[]),('f',[]),('g',[]),('h',[]),('i',[]),('j',[]),('k',[]),('l',[]),('m',[]),('n',[]),('o',[]), -- ('p',[]),('q',[]),('w',[]),('x',[]),('y',[]),('z',[])] Now, what if we **dont want** alphabets that don't have any names? Well, it's just a ``filter`` away: .. code-block:: haskell groupNamesByAllAlphabets :: [String] -> [(Char, [String])] groupNamesByAllAlphabets names = filter -- the filter condition -- notice how it's destructuring the tuple... (\ (alpha, collectedName) -> length (collectedNames) > 0) -- the list that we want to filter... (map (\alphabet -> groupNamesByAlphabet alphabet names) "abcdefghijklmnopqwxyz") -- GHCi> :r -- GHCi> groupNamesByAllAlphabets [ "Adam", "Bhushan", "Bimal", "Chilgoza", "Beth", "Anurag", "Ashton", "Chaman", "Charlie", "Banksky"] -- [('a',["Adam","Anurag","Ashton"]),('b',["Bhushan","Bimal","Beth","Banksky"]),('c',["Chilgoza","Chaman","Charlie"])] **Let's look at another example:** For example, what if we want to use ``foldl'`` to compute the **first 10 terms** of the Fibonacci sequence? To calculate the n-th term we'll need access to the (n-1)-th term, and the (n-2)-th term at each step. But, ``foldl'`` allows us to retain only one variable/value between steps? No problem -- let's use a 3-tuple... .. code-block:: haskell GHCi> foldl' (\ (x, y, fiboList) i -> (y, x + y, fiboList ++ [x+y])) (0, 1, []) [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] (55,89,[1,2,3,5,8,13,21,34,55,89]) Let's align the call to ``foldl'`` with it's type-signature to understand what's going on: .. code-block:: haskell foldl' :: ( b -> a -> b) -> b -> [a] foldl' (\ (x, y, fiboList) i -> (y, x + y, fiboList ++ [x+y])) (0, 1, []) [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] - In this function call, *type* ``a`` is ``Int`` and *type* ``b`` is ``(Int, Int, [Int])``. The latter is our 3-tuple saviour! - What do the 3 elements in the tuple represent: - (n-1)the term of the Fibo series - n-th term of te Fibo series - Fibo series containing n-terms - The lambda evaluates to ``(y, x + y, fiboList ++ [x+y])``, which is of the same *type* ``(Int, Int, [Int])``. It calculates the n-th element of the Fibo series **and** also appends it to the 3rd element of the tuple. - The initial value being passed to ``foldl'`` is ``(0, 1, [])``, which is of the same type ``(Int, Int, [Int])``. It's the first two terms of the Fibo series, along with an empty-list (which indicates that we have not accumulated any terms of the Fibo series, yet). - The third argument to ``foldl'`` is ``[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]`` which is of type ``[Int]``. Each element of this list is passed as the *second argument* to the lambda (the first argument is our 3-tuple accumulator). But the lambda doesn't seem to be doing anything with this value! That is correct. The actual numbers in this list don't make any difference; only that the number of elements should be 10 (because we want the first 10 terms of the Fibo series). Try the following in GHCi, and verify this for yourself: .. code-block:: haskell GHCi> foldl' (\ (x, y, fiboList) i -> (y, x + y, fiboList ++ [x+y])) (0, 1, []) [14, 19, 2, 8, 8, 9, 20, 12, 22, 29] zip, zipWith, and friends ^^^^^^^^^^^^^^^^^^^^^^^^^ Now, with the detour about tuples out of the way, let's look at ``zip`` and friends. First let's understand *why* they are required. Using only ``map``, ``filter`` and ``foldl'``, try solving the following problem. Suppose, for some government certification, required for your business, you had to find out the days on which a government office is going to be functioning AND the governemnt secretary is going to be present. Now, as it is the case with most government departments, these two pieces of data are made available separately by separate people who don't really talk to each other. Thankfully, they have provided this data in the form of ``Bool`` lists: .. code-block:: haskell -- 31 elements to represent 31 days. Each Bool value -- represents whether the governemnt office is open, or not. [ True, True, True, True, True, False, False , True, True, True, True, True, False, False , True, True, True, True, True, False, False , True, True, True, True, True, False, False , True, True, True ] -- 31 elements to represent 31 days. Each Bool value -- represents whether the governemnt secretary is working, -- or, on leave. [ True, True, True, True, True, False, False , False, False, False, False, False, False, False , False, False, False, False, False, False, False , True, True, True, True, True, False, False , False, False, False ] So, basically you need to write a function that takes these lists as an input and returns a "unified" list: .. code-block:: haskell daysOfActualWork :: [Bool] -> [Bool] -> [Bool] daysOfActualWork officeOpenDays secretaryPresentDays = _todo Seriously! Don't read ahead. Try solving the problem only with ``map``, ``filter`` or ``foldl'`` to really appreciate why ``zipWith`` is required. Here's a possible solution with ``foldl'`` - .. container:: toggle .. container:: header **Show/hide code** .. code-block:: haskell -- Here's a possible, but, ugly, solution daysOfActualWork :: [Bool] -> [Bool] -> (Int, [Bool]) daysOfActualWork officeOpenDays secretaryPresentDays = foldl' (\(i, finalList) isOfficeOpen -> (i + 1, finalList ++ [isOfficeOpen && (secretaryPresentDays !! i)]) (0, []) officeOpenDays We had to resort to the ``!!`` operator to lookup the ``secretaryPresentDays`` list by index AND we had to resort to manually preserving a loop counter. If you really think about it, this isn't very different from writing a ``for`` loop directly. Writing code like this doesn't really fulfil the reason of *why* we wanted to use lambdas and higher-order functions in the first place. Now, look at how you can solve this with ``zipWith``: .. code-block:: haskell zipWith (\isOfficeOpen isSecretaryPresent -> isOfficeOpen && isSecretaryPresent) officeOpenDays secretaryPresentDays That's it! Check the type-signature of ``Data.List.zipWith`` and see if you can make sense of it. Next, check the type-signature of ``Data.List.zip`` and see if you can make sense of it. Make sure you play around with ``zip`` and ``zipWith`` and understand how they behave in the edge-case that you pass lists of **different lengths** to them. There is a also a ``zip3``, ``zipWith3``, ``zip4``, ``zipWith4``, and so on, till ``zipWith7``, if you ever need them. There is a also a series of ``unzip`` functions that exist. Lots & lots of other functions ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ There are many, many higher-order functions in Haskell libraries. The abstraction being provided by (almost) every higher-order fucntion can be re-implemented by force-fitting a ``foldl'``. But it is not a good idea to do that. Most of these higher-order funcions "abstract away" common procedures. The choice of higher-order function should immediately communicate your *intent* to the reader (the common procedure that you have abstracted away). Using a ``foldl'`` doesn't always do that. Take a look at the ``zipWith`` example above. What kind of code would you rather take responsibility of? Please browse around the documentation of ``Data.List``. There are a lot of useful higher-order functions given there. To name a few: ``takeWhile``, ``dropWhile``, ``dropWhileEnd``, ``break``, ``groupBy``, ``partition``, ``sortOn``, ``deleteBy``, ``deleteFirstBy``, and many more. There are also a host of useful functions in the ``Data.List`` module, which aren't necessarily higher-order functions, but you will find yourself reaching out for them every now & then when writing real-world code. For example: ``reverse``, ``intercalate``, ``replicate``, ``take``, ``drop``, ``splitAt``, etc. **Remember: since String is just a list of Char, all of these functions can be used with String as well.** They can obviously be used with a list of *any* type. Exercises --------- Reimplementing old problems in a more Haskell-y way ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ We used tail-recursion to solve all the problems of the previous chapter to completely unlearn our muscle-memory of writing ``for`` loops and changing values of variables (called "mutation" in the FP world). Now, we'll solve them again using a more real-world Haskell approach: (Note: In all the exercises given below, the indicative functions are only the higher-order functions. You are free to use other utiliity functions like ``mod``, ``toUpper``, ``digitToInt``, etc. as required by the solution) 1. Sum a list of integers using ``foldl'`` 2. Sum a list of integers using: (a) just ``foldl'``, and (b) combination of ``foldl'`` and ``filter`` 3. Double a list of integers using ``map`` 4. Check if all characters in a ``String`` are contained within another ``String``: Use ``foldl'`` & ``elem`` 5. Generating a list of all even numbers till “N”: (a) Use ``foldl'`` alone, (b) Use ``filter`` and the short-hand of generating a list containing all number within a range:``[1..n]`` 6. Generate a list of first “N” even numbers: (a) Use ``foldl'`` alone, (b) Use ``map`` and ``[1..x]`` short-hand syntax 7. Sum of first “N” multiples of 3 or 5: (a) Use ``foldl'`` alone 8. Sum square difference: Use ``foldl'`` 9. Even Fibonacci numbers: Modify the solution already given above using ``filter`` 10. ISBN Verifier: (a) Build your solution using only ``foldl'``, (b) modify you solution to use ``zip`` or ``zipWith`` at the appropriate place. 11. Pangram: (a) write using only ``foldl'``, (b) write it using ``map`` and ``filter``, (c) write it using ``any`` 12. Reimplement the contact list (phone book) example from above using a higher order function that seems more approriate. (There is no single right answer to this question). Add line numbers to source code ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Write a function that accepts lines-of-code, represented as a list of Strings, and return a new list where each line-of-code is pre-fixed with the line number? .. code-block:: haskell prefixLineNumbers :: [String] -> [String] prefixLineNumbers = _todo -- prefixLineNumbers -- [ "module Main where" -- , "" -- , "increment :: Int -> Int" -- , "increment x = x + 1" -- ] -- -- OUTPUT: -- [ "1: module Main where" -- , "2: " -- , "3: increment :: Int -> Int" -- , "4: increment x = x + 1" ] - First write it using only functions and recursion and ``if-then-else``. - Write it using ``map``. Is it possible? If not, why? - Write it using ``foldl'`` - Write it using ``zip`` - Write it using ``zipWith`` Do you still miss your loop indices -- ``i``, ``j`` or ``idx``? Generalized Fibonacci selector ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Write a function that can select the first ``N`` terms of the Fibonacci series that match a given condition. Take note: - The result of the function itself should contain ``N`` terms, i.e. if the first 3 even terms are requested, the result should be: ``[2, 8, 34]``. Similarly, if the first two multiples-of-3 are requested, the result should be: ``[3, 21]`` - You should consider writing your own higher-order function (write a function that accepts the condition as a lambda) .. _pagination: Pagination ^^^^^^^^^^ Write the core logic for displaying page-numbers at the bottom of a typical list-view in any web-app. The entire list is divided into ``p`` pages of ``r`` items each. .. code-block:: haskell -- Note the use of type-synonyms to clearly indicate what each `Int` -- in the function signature actually _means_ type ItemsPerPage = Int type TotalItems = Int type CurrentPage = Int displayPagination :: TotalItems -> ItemsPerPage -> CurrentPage -> String Usually, when the total page-count is less than ``8`` all the page-numbers are shown at the bottom of the list, as such (``*`` next to a page number denotes that the user is currently on that page): .. code-block:: haskell displayPagination 55 10 4 -- OUTPUT: -- < Prev | 1 | 2 | 3 | 4* | 5 | Next > However, when the total page-count is equal-to/higher-than ``8``, then all the page numbers are not shown (it would make the pagination too long). The current page is kept towards the "center" and 3 numbers on either side are shown, as demonstrated below: .. code-block:: haskell displayPagination 545 10 16 -- OUTPUT: -- However, there is an edge-case when we don't have enough page numbers to display on either the left, or right, side of the "center" page. Examples: .. code-block:: haskell displayPagination 545 10 3 -- OUTPUT: -- < Prev | 1 | 2 | 3* | 4 | 5 | 6 | 7 | ... | Next > displayPagination 545 10 53 -- OUTPUT: -- < Prev | ... | 49 | 50 | 51 | 52 | 53* | 54 | 55 | Next > **Follow-up:** The total number of pages to be display was "hard-coded" (i.e. pre-decided) in the problem given above. Modify your function to additionally take the total number of pages to display, as well. Example: .. code-block:: haskell type ItemsPerPage = Int type TotalItems = Int type CurrentPage = Int type NumOfPagesToDisplay = Int displayPagination :: NumOfPagesToDisplay -> TotalItems -> ItemsPerPage -> CurrentPage -> String displayPagination 13 321 10 3 -- -- OUTPUT: -- displayPagination 13 321 10 15 -- -- OUTPUT: -- **Another follow-up:** Have you handled the case where ``NumOfPagesToDisplay`` is an even number? .. _exam_score_processing: Exam-score processing ^^^^^^^^^^^^^^^^^^^^^ Say, you have parsed the exam-scores of all students from a CSV, and have it in the following data-structure: .. code-block:: haskell [ ("Saurabh Nanda", [("English", 84), ("Chemistry", 80), ("Physics", 95), ("Geography", 75)]) , ("John Doe", [("Chemistry", 80), ("Physics", 95), ("Geography", 75)]) , ("Jane Doe", [("Chemistry", 66), ("Phsyics", 33), ("Geography", 56)]) -- Note the spelling error in "Physics" , ("John Doe", [("Chemistry", 90), ("Economics", 45), ("Geography", 56)]) -- Note that "John Doe" is a repeat entry , ("Bahubali", [("Hindi", 45), ("Biology", -90), ("Geography", -75)]) -- Note the negative marks in "Biology" & "Geography" , ("Rajnikant", [("Tamil", 110), ("Biology", 100), ("Geography", 100)]) -- Note that marks in "Tamil" are greater than 100 ... ] (Can you work out the type of the data-structure given above without inspecting it in GHCi?) Further, the list of known subjects is also available to you in the following data-structure: .. code-block:: haskell [ "English" , "Geography" , "Physics" , "Chemistry" , "Economics" , "Computer Science" ... ] Write the following functions: 1. Calculate the average marks for a given subject taking care that invalid entries are not included in the calculation. - All *scores* that are negative should be considered invalid - All *scores* that are greater than 100 should be considered invalid - All *scores* where the subject name is not known should be considered invalid - All *entries* where the student's name is duplicated should be considered invalid, i.e. none of the scores for those entries should be considered. 2. Calculate the standard deviation for a given subject taking care of invalid entries. 3. List all the students whose names are duplicated 4. List all students, who have invalid scores in any subject, along with the subjects in which they have invalid scores and the reason why the score is considered invalid. For the sample data given above, the result should be: .. code-block:: haskell [ ("Jane Doe", [("Phsyics", 33, "Unknown subject")] , ("Bahubali", [("Biology", -90, "Negative score"), ("Geography", -75, "Negative score")]) , ("Rajnikant", [("Tamil", 110, "Score is greater than 100")] ] 5. Give a list of all subjects and the students who took an exam in that subject, taking care of invalid entries & scores. For example: .. code-block:: haskell [ ("English", ["Saurabh Nanda"]) , ("Geography", ["Saurabh Nanda", "Jane Doe", "Rajnikant"]) , ("Physics", ["Saurabh Nanda"]) , ("Chemistry", ["Saurabh Nanda", "Jane Doe"] , ("Economics", []) , ("Computer Science", []) ] 6. Group together all students that have taken the exact same exams, taking care of invalid entries & scores. Invalid scores can be considered as exams not taken. The result should be of the type ``[([String], [String])]`` where the first element of each tuple is the list of exams, and the second element is the list of students who have taken those exams. **Follow-up:** Generalize your solution for 1, 2, & 4 by writing your own higher-order function. Generalize your solution for 5 & 6 by writing your own higher-order function. Stuff that you might struggle with ---------------------------------- Thinking "functionally" vs "imperatively" ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ You will probably have the logic for most of the exercises in your head, but you might struggle with actually writing the logic using only ``map``, ``fold``, or ``zip``. Do not worry - this is expected. There is a bit of *re-wiring* that your brain needs to undergo, to start thinking functionally. That's the whole point of this chapter. **Do NOT skip the exercises.** Is there a function to do X ? ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If you've been spending time on the API-documentation (which you **should** be doing), you would have noticed that ``map``, ``fold`` and ``zip`` are not the only higher-order functions that are available. There seem to be dozens of variatons of ``map``, ``fold`` and ``zip`` for common use-cases. Initially, you might end-up re-implementing a higher-order function that is already present in the standard library (most probably using a ``foldl'``). However, as your familiarity with the standard library increases, you will realise that a lot of code can be reduced to a one-liner by using the correct higher-order function. **Do not be afraid** to go back and refactor your code when you learn of something new. Refactoring is a super-power in Haskell - if you mess-up something while refactoring, the compiler is going to shout at you till you fix it. So, **don't be afraid to refactor.** What the hell is going on? Can I print the value of this variable? ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ You have been explicitly forbidden to write output to the terminal - GHCi will automtically print the return value of your function. There is a reason for doing this, which we will cover later. However, if, for **debugging** you *need* to print values to the terminal use various functions from the ``Debug.Trace`` module. For example, here's how you can use ``traceShow`` to print the values passed to your lambda, while summing a list of itnegers: .. code-block:: haskell GHCi> import Debug.Trace (traceShow) GHCi> import Data.List (foldl') GHCi> foldl' (\total x -> traceShow (total, x) (total + x)) 0 [1, 2, 3, 4, 5] The **simplified** type-signature of ``traceShow`` is: .. code-block:: haskell traceShow :: a -> b -> b You can pass it two arguments of **any type**, it will print the first argument to the console, and will return the second argument as it is. Which means that ``traceShow`` will always evaluate to the second argument passed to it. So, in our example above ``(total, x)`` is being printed, but the function call to ``traceShow`` is evaluating to ``total + x``; and, as a result, the entire lambda is evaluating to ``total + x`` -- thus our summation logic is not changing. In other words, ``(total, x)``, which is being printed has absolutely no effect on the "return value" of ``traceShow``. The "return value" of ``traceShow`` will always be the second argument you pass to it.