A touch of Haskell for the curious
Hello!, today we will write about things a newcomer to Haskell would need to ease the journey into the pure world, as for my self I could have certainly avoided the struggle in the paradigm shift if I were given something like this. I honestly hope this helps the typical oop/imperative oriented developer coming from C#, Python, Ruby, PHP, Java, JavaScript to Haskell.
WARNING!
If you are just about to begin the journey into Haskell, or you are not too deep in the woods to have noticed yet, let me warn you one little thing about Haskell and the FP world behind it… it’s a trap!.
Honestly, no one told me this, I just wanted to become good at Scala and what better introduction to functional programming that start with Haskell and then transition into Scala, then it would be a piece of cake since you will grasp the functional thinking with a fp-only language first. Up to this day I don’t know if this was a curse or a gift, I like to think is the latter, it must be, right?.
When I realized, it was too late, it had absorbed me, soon I began to read about similar stories, people wanting to try Haskell just because they wanted to prove other colleges their assumptions, something along the lines of “Haskell is too difficult and doesn’t bring anything new to the table”, or “why Haskell? JavaScript is by far the most widely adopted language today and it’s because simply put, it is the best.", they too are now trapped.
As Aditya Siram (in his talk “A (Not So Gentle) Introduction To Systems Programming In ATS”) put it:
People keep saying that learning programming languages makes you a better programmer, it really doesn’t, it makes you a better programmer up to a point… then it makes you bitter and dissatisfied…
That’s so true! π …
because you’ll never be able port those ideas to your day job, even if you could get past of the technical hurdle of your programming language not supporting those constructs you’ll never get there.
π§ -> π -> π -> π -> π«… Sad but kind of true, if you recognize the value that lies within functional programming you’ll see that in most organizations you simply can not use it (practicality, team, management, etc), however if you are reading this, it must mean you are not a “9 to 5”-er, you care deeply about all aspects of software development and as such it is a moral duty to spread the word, so that some day interstellar infrastructure is well constructed and powered by using mathematical principles not only at the physical/hardware level, but at the software level too (I’m being sarcastic btw).
With that, let’s begin.
The basics
Usually when learning other programming languages we tend to “map” what we already know about our language to the new language we’re learning, some of us even try at any point to defend our beliefs internalizing expressions like “X can also do this”, “This is easier to do in X”, I suggest avoid doing this, not because it might be true or false, but because once you grok Haskell, it will simply not make sense, we are at different leagues here, forget all you know about programming, we’re not in Kansas anymore (and I’m in Mexico).
Think everything is an immutable expression, once you state something is X then that identifier is X for ever, and that is an expression written in stone… in code, I mean. There are different types of expressions and expressions have types. Let’s begin with something easy:
awesome = True
Ahh, a pure expression π, awesome
will always be True
it doesn’t matter what you think, it simply is. From this moment awesome
can’t be other thing:
awesome = "great" -- Errr, will not compile, 'awesome' is already defined
awesome = False -- Errr, will not compile, 'awesome' is already defined
Not only we can’t assign values of other type, but also, we can’t give it a value of the same type, is just immutable. awesome
is a Bool
, the compiler knows this because both True
and False
are values of this type. If we want to be clearer about our intent or state our assumptions about the type of an expression, we can use type annotations when defining an expression like this:
great :: Bool
great = True
We can read ::
as “is a” or “has the type”, so here “great is a Bool”. By adding type annotations to definitions we can’t accidentally assign values of other types to an expression:
canNot :: Int
canNot = "hello" -- Errr, will not compile, types don't match
Yes, this it may seem to basic, but when we’re dealing with more complex expressions this is awesome.
Functions
Now let’s write a function scream
that takes no arguments and gives a good old terror movie scream back:
scream = "Aaaah!"
You see? values and functions without arguments are the same thing! we might just tell everything is a function! More over we don’t need to add an explicit return
statement, the last expression is returned (“as in Ruby” I hear you scream with desperation π). Let’s define a function that takes a name to salute:
-- Define the function:
hello name = "Hello " ++ name
-- Use the function
hello "Joe" -- Outputs: "Hello Joe"
The hello
function takes one argument name
, then it just appends that name to our "Hello "
message, ++
is the concatenation operator, but is not an operator but a function!. To use the hello
function we just specify the function name and the arguments will follow, no function-call-parenthesis required.
Functions defined with symbols like the ++
function are known as infix functions, because they are used in infix position, that is in between values, like mathematical operators. hello
is not infix because its name is not comprised of symbols.
What happens if why change the type of the parameter:
hello True -- Errr, will not compile, types don't match
Our hello
function expects a String
and only a String
, why is that? Here the compiler infers the type of our function by inspecting what it does, or more precisely, by inspecting the types of the expressions that form our hello
expression (because everything, even functions are expressions). Here the compiler sees we use the ++
function, and it so happens that the type of this function looks like this:
(++) :: String -> String -> String
Note: For our pedantic friend out there, I know this is not the type of the ++
function, but for simplicity we’ll assume it is.
If the type annotation contains any arrows (->
) we can easily tell this is a function taking some arguments. The ->
in the type definition is how type arguments of the function are separated in the type annotation, the last ->
indicates the return type of the whole function. So we can say /"++
is an infix function that takes a String
and another String
to result in a final String
.
We can see in our hello
function definition, that we are giving "Hello "
and name
as the parameters to this function, and the final return value is then a value of type String
, and thus our hello
function has the following inferred type:
hello :: String -> String
That’s why when we pass anything to our function but a String
we get an error back, because types do not match.
Expressions are not statements
This means we can’t write more than one expression in a function, and this may have the effect that we can’t create a sequence of instructions in a function, for example, this will not work at all:
myAgeNextYear :: Int -> Int -> Int
myAgeNextYear birthYear currentYear =
nextYear = currentYear + 1 -- Attempt to declare a 'variable'
nextYear - birthYear -- Implicitly return the difference
That is not valid Haskell code, it will not compile, because the concept of “sequence” as in statements do not exists, but it can be “emulated” if one can says so with the let ... in ...
expression:
myAgeNextYear :: Int -> Int -> Int
myAgeNextYear birthYear currentYear =
let
nextYear = currentYear + 1 -- Add a local definition
in
nextYear - birthYear -- Implicitly return the difference
Note this pattern can be abused to “sequence” expressions, please don’t do that, that’s nasty π
Now, this will compile with no issues, but take a closer look, there are no statements here and no mutation, everything is an expression, including the let ... in ...
, this means we can assign “its result” as a value:
message :: String
message = let word = "World" in "Hello " ++ word
And this can also be expressed using where
as follows:
message' :: String
message' = "Hello " ++ word
where word = "World"
The tick mark '
is just part of the function name, it doesn’t mean anything special here (in other contexts it could have meaning but that’s more advanced and not relevant for this introduction), other than that both message
and message'
work in the same way.
Currying and Function composition
Let’s say we have defined the following functions:
add a b = a + b
mul a b = a * b
square a = a * a
As this functions use arithmetic functions in their definition, it means these functions will work on numbers, let’s assume add
and mul
both have the type signature Int -> Int -> Int
(although this is exactly not true it will make the examples easier), meanwhile the square
function will have the type Int -> Int
because it takes a single Int
value and yields another one back. From now on, I’ll include type annotations just to be clear, so let’s add the type signatures to the functions:
add :: Int -> Int -> Int
add a b = a + b
mul :: Int -> Int -> Int
mul a b = a * b
square :: Int -> Int
square a = a * a
Now, on to currying.
Currying
Even though it looks that both add
and mul
take two parameters in fact they just take a single argument and both return a function. We can group the type annotation with parenthesis to actually see what’s happening:
add :: Int -> (Int -> Int)
add a b = a + b
mul :: Int -> (Int -> Int)
mul a b = a * b
This means that, add
takes a value of type Int
and yields a function that takes an Int
value to yield another Int
value back. What this means is that we can do partial function application and save the resulting function to another value:
add5 :: Int -> Int
add5 = add 5
-- Using 'add5':
add5 10 -- Outputs: 15
“It can be done in JavaScript…” I hear you mumble. The point I’m making is that we can pass functions as values at any time, they are just expressions, and expressions can compose with other expressions.
Function composition
Now if we want to create a function that adds two numbers, then multiplies the result by 5, and finally squares the result we can do this:
compute :: Int -> Int -> Int
compute x y = square (mul 5 (add x y))
compute 3 4 -- Outputs: 1225
Note that, the parenthesis are not due to function calls as in many other programming languages, here they’re used just to group expressions, this is needed because in Haskell function application has the higher precedence and thus without the parenthesis we will be saying that the square
function takes multiple parameters (the mul
function being one of them) and this is not true. In Haskell we can use the $
function make it look cleaner:
compute :: Int -> Int -> Int
compute x y = square $ mul 5 $ add x y
It means the same thing, first apply add
function, then mul 5
and finally square
. Or read otherwise, apply square
to everything to the right of $
, for this we will first need to apply mul 5
to everything to the right of the second $
.
You’ll see this starts to look like a pipeline with water flowing from right to left, but Haskellers prefer function composition, using the .
function composes functions:
compute :: Int -> Int -> Int
compute x y = square . mul 5 $ add x y
But see where partially applying mul
so we get a composable function, that means we can do the same with add
and compose that one too:
compute :: Int -> Int -> Int
compute x y = square . mul 5 . add x $ y
The .
is just a function that effectively combines functions, this means that on the fly we define a single function that is the same as applying square
to the result of applying mul 5
to the result of the add x
function. As this is just a function without a name (is being composed on the fly) we use the $
operator, to apply the function on the left to everything on the right.
This means the same thing:
compute :: Int -> Int -> Int
compute x y = (square . mul 5 . add x) y
Then, see how on both sides of the =
sign we have y
values? you can think of it as y
is multiplying at both sides of the equation and just as in math equations, we can drop them off because the equality will hold, so this is the same function:
compute :: Int -> Int -> Int
compute x = square . mul 5 . add x
Reducing expressions in this way when there are implicit parameters being passed around is called point free style.
One thing it may trick you on functions defined as the composition of other functions (such as compute
) is that computing steps are written backwards, other languages such as Elm or Elixir have a pipe operator that allows basically the same but with a sequential order, doing that is not well seen in Haskell as “it’s not the mathematical way” but if you’re stuborn you can achieve the same thing with the &
function, is more like $
but reversed:
compute :: Int -> Int -> Int
compute x y = add x y & mul 5 & square
Note: This
&
function needs to be imported explicitly fromData.Function
module.
This does not compose the functions but it acts as a pipeline, the result of the add
function is then passed as the input to the mul
function and its result then goes to the square
function as its input. We can write it in multiple lines as well:
compute :: Int -> Int -> Int
compute x y
= add x y
& mul 5
& square
But as I mentioned, this is not “the Haskell way” of doing things and one should try to look for composition when possible. But how can we know when it’s possible to compose functions? Haskell makes this very easy thanks to the type system, that is, by adding type annotations to our expressions we can clearly see the edges of our Lego pieces and see how they can fit one with another, effectively composing them. Take a look to our functions above, they all take and produce Int
s, here it’s easy to see that what one function produces and what other needs are compatible types, so these Lego pieces will fit perfectly. When writing more complex expressions and abstract expressions in Haskell the same idea will remain, but unless you don’t learn how to use the type system to your favor, you won’t be able to see this, and you’ll always feel restricted when using Haskell and finally quit without knowing why others share so much respect and excitement about this language.
Lambdas
This is a good point to introduce lambdas which are function literals, the syntax to define one is \
followed by named arguments and ->
to signal the end of parameter list and beginning of function body. The following compute example is 100% equivalent to this one using a lambda:
compute :: Int -> Int -> Int
compute = \x -> square . mul 5 . add x
Notice that since we’re binding compute
to a lambda we no longer need to capture the x
parameter to the left side of the =
, but instead the lambda captures the argument that is needed for the add
function, if we think about it, the lambda exists just to fulfill that purpose, so it means we can do this instead:
compute :: Int -> Int -> Int
compute = square . mul 5 . (\x -> add x)
Lambdas are just functions, and we can compose them as such.
Data types
In Haskell we have “primitive” data types, that is, data types that are very common in software, for example Int
that represents a bounded integer, whereas Integer
is unbounded and can represent any number no matter how big it is (up to your memory capacity limit). Float
for floats and Double
for double precision floating point number. There are more types used to describe numbers, I suggest checking out this gist about how numbers work in Haskell.
Other types you’re already familiar with is the Bool
type which its only values can be either True
or False
. The []
(list) type to hold several values of any other type, this is known as a parameterized type or more formally a type constructor, that is, a type expression that needs other type to be provided to serve as a type.
Lastly a good type to know is the Maybe
type, this is very similar to a list, whereas the list can hold no value (empty list), a single value (single element list) or multiple values of a type, the Maybe
type can either have no value (in which case the value will be Nothing
) or it has a single value (in which the value is Just value
). For example we can define a function safeDiv
that will not fail even if we divide by zero, in which case the value to be returned will be Nothing
:
safeDiv :: Double -> Double -> Maybe Double
safeDiv a b =
if b == 0 then
Nothing
else
Just (a / b)
And by the way, that’s how if
, then
, else
works in Haskell, but it isn’t used as much, we can achieve the same using guards:
safeDiv :: Double -> Double -> Maybe Double
safeDiv a b
| b == 0 = Nothing
| otherwise = Just (a / b)
By using guards notation we list out conditions after each |
the first one to evaluate to True
will be executed, that last one otherwise
is just an alias to True
, so we could put True
directly instead and would achieve the same thing.
Custom data types
We can’t define classes or objects in functional programming as in object oriented programming, but we can define our own data types, this is a different way to model structures in our computer programs and in my opinion is a simpler yet more expressive approach. For example, if your application handle user’s email accounts its common to have the Email
type defined, this can be done like this:
newtype Email = Email String
With this we can construct an email like this:
myEmail = Email "[email protected]"
The compiler will see the type of myEmail
to be Email
and we can define functions that expect values of this type to provided to them, then our code will only compile if these functions receive Email
s and not any other type.
Of course, anyone could create an Email
out of gibberish and it will be
valid!:
gibberishEmail = Email "what's up?"
In order to avoid use we can define a function String -> Maybe Email
, if the String
is a valid email then we can give a Just email
value back, otherwise Nothing
will be returned. Then our module can hide (or make private) the Email
value constructor so no one can use it directly, this will turn impossible to construct invalid emails.
But how to access the inner String
in case I need it? Let’s suppose we just want to check weather an Email
is on Gmail like this:
isOnGmail $ Email "[email protected]" -- Outputs: False
isOnGmail $ Email "[email protected]" -- Outputs: True
So isOnGmail
is a function that should be able to receive an Email
value but also check the suffix of the email String
, and we already know we can’t treat an Email
as a String
because those are different types now. For that we can use pattern matching:
isOnGmail :: Email -> Bool
isOnGmail (Email address) = "gmail.com" `isSuffixOf` address
Note to the left of the =
we can use the data constructor Email
as if we were constructing one but here it means to deconstruct because it is a pattern match, the value of inner String
will be bound to the address
identifier. Also here we use the isSuffixOf
function (to use it we will need to import it from Data.List
module) in a prefix style, by surrounding the function name with `
we turn a non-infix function into a infix function. So previously when calling our add 4 3
function we could as well call it like 4 `add` 3
and it means the same thing. So this is equivalent:
isOnGmail :: Email -> Bool
isOnGmail (Email address) = isSuffixOf "gmail.com" address
If we can turn a non-infix function into a infix function, could we turn a infix function into non-infix? Yes we can, recall how add
is defined:
add :: Int -> Int -> Int
add a b = a + b
The +
there is an infix function, we can surround it with parenthesis (
)
to make it into a prefix (non-infix) function:
add :: Int -> Int -> Int
add a b = (+) a b
And we can apply that old trick, b
is at both sides of the equation:
add :: Int -> Int -> Int
add a = (+) a
And a
is too at both sides of the equation:
add :: Int -> Int -> Int
add = (+)
And now we see clearly add
and +
are the same function, they’re equal (ignoring sameness and equality distinctions).
Type aliases
We can also alias some types, that is give it several names to the same type, for example in Haskell a String
is just an alias to a list of characters [Char]
and is defined like this:
type String = [Char]
Or suppose that in your sales application want to track order numbers as if it was just a number:
type OrderId = Int
Unlike newtype
, this would make Haskell accept an OrderId
whenever an Int
is expected and vice versa, so it wouldn’t warn you if you try to add up order numbers.
Algebraic data types
There’s a class of more interesting types that can be defined, these are types with more than one value or constructor, we’ve already met Bool
, an algebraic data type roughly defined like this:
data Bool = True | False
This are called sum types and it means a value of type Bool
can either be True
or False
but not both at the same time. In contrast to product types whose values can include multiple values at once, for example a Person
has a Name
and an Age
:
type Name = String
type Age = Int
data Person = Person Name Age
The Person
to the left side of the =
is the type we’re creating, and the Person
to the right-hand side is the name of the data constructor to create a value of such type, they don’t need to be called the same but is common practice to use the same name for both the type and data constructor. So this is how we create a value of type Person
:
jane :: Person
jane = Person "Jane Doe" 21
Now you could create a function to check if Jane is allowed to drink.
Recall Maybe
? we said is like a list that could hold one value or nothing at all inside it, when a type can hold a value of other type and we want to make this inner type “configurable” then we have an algebraic parameterized data type:
data Maybe a = Nothing | Just a
“Generics!" you may say. Yes and we can also define recursive algebraic parameterized recursive data types, as List
:
data List a = Nil | Cons a (List a)
That is, a linked List
can be either empty (Nil
) or be a Cons
that holds an element a
of the list and references another List
. We could create a list of lucky numbers like so:
luckyNumbers :: List Int
luckyNumbers = Cons 7 (Cons 21 (Cons 99 Nil))
Or we could also use $
instead:
luckyNumbers :: List Int
luckyNumbers = Cons 7 $ Cons 21 $ Cons 99 Nil
In Haskell lists are denoted with []
as already noted, so a more truthful definition for the list type in Haskell will be:
data [] a = [] | a : [a]
This is harder to read because it uses symbols as both the type and constructor names but overall is the same structure, Cons
is :
(as is a symbol this is infix) and Nil
is []
, so a Haskell list can be created like this:
luckyNumbers :: [] Int
luckyNumbers = 7 : 21 : 99 : []
Thanks to the associativity defined for these data constructor we don’t need to use parenthesis here, but will still to be ugly to write like that, so Haskell gives some syntax sugar for lists:
luckyNumbers :: [Int]
luckyNumbers = [7, 21, 99]
Now, product types can also be polymorphic and take more than one type variable (sum types can also take multiple type variables):
data Tuple a b = Tuble a b
So our Tuple
holds values of different types:
tupleOne :: Tuple Int String
tupleOne = Tuple 3 "hello"
tupleTwo :: Tuple Bool Float
tupleTwo = Tuple False 3.14
In reality Haskell already provides tuples, these are denoted with parenthesis:
tupleOne :: (Int, String)
tupleOne = (3, "hello")
tupleTwo :: (Bool, Float)
tupleTwo = (False, 3.14)
Type classes
Type classes are a mechanism to decouple behavior from data, as you may have noticed when defining our own data types such as Person
we didn’t define any behavior for values of such types, they just “are”, pure data. However under some circumstances it may be desirable to associate some behavior to values of some type, as an example let’s define both a circle and a rectangle:
data Circle = Circle { radius :: Double }
data Rectangle = Rectangle
{ base :: Double
, height :: Double
}
Here we see a different syntax for data
definitions, we can give names to the components of the data type, we only need to surround the declarations within braces { .. }
, other than that it should look pretty straight forward, Circle
has a radius
of type Double
, whereas Rectangle
as a base
and a height
, both of type Double
, these are called attribute fields or just fields.
Moving on, this is just data without any behavior associated to it, but we know they have something in common, both are shapes and we can compute the area
on them, we can use type classes to express this behavior, this is how a type class is defined:
class Shape a where
area :: a -> Double
What this means is that we want to introduce some behavior Shape
for any value of type a
, values for whom this behavior is defined we expect to be able to call the area
function on them to get their area computed. At this point we have behavior in the type class and data in our data types, but they’re not linked, we do that by instantiating the type class on the data types we want:
instance Shape Circle where
area (Circle r) = pi * r ^ 2
instance Shape Rectangle where
area rectangle = height rectangle * base rectangle
We have to implement the behaviors dictated by the Shape
type class, that is area
in this case. Now, this may look like interfaces from OOP but with a subtle difference, the type knows nothing about the behavior associated to it, nowhere in the definition of Circle
or Rectangle
we mentioned they had to had this behavior, these are totally decoupled from each other and thus a behavior can be introduced at a later point or in a different place altogether.
Also notice, we accessed the inner data for Rectangle
in a different way, for Circle
we used pattern matching, whereas for Rectangle
we used the same attribute fields we used when defined the Rectangle
data type, this means that attribute fields are just functions that work on a value of the same type they’re defined on, that is height :: Rectangle -> Double
and base :: Rectangle -> Double
.
Let’s close this section by creating a function that just tells weather a shape is big or small:
isBigShape :: a -> Bool
isBigShape shape = area shape > 100
This will return true
if the area of the given shape is greater than 100, however we’re missing something, the type a
we receive could be anything, we know nothing about it, maybe is a number, maybe is a String
!, we want isBigShape
to work on member of the Shape
type class, and for that we specify a constraint on the function type:
isBigShape :: Shape a => a -> Bool
isBigShape shape = area shape > 100
Now it’s all good, isBigShape
is a function defined for any a, given that a
is a member of the Shape
type class:
isBigShape (Circle 20) -- Outputs: True
isBigShape (Rectangle { height = 5, base = 10 }) -- Outputs: False
isBigShape "Hola" -- Can not do it! 'String' type is not member of 'Shape'!
This last function, also let’s see something else about Haskell, we can have type variables in function signatures, so a function may work with values of any type, and when they do, it means they’re just working in some higher level structure, for example, recall our friends $
and .
we used before? this is how they’re defined:
($) :: (a -> b) -> a -> b
($) f x = f x
(.) :: (b -> c) -> (a -> b) -> a -> c
(.) f g x = f (g x)
Because the ability to call a function on a value does not depend on the type of the value it means that $
will have to work with values of any type, like wise, composition function should be able to compose functions no matter their types, so these are polymorphic functions because they can work on values of any type, not just Int
or just String
, but any type.
Conclusion
Maybe this post was overwhelming, or maybe not, but you can’t deny it is long.
Here I wanted to provide a general overview about the basics of Haskell syntax and semantics for already software developers using other mainstream languages. However, you can see I did not write about how to get started with Haskell, how to compile a program, what should we install, where to search documentation and packages, or in generally how the ecosystem is set up. We will see that later because this is already too loooong (many o’s there), and this was just a touch of Haskell.