TSM - Functional Programming in Haskell (III)

Mihai Maruseac - IxNovation

In the last article we offered a short review of Haskell types, both the existing ones and the ones which can be defined by the programmer. We continue to uncover features of this language in this article by giving a short insight into another type system characteristic: the special way in which polymorphism is implemented in Haskell: typeclasses.

Though the notions of class and polymorphism bring into mind the object oriented programming, remember that Haskell is different. By the end of this article you would know that the similarities with OOP are very few and squinted. Here, there are totally different concepts having the same names.

Let"s start with a simple function (+). Last time we used it as having type Integer -> Integer -> Integer but it is self-evident that in this way we could not add two real numbers together. Changing type to a -> b -> c gets us into a situation with 2 problems: we cannot construct type c from types a and b (if it were a constant type we could have chosen a null constructor and use a constant function - not a correct implementation for (+) anyway) and we can also have invalid formulas like True + "ana has apples". Restricting the signature to a -> a -> a in order to solve the above 2 problems will still allow us to write the semantically invalid True + False. We need a way to restrict the types of values of type variable a to the numeric ones.

Let"s consider another example: converting values to strings. In Java we can use toString from class Object by overriding it in each of our classes. However, unlike the majority of languages, in Haskell functions are first-class values. Taking into account that we cannot convert a function to a string (excluding the lexical obvious example of "function at line 42") we arrive at the real problem: we need a function of type a -> String where values of type variable a are restricted to those types for which the conversion has a valid semantics.

The same is true for the reverse case of a function of type String -> a to convert a string to another type.

We can also look at functions (==) :: a -> a -> Bool, (<) :: a -> a -> Bool, etc. In each of these cases, we need a supplementary restriction on the types of variables in the signature. These restrictions apply to the type, not to the values of the type.

Finally, let"s consider a more contrived example: in a way, it is useful to define functions like map:: (a -> b) -> [a] -> [b] but for other container-like data types (trees, stacks, graphs, etc.). We"d rather generalize the function instead of defining different functions for different containers. That is, instead of having mapTree :: (a -> b) -> Tree a -> Tree b and mapStack :: (a -> b) -> Stack a -> Stack b we would prefer to have fmap :: (a -> b) -> f a -> f b where f is an unary type constructor (needs a single proper type). Of course, we need to restrict f to those type constructors representing a container of values.

In the above fmap signature f cannot be any proper type (for example, it cannot be Bool). Practically, we need a type system for the types of Haskell. This is the kinds system, each type has a kind. The proper types (Bool, Integer, functions) have kind *, while type constructors have kinds along the lines of * -> *, * -> * -> *, etc. In GHCi, one can find the kind of a type expression by using :k.

Going back to type variables restrictions, we consider the way they are implemented in OOP languages: to force a class to implement some methods we use inheritance, especially interfaces. An interface in OOP world is just a contract: if this class implements that interface, then this class has usable definitions for all methods of that interface.

Exactly the same idea resides behind the Haskell"s typeclasses. A typeclass is only a specification for the functions which are associated with a type enrolled in this typeclass. The definitions for these functions can be written by the programmer or generated by the compiler. The second case is when the programmer uses the deriving clause when defining a new type.

So, the equivalent for a typeclass is the OOP interface, not the OOP class. That is as long as we neglect the fact that an OOP interface doesn"t allow implicit definitions for methods or automatically deriving of method"s implementations.

Let"s see now how the typeclasses are used to solve the problems with which we started this article. For the numeric types we make use of the Num typeclass (along with a typeclass hierarchy starting from it: Real, Rational, Integral, etc):

 
class Num a where
(+), (-), (*) :: a -> a -> a
negate :: a -> a
abs :: a -> a
signum :: a -> a
fromInteger :: Integer -> a
x - y = x + negate y
negate x = 0 - x

It is evident that (-) and negate are defined one in terms of the other. Thus, for enrolling a type in Num typeclass we only need to define all methods but one of the 2.

To convert to strings we have the Show typeclass which contains the function show :: a -> String. Similarly, for the inverse conversion we have a typeclass named Read with read :: String -> a. For the equality tests we have the Eq typeclass from which we need to implement one of (==) and (/=). Finally, to have an order relation between two elements we need to make use of a typeclass named Ord

class Eq a => Ord a where
compare :: a -> a -> Ordering
(<), (<=), (>), (>=) :: a -> a -> Bool
max, min :: a -> a -> a

For enrolling a type into this typeclass we only need to implement compare. It is possible to also define the 4 inequality tests. Observe that there is one more restriction into play: a type cannot be enrolled in Ord without being a member of Eq as well. Otherwise, the operators (<=) and (>=) would lack sense.

Finally, the fmap function can be applied to all types enrolled into the Functor typeclass. The name comes from a concept in Category Theory and the class has more power than it is visible at the moment, power which will be unleashed in a future article.

class Functor f where
fmap :: (a -> b) -> f a -> f b

Now it is time to see how we can enroll a type to a typeclass without using the deriving clause. Here is an example for doing this for lists and typeclass Functor:

instance Functor [] where
fmap = map

Let"s continue the example from the last article: we have 3 data tables with information about persons. Last time we defined 3 different search functions but we want to define a single one and let the compiler specialize it to the proper one. We will need several GHC extensions activated by using the LANGUAGE directive.

{-# LANGUAGE MultiParamTypeClasses #-} 
{-# LANGUAGE FunctionalDependencies #-} 
{-# LANGUAGE TypeSynonymInstances #-} 
{-# LANGUAGE FlexibleInstances #-}

The types from the last article:

type Name = String
type Age = Int
type Address = String
type PhoneNumber = Integer
newtype NameAgeTable = NAgT [(Name, Age)] deriving Show
newtype NameAddressTable = NAdT [(Name, Address)] deriving Show
newtype NamePhoneTable = NPT [(Name, PhoneNumber)] deriving Show
 

We still use deriving Show to get a textual representation of the tables.

Let"s construct some values for the 3 types:

nameAge = NAgT [("Ana", 24), ("Gabriela", 21), ("Mihai", 25), ("Radu", 24)]
nameAddress = NAdT [("Mihai", "a random address"), ("Ion", "another address")]
namePhone = NPT [("Ana", 2472788), ("Mihai", 24828542)]
 

Now it is time to define a typeclass for searching by name in one of these tables:

class SearchableByName t a | t -> a where 
search :: Name -> t -> Maybe a 

Because we need 2 type variables in the signature of function search we need two type variables in defining the typeclass: t for the table in which we search and a for the type of the returned values. Since the table defines the type of the returned values we use fundeps (between | and where). Because of this we needed to activate GHC"s extensions above.

Now, let"s enroll our types to our typeclass:

instance SearchableByName NameAgeTable Age where 
search name (NAgT l) = lookup name l 

instance SearchableByName NameAddressTable Address where 
search name (NAdT l) = lookup name l 

instance SearchableByName NamePhoneTable PhoneNumber where 
search name (NPT l) = lookup name l 

Searching is simple now:

*Main> search "Ion" nameAge 
Nothing
*Main> search "Mihai" nameAge 
Just 25
*Main> search "Mihai" nameAddress 
Just "a random address"
*Main> search "Gabriela" nameAddress 
Nothing
*Main> search "Gabriela" namePhone 
Nothing
*Main> search "Mihai" namePhone 
Just 24828542

The type of the datatable in which we search completely specifies the type of the result. We use a single function for searching in several tables and the compiles specializes it to the proper implementation.

Next time we will extend this code to implement join operations.