Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Haskell
Wiki community
Recent changes
Random page
HaskellWiki
Search
Search
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Haskellへのヒッチハイカーガイド
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Special pages
Page information
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== 第3章:ナップサックを荷造りしてclassも使ってテストしよう(タオルも忘れないように!) == Enough preliminaries already. let's go pack some CDs. As you might already have recognized, our problem is a classical one. It is called a "knapsack problem" ([http://www.google.com/search?q=knapsack+problem google it up], if you don't know already what it is. There are more than 100000 links). let's start from the greedy solution, but first let's slightly modify our "Dir" datatype to allow easy extraction of its components: <haskell> -- Taken from 'cd-fit-3-1.hs' data Dir = Dir {dir_size::Int, dir_name::String} deriving Show </haskell> ---- '''Exercise:''' examine types of "Dir", "dir_size" and "dir_name" ---- From now on, we could use "dir_size d" to get a size of directory, and "dir_name d" to get its name, provided that "d" is of type "Dir". The Greedy algorithm sorts directories from the biggest down, and tries to put them on CD one by one, until there is no room for more. We will need to track which directories we added to CD, so let's add another datatype, and code this simple packing algorithm: <haskell> -- Taken from 'cd-fit-3-1.hs' import Data.List (sortBy) -- DirPack holds a set of directories which are to be stored on single CD. -- 'pack_size' could be calculated, but we will store it separately to reduce -- amount of calculation data DirPack = DirPack {pack_size::Int, dirs::[Dir]} deriving Show -- For simplicity, let's assume that we deal with standard 700 Mb CDs for now media_size = 700*1024*1024 -- Greedy packer tries to add directories one by one to initially empty 'DirPack' greedy_pack dirs = foldl maybe_add_dir (DirPack 0 []) $ sortBy cmpSize dirs where cmpSize d1 d2 = compare (dir_size d1) (dir_size d2) -- Helper function, which only adds directory "d" to the pack "p" when new -- total size does not exceed media_size maybe_add_dir p d = let new_size = pack_size p + dir_size d new_dirs = d:(dirs p) in if new_size > media_size then p else DirPack new_size new_dirs </haskell> ---- I'll highlight the areas which you could explore on your own (using other nice tutorials out there, of which I especially recommend "Yet Another Haskell Tutorial" by Hal Daume): * We choose to import a single function "sortBy" from a module [[Data.List]], not the whole thing. * Instead of coding case-by-case recursive definition of "greedy_pack", we go with higher-order approach, choosing "foldl" as a vehicle for list traversal. Examine its type. Other useful function from the same category are "map", "foldr", "scanl" and "scanr". Look them up! * To sort list of "Dir" by size only, we use custom sort function and parametrized sort - "sortBy". This sort of setup where the user may provide a custom "modifier" for a generic library function is quite common: look up "deleteBy", "deleteFirstsBy", "groupBy", "insertBy", "intersectBy", "maximumBy", "minimumBy", "sortBy", "unionBy". * To code the quite complex function "maybe_add_dir", we introduced several '''local definitions''' in the "let" clause, which we can reuse within the function body. We used a "where" clause in the "greedy_pack" function to achieve the same effect. Read about "let" and "where" clauses and the differences between them. * Note that in order to construct a new value of type "DirPack" (in function "maybe_add_dir") we haven't used the helper accessor functions "pack_size" and "dirs" ---- In order to actually use our greedy packer we must call it from our "main" function, so let's add a lines: <haskell> -- Taken from 'cd-fit-3-1.hs' main = do ... -- compute solution and print it putStrLn "Solution:" ; print (greedy_pack dirs) </haskell> Verify integrity of our definitions by (re)loading our code in ghci. Compiles? Thought so :) Now, do "darcs record" and add some sensible commit message. Now it is time to test our creation. We could do it by actually running it in the wild like this: $ du -sb ~/DOWNLOADS/* | runhaskell ./cd-fit.hs This will prove that our code seems to be working. At least, this once. How about establishing with reasonable degree of certainty that our code, parts and the whole, works properly, and doing so in re-usable manner? In other words, how about writing some test? Java programmers used to JUnit probably thought about screens of boiler-plate code and hand-coded method invocations. Never fear, we will not do anything as silly :) Enter '''[[QuickCheck]]'''. [[QuickCheck]] is a tool to do automated testing of your functions using (semi)random input data. In the spirit of "100b of code examples is worth 1kb of praise" let's show the code for testing the following ''property'': An attempt to pack directories returned by "greedy_pack" should return "DirPack" of exactly the same pack: <haskell> -- Taken from 'cd-fit-3-2.hs' import Test.QuickCheck import Control.Monad (liftM2) -- We must teach QuickCheck how to generate arbitrary "Dir"s instance Arbitrary Dir where -- Let's just skip "coarbitrary" for now, ok? -- I promise, we will get back to it later :) coarbitrary = undefined -- We generate arbitrary "Dir" by generating random size and random name -- and stuffing them inside "Dir" arbitrary = liftM2 Dir gen_size gen_name -- Generate random size between 10 and 1400 Mb where gen_size = do s <- choose (10,1400) return (s*1024*1024) -- Generate random name 1 to 300 chars long, consisting of symbols "fubar/" gen_name = do n <- choose (1,300) sequence $ take n $ repeat (elements "fubar/") -- For convenience and by tradition, all QuickCheck tests begin with prefix "prop_". -- Assume that "ds" will be a random list of "Dir"s and code your test. prop_greedy_pack_is_fixpoint ds = let pack = greedy_pack ds in pack_size pack == pack_size (greedy_pack (dirs pack)) </haskell> let's run the test, after which I'll explain how it all works: Prelude> :r Compiling Main ( ./cd-fit.hs, interpreted ) Ok, modules loaded: Main. *Main> quickCheck prop_greedy_pack_is_fixpoint [numbers spinning] OK, passed 100 tests. *Main> We've just seen our "greedy_pack" run on a 100 completely (well, almost completely) random lists of "Dir"s, and it seems that property indeed holds. let's dissect the code. The most intriguing part is "instance Arbitrary Dir where", which declares that "Dir" is an '''[[instance]]''' of '''type[[class]]''' "Arbitrary". Whoa, that's a whole lot of unknown words! :) Let's slow down a bit. What is a '''type[[class]]'''? A typeclass is a Haskell way of dealing with the following situation: suppose that you are writing a library of useful functions and you don't know in advance how exactly they will be used, so you want to make them generic. Now, on one hand you don't want to restrict your users to certain type (e.g. String). On the other hand, you want to enforce the convention that arguments for your function must satisfy a certain set of constraints. That is where '''typeclass''' comes in handy. Think of typeclass as a '''contract''' (or "interface", in Java terms) that your type must fulfill in order to be admitted as an argument to certain functions. Let's examine the typeclass "Arbitrary": *Main> :i Arbitrary class Arbitrary a where arbitrary :: Gen a coarbitrary :: a -> Gen b -> Gen b -- Imported from Test.QuickCheck instance Arbitrary Dir -- Defined at ./cd-fit.hs:61:0 instance Arbitrary Bool -- Imported from Test.QuickCheck instance Arbitrary Double -- Imported from Test.QuickCheck instance Arbitrary Float -- Imported from Test.QuickCheck instance Arbitrary Int -- Imported from Test.QuickCheck instance Arbitrary Integer -- Imported from Test.QuickCheck -- rest skipped -- It could be read this way: "Any [[type]] (let's name it 'a') could be a member of the [[class]] Arbitrary as soon as we define two functions for it: "arbitrary" and "coarbitrary", with signatures shown. For types Dir, Bool, Double, Float, Int and Integer such definitions were provided, so all those types are instance of class Arbitrary". Now, if you write a function which operates on its arguments solely by means of "arbitrary" and "coarbitrary", you can be sure that this function will work on any type which is an instance of "Arbitrary"! let's say it again. Someone (maybe even you) writes the code (API or library), which requires that input values implement certain ''interfaces'', which is described in terms of functions. Once you show how your type implements this ''interface'' you are free to use API or library. Consider the function "sort" from standard library: *Main> :t Data.List.sort Data.List.sort :: (Ord a) => [a] -> [a] We see that it sorts lists of any values which are instance of typeclass "Ord". Let's examine that class: *Main> :i Ord class Eq a => Ord a where compare :: a -> a -> Ordering (<) :: a -> a -> Bool (>=) :: a -> a -> Bool (>) :: a -> a -> Bool (<=) :: a -> a -> Bool max :: a -> a -> a min :: a -> a -> a -- skip instance Ord Double -- Imported from GHC.Float instance Ord Float -- Imported from GHC.Float instance Ord Bool -- Imported from GHC.Base instance Ord Char -- Imported from GHC.Base instance Ord Integer -- Imported from GHC.Num instance Ord Int -- Imported from GHC.Base -- skip *Main> We see a couple of interesting things: first, there is an additional requirement listed: in order to be an instance of "Ord", type must first be an instance of typeclass "Eq". Then, we see that there is an awful lot of functions to define in order to be an instance of "Ord". Wait a second, isn't it silly to define both (<) and (>) when one could be expressed via another? Right you are! Usually, typeclass contains several "default" implementation for its functions, when it is possible to express them through each other (as it is with "Ord"). In this case it is possible to supply only a minimal definition (which in case of "Ord" consists of any single function) and others will be automatically derived. If you supplied fewer functions than are required for minimal implementation, the compiler/interpreter will say so and explain which functions you still have to define. Once again, we see that a lot of [[type]]s are already instances of typeclass Ord, and thus we are able to sort them. Now, let's take a look back to the definition of "Dir": <haskell> -- Taken from 'cd-fit-3-2.hs' data Dir = Dir {dir_size::Int, dir_name::String} deriving Show </haskell> See that "[[deriving]]" clause? It instructs the compiler to automatically derive code to make "Dir" an instance of typeclass Show. The compiler knows about a bunch of standard typeclasses (Eq, Ord, Show, Enum, Bound, Typeable to name a few) and knows how to make a type into a "suitably good" instance of any of them. If you want to derive instances of more than one typeclass, say it this way: "deriving (Eq,Ord,Show)". Voila! Now we can compare, sort and print data of that type! Side note for Java programmers: just imagine java compiler which derives code for "implements Storable" for you... Side note for C++ programmers: just imagine that deep copy constructors are being written for you by compiler.... ---- '''Exercises:''' * Examine typeclasses Eq and Show * Examine types of (==) and "print" * Try to make "Dir" instance of "Eq" ---- OK, back to our tests. So, what we have had to do in order to make "Dir" an instance of "Arbitrary"? Minimal definition consists of "arbitrary". Let's examine it up close: *Main> :t arbitrary arbitrary :: (Arbitrary a) => Gen a See that "Gen a"? Reminds you of something? Right! Think of "IO a" and "Parser a" which we've seen already. This is yet another example of action-returning function, which could be used inside "do"-notation. (You might ask yourself, wouldn't it be useful to generalize that convenient concept of actions and "do"? Of course! It is already done, the concept is called "[[Monad]]" and we will talk about it in Chapter 400 :) ) Since 'a' here is a [[type variable]] which is an instance of "Arbitrary", we could substitute "Dir" here. So, how we can make and return an action of type "Gen Dir"? Let's look at the code: <haskell> -- Taken from 'cd-fit-3-2.hs' arbitrary = liftM2 Dir gen_size gen_name -- Generate random size between 10 and 1400 Mb where gen_size = do s <- choose (10,1400) return (s*1024*1024) -- Generate random name 1 to 300 chars long, consisting of symbols "fubar/" gen_name = do n <- choose (1,300) sequence $ take n $ repeat (elements "fubar/") </haskell> We have used the library-provided functions "choose" and "elements" to build up "gen_size :: Gen Int" and "gen_name :: Gen String" (exercise: don't take my word on that. Find a way to check types of "gen_name" and "gen_size"). Since "Int" and "String" are components of "Dir", we sure must be able to use "Gen Int" and "Gen String" to build "Gen Dir". But where is the "do" block for that? There is none, and there is only single call to "liftM2". Let's examine it: *Main> :t liftM2 liftM2 :: (Monad m) => (a1 -> a2 -> r) -> m a1 -> m a2 -> m r Kind of scary, right? Let's provide typechecker with more context: *Main> :t liftM2 Dir liftM2 Dir :: (Monad m) => m Int -> m String -> m Dir Since you already heard that "Gen" is a "Monad", you could substitute "Gen" for "m" here, obtaining "liftM2 Dir :: (Monad Gen) => Gen Int -> Gen String -> Gen Dir". Exactly what we wanted! Consider "liftM2" to be "advanced topic" of this chapter (which we will cover later) and just note for now that: * "2" is a number of arguments for data constructor "Dir" and we have used "liftM2" to construct "Gen Dir" out of "Dir" * There are also "liftM", "liftM3", "liftM4", "liftM5" * "liftM2" is defined as "liftM2 f a1 a2 = do x<-a1; y<-a2; return (f x y)" Hopefully, this will all make sense after you read it for the third time ;) Oh, by the way - don't forget to "darcs record" your changes!
Summary:
Please note that all contributions to HaskellWiki are considered to be released under simple permissive license (see
HaskellWiki:Copyrights
for details). If you don't want your writing to be edited mercilessly and redistributed at will, then don't submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
DO NOT SUBMIT COPYRIGHTED WORK WITHOUT PERMISSION!
Cancel
Editing help
(opens in new window)
Toggle limited content width