(outline, TEST, INTEST, and some intro material)
(more discussion of I/O)
Revision as of 00:07, 21 June 2007
SPOJ is Sphere Online Judge, a automated programming contest website with lots of problems and solutions ranked.
1 Solution to TEST
TEST description: the input consists of one number per line, the
program should echo each number until
42 is reached, at
which point the program should exit.
main = mapM_ putStrLn . takeWhile (/="42") . lines =<< getContents
2 Solution to INTEST
INTEST description: the first line of input contains two numbers: n and k. The input then consists of n lines of numbers. The output of the program should be the count of the numbers which are divisible by k.
Prior to the installation of GHC 6.6.1, it was quite difficult, if not impossible, to pass this demonstration. This solution shows a simple, reasonably efficient manner of using the new ByteString library to achieve acceptable times.
import Data.List (unfoldr) import qualified Data.ByteString.Char8 as SS divisibleBy :: Int -> Int -> Bool a `divisibleBy` n = a `rem` n == 0 readInt1 :: SS.ByteString -> Maybe (Int, SS.ByteString) readInt1 cs = do (n, cs') <- SS.readInt cs return (n, SS.tail cs') main = do cs <- SS.getContents -- This is the only line of I/O let n:k:ns = unfoldr readInt1 cs count = length $ filter (`divisibleBy` k) ns print count
3 Techniques for dealing with problems efficiently
This section accumulates some earned wisdom about writing Haskell programs which overcome some of the technical obstacles in SPOJ. These are not spoilers, hopefully, but useful tips in general about auxiliary issues.
SPOJ has finally installed the GHC version 6.6.1. This makes the new, alternative I/O module available: Data.ByteString. Hopefully, it will be possible to find solutions for many more problems -- which could not be solved efficiently in the past because of mundane issues like I/O.
3.1.1 Brief Details
ByteStrings come in two varieties: Strict (default), and Lazy. Strict ByteStrings, in brief, are implemented by a foreign pointer to a block of memory which is filled by low level operations. The ByteString data type points to this memory region and also contains two Ints which track relative position and length. This means that many operations on ByteStrings can re-use memory and merely manipulate the two Ints.
Data.ByteString.Lazy is a layer on top which provides the ByteString API, but now the data is only read in chunks of Strict ByteStrings when necessary.
Don Stewart and others have put a great deal of excellent work into the library to ensure that the high-level interface will be optimized into efficient code. However, it may behoove you to inspect the source code for yourself, which is quite accessible and available in the GHC repository under libraries/base/Data/.
For SPOJ purposes, the most common operation is reading some kind of list of Ints. The code for INTEST above demonstrates one simple way to take advantage of ByteString. Using a similar technique may suffice for many problems. However, there are more advanced possibilities which could offer improvements in some cases.
The module is normally imported qualified because it shares many names
with standard Prelude functions. I use the prefix
my examples, for Strict byteString.
3.1.2 Span and Break
break functions available
from ByteString can be used to parse the occasional non-numeric value
in input. For example, the following code skips a line:
import qualified Data.ByteString.Char8 as SS -- ... skipLine cs = SS.span (== '\n') cs' where (_, cs') = SS.break (== '\n') cs
The nice part about
break is that
when it is found in the form
span (== c) for all
characters c, it is optimized down to a simple
3.1.3 Lazy ByteStrings
In general, I find that the Strict ByteStrings are more than adequate,
and sometimes more complicated usage does not result in better
performance. There are cases where Lazy ByteStrings do seem handle
better. You can use Lazy ByteStrings as a drop-in replacement for
Strict, mostly. But that does not generally confer any advantage.
The best part about Lazy ByteStrings is the function called
toChunks. This function provides an interface to the
lower-level portions which actually operate by reading chunks of data
into Strict ByteStrings acting as buffers. The chunks are kept in a
lazy list, so each chunk is only read on-demand. However, you are
getting the data as efficiently as the library deems possible.
3.3 Garbage Collection