Difference between revisions of "SPOJ"
(outline, TEST, INTEST, and some intro material) |
(more discussion of I/O) |
||
Line 1: | Line 1: | ||
⚫ | |||
− | There is a website that provides an online programming contest called |
||
+ | programming contest website with lots of problems and solutions |
||
⚫ | |||
+ | ranked. |
||
− | together a bunch of haskellers and wipe other languages off the map :) |
||
== Solution to TEST == |
== Solution to TEST == |
||
− | TEST description: the input consists of one number per line, the |
+ | TEST description: the input consists of one number per line, the |
− | should echo each number until |
+ | program should echo each number until <code>42</code> is reached, at |
− | should exit. |
+ | which point the program should exit. |
<haskell> |
<haskell> |
||
Line 56: | Line 56: | ||
possible to find solutions for many more problems -- which could not be solved |
possible to find solutions for many more problems -- which could not be solved |
||
efficiently in the past because of mundane issues like I/O. |
efficiently in the past because of mundane issues like I/O. |
||
+ | |||
+ | ==== Brief Details ==== |
||
ByteStrings come in two varieties: Strict (default), and Lazy. Strict |
ByteStrings come in two varieties: Strict (default), and Lazy. Strict |
||
Line 81: | Line 83: | ||
could offer improvements in some cases. |
could offer improvements in some cases. |
||
+ | The module is normally imported qualified because it shares many names |
||
− | todo |
||
+ | with standard Prelude functions. I use the prefix <code>SS</code> in |
||
+ | my examples, for Strict byteString. |
||
+ | |||
+ | ==== Span and Break ==== |
||
+ | |||
+ | The <code>span</code> and <code>break</code> functions available |
||
+ | from ByteString can be used to parse the occasional non-numeric value |
||
+ | in input. For example, the following code skips a line: |
||
+ | |||
+ | <haskell> |
||
+ | import qualified Data.ByteString.Char8 as SS |
||
+ | -- ... |
||
+ | skipLine cs = SS.span (== '\n') cs' |
||
+ | where (_, cs') = SS.break (== '\n') cs |
||
+ | </haskell> |
||
+ | |||
+ | The nice part about <code>span</code> and <code>break</code> is that |
||
+ | when it is found in the form <code>span (== c)</code> for all |
||
+ | characters c, it is optimized down to a simple <code>memchr</code>. |
||
+ | |||
+ | ==== Lazy ByteStrings ==== |
||
+ | |||
+ | In general, I find that the Strict ByteStrings are more than adequate, |
||
+ | and sometimes more complicated usage does not result in better |
||
+ | performance. There are cases where Lazy ByteStrings do seem handle |
||
+ | better. You can use Lazy ByteStrings as a drop-in replacement for |
||
+ | Strict, mostly. But that does not generally confer any advantage. |
||
+ | The best part about Lazy ByteStrings is the function called |
||
+ | <code>toChunks</code>. This function provides an interface to the |
||
+ | lower-level portions which actually operate by reading chunks of data |
||
+ | into Strict ByteStrings acting as buffers. The chunks are kept in a |
||
+ | lazy list, so each chunk is only read on-demand. However, you are |
||
+ | getting the data as efficiently as the library deems possible. |
||
+ | |||
+ | [http://www.haskell.org/pipermail/haskell-cafe/2007-June/026654.html Don Stewart Discusses Chunked ByteString Input] |
||
=== Arrays === |
=== Arrays === |
Revision as of 00:07, 21 June 2007
SPOJ is Sphere Online Judge, a automated programming contest website with lots of problems and solutions ranked.
Solution to TEST
TEST description: the input consists of one number per line, the
program should echo each number until 42
is reached, at
which point the program should exit.
main = mapM_ putStrLn . takeWhile (/="42") . lines =<< getContents
Solution to INTEST
INTEST description: the first line of input contains two numbers: n and k. The input then consists of n lines of numbers. The output of the program should be the count of the numbers which are divisible by k.
Prior to the installation of GHC 6.6.1, it was quite difficult, if not impossible, to pass this demonstration. This solution shows a simple, reasonably efficient manner of using the new ByteString library to achieve acceptable times.
import Data.List (unfoldr)
import qualified Data.ByteString.Char8 as SS
divisibleBy :: Int -> Int -> Bool
a `divisibleBy` n = a `rem` n == 0
readInt1 :: SS.ByteString -> Maybe (Int, SS.ByteString)
readInt1 cs = do
(n, cs') <- SS.readInt cs
return (n, SS.tail cs')
main = do
cs <- SS.getContents -- This is the only line of I/O
let n:k:ns = unfoldr readInt1 cs
count = length $ filter (`divisibleBy` k) ns
print count
Techniques for dealing with problems efficiently
This section accumulates some earned wisdom about writing Haskell programs which overcome some of the technical obstacles in SPOJ. These are not spoilers, hopefully, but useful tips in general about auxiliary issues.
I/O
SPOJ has finally installed the GHC version 6.6.1. This makes the new, alternative I/O module available: Data.ByteString. Hopefully, it will be possible to find solutions for many more problems -- which could not be solved efficiently in the past because of mundane issues like I/O.
Brief Details
ByteStrings come in two varieties: Strict (default), and Lazy. Strict ByteStrings, in brief, are implemented by a foreign pointer to a block of memory which is filled by low level operations. The ByteString data type points to this memory region and also contains two Ints which track relative position and length. This means that many operations on ByteStrings can re-use memory and merely manipulate the two Ints.
Data.ByteString.Lazy is a layer on top which provides the ByteString API, but now the data is only read in chunks of Strict ByteStrings when necessary.
Don Stewart and others have put a great deal of excellent work into the library to ensure that the high-level interface will be optimized into efficient code. However, it may behoove you to inspect the source code for yourself, which is quite accessible and available in the GHC repository under libraries/base/Data/.
For SPOJ purposes, the most common operation is reading some kind of list of Ints. The code for INTEST above demonstrates one simple way to take advantage of ByteString. Using a similar technique may suffice for many problems. However, there are more advanced possibilities which could offer improvements in some cases.
The module is normally imported qualified because it shares many names
with standard Prelude functions. I use the prefix SS
in
my examples, for Strict byteString.
Span and Break
The span
and break
functions available
from ByteString can be used to parse the occasional non-numeric value
in input. For example, the following code skips a line:
import qualified Data.ByteString.Char8 as SS
-- ...
skipLine cs = SS.span (== '\n') cs'
where (_, cs') = SS.break (== '\n') cs
The nice part about span
and break
is that
when it is found in the form span (== c)
for all
characters c, it is optimized down to a simple memchr
.
Lazy ByteStrings
In general, I find that the Strict ByteStrings are more than adequate,
and sometimes more complicated usage does not result in better
performance. There are cases where Lazy ByteStrings do seem handle
better. You can use Lazy ByteStrings as a drop-in replacement for
Strict, mostly. But that does not generally confer any advantage.
The best part about Lazy ByteStrings is the function called
toChunks
. This function provides an interface to the
lower-level portions which actually operate by reading chunks of data
into Strict ByteStrings acting as buffers. The chunks are kept in a
lazy list, so each chunk is only read on-demand. However, you are
getting the data as efficiently as the library deems possible.
Don Stewart Discusses Chunked ByteString Input
Arrays
todo
Garbage Collection
todo