Difference between revisions of "Seq"
Benmachine (talk | contribs) |
m |
||
(11 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
<span>{{DISPLAYTITLE:seq}}</span> |
<span>{{DISPLAYTITLE:seq}}</span> |
||
+ | While its name might suggest otherwise, the <code>seq</code> function's purpose is to introduce ''strictness'' to a Haskell program. As indicated by its type signature: |
||
− | The <tt>seq</tt> function is the most basic method of introducing strictness to a Haskell program. <tt>seq :: a -> b -> b</tt> takes two arguments of any type, and returns the second. However, it also has the important property that it is magically strict in its first argument. In essence, <tt>seq</tt> is defined by the following two equations: |
||
<haskell> |
<haskell> |
||
− | + | seq :: a -> b -> b |
|
− | a `seq` b = b |
||
</haskell> |
</haskell> |
||
+ | it takes two arguments of any type, and returns the second. However, it also has the important property that it is always strict in its first argument. In essence, <code>seq</code> is defined by the following two equations: |
||
− | See [[Bottom]] for an explanation of the ⊥ symbol. |
||
+ | <haskell> |
||
− | A common misconception regarding <tt>seq</tt> is that <tt>seq x</tt> "evaluates" <tt>x</tt>. Well, sort of. <tt>seq</tt> doesn't evaluate anything just by virtue of existing in the source file, all it does is introduce an artificial data dependency of one value on another: when the result of <tt>seq</tt> is evaluated, the first argument must also be evaluated. As an example, suppose <tt>x :: Integer</tt>, then <tt>seq x b</tt> behaves essentially like <tt>if x == 0 then b else b</tt> – unconditionally equal to <tt>b</tt>, but forcing <tt>x</tt> along the way. In particular, the expression <tt>x `seq` x</tt> is completely redundant, and always has exactly the same effect as just writing <tt>x</tt>. |
||
+ | ⊥ `seq` b = ⊥ |
||
+ | a `seq` b | a ≠ ⊥ = b |
||
+ | </haskell> |
||
+ | |||
+ | (See [[Bottom]] for an explanation of the <code>⊥</code> symbol.) |
||
+ | __NOTOC__ |
||
+ | === History === |
||
− | Strictly speaking, the two equations of <tt>seq</tt> are all it must satisfy, and if the compiler can statically prove that the first argument is not ⊥, it doesn't have to evaluate it to meet its obligations. In practice, this almost never happens, and would probably be considered highly counterintuitive behaviour on the part of GHC (or whatever else you use to run your code). However, it ''is'' the case that evaluating <tt>b</tt> and ''then'' <tt>a</tt>, then returning <tt>b</tt> is a perfectly legitimate thing to do; it is to prevent this ambiguity that <tt>pseq</tt> was invented, but that's another story. |
||
+ | The need for a primitive sequencing definition has been known since at least 1996: |
||
− | === Common uses of <tt>seq</tt> === |
||
+ | <blockquote> |
||
− | <tt>seq</tt> is typically used in the semantic interpretation of other strictness techniques, like strictness annotations in data types, or GHC's <tt>BangPatterns</tt> extension. For example, the meaning of this: |
||
+ | The <code>seq</code> combinator implements sequential composition. When the expression <code>e1 ‘seq‘ e2</code> is evaluated, <code>e1</code> |
||
+ | is evaluated to weak head normal form first, and then the value of <code>e2</code> is returned. In the following parallel <code>nfib</code> |
||
+ | function, <code>seq</code> is used to force the evaluation of <code>n2</code> before the addition takes place. This is because Haskell does not |
||
+ | specify which operand is evaluated first, and if <code>n1</code> was evaluated before <code>n2</code>, there would be no parallelism. |
||
+ | |||
+ | <haskell> |
||
+ | nfib :: Int -> Int |
||
+ | nfib n | n <= 1 = 1 |
||
+ | | otherwise = n1 ‘par‘ (n2 ‘seq‘ n1 + n2 + 1) |
||
+ | where |
||
+ | n1 = nfib (n-1) |
||
+ | n2 = nfib (n-2) |
||
+ | </haskell> |
||
+ | |||
+ | :<small>[https://web.archive.org/web/20040228202402/http://www.dcs.gla.ac.uk/fp/workshops/fpw96/Trinder.pdf Accidents always Come in Threes: A Case Study of Data-intensive Programs in Parallel Haskell] (page 2 of 14).</small> |
||
+ | </blockquote> |
||
+ | |||
+ | ...the same year <code>seq</code> was introduced in Haskell 1.3 as a method of the (now-abandonded) <code>Eval</code> type class: |
||
+ | |||
+ | <haskell> |
||
+ | class Eval a where |
||
+ | strict :: (a -> b) -> a -> b |
||
+ | seq :: a -> b -> b |
||
+ | |||
+ | strict f x = x ‘seq‘ f x |
||
+ | </haskell> |
||
+ | |||
+ | However, despite that need by the time Haskell 98 was released <code>seq</code> had been reduced to a primitive strictness definition. But in 2009, all doubts about the need for a primitive sequencing definition were vanquished: |
||
+ | |||
+ | <blockquote> |
||
+ | <b>2.1 The need for</b><code>pseq</code> |
||
+ | |||
+ | The <code>pseq</code> combinator is used for sequencing; informally, it |
||
+ | evaluates its first argument to weak-head normal form, and then evaluates |
||
+ | its second argument, returning the value of its second argument. |
||
+ | Consider this definition of <code>parMap</code>: |
||
+ | |||
+ | <haskell> |
||
+ | parMap f [] = [] |
||
+ | parMap f (x:xs) = y ‘par‘ (ys ‘pseq‘ y:ys) |
||
+ | where y = f x |
||
+ | ys = parMap f xs |
||
+ | </haskell> |
||
+ | |||
+ | The intention here is to spark the evaluation of <code>f x</code>, and then |
||
+ | evaluate <code>parMap f xs</code>, before returning the new list <code>y:ys</code>. The |
||
+ | programmer is hoping to express an <i>ordering</i> of the evaluation: <i>first</i> |
||
+ | spark <code>y</code>, <i>then</i> evaluate <code>ys</code>. |
||
+ | |||
+ | :<small>[https://www.cs.tufts.edu/~nr/cs257/archive/simon-peyton-jones/multicore-ghc.pdf Runtime Support for Multicore Haskell] (page 2 of 12).</small> |
||
+ | </blockquote> |
||
+ | |||
+ | Alas, this confirmation failed to influence Haskell 2010 - to this day, <code>seq</code> remains just a primitive strictness definition. So for enhanced confusion the only Haskell implementation still in widespread use now provides both <code>seq</code> and <code>pseq</code>. |
||
+ | |||
+ | === Demystifying <code>seq</code> === |
||
+ | |||
+ | A common misconception regarding <code>seq</code> is that <code>seq x</code> "evaluates" <code>x</code>. Well, sort of. <code>seq</code> doesn't evaluate anything just by virtue of existing in the source file, all it does is introduce an artificial data dependency of one value on another: when the result of <code>seq</code> is evaluated, the first argument must also (sort of; see below) be evaluated. As an example, suppose <code>x :: Integer</code>, then <code>seq x b</code> behaves essentially like <code>if x == 0 then b else b</code> – unconditionally equal to <tt>b</tt>, but forcing <code>x</code> along the way. In particular, the expression <code>x `seq` x</code> is completely redundant, and always has exactly the same effect as just writing <code>x</code>. |
||
+ | |||
+ | Strictly speaking, the two equations of <code>seq</code> are all it must satisfy, and if the compiler can statically prove that the first argument is not ⊥, or that its second argument ''is'', it doesn't have to evaluate anything to meet its obligations. In practice, this almost never happens, and would probably be considered highly counterintuitive behaviour on the part of GHC (or whatever else you use to run your code). So for example, in <code>seq a b</code> it is perfectly legitimate for <code>seq</code> to: |
||
+ | |||
+ | :1. evaluate <code>b</code> - its ''second'' argument, |
||
+ | |||
+ | :2. before evaluating <code>a</code> - its first argument, |
||
+ | |||
+ | :3. then returning <code>b</code>. |
||
+ | |||
+ | In this larger example: |
||
+ | |||
+ | <haskell> |
||
+ | let x = ... in |
||
+ | let y = sum [0..47] in |
||
+ | x `seq` 3 + y + y^2 |
||
+ | </haskell> |
||
+ | |||
+ | <code>seq</code> immediately evaluating its second argument (<code>3 + y + y^2</code>) avoids having to allocate space to store <code>y</code>: |
||
+ | |||
+ | <haskell> |
||
+ | let x = ... in |
||
+ | case sum [0..47] of |
||
+ | y -> x `seq` 3 + y + y^2 |
||
+ | </haskell> |
||
+ | |||
+ | However, sometimes this ambiguity is undesirable, hence the need for <code>pseq</code>. |
||
+ | |||
+ | === Common uses of <code>seq</code> === |
||
+ | |||
+ | <code>seq</code> is typically used in the semantic interpretation of other strictness techniques, like strictness annotations in data types, or GHC's <tt>BangPatterns</tt> extension. For example, the meaning of this: |
||
<haskell> |
<haskell> |
||
Line 26: | Line 118: | ||
<haskell> |
<haskell> |
||
f x y | x `seq` y `seq` False = undefined |
f x y | x `seq` y `seq` False = undefined |
||
− | | otherwise = z |
+ | | otherwise = z |
</haskell> |
</haskell> |
||
although that literal translation may not actually take place. |
although that literal translation may not actually take place. |
||
− | < |
+ | <code>seq</code> is frequently used with accumulating parameters to ensure that they don't become huge thunks, which will be forced at the end anyway. For example, strict <code>foldl</code>: |
<haskell> |
<haskell> |
||
Line 50: | Line 142: | ||
=== Controversy! === |
=== Controversy! === |
||
− | Note that < |
+ | Note that <code>seq</code> is the ''only'' way to force evaluation of a value with a function type (except by applying it, which is liable to cause other problems). As such, it is the only reason why Haskell programs are able to distinguish between the following two values: |
<haskell> |
<haskell> |
||
Line 57: | Line 149: | ||
</haskell> |
</haskell> |
||
− | This violates the principle from lambda calculus of extensionality of functions, or eta-conversion, because < |
+ | This violates the principle from lambda calculus of extensionality of functions, or eta-conversion, because <code>f</code> and <code>\x -> f x</code> are distinct functions, even though they return the same output for ''every'' input. For this reason, <code>seq</code>, and this distinction, is sometimes ignored e.g. when assessing the correctness of [[Correctness of short cut fusion|optimisation techniques]] or type class instances. |
+ | |||
+ | == See also == |
||
+ | |||
+ | * [http://stackoverflow.com/questions/12687392/why-is-seq-bad Why is seq bad?] |
||
+ | |||
+ | [[Category:Glossary]] |
Latest revision as of 07:44, 5 September 2024
While its name might suggest otherwise, the seq
function's purpose is to introduce strictness to a Haskell program. As indicated by its type signature:
seq :: a -> b -> b
it takes two arguments of any type, and returns the second. However, it also has the important property that it is always strict in its first argument. In essence, seq
is defined by the following two equations:
⊥ `seq` b = ⊥
a `seq` b | a ≠ ⊥ = b
(See Bottom for an explanation of the ⊥
symbol.)
History
The need for a primitive sequencing definition has been known since at least 1996:
The
seq
combinator implements sequential composition. When the expressione1 ‘seq‘ e2
is evaluated,e1
is evaluated to weak head normal form first, and then the value ofe2
is returned. In the following parallelnfib
function,seq
is used to force the evaluation ofn2
before the addition takes place. This is because Haskell does not specify which operand is evaluated first, and ifn1
was evaluated beforen2
, there would be no parallelism. nfib :: Int -> Int nfib n | n <= 1 = 1 | otherwise = n1 ‘par‘ (n2 ‘seq‘ n1 + n2 + 1) where n1 = nfib (n-1) n2 = nfib (n-2)
...the same year seq
was introduced in Haskell 1.3 as a method of the (now-abandonded) Eval
type class:
class Eval a where
strict :: (a -> b) -> a -> b
seq :: a -> b -> b
strict f x = x ‘seq‘ f x
However, despite that need by the time Haskell 98 was released seq
had been reduced to a primitive strictness definition. But in 2009, all doubts about the need for a primitive sequencing definition were vanquished:
2.1 The need for
pseq
The
pseq
combinator is used for sequencing; informally, it evaluates its first argument to weak-head normal form, and then evaluates its second argument, returning the value of its second argument. Consider this definition ofparMap
: parMap f [] = [] parMap f (x:xs) = y ‘par‘ (ys ‘pseq‘ y:ys) where y = f x ys = parMap f xsThe intention here is to spark the evaluation of
f x
, and then evaluateparMap f xs
, before returning the new listy:ys
. The programmer is hoping to express an ordering of the evaluation: first sparky
, then evaluateys
.
- Runtime Support for Multicore Haskell (page 2 of 12).
Alas, this confirmation failed to influence Haskell 2010 - to this day, seq
remains just a primitive strictness definition. So for enhanced confusion the only Haskell implementation still in widespread use now provides both seq
and pseq
.
Demystifying seq
A common misconception regarding seq
is that seq x
"evaluates" x
. Well, sort of. seq
doesn't evaluate anything just by virtue of existing in the source file, all it does is introduce an artificial data dependency of one value on another: when the result of seq
is evaluated, the first argument must also (sort of; see below) be evaluated. As an example, suppose x :: Integer
, then seq x b
behaves essentially like if x == 0 then b else b
– unconditionally equal to b, but forcing x
along the way. In particular, the expression x `seq` x
is completely redundant, and always has exactly the same effect as just writing x
.
Strictly speaking, the two equations of seq
are all it must satisfy, and if the compiler can statically prove that the first argument is not ⊥, or that its second argument is, it doesn't have to evaluate anything to meet its obligations. In practice, this almost never happens, and would probably be considered highly counterintuitive behaviour on the part of GHC (or whatever else you use to run your code). So for example, in seq a b
it is perfectly legitimate for seq
to:
- 1. evaluate
b
- its second argument,
- 2. before evaluating
a
- its first argument,
- 3. then returning
b
.
In this larger example:
let x = ... in
let y = sum [0..47] in
x `seq` 3 + y + y^2
seq
immediately evaluating its second argument (3 + y + y^2
) avoids having to allocate space to store y
:
let x = ... in
case sum [0..47] of
y -> x `seq` 3 + y + y^2
However, sometimes this ambiguity is undesirable, hence the need for pseq
.
Common uses of seq
seq
is typically used in the semantic interpretation of other strictness techniques, like strictness annotations in data types, or GHC's BangPatterns extension. For example, the meaning of this:
f !x !y = z
is this:
f x y | x `seq` y `seq` False = undefined
| otherwise = z
although that literal translation may not actually take place.
seq
is frequently used with accumulating parameters to ensure that they don't become huge thunks, which will be forced at the end anyway. For example, strict foldl
:
foldl' :: (a -> b -> a) -> a -> [b] -> a
foldl' _ z [] = z
foldl' f z (x:xs) = let z' = f z x in z' `seq` foldl' f z' xs
It's also used to define strict application:
($!) :: (a -> b) -> a -> b
f $! x = x `seq` f x
which is useful for some of the same reasons.
Controversy!
Note that seq
is the only way to force evaluation of a value with a function type (except by applying it, which is liable to cause other problems). As such, it is the only reason why Haskell programs are able to distinguish between the following two values:
undefined :: a -> b
const undefined :: a -> b
This violates the principle from lambda calculus of extensionality of functions, or eta-conversion, because f
and \x -> f x
are distinct functions, even though they return the same output for every input. For this reason, seq
, and this distinction, is sometimes ignored e.g. when assessing the correctness of optimisation techniques or type class instances.