Difference between revisions of "DataDriven"
m (Filled in Sources & Ephemeral) 

Line 1:  Line 1:  
[[Category:Applicative]] 
[[Category:Applicative]] 

+  [[Category:Events]] 

[[Category:Libraries]] 
[[Category:Libraries]] 

[[Category:Packages]] 
[[Category:Packages]] 

Line 5:  Line 6:  
== Abstract == 
== Abstract == 

⚫  '''DataDriven''' is a library for functional events and timevarying values. The ideas and interface 

+  ''Warning'': The Haddock docs are not ready yet. I'm trying to get a working haddock 2.0 running (on my windows machine). 

⚫  '''DataDriven''' is a library for functional events and timevarying values ("sources"). The ideas and interface come mainly from [[Research_papers/Functional_reactive_programmingfunctional reactive programming]] (FRP). Most FRP implementations I'm aware of have a demanddriven implementation, while the implementation of DataDriven is datadriven (surprise). This library is a resurrection of some ideas from an old, incomplete Fran reimplementation that also became the basis of Meurig Sage's [http://www.dcs.gla.ac.uk/~meurig/research/icfp.pdf FranTk]. This time around, I've been particularly interested in using standard classes as much as possible, most centrally [http://www.haskell.org/ghc/docs/latest/html/libraries/base/ControlApplicative.html <hask>Applicative</hask>] and [http://www.haskell.org/ghc/docs/latest/html/libraries/base/DataMonoid.html <hask>Monoid</hask>]. 

+  
+  Besides this wiki page, here are more ways to find out about DataDriven. 

* Read [http://darcs.haskell.org/packages/DataDriven/doc/html the Haddock docs] (with source code, additional examples, and Comment/Talk links). 
* Read [http://darcs.haskell.org/packages/DataDriven/doc/html the Haddock docs] (with source code, additional examples, and Comment/Talk links). 

* Get the code repository: '''<tt>darcs get http://darcs.haskell.org/packages/DataDriven</tt>''', or 
* Get the code repository: '''<tt>darcs get http://darcs.haskell.org/packages/DataDriven</tt>''', or 

* Grab a [http://darcs.haskell.org/packages/DataDriven/dist distribution tarball]. 
* Grab a [http://darcs.haskell.org/packages/DataDriven/dist distribution tarball]. 

* See the [[DataDriven/Versions version history]]. 
* See the [[DataDriven/Versions version history]]. 

+  * See the use of events and sources in [[Phooey]] and [[Eros]]. 

+  
+  Please leave comments at the [[Talk:DataDrivenTalk page]]. 

== Events == 
== Events == 

⚫  The 

+  === Background === 

+  
⚫  The heart of the library is a notion of functional, composable ''events'', with a datadriven implementation. Most of the ideas and vocabulary are borrowed from Fran, when Fran's events came to mean multiple occurrences (see [http://conal.net/papers/ppdp00 Declarative EventOriented Programming], rather than the initial ICFP '97 publication). As in Fran, you can think of an event as a stream of "occurrences", each of which has a time and a value. The implementation, however, is radically different from Fran's, being datadriven rather than demanddriven. And in some cases, the functions are not pure. There are also several eventrelated functions, to create timevarying values. 

+  
+  === A first look at the interface === 

Some of the useful event operations come through standard classes. 
Some of the useful event operations come through standard classes. 

−  * <hask>Functor</hask>: <hask>fmap f e</hask> is the event that occurs whenever <hask>e</hask> occurs, but whose occurrence values come from applying <hask>f</hask> to the values from <hask>e</hask>. (Fran's <hask>(==>)</hask>.) 
+  * '''<hask>Functor</hask>''': <hask>fmap f e</hask> is the event that occurs whenever <hask>e</hask> occurs, but whose occurrence values come from applying <hask>f</hask> to the values from <hask>e</hask>. (Fran's <hask>(==>)</hask>.) 
−  * <hask>Monoid</hask>: <hask>mempty</hask> is the event that never occurs, and <hask>e `mappend` e'</hask> is the event that combines occurrences from <hask>e</hask> and <hask>e'</hask>. (Fran's <hask>neverE</hask> and <hask>(..)</hask>.) 
+  * '''<hask>Monoid</hask>''': <hask>mempty</hask> is the event that never occurs, and <hask>e `mappend` e'</hask> is the event that combines occurrences from <hask>e</hask> and <hask>e'</hask>. (Fran's <hask>neverE</hask> and <hask>(..)</hask>.) 
−  * <hask>Monad</hask>: <hask>return a</hask> is an event with a single occurrence. This one doesn't quite fit the original semantics, as the occurrence is delivered immediately on "listening" to an event (discussed later). In <hask>e >>= f</hask>, each occurrence of <hask>e</hask> leads, through <hask>f</hask> to a new event. Similarly for <hask>join</hask>, which is somehow simpler for me to think about. The occurrences of <hask>e >>= f</hask> correspond to the union of the occurrences of all such events. For example, suppose we're playing Asteroids and tracking collisions. Each collision can break an asteroid into more of them, each of which has to be tracked for more collisions. Another example: A chat room has an "enter" event, whose occurrences contain new events like "speak". 
+  * '''<hask>Monad</hask>''': <hask>return a</hask> is an event with a single occurrence. This one doesn't quite fit the original semantics, as the occurrence is delivered immediately on "listening" to an event (discussed later). In <hask>e >>= f</hask>, each occurrence of <hask>e</hask> leads, through <hask>f</hask>, to a new event. Similarly for <hask>join ee</hask>, which is somehow simpler for me to think about. The occurrences of <hask>e >>= f</hask> (or <hask>join ee</hask>) correspond to the union of the occurrences of all such events. For example, suppose we're playing Asteroids and tracking collisions. Each collision can break an asteroid into more of them, each of which has to be tracked for more collisions. Another example: A chat room has an "enter" event, whose occurrences contain new events like "speak". 
As a simple example, the following function transforms and combines two events: 
As a simple example, the following function transforms and combines two events: 

Line 29:  Line 40:  
</haskell> 
</haskell> 

⚫  
+  === Events as continuations === 

+  
⚫  
<haskell> 
<haskell> 

newtype Cont o a = Cont { runCont :: (a > o) > o } 
newtype Cont o a = Cont { runCont :: (a > o) > o } 

</haskell> 
</haskell> 

−  The <hask>Functor</hask> and <hask>Monad</hask> instances come from <hask>Cont</hask>. The <hask>Monoid</hask> instance for <hask>Cont</hask> is missing (as of 20070908), so it is defined in this module (and thus is an "orphan"). The more specialized event type is simply 
+  The <hask>Functor</hask> and <hask>Monad</hask> instances come from <hask>Cont</hask>. The <hask>Monoid</hask> instance for <hask>Cont</hask> is missing (as of 20070908), so it is defined in this module (and thus is an "orphan") simply by <hask>deriving</hask>. The more specialized event type is simply 
<haskell> 
<haskell> 

−  type Event = Cont 
+  type Event = Cont (IO ()) 
−  </haskell> 

−  where, to save typing later, 

−  <haskell> 

−  type Action = IO () 

</haskell> 
</haskell> 

−  Why does it make sense to think of continuationbased computations as events? Because an event is something that one can subscribe to. Subscription provides a "listener" (a continuation) to be invoked on every occurrence of the event. 
+  Why does it make sense to think of continuationbased computations as events? Because an event is something that one can subscribe to. Subscription provides a "listener" (a continuation) to be invoked on every occurrence of the event. If the occurrence value has type <hask>a</hask>, and the result of the listener and of registration has type <hask>o</hask>, then subscribing has type <hask>(a > o) > o</hask>, which is the type wrapped by <hask>Cont</hask>. 
+  
+  The <hask>Monoid</hask>, <hask>Functor</hask>, and <hask>Monad</hask> operations are simple. Given a listener <hask>l :: a > o</hask>, 

* Subscribing <hask>l</hask> to <hask>mempty</hask> has no effect, since the <hask>mempty</hask> is guaranteed never to occur. 
* Subscribing <hask>l</hask> to <hask>mempty</hask> has no effect, since the <hask>mempty</hask> is guaranteed never to occur. 

* Subscribing <hask>l</hask> to <hask>ea `mappend` eb</hask> subscribes <hask>l</hask> to each of <hask>ea</hask> and <hask>eb</hask>. 
* Subscribing <hask>l</hask> to <hask>ea `mappend` eb</hask> subscribes <hask>l</hask> to each of <hask>ea</hask> and <hask>eb</hask>. 

Line 48:  Line 61:  
== Sources == 
== Sources == 

⚫  
+  ''Sources'' are timevarying values, akin to Fran's "behaviors". They are built up mainly from constant values and application (via the <hask>Applicative</hask> interface), as well as reaction to events. 

+  
+  === Composing Sources === 

+  
+  Like events, sources have a more general, and surprisingly simple, form: 

+  <haskell> 

+  type SourceG change m = ((,) change) `O` m 

+  </haskell> 

+  The <hask>change</hask> type parameter provides a description of everything that can affect a source (cause it to change). The <hask>m</hask> parameter is a way to sample the value when changed. Here <hask>g `O` h</hask> means the composition of two type constructors, and is defined in [[TypeCompose]]. Without the fancy type constructors, 

+  \begin{code} 

+  type SourceG' change m a = (change, m a) 

+  \end{code} 

+  
+  One of the delightful properties of functors and of applicative functors is that they compose. That is, two functors compose to a functor and two AFs compose to form an AF. For any monoid <hask>o</hask>, <hask>((,) o)</hask> is an AF (corresponding to the writer monad). So, when <hask>change</hask> is a monoid and <hask>m</hask> is an AF, <hask>SourceG change m</hask> is an AF. 

+  
+  There are many possible monoid choices for <hask>change</hask>. One especially useful one is a continuation/event: 

+  <haskell> 

+  type SourceC o m = SourceG (Cont (m o) ()) m 

+  </haskell> 

+  Still more specifically 

+  <haskell> 

+  type Source = SourceC () IO 

+  </haskell> 

+  A source, then, is simply a change event together with a sampler <hask>IO</hask>. Given an AF application <hask>f <*> a</hask> for AFs <hask>f</hask> and <hask>a</hask>, the change event combines (<hask>mappend</hask>) change events for <hask>f</hask> and <hask>a</hask>, and sampling just applies a sampling of <hask>f</hask> to a sampling of <hask>a</hask>. 

+  
+  === Sources and events === 

+  
+  As an example of eventbased sources, the following function makes a source with an initial value and changing at every occurrence of an event. The resulting source remembers the event's most recent occurrence value. 

+  <haskell> 

+  mkStepper :: a > Event a > IO (Source a) 

+  </haskell> 

+  The result of <hask>mkStepper a e</hask> is an <hask>IO</hask> because it starts reacting to occurrences of <hask>e</hask> only after it is executed. The semantic difference is clearer with the following function, which ''accumulates'' event occurrence values: 

+  <haskell> 

+  mkAccumS :: a > Event (a > a) > IO (Source a) 

+  </haskell> 

+  
+  Sources are also used to make events. For instance, the <hask>snapshot</hask> function samples a source whenever an event occurs, and pairs the occurrence and source values. 

+  <haskell> 

+  snapshot :: Event a > Source b > Event (a,b) 

+  </haskell> 

+  
⚫  
+  
+  === GC favors demanddriven computation === 

+  
+  The purpose of garbage collection is to keep services alive as long as they are useful to clients and then free up the services' computational resources (effort and memory). Conventional garbage collection works very well for demanddriven (pullbased) computation, but not for datadriven (pushbased) computation. 

+  
+  Consider a piece of information supplied by a service and used by a client. In a demanddriven scenario, the client has a pointer to the service and uses that pointer to get more of the information. The client keeps the serice alive. When the client get GC'd, its pointer to the service goes away. If there are no more pointers to the service, then it will also get GC'd. Both the computational effort and the memory are freed up for other uses. GC did its job. 

+  
+  The situation is reversed for datadriven computation. Here, the service pushes information to the client, so the service has a pointer to the client. This pointer means that the service keeps the client alive and keeps computing even when the client is no longer of any use. GC fails to satisfy its purpose. 

+  
+  === Caching and weak pointers === 

+  
+  Various forms of ''caching'' have this same problem. Suppose we use a hash table to memoize and expensive function. Even though the hash table is in service to function's arguments, the table's key/value entries keep the key values (function arguments) from ever getting reclaimed. Ideally, the situation would be reversed: the key would keep its table entry alive, and when the key was GC'd, they entry would shortly follow. Unfortunately, the direction of pointers, from entry to key, means that the entries keep the values alive, wasting space and slowing down search for useful entries. 

+  
+  The classic solution to this problem is to use "weak" pointers, i.e., pointers that the GC disregards in its decision about whether to retain a previously allocated object. For more discussion of these issues, see the [http://www.haskell.org/ghc/docs/latest/html/libraries/base/SystemMemWeak.html <hask>System.Mem.Weak</hask> documentation] and "[http://citeseer.ist.psu.edu/peytonjones99stretching.html Stretching the storage manager]". (These ideas were a motivating application for the "Stretching" paper, but the description got cut for lack of space.) 

+  
+  === Ephemeral listener === 

+  
+  What does all this have to events and sources? Recall that an event is simply a means of registering a "listener" (continuation) to be invoked on every occurrence. The event must have some kind of reference to the listener in order to invoke it. It must not keep the listener alive, however, since the event is the service and the listener is the client. The solution is for events to hold their clients weakly, i.e., point to them via weak references. Once a listener's strong pointers all disappear, the GC nulls out ("tombstones") the event's weak pointer. The next time the event occurs, it finds that it no longer has a live pointer, so it stops notifying the listener. 

+  
+  One of the (currently four) functions in <hask>Data.Ephemeral</hask> converts a value into an ephemeral one: 

+  <haskell> 

+  ephemeral :: (WeakM m, Monoid o) => o > m o 

+  </haskell> 

+  <hask>WeakM</hask> refers to monads having weak pointers, currently just <hask>IO</hask>. The result of <hask>ephemeral o</hask> is an "ephemeral" monadic version <hask>mo'</hask>. Initially, <hask>mo'</hask> just returns <hask>o</hask>. Once <hask>o</hask> is GC'd, <hask>mo'</hask> instead returns <hask>mempty</hask>. The <hask>ephemeral</hask> function is a special case of the more general <hask>ephemeralWith</hask>, which drops the <hask>Monoid</hask> constraint and takes an explicit fallback value. 

+  
+  The implementation of DataDriven takes care of ephemerality automatically, so client code doesn't have to worry about it. The only sign of this issue is the <hask>WeakM</hask> monad constraint in the most general forms of <hask>Event</hask> and <hask>Source</hask> functions. In fact, in the implementation of DataDriven, only one primitive worries about ephemerality, and all of the others inherit the benefits. 
Revision as of 05:32, 10 September 2007
Contents
Abstract
Warning: The Haddock docs are not ready yet. I'm trying to get a working haddock 2.0 running (on my windows machine).
DataDriven is a library for functional events and timevarying values ("sources"). The ideas and interface come mainly from functional reactive programming (FRP). Most FRP implementations I'm aware of have a demanddriven implementation, while the implementation of DataDriven is datadriven (surprise). This library is a resurrection of some ideas from an old, incomplete Fran reimplementation that also became the basis of Meurig Sage's FranTk. This time around, I've been particularly interested in using standard classes as much as possible, most centrally Applicative
and Monoid
.
Besides this wiki page, here are more ways to find out about DataDriven.
 Read the Haddock docs (with source code, additional examples, and Comment/Talk links).
 Get the code repository: darcs get http://darcs.haskell.org/packages/DataDriven, or
 Grab a distribution tarball.
 See the version history.
 See the use of events and sources in Phooey and Eros.
Please leave comments at the Talk page.
Events
Background
The heart of the library is a notion of functional, composable events, with a datadriven implementation. Most of the ideas and vocabulary are borrowed from Fran, when Fran's events came to mean multiple occurrences (see Declarative EventOriented Programming, rather than the initial ICFP '97 publication). As in Fran, you can think of an event as a stream of "occurrences", each of which has a time and a value. The implementation, however, is radically different from Fran's, being datadriven rather than demanddriven. And in some cases, the functions are not pure. There are also several eventrelated functions, to create timevarying values.
A first look at the interface
Some of the useful event operations come through standard classes.

Functor
:fmap f e
is the event that occurs whenevere
occurs, but whose occurrence values come from applyingf
to the values frome
. (Fran's(==>)
.) 
Monoid
:mempty
is the event that never occurs, ande `mappend` e'
is the event that combines occurrences frome
ande'
. (Fran'sneverE
and(..)
.) 
Monad
:return a
is an event with a single occurrence. This one doesn't quite fit the original semantics, as the occurrence is delivered immediately on "listening" to an event (discussed later). Ine >>= f
, each occurrence ofe
leads, throughf
, to a new event. Similarly forjoin ee
, which is somehow simpler for me to think about. The occurrences ofe >>= f
(orjoin ee
) correspond to the union of the occurrences of all such events. For example, suppose we're playing Asteroids and tracking collisions. Each collision can break an asteroid into more of them, each of which has to be tracked for more collisions. Another example: A chat room has an "enter" event, whose occurrences contain new events like "speak".
As a simple example, the following function transforms and combines two events:
show2 :: (Show a, Show b) => Event a > Event b > Event String
show2 ea eb = showE ea `mappend` showE eb
where
showE e = fmap show e
Events as continuations
The Event
type is not actually a new type, but merely a specialization of the familiar type of continuationbased computations, Cont
:
newtype Cont o a = Cont { runCont :: (a > o) > o }
The Functor
and Monad
instances come from Cont
. The Monoid
instance for Cont
is missing (as of 20070908), so it is defined in this module (and thus is an "orphan") simply by deriving
. The more specialized event type is simply
type Event = Cont (IO ())
Why does it make sense to think of continuationbased computations as events? Because an event is something that one can subscribe to. Subscription provides a "listener" (a continuation) to be invoked on every occurrence of the event. If the occurrence value has type a
, and the result of the listener and of registration has type o
, then subscribing has type (a > o) > o
, which is the type wrapped by Cont
.
The Monoid
, Functor
, and Monad
operations are simple. Given a listener l :: a > o
,
 Subscribing
l
tomempty
has no effect, since themempty
is guaranteed never to occur.  Subscribing
l
toea `mappend` eb
subscribesl
to each ofea
andeb
.  Subscribing
l
tofmap f e
subscribesl . f
toe
.  Subscribing
l
tojoin e
subscribes toe
a listener that subscribes to every event generated bye
. (Similarly fore >>= f == join (fmap f e)
.)
The functions in the Event
module operate on this general notion of events (Cont
) or something more specialized. I expect the most common use to be the Event
(IO
) specialization, and the types are often much easier to read for that type. General functions are given general signatures, with the Event
specializations as comments.
Sources
Sources are timevarying values, akin to Fran's "behaviors". They are built up mainly from constant values and application (via the Applicative
interface), as well as reaction to events.
Composing Sources
Like events, sources have a more general, and surprisingly simple, form:
type SourceG change m = ((,) change) `O` m
The change
type parameter provides a description of everything that can affect a source (cause it to change). The m
parameter is a way to sample the value when changed. Here g `O` h
means the composition of two type constructors, and is defined in TypeCompose. Without the fancy type constructors,
\begin{code}
type SourceG' change m a = (change, m a)
\end{code}
One of the delightful properties of functors and of applicative functors is that they compose. That is, two functors compose to a functor and two AFs compose to form an AF. For any monoid o
, ((,) o)
is an AF (corresponding to the writer monad). So, when change
is a monoid and m
is an AF, SourceG change m
is an AF.
There are many possible monoid choices for change
. One especially useful one is a continuation/event:
type SourceC o m = SourceG (Cont (m o) ()) m
Still more specifically
type Source = SourceC () IO
A source, then, is simply a change event together with a sampler IO
. Given an AF application f <*> a
for AFs f
and a
, the change event combines (mappend
) change events for f
and a
, and sampling just applies a sampling of f
to a sampling of a
.
Sources and events
As an example of eventbased sources, the following function makes a source with an initial value and changing at every occurrence of an event. The resulting source remembers the event's most recent occurrence value.
mkStepper :: a > Event a > IO (Source a)
The result of mkStepper a e
is an IO
because it starts reacting to occurrences of e
only after it is executed. The semantic difference is clearer with the following function, which accumulates event occurrence values:
mkAccumS :: a > Event (a > a) > IO (Source a)
Sources are also used to make events. For instance, the snapshot
function samples a source whenever an event occurs, and pairs the occurrence and source values.
snapshot :: Event a > Source b > Event (a,b)
Ephemeral values
GC favors demanddriven computation
The purpose of garbage collection is to keep services alive as long as they are useful to clients and then free up the services' computational resources (effort and memory). Conventional garbage collection works very well for demanddriven (pullbased) computation, but not for datadriven (pushbased) computation.
Consider a piece of information supplied by a service and used by a client. In a demanddriven scenario, the client has a pointer to the service and uses that pointer to get more of the information. The client keeps the serice alive. When the client get GC'd, its pointer to the service goes away. If there are no more pointers to the service, then it will also get GC'd. Both the computational effort and the memory are freed up for other uses. GC did its job.
The situation is reversed for datadriven computation. Here, the service pushes information to the client, so the service has a pointer to the client. This pointer means that the service keeps the client alive and keeps computing even when the client is no longer of any use. GC fails to satisfy its purpose.
Caching and weak pointers
Various forms of caching have this same problem. Suppose we use a hash table to memoize and expensive function. Even though the hash table is in service to function's arguments, the table's key/value entries keep the key values (function arguments) from ever getting reclaimed. Ideally, the situation would be reversed: the key would keep its table entry alive, and when the key was GC'd, they entry would shortly follow. Unfortunately, the direction of pointers, from entry to key, means that the entries keep the values alive, wasting space and slowing down search for useful entries.
The classic solution to this problem is to use "weak" pointers, i.e., pointers that the GC disregards in its decision about whether to retain a previously allocated object. For more discussion of these issues, see the System.Mem.Weak
documentation and "Stretching the storage manager". (These ideas were a motivating application for the "Stretching" paper, but the description got cut for lack of space.)
Ephemeral listener
What does all this have to events and sources? Recall that an event is simply a means of registering a "listener" (continuation) to be invoked on every occurrence. The event must have some kind of reference to the listener in order to invoke it. It must not keep the listener alive, however, since the event is the service and the listener is the client. The solution is for events to hold their clients weakly, i.e., point to them via weak references. Once a listener's strong pointers all disappear, the GC nulls out ("tombstones") the event's weak pointer. The next time the event occurs, it finds that it no longer has a live pointer, so it stops notifying the listener.
One of the (currently four) functions in Data.Ephemeral
converts a value into an ephemeral one:
ephemeral :: (WeakM m, Monoid o) => o > m o
WeakM
refers to monads having weak pointers, currently just IO
. The result of ephemeral o
is an "ephemeral" monadic version mo'
. Initially, mo'
just returns o
. Once o
is GC'd, mo'
instead returns mempty
. The ephemeral
function is a special case of the more general ephemeralWith
, which drops the Monoid
constraint and takes an explicit fallback value.
The implementation of DataDriven takes care of ephemerality automatically, so client code doesn't have to worry about it. The only sign of this issue is the WeakM
monad constraint in the most general forms of Event
and Source
functions. In fact, in the implementation of DataDriven, only one primitive worries about ephemerality, and all of the others inherit the benefits.