Opting for oracles: Difference between revisions

From HaskellWiki
m (Word replaced with synonym (too repetitive))
(Redundant sections replaced with "oracle" reference; other minor changes)
 
(18 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Oracles, defined ==


An ''oracle'' is a value that can be viewed as having the ability to predict what another value will be. The  predicted value is usually the result of a computation involving information which cannot be represented as regular values in the computation. The oracle, by seeming to contain the prediction, preserves the [[Referential transparency|referential transparency]] of the language being used to define the computation.
== An adapted example ==


This makes the use of oracles highly relevant to languages intended to preserve referential transparency, such as Haskell.
Based on page 42 of 76 in [http://www.people.cs.uchicago.edu/~soare/Turing/shagrir.pdf Turing-Post Relativized Computability and Interactive Computing]:


== Oracles in use ==
<blockquote>
<b>The Limit Computable or Approximation Model</b>


The early use of oracles in programming languages focussed on coping with the <span class="plainlinks">[https://wiki.haskell.org/Category:Nondeterminism nondeterminism]<span> <!-- [[Category:Nondeterminism|nondeterminism]] --> arising from the occurrence of events outside the language - one classic example being ''concurrency''. For a language to support it, it must provide some means for allowing multiple computations to each progress at their own rate, depending on the current availability of suitable resources, which can either be:
There exists a sequence of Turing programs <math>\{P_t:t \in T\}</math> so that
<math>P_t</math> computes function <math>g_t</math> at time <math>{t \in T}</math>. There is not
necessarily any connection between different programs and computing may have to
start all over again with a new program as the time changes from <math>t</math> to <math>t+1</math>.


* ''inside'' the language e.g. definitions which the computations share, or
Suppose a meteorologist receives data every second <math>t \in T</math> from weather
stations scattered across the country. The configuration at the meteorologist's
desk may be described using the Shoenfield Limit Lemma by a computable function
where <math>g_t</math> is the computable characteristic function of
<math>B_t</math>, the configuration of the meteorological computation at the end
of time <math>t</math>. The computable function <math>g_t</math> gives an algorithm to
compute the condition <math>B_t</math> at time <math>t</math> but it gives no relationship
between <math>B_t</math> and <math>B_{t+1}</math>. It will not be possible for the
meteorologist to run, let alone write a new program every second. How will the
meteorologist write a program to uniformly compute the index <math>g_t</math>
for <math>{t \in T}</math>?


* ''outside'' of it e.g. hardware devices for storage, audiovisual or networking.
<b>The Online Model With an Oracle Machine</b>


Clearly the language cannot predict e.g. when users of a computer system will be active, so concurrency is generally nondeterministic. For information on how oracles have helped to support various forms of concurrency, see [[Concurrency with oracles]].
By the Shoenfield Limit Lemma there is a computably enumerable set <math>{A}</math> (or even
a <math>\Delta_2^0</math> set) and oracle machine <math>\Phi_e</math> such that
<math>B = \Phi_e^A</math>. Now the meteorologist can program the
algorithm <math>\Phi_e</math> into a computer once and for all at the start of the
day. Every second <math>{t \in T}</math> the meteorologist receives from the weather stations
the latest readings <math>A_t</math> which enter directly into that computer by
an network connection. The meteorologist does not (and cannot) change the
program <math>\Phi_e</math> every second. The algorithm simply receives the
“oracle” information <math>A_t</math> from the weather-station network as it is continually
updated, and computes the approximation <math>B_t(x) = \Phi_e^{A_t}(x)</math>.
The meteorologist's program then produces the next scheduled weather forecast for the day from the
algorithm's result. It is difficult to see how this meteorologist could have
carried out that activity using a batch processing, automatic machine
model, instead of an online model.
</blockquote>


(Of course, the use of oracles goes beyond programming languages e.g. Jennifer Hackett and Graham Hutton use them to alleviate some of the tedium associated with the classic state-centric semantics used to examine the operational behaviour of lazy programs - see [https://www.cs.nott.ac.uk/~pszgmh/clairvoyant.pdf Call-by-Need Is Clairvoyant Call-by-Value].)
...the [[oracle]] is a function of time <math>t \in T</math> with result
<math>A_t</math> that provides input to the meteorologist's program.


== From oracles to ''pseudodata'' ==
=== The problem of choice ===


In his paper [https://academic.oup.com/comjnl/article-pdf/31/3/243/1157325/310243.pdf Nondeterminism with Referential Transparency in Functional Programming Languages], F. Warren Burton illustrates how oracles can be repurposed to make use of other outside information - starting with <tt>decisions</tt> for supporting nondeterministic choice:
But for the program's output - the forecasts - an element of choice is often
involved:


<haskell>
* the choice of time to send the forecasts to media outlets broadcasting to listeners or viewers - forecasts may need to be sent more frequently to regions experiencing more severe weather conditions.
-- section 2
data Tree a = Node { contents :: a,
                    left    :: Tree a,
                    right    :: Tree a }


-- section 3
* the choice of device - the meteorologist may need a hard copy of the latest forecast (or some part thereof) for research purposes.
data Decision  -- abstract, builtin
choice :: Decision -> a -> a -> a
</haskell>


(the details of which can be found in [[Burton-style nondeterminism]].)
Choices can also affect the program itself - the meteorologist may for
some reason decide to intervene while the program is running (perhaps
restarting it to provide more accurate forecasts after software upgrades).


Burton then shows how the concept of oracles can be expanded to access the current time, using the example of <tt>timestamps</tt>:
In contrast, output is a simple matter for a function - if there is any,
it all goes to its caller! Therefore interactive choice requires something
more than functions alone; it usually appearing in the form of an abstract
type:


<haskell>
* the various auxiliary types used in dialogues[https://hackage.haskell.org/package/dialogue <span></span>] - text and binary files, channels, <i>etc</i>.
-- section 7
data Timestamp  -- abstract, possibly builtin
stamp :: !Timestamp -> Timestamp
compare :: Timestamp -> Timestamp -> Integer -> Bool
</haskell>


all while preserving referential transparency. He also hints as to how the current size of available storage can also be made available - see his paper for more details.
* the type of the final result returned by a nested sequence of CPS calls[https://dl.acm.org/doi/pdf/10.1145/165180.165199 <span></span>];


== A simpler interface ==
* the type of the state passed through all the calls which require it[https://www.cambridge.org/core/services/aop-cambridge-core/content/view/2EFAEBBE3A19EA03A8D6D75A5348E194/S0956796800001258a.pdf/the-ins-and-outs-of-clean-io.pdf <span></span>];


Since its advent (sometimes as a result of being inspired by it, or similar entities), an alternate interface has appeared for working with Burton's ''pseudodata'':
* the monadic, comonadic or arrow types of online actions[https://web.archive.org/web/20051212060309/http://www.cse.ogi.edu/PacSoft/publications/phaseiiiq10papers/codata.pdf <span></span>][https://hackage.haskell.org/package/io-machine <span></span>];
* In Simon Peyton Jones's book [[Books|The implementation of functional programming languages]] (section 9.6 on page 188 of 458), Peter Hancock provides a simple interface for generating new type varibles (of type <tt>tvname</tt>) for a type checker, using the type <tt>name_supply</tt>:


<tt>
Another option is to use an structured value of unbounded size:
::|| page 188 of 458
::next_name :: name_supply -> tvname
::deplete :: name_supply -> name_supply
::split :: name_supply -> (name_supply, name_supply)
</tt>


* In his paper [https://www.iro.umontreal.ca/~lecuyer/myftp/papers/cacm88.pdf Efficient and Portable Combined Random Number Generators], Pierre L'Ecuyer suggests the ''disjoint'' splitting of random numbers into independent subsequences as needed. Burton and Rex L. Page follow this up in [https://www.cs.ou.edu/~rlpage/burtonpagerngjfp.pdf Distributed Random Number Generation] - from page 9 of 10:
* lazy lists, for incremental streaming[https://arxiv.org/pdf/1907.11354.pdf <span></span>];


:<haskell>
* lazy trees, for providing subtrees to various parts of the program[https://dspace.stir.ac.uk/bitstream/1893/12116/1/Harrison%20-%20thesis.pdf <span></span>].
data Random = Seed Int


seed    :: Int -> Random
split    :: Random -> (Random, Random)
generate :: Random -> [Float]
</haskell>


:(An assessment of the applicability of Burton's technique for simplifying the the provision of random numbers is also given).
[[Category:Mathematics]]
 
* Lennart Augustsson, Mikael Rittri and Dan Synek use Burton's technique to reimplement Hancock's <tt>name_supply</tt> in their functional pearl [[Research papers/Functional pearls|On generating unique names]], making it practical for regular use. An implementation can be found in Simon Peyton Jones and Jon Launchbury's [https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.52.3656&rep=rep1&type=pdf State in Haskell] - using using more-contemporary syntax:
 
:<haskell>
-- page 39
newUniqueSupply  :: IO UniqueSupply
splitUniqueSupply :: UniqueSupply -> (UniqueSupply, UniqueSupply)
getUnique        :: UniqueSupply -> Unique
 
data UniqueSupply =  US Unique UniqueSupply UniqueSupply
 
-- page 40
type Unique      =  Int
 
newUniqueSupply  =  do uvar <- newIORef 0
                        let incr  :: Int -> (Int, Unique)
                            incr u =  (u+1, u)
 
                            next  :: IO Unique
                            next  =  unsafeInterleaveIO $
                                      atomicModifyIORef uvar incr
 
                            supply :: IO UniqueSupply
                            supply =  unsafeInterleaveIO $
                                      liftM3 US next supply supply
 
                        supply
 
splitUniqueSupply (US _ s1 s2) =  (s1, s2)
getUnique (US u _ _) =  u
</haskell>
 
:The crucial point here is that <code>US</code> - the single data constructor for <code>UniqueSupply</code> - can now be kept private. The use of trees has been reduced to an implementation detail, oblivious to the users of <code>UniqueSupply</code> (and <code>Unique</code>) values.
 
== A simpler implementation ==
 
Augustsson, Rittri and Synek provide other possible implementations of Hancock's supply in their paper - of particular interest is the ''monousal'' one: to preserve referential transparency, each <code>UniqueSupply</code> should only be used ''once'' (if at all). This makes their concept implementation quite compact, if a little peculiar - reusing parts of Launchbury and Peyton-Jones's example:
 
<tt>
:abstype uniquesupply
:with
::new_uniquesupply  ::  uniquesupply
::split_uniquesupply ::  uniquesupply -> (uniquesupply, uniquesupply)
::get_unique        ::  uniquesupply -> unique
 
:uniquesupply ::= US
 
:new_uniquesupply = US
:split_uniquesupply US = (US, US)
:get_unique s = gensym(s)
 
:unique == int
 
:|| Not a regular definition!
:gensym :: * -> unique
</tt>
 
It should be obvious that an actual implementation in ''regular'' Haskell isn't possible - non-standard, possibly implementation-specific extensions are required. To provide an implementation here would be more distracting than useful.
 
== A matter of nomenclature ==
 
As mentioned earlier, L'Ecuyer suggests the splitting of random numbers be ''disjoint''. However, as this can complicate matters, many RNGs (including Burton and Page's) opt for simpler forms of splitting where the duplication of random numbers is statistically unlikely. It only requires a little imagination to realise the consequences of using a similarly-relaxed approach for supplying fresh identifiers!
 
To avoid having to repeatedly specify it, an alternate terminology is needed - one which clearly indicates that for some types of pseudodata, the ''disjointedness'' of its splitting is '''mandatory''', instead of just being very convenient.
 
[[Category:Code]]
[[Category:Nondeterminism]]

Latest revision as of 08:44, 20 December 2024

An adapted example

Based on page 42 of 76 in Turing-Post Relativized Computability and Interactive Computing:

The Limit Computable or Approximation Model

There exists a sequence of Turing programs {Pt:tT} so that Pt computes function gt at time tT. There is not necessarily any connection between different programs and computing may have to start all over again with a new program as the time changes from t to t+1.

Suppose a meteorologist receives data every second tT from weather stations scattered across the country. The configuration at the meteorologist's desk may be described using the Shoenfield Limit Lemma by a computable function where gt is the computable characteristic function of Bt, the configuration of the meteorological computation at the end of time t. The computable function gt gives an algorithm to compute the condition Bt at time t but it gives no relationship between Bt and Bt+1. It will not be possible for the meteorologist to run, let alone write a new program every second. How will the meteorologist write a program to uniformly compute the index gt for tT?

The Online Model With an Oracle Machine

By the Shoenfield Limit Lemma there is a computably enumerable set A (or even a Δ20 set) and oracle machine Φe such that B=ΦeA. Now the meteorologist can program the algorithm Φe into a computer once and for all at the start of the day. Every second tT the meteorologist receives from the weather stations the latest readings At which enter directly into that computer by an network connection. The meteorologist does not (and cannot) change the program Φe every second. The algorithm simply receives the “oracle” information At from the weather-station network as it is continually updated, and computes the approximation Bt(x)=ΦeAt(x). The meteorologist's program then produces the next scheduled weather forecast for the day from the algorithm's result. It is difficult to see how this meteorologist could have carried out that activity using a batch processing, automatic machine model, instead of an online model.

...the oracle is a function of time tT with result At that provides input to the meteorologist's program.

The problem of choice

But for the program's output - the forecasts - an element of choice is often involved:

  • the choice of time to send the forecasts to media outlets broadcasting to listeners or viewers - forecasts may need to be sent more frequently to regions experiencing more severe weather conditions.
  • the choice of device - the meteorologist may need a hard copy of the latest forecast (or some part thereof) for research purposes.

Choices can also affect the program itself - the meteorologist may for some reason decide to intervene while the program is running (perhaps restarting it to provide more accurate forecasts after software upgrades).

In contrast, output is a simple matter for a function - if there is any, it all goes to its caller! Therefore interactive choice requires something more than functions alone; it usually appearing in the form of an abstract type:

  • the various auxiliary types used in dialogues - text and binary files, channels, etc.
  • the type of the final result returned by a nested sequence of CPS calls;
  • the type of the state passed through all the calls which require it;
  • the monadic, comonadic or arrow types of online actions;

Another option is to use an structured value of unbounded size:

  • lazy lists, for incremental streaming;
  • lazy trees, for providing subtrees to various parts of the program.