Referential transparency: Difference between revisions

From HaskellWiki
(Various sections rewritten or reorganised)
mNo edit summary
 
(11 intermediate revisions by the same user not shown)
Line 1: Line 1:
<span>{{DISPLAYTITLE:referential transparency}}</span>
[[Category:Glossary]]
[[Category:Glossary]]
Referential transparency is an oft-touted property of (pure) functional languages, which makes it easier to reason about the behavior of programs. While there is no single formal definition[[#notes|[1]]], it usually means that an expression always evaluates to the same result in any context. Side effects like (uncontrolled) imperative update break this desirable property. C and ML are languages with constructs that are not referentially transparent.


As an example, consider the following program[[#notes|[2][3]]] in Standard ML:
<blockquote>
A key advantage to pure functional programming is the property of referential
transparency, first enunciated by Quine[[#notes|[1]]] and defined by him as follows:
 
:“I call a mode of containment <math>\phi</math> referentially transparent if, whenever an occurrence of a singular term <math>t</math> is purely referential in a term or sentence <math>\psi(t)</math>, it is purely referential also in the containing term or sentence <math>\phi(\psi(t))</math>”.
 
In formalizing this idea for programming languages, Søndergaard and Sestoft[[#notes|[2]]] conclude by saying informally,
 
:“[A]n operator is referentially transparent if it preserves applicability of Leibniz’s law, or substitutivity of identity: the principle that any subexpression can be replaced by any other equal in value.”
 
:<small>Derek Bronish[[#notes|[3][4]]].</small>
</blockquote>
 
...with the term first being introduced to the study of programming languages by Christopher Strachey[[#notes|[5]]]:
 
<blockquote>
One of the most useful properties of expressions is that called by Quine[[#notes|[1]]] <i>referential transparency</i>. In essence this means that if we wish to find the value of an expression which contains a sub-expression, the only thing we need to know about the sub-expression is its value. Any other features of the sub-expression, such as its internal structure, the number and nature of its components, the order in which they are evaluated or the colour of the ink in which they are written, are irrelevant to the value of the main expression.
</blockquote>
 
Side effects like (uncontrolled) imperative update break this desirable property. C and ML are languages with constructs that are not referentially transparent.
 
As an example, consider the following program[[#notes|[6][7]]] in ML:
<pre>
<pre>
puts "h"; puts "a"; puts "h"; puts "a"
puts "h"; puts "a"; puts "h"; puts "a"
</pre>
</pre>
which prints <code>"haha"</code>. In an attempt to factor out the repetition, we write
which prints <code>"haha"</code>. In an attempt to factor out the repetition, we write
Line 16: Line 37:
in  x (); x () end
in  x (); x () end
</pre>
</pre>
Haskell's monadic I/O system distinguishes between ''values'' and
Haskell's monadic[[#notes|[8]]] I/O system distinguishes between ''values'' and
''actions'' like the <code>puts</code> procedure above. So we do indeed have that
''actions'' like the <code>puts</code> procedure above. So we do indeed have that
<haskell>
<haskell>
Line 31: Line 52:
<span id="notes">Notes</span>:
<span id="notes">Notes</span>:


[1] Wolfram Kahl provides a [https://www.cas.mcmaster.ca/~kahl/reftrans.html USENET post by Tom DeBoni] containing a summary of various definitions for referential transparency.
[1] [https://archive.org/details/in.ernet.dli.2015.52908 Word and Object], by Willard Van Ormond Quine (page 163 of 314).
 
[2] [https://www.cs.tufts.edu/~nr/cs257/archive/peter-sestoft/ref-trans.pdf Referential transparency, definiteness and unfoldability], by Harald Søndergaard and Peter Sestoft (page 11 of 13).
 
[3] [https://etd.ohiolink.edu/acprod/odb_etd/ws/send_file/send?accession=osu1338317549&disposition=inline Abstraction as the Key to Programming, with Issues for Software Verification in Functional Languages], by Derek Bronish (pages 113-114 of 211).
 
[4] Wolfram Kahl provides a [https://www.cas.mcmaster.ca/~kahl/reftrans.html USENET post by Tom DeBoni] containing a summary of various definitions for referential transparency.
 
[5] [https://classes.cs.uoregon.edu/14S/cis607pl/Papers/fundamental-1967.pdf Fundamental Concepts in Programming Languages], Christopher Strachey, 1967 (page 9 of 39).
 
[6] This example is based on one from pages 4-5 of 33, in [https://homepages.inf.ed.ac.uk/wadler/ Philip Wadler]'s  [https://web.archive.org/web/20130430091603/https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.91.3579&rep=rep1&type=pdf How to Declare an Imperative], ''ACM Computing Surveys'', 29(3):240--263, September 1997.


[2] This example is based on one from pages 4-5 of 33, in [https://homepages.inf.ed.ac.uk/wadler/ Philip Wadler]'s [https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.91.3579&rep=rep1&type=pdf How to Declare an Imperative], ''ACM Computing Surveys'', 29(3):240--263, September 1997.
[7] where <code>puts</code> can be defined as <code>fun puts s = output(std_out,s);</code>


[3] where <code>puts</code> can be defined as <code>fun puts s = TextIO.output(TextIO.stdOut, s);</code>
[8] This style of I/O can also be implemented on top of side-effecting I/O: table 9.1 (page 148 of 168) in Andrew Gordon's [https://web.archive.org/web/20240618013631/https://www.audentia-gestion.fr/research.microsoft/fpio.pdf Functional Programming and Input/Output] (the revised 1992 PhD dissertation) provides an example of monadic teletype I/O for ML.


[4] There is some debate about whether the imprecisely-defined semantics of <code>Int</code> breaks referential transparency. For instance, <code>even (maxBound :: Int)</code> may be <code>True</code> in some contexts and <code>False</code> in others. Another example is <code>System.Info.os :: String</code>.
[9] There is some debate about whether the imprecisely-defined semantics of <code>Int</code> breaks referential transparency. For instance, <code>even (maxBound :: Int)</code> may be <code>True</code> in some contexts and <code>False</code> in others. Another example is <code>System.Info.os :: String</code>.


One perspective is that Haskell is not just one language (plus <code>Prelude</code>), but a family of languages, parameterized by a collection of implementation-dependent parameters.
[10] One perspective is that Haskell is not just one language (plus <code>Prelude</code>), but a family of languages, parameterized by a collection of implementation-dependent parameters.
Each such language is referentially transparent, even if the collection as a whole might not be.
Each such language is referentially transparent, even if the collection as a whole might not be.
Some people are satisfied with this situation and others are not.
Some people are satisfied with this situation and others are not.

Latest revision as of 10:48, 17 February 2025

A key advantage to pure functional programming is the property of referential transparency, first enunciated by Quine[1] and defined by him as follows:

“I call a mode of containment ϕ referentially transparent if, whenever an occurrence of a singular term t is purely referential in a term or sentence ψ(t), it is purely referential also in the containing term or sentence ϕ(ψ(t))”.

In formalizing this idea for programming languages, Søndergaard and Sestoft[2] conclude by saying informally,

“[A]n operator is referentially transparent if it preserves applicability of Leibniz’s law, or substitutivity of identity: the principle that any subexpression can be replaced by any other equal in value.”
Derek Bronish[3][4].

...with the term first being introduced to the study of programming languages by Christopher Strachey[5]:

One of the most useful properties of expressions is that called by Quine[1] referential transparency. In essence this means that if we wish to find the value of an expression which contains a sub-expression, the only thing we need to know about the sub-expression is its value. Any other features of the sub-expression, such as its internal structure, the number and nature of its components, the order in which they are evaluated or the colour of the ink in which they are written, are irrelevant to the value of the main expression.

Side effects like (uncontrolled) imperative update break this desirable property. C and ML are languages with constructs that are not referentially transparent.

As an example, consider the following program[6][7] in ML:

puts "h"; puts "a"; puts "h"; puts "a"

which prints "haha". In an attempt to factor out the repetition, we write

let val x = (puts "h"; puts "a")
in  x; x end

but now the laugh is on us, because "ha" is only printed once. The reason is that puts's side effect is only realized when x gets bound, so we should have written

let fun x () = (puts "h"; puts "a")
in  x (); x () end

Haskell's monadic[8] I/O system distinguishes between values and actions like the puts procedure above. So we do indeed have that

putStr "h" >> putStr "a" >> putStr "h" >> putStr "a"

is equivalent to

let x = putStr "h" >> putStr "a"
in  x >> x

Notes:

[1] Word and Object, by Willard Van Ormond Quine (page 163 of 314).

[2] Referential transparency, definiteness and unfoldability, by Harald Søndergaard and Peter Sestoft (page 11 of 13).

[3] Abstraction as the Key to Programming, with Issues for Software Verification in Functional Languages, by Derek Bronish (pages 113-114 of 211).

[4] Wolfram Kahl provides a USENET post by Tom DeBoni containing a summary of various definitions for referential transparency.

[5] Fundamental Concepts in Programming Languages, Christopher Strachey, 1967 (page 9 of 39).

[6] This example is based on one from pages 4-5 of 33, in Philip Wadler's How to Declare an Imperative, ACM Computing Surveys, 29(3):240--263, September 1997.

[7] where puts can be defined as fun puts s = output(std_out,s);

[8] This style of I/O can also be implemented on top of side-effecting I/O: table 9.1 (page 148 of 168) in Andrew Gordon's Functional Programming and Input/Output (the revised 1992 PhD dissertation) provides an example of monadic teletype I/O for ML.

[9] There is some debate about whether the imprecisely-defined semantics of Int breaks referential transparency. For instance, even (maxBound :: Int) may be True in some contexts and False in others. Another example is System.Info.os :: String.

[10] One perspective is that Haskell is not just one language (plus Prelude), but a family of languages, parameterized by a collection of implementation-dependent parameters. Each such language is referentially transparent, even if the collection as a whole might not be. Some people are satisfied with this situation and others are not.