Cn/Introduction

From HaskellWiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Haskell是一种计算机程序语言。特别是,它是一种 多态的 静态类型的惰性的纯函数式 语言, 与大多数其他语言有很大的不同之处。 这门语言以为函数式语言建立数学逻辑基础的Haskell Brooks Curry命名。 Haskell基于lambda calculus, 因此也使用lambda作为logo。


为什么使用Haskell?

编写大型软件系统是一件困难而昂贵的事情,而维护那些系统尤为如此。函数式编程语言,例如Haskell,能够使这些变得更简单更便宜。 例如,一个用Haskell写了一个下行关系数据库管理系统的新用户会这样说:

WOW! 我基本上编完这些都不用测试,只用想着我的程序在类型之间的转换方式。我编写了测试和示例代码而几乎没有出现任何实现错误! 编译器或者类型系统真的真的很擅于阻止你发生编码错误!我一生中从来没有在初次试用的时候就写出这样一段功能强大的代码。我被震撼了。

即使你并不打算在你的项目中试用Haskell,学习Haskell能够使你使用任何语言都会变成更好的程序员。

我很多年以前开始学习Haskell, 之前曾使用Pyton和很多其他语言。最近,我在一个项目中使用Python(这种选择是由技术的和非技术的因素决定的),发现我的Python编程风格现在受到了Haskell编程经验的极大的影响(变得更好了,我希望;-)

Graham Klyne


Haskell为你提供:

  • 大大的增加程序员的产能(Ericsson统计过,在对电信软件开发的一系列实验中,通过使用一种类似于Haskell的函数式编程语言Erlang能够提高9到25倍)
  • 更短小,更整洁,和更具备可维护性的代码
  • 更少的错误,更高的可信度
  • 编程语言与自然语言之间更小的"语义界限"
  • 更短的lead times。

Haskell是一种应用广泛的语言,适合于不同的应用程序。特别适合于高度可变和可维护的程序。

很多软件产品的生命周期都消耗在规格设计维护上,而并不是编程。 函数式语言可以极好的用于书写实际可运行(同时测试和调试)的规格。继而这样一种规格就是最终程序的第一个原型。

函数式程序维护起来也相对简单,因为代码更短,更整洁,而且对副作用的严格控制消除了大量的不可预见的交互。


函数式编程时什么?

C,Java,Pascal,Ada,等等,都是命令式语言。它们是"命令式"的;特点是它们有一组命令序列组成,严格的按照先后顺序执行。 Haskell是一种函数式语言。函数式程序是一个单一的表达式,通过对表达式求值进行执行。

只要使用过电子表格的人都有过函数式编程的经历。在一个电子表格中,用户以其他单元格的方式指定一个单元格中的值。把注意力集中在什么要被计算,而不是如何去计算。例如:

  • 我们无需指定被计算的单元格的顺序;去而代之的我们让电子表格去根据内部的依赖以某种顺序去计算那些单元格。
  • 我们无需告诉电子表格如何分配它的内存;而是我们希望它为我们提供一个有无限多单元格的平板,同时只为那些实际使用到的单元格分配内存。
  • 大多数情况下,我们通过一个表达式来指定单元格的值(表达式中的各个部分可以以任何顺序进行求值计算),而不是通过命令序列来计算它们的值

电子表格的无需指定重新计算顺序的特点,一个有趣的推论就是赋值的概念不是很重要。毕竟,如果你都不是很了解要何时赋值,你也不能指望它有多大的用处! 这和传统的编程语言形成了鲜明的对比。例如C语言由一组本质上被小心指定的赋值序列组成,又或者Java,方法的调用顺序对一个程序有至关重要的意义。

这种聚焦在高层次的"做什么"上,而不是低层次的"怎么做"的方式就是函数式编程语言的显著特点。

另一个有名的很接近函数式语言的语言就是标准数据库查询语言SQL。一个SQL查询就是一个跟创建,选择,连接等相关的表达式。查询表明了什么关系要被计算,而没有表明该怎么计算。 事实上,查询能够以任何方便的顺序被执行。SQL语言的具体实现通常会进行大量的查询优化来指出表达式求值的最佳顺序。


函数式编程有什么好处?

电子表格和SQL都是很专门化的语言。函数式编程语言汲取了这些相同的想法并带入到用于通用目的编程的领域。想要了解函数式编程到底是什么样的,以及函数式语言丰富的表现力,请看看下面的快速排序程序。 它们都是对一组数据尽享升序排序并使用一个标准的方法,命名为"quicksort"。第一个程序用Haskell编写,第二个用C语言编写。

C语言描述了机器进行排序的具体步骤,大部分的代码都是低级别的出力数据的操作。与之对照的是,Haskell程序在一个更高的层次实现了排序算法的编码,结果是提高了简洁性和清楚性(以效率作为代价,如果不是以非常好的编译器进行编译):


Quicksort in Haskell

qsort []     = []
qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++ qsort (filter (>= x) xs)

Quicksort in C

True quicksort in C sorts in-place:

// To sort array a[] of size n: qsort(a,0,n-1)

void qsort(int a[], int lo, int hi) 
{
  int h, l, p, t;

  if (lo < hi) {
    l = lo;
    h = hi;
    p = a[hi];

    do {
      while ((l < h) && (a[l] <= p)) 
          l = l+1;
      while ((h > l) && (a[h] >= p))
          h = h-1;
      if (l < h) {
          t = a[l];
          a[l] = a[h];
          a[h] = t;
      }
    } while (l < h);

    a[hi] = a[l];
    a[l] = p;

    qsort( a, lo, l-1 );
    qsort( a, l+1, hi );
  }
}

这里是C语言代码的 半直接的翻译

让我们检验一下Haskell和函数式编程的诸多好处。 一个更详细的函数式编程的例子在

Why Functional Programming Matters by John Hughes, The Computer Journal, Vol. 32, No. 2, 1989, pp. 98 - 107. Also in: David A. Turner (ed.): Research Topics in Functional Programming, Addison-Wesley, 1990, pp. 17 - 42.

一个由上面的报告灵感激发的稍微不那么正式的论文在

Why Haskell Matters originally by Sebastian Sylvan

简洁

函数式编程倾向于比那些不够简洁的同伴们更简洁。快速排序是一个相当极端的例子,但是通常函数式程序会更短(2到10倍)。


便于理解

Functional programs are often easier to understand. You should be able to understand the program without any previous knowledge of either Haskell or quicksort. The same certainly cannot be said of the C program. It takes quite a while to understand, and even when you do understand it, it is extremely easy to make a small slip and end up with an incorrect program. Here is a detailed explanation of the Haskell quicksort:

qsort []     = []
qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++ qsort (filter (>= x) xs)

The first line reads: "When you sort an empty list ([]), the result is another empty list". The second line reads: "To sort a list whose first element is named x and the rest of which is named xs, sort the elements of xs that are less than x, sort the elements of xs that are greater than or equal to x, and concatenate (++) the results, with x sandwiched in the middle."

No core dumps

Most functional languages, and Haskell in particular, are strongly typed, eliminating a huge class of easy-to-make errors at compile time. In particular, strong typing means no core dumps! There is simply no possibility of treating an integer as a pointer, or following a null pointer.

Code re-use

Of course, strong typing is available in many imperative languages, such as Ada or Pascal. However, Haskell's type system is much less restrictive than, say, Pascal's, because it uses polymorphism.

For example, the qsort program given in Figure 1 will not only sort lists of integers, but also lists of floating point numbers, lists of characters, lists of lists; indeed, it will sort lists of anything for which it is meaningful to have "less-than" and "greater-than" operations. In contrast, the C version can only sort an array of integers, and nothing else.

Polymorphism enhances re-usability.

Strong glue

"Non-strict" functional languages, such as Haskell, have another powerful feature: they only evaluate as much of the program as is required to get the answer - this is called lazy evaluation. This feature is rather like Unix pipes. For example, the Unix command

grep printf Foo.c | wc

counts the number of lines in the file Foo.c that include the string printf. The command

grep printf Foo.c

produces all lines which contain the string "printf", while the "wc" command counts them. The pipe, written "|", takes the output from the first command and delivers it to the second. The two commands execute together, so that the output of the first is consumed more-or-less immediately by the second. In this way, no large intermediate files need be produced. You can think of wc "demanding" lines from the grep.

If the second command only needs some of the output of the first, then execution of the first command might never need to be completed. For example,

grep printf Foo.c | head 5

just prints the first 5 lines which contain "printf". There is no need to modify the grep command to take account of the fact that its execution might be abandoned.

Non-strict languages provide exactly this kind of demand-driven evaluation. Data structures are evaluated just enough to deliver the answer, and parts of them may not be evaluated at all. As in the case of Unix commands, this provides powerful "glue" with which to compose existing programs together. What this means is that it is possible to re-use programs, or pieces of programs, much more often than can be done in an imperative setting. Lazy evaluation allows us to write more modular programs.

Powerful abstractions

In general, functional languages offer powerful new ways to encapsulate abstractions. An abstraction allows you to define an object whose internal workings are hidden; a C procedure, for example, is an abstraction. Abstractions are the key to building modular, maintainable programs, so much so that a good question to ask of any new language is "what mechanisms for abstraction does it provide?".

One powerful abstraction mechanism available in functional languages is the higher order function. In Haskell a function is a first-class citizen: it can freely be passed to other functions, returned as the result of a function, stored in a data structure, and so on. It turns out that the judicious use of higher order functions can substantially improve the structure and modularity of many programs.

Built-in memory management

Very many sophisticated programs need to allocate dynamic memory from a heap. In C this is done with a call to malloc, followed by code to initialize the store just allocated. The programmer is responsible for returning the store to the free pool when it isn't needed any more, a notorious source of "dangling-pointer" errors. To make matters worse, malloc is fairly expensive performance-wise, so programmers often malloc a single large chunk of store, and then allocate "by hand" out of this.

Every functional language relieves the programmer of this storage management burden. Store is allocated and initialized implicitly, and recovered automatically by the garbage collector. The technology of storage allocation and garbage collection is now well developed, and the performance costs are rather slight.

When C is better

It isn't all roses, of course. The C quicksort uses an extremely ingenious technique, invented by Hoare, whereby it sorts the array in place; that is, without using any extra storage. As a result, it runs quickly, and in a small amount of memory. In contrast, the Haskell program allocates quite a lot of extra memory behind the scenes, and runs rather slower than the C program.

In effect, the C quicksort does some very ingenious storage management, trading this algorithmic complexity for a reduction in run-time storage management costs.

In applications where performance is required at any cost, or when the goal is detailed tuning of a low-level algorithm, an imperative language like C would probably be a better choice than Haskell, exactly because it provides more intimate control over the exact way in which the computation is carried out.

Functional vs imperative

But few programs require performance at any cost! After all, we all stopped writing assembly-language programs, except perhaps for key inner loops, long ago. The benefits of having a more supportive programming model (an arbitrary number of named, local variables instead of a fixed number of registers, for example) far outweigh the modest run-time costs.

Similarly, we willingly accept the costs of a virtual memory paging system, in exchange for the more supportive programming model of an infinite virtual address space. The days of explicit memory overlays are over.

Functional languages take another large step towards a higher-level programing model. Programs are easier to design, write and maintain, but the language offers the programmer less control over the machine. For most programs the result is perfectly acceptable.

What is Haskell?

Haskell is a modern, standard, non-strict, purely-functional programming language. It provides all the features sketched above, including polymorphic typing, lazy evaluation and higher-order functions. It also has an innovative type system which supports a systematic form of overloading and a module system.

It is specifically designed to handle a wide range of applications, from numerical through to symbolic. To this end, Haskell has an expressive syntax, and a rich variety of built-in data types, including arbitrary-precision integers and rationals, as well as the more conventional integer, floating-point and boolean types.

There are a number of compilers and interpreters available. All are free. First-time users may want to start with Hugs, a small, portable Haskell interpreter.

See also the History of Haskell

Does anyone use functional programming?

Functional programming languages are used in substantial applications. For example:

  • Software AG, a major German software company, market an expert system (Natural Expert) which is programmed in a functional language. Their users find it easy to develop their applications in this language, through which they gain access to an underlying database system. It all runs on an IBM mainframe.
  • Ericsson have developed a new functional language, Erlang, to use in their future telephony applications. They have already written 130k-line Erlang applications, and find them very much shorter and faster to develop.
  • Amoco ran an experiment in which they re-coded in Miranda, a lazy functional language, a substantial fraction of their main oil-reservoir simulation code, a critical application. The resulting program was vastly shorter, and its production revealed a number of errors in the existing software. Amoco subsequently transcribed the functional program into C++ with encouraging results.
  • A researcher at the MITRE corporation is using Haskell to prototype his digital signal-processing applications.
  • Researchers at Durham University used Miranda, and later Haskell, in a seven-year project to build LOLITA, a 30,000-line program for natural-language understanding.
  • Query is the query language of the O2 object-oriented database system. O2Query is probably the most sophisticated commercially-available object-oriented database query language and it is a functional language.
  • ICAD Inc market a CAD system for mechanical and aeronautical engineers. The language in which the engineers describe their design is functional, and it uses lazy evaluation extensively to avoid recomputing parts of the design which are not currently visible on the screen. This results in substantial performance improvements.
  • An incestuous example: the Glasgow Haskell compiler is written in Haskell: a 100,000-line application.
  • Pugs, the leading perl6 implementation is written in Haskell
  • As is Darcs, a cutting edge distributed revision control system

Some other examples of Haskell in practice.

Clifford Beshers, of Linspire Inc., describes their experience with Haskell, and functional programming:

Linspire, Inc. has used functional programming since its inception in 2001, beginning with extensive use of O'Caml, with a steady shift to Haskell as its implementations and libraries have matured. Hardware detection, software packaging and CGI web page generation are all areas where we have used functional programming extensively.

Haskell's feature set lets us replace much of our use of little languages (e.g., bash or awk) and two-level languages (C or C++ bound to an interpreted language), allowing for faster development, better code sharing and ultimately faster implementations. Above all, we value static type checking for minimizing runtime errors in applications that run in unknown environments and for wrapping legacy programs in strongly typed functions to ensure that we pass valid arguments.

Other frequently-asked questions

Is functional programming hard to learn?

Functional programming does require a change in perspective, which some programmers find hard. But Ericsson's experience in training programmers in Erlang is that most find the transition easy - provided they take the training need seriously rather than assuming that they can "pick it up on the day".

Aren't functional programs very slow?

They used to be, perhaps 20 years ago. But the compilers

have long since caught up. Haskell programs run fast for all but the most performance-demanding applications. At the time of writing, Haskell compiled via GHC is in 2nd place (behind C) in the Great Language Shootout, with other functional languages also ranked highly.

I already have a large application in C or C++.

Also worded as: Can I benefit from functional programming without rewriting my whole system?

Haskell has been successfully integrated into existing applications in a number of ways. HaskellDirect is an IDL (Interface Description Language) based tool that allows Haskell programs to work with software components. Low level C/C++ interfaces can be generated with Green Card or C->Haskell, allowing tight integration between Haskell and C. These tools have been used to build a large number of successful, mixed language systems.

What libraries does Haskell support?

Many software libraries have been developed for Haskell. See the list of Haskell libraries for a list of much of what is available.

What other software tools for Haskell are there?

Glasgow Haskell comes with a profiler which allows you to find which parts of your program are consuming most time and space. Chalmers Haskell has a space-profiling tool, and a quasi-parallel simulator which allows you to experiment with running your program in parallel. Hugs also has some similar tools. For a complete list, check the tools page.

Can I get a support contract or a help-line?

It used to be the case that if you wanted help, you had to persuade a Haskell research group that your problem was interesting enough or important enough that they should spend time helping you for free.
Whilst that is still an option, there is now a directory of Haskell Consultants who provide:

  • Support for compilers, tools and libraries.
  • Help with improving code quality (time, space, robustness, maintainability, etc.) using code reviews and tools.
  • Help with using libraries, tools and advanced Haskell features such as type system extensions, exception handling, the foreign function interface, test harnesses, and concurrency.
  • Library and application development.
  • Staff training.

These companies and individuals tend to work closely with those developing Haskell (indeed, they have usually made major contributions to Haskell themselves).

How can I learn Haskell?

The easiest way to learn Haskell is with a textbook. There are a lot of online tutorials, but you'll have a much easier time to learn the basics from a book. After all, Haskell is very different from traditional mainstream languages, it's like learning programming anew.

Comparisons to other languages

Click to see a table comparing features of Haskell to similar languages

Based on a paper by Simon Peyton Jones.