Internationalization of Haskell programs
(→Comparison of approaches to internationalization: Compare Yesod's i18N)
(→Using native Haskell data types: explain Yesod's usage more)
Revision as of 10:10, 10 December 2011
1 Approaches to internationalization in Haskell
There are several different approaches you can use to internationalize your Haskell program.
1.1 Using GNU gettext
Set up your translations and integrate them into your application using these instructions.
1.2 Using native Haskell data types
You can internationalize your program using native Haskell data types.
Represent the individual texts to be translated as constructors of a Haskell data type. Then provide a function that automatically renders the texts appropriately in the current language context.
See this description of a simple example of using Haskell data types for internationalization.
1.3 Using the Grammatical Framework
You can internationalize your program using the Grammatical Framework (GF).
The GF provides a way to define human-language-independent syntax for expressing texts in Haskell. The GF can then render the texts automatically in any human language for which an appropriate GF grammer exists.
very simple example
of an application internationalized using GF.
It is based on the "Foods" grammar (included in the example),
so it's quite contrived,
but it should be enough to get you started.
Usage instructions are in the file
See the GF download page for information about how to install GF. The standard installation of GF currently includes grammars for at least 26 languages.
2 Comparison of approaches to internationalization
2.1 GNU gettext
- Easiest integration with other tools and programming languages
- Little or no specialized knowledge required of translators
- Little or no interaction needed between programmers and translators
- Well-known and well-documented
- Inflexible use of static text literals creates awkwardness when there are complex differences between how the same idea is expressed in different languages
- Translation selection happens at runtime, so there is no type safety
- Texts are loaded from external files at runtime, which creates overhead and deployment issues
- Requires a moderate amount of work to set up and integrate
- Not well supported on MS Windows
2.2 Native Haskell data types
- Easy to implement in Haskell
- Compile-time type safety
- Flexible in handling complex differences between languages
- Flexible in implementation: e.g., use a type class if you don't want one big data type, use Text or Builder instead of String
- Platform independent
- May require some training of translators and/or cooperative integration work between translators and programmers, depending on the level of sophistication needed in the rendering functions
- May lead to many string literals in a Haskell source file. This requires a work-around for a current limitation of the GHC compiler; see file-embed, below.
2.3 Yesod's Native Data type approach
Yesod's approach provides a translator-friendly veneer that gets rid of the above disadvantages. It is only integrated with the Yesod web framework and its use of the Hamlet template language, but it could be abstracted out for use with other projects.
2.4 Grammatical Framework
- Translations automatically generated in all languages for which GF grammars exist, without the need for human translators
- High quality translations
- Platform independent
- Learning curve for the programmer to express texts using existing GF grammars, and to extend GF grammars as needed
- Extra work needed if you must support languages that do not yet have a GF grammar
- Installing GF is not quite as simple as installing the usual Haskell package
- A human translator may still be needed for domain-specific words and expressions not included in the standard GF grammars
3 Other tools
The following tools may also be useful when internationalizing your Haskell program:
Internationalization often leads to a large number of literal strings in a Haskell source file. This creates a technical problem due to a current limitation of the GHC compiler - the GHC compiler does not behave well when compiling a source file with a large number of literal strings.
One classic work-around to this problem is to place the literal strings in a small C library and import them via the FFI.
Another work-around, based on Template Haskell, is provided by the file-embed package.
The numerals package renders numbers (currently only cardinal numbers) as text in many different languages.