Difference between revisions of "Internationalization of Haskell programs using gettext"

From HaskellWiki
Jump to navigation Jump to search
(First version of text, copied from http://progandprog.blogspot.com/2009/03/i18n-and-haskell.html)
 
m (grammar and spelling corrections)
Line 16: Line 16:
 
</haskell>
 
</haskell>
   
First of all, wrap all strings, you want to translate in function <hask>__</hask>:
+
First of all, wrap all strings you want to translate in function <hask>__</hask>:
   
 
<haskell>
 
<haskell>
Line 32: Line 32:
   
   
We will return to the definition of <hask>__</hask> a bit later, now live this function empty (<hask>id</hask>)
+
We will return to the definition of <hask>__</hask> a bit later, for now leave this function as a no-op (<hask>id</hask>).
   
 
==Translate==
 
==Translate==
   
The next step is to generate POT file (template, which contain all strings to needed to be translated). For Python, C, C++ and Scheme languages there is xgettext utility, but it doesn't support Haskell. On [[Hackage]] you could download [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/hgettext hgettext] library and utility, which process haskell source files in the same way as xgettext C/C++ files:
+
The next step is to generate a POT file (a template which contain all strings needing translation). For Python, C, C++ and Scheme languages there is the xgettext utility, but it doesn't support Haskell. On [[Hackage]] you can download [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/hgettext hgettext] library and utilities, which process haskell source files in the same way as xgettext does for C/C++ files:
   
 
<tt> cabal install --global hgettext </tt>
 
<tt> cabal install --global hgettext </tt>
   
Now run from the directory, where your project is:
+
Now run this from the directory where your project is:
   
 
<tt>hgettext -k __ -o messages.pot Main.hs</tt>
 
<tt>hgettext -k __ -o messages.pot Main.hs</tt>
   
Shortly, it gather all strings marked by function <tt>__</tt> from the <tt>Main.hs</tt> and writes everything to <tt>messages.pot</tt>.
+
This gathers all strings marked by the function <tt>__</tt> from the file <tt>Main.hs</tt> and writes everything to <tt>messages.pot</tt>.
   
 
Now look at the resulting pot file:
 
Now look at the resulting pot file:
Line 81: Line 81:
   
   
We are interested in the bottom part of this file (started from <tt>'#: Main.hs:...'</tt>). Here we can see pairs of lines: <tt>msgid</tt> and <tt>msgstr</tt>: <tt>msgid</tt> is the original text from the code, and <tt>msgstr</tt> is the translaged string. Each language, should have its own translation file. I will create two translations: German and English.
+
We are interested in the bottom part of this file (starting from <tt>'#: Main.hs:...'</tt>). Here we can see pairs of lines: <tt>msgid</tt> and <tt>msgstr</tt>: <tt>msgid</tt> is the original text from the code, and <tt>msgstr</tt> is the translated string. Each language should have its own translation file. I will create two translations, German and English.
   
To create a PO file for specific locale we should use <tt>msginit</tt> utility:
+
To create a PO file for specific locale we should use the <tt>msginit</tt> utility:
   
To generate German translations template run:
+
To generate the German translation file template run:
   
 
<tt>msginit --input=messages.pot --locale=de.UTF-8</tt>
 
<tt>msginit --input=messages.pot --locale=de.UTF-8</tt>
   
And for English translation run:
+
And for the English translation run:
   
 
<tt>msginit --input=messages.pot --locale=en.UTF-8</tt>
 
<tt>msginit --input=messages.pot --locale=en.UTF-8</tt>
   
If we look at the generated files (<tt>en.po</tt> and <tt>de.po</tt>), we will see, that English translation is completelly filled, we have only to edit German PO file. So fill it with following strings:
+
If we look at the generated files (<tt>en.po</tt> and <tt>de.po</tt>), we will see that the English translation is completely filled, we have only to edit the German PO file. So fill it with the following strings:
   
 
<tt>
 
<tt>
Line 113: Line 113:
 
==Install translation files==
 
==Install translation files==
   
Now we have to create directories, where these translations should be placed. Originally all translation files are places on <tt>/usr/share/locale/</tt> folder, but we are free to select different place. Run:
+
Now we have to create the directories where these translations should be placed. By default all translation files are placed on <tt>/usr/share/locale/</tt> folder, but we are free to select different places. Run:
   
 
<tt>mkdir -p {de,en}/LC_MESSAGES</tt>
 
<tt>mkdir -p {de,en}/LC_MESSAGES</tt>
   
It will create two directories <tt>de</tt> and <tt>en</tt>, that contain <tt>LC_MESSAGES</tt>, in the current directory. Now use <tt>msgfmt</tt> tool, to encode our <tt>po</tt> files to <tt>mo</tt> files (binary translation files):
+
It will create two directories in the current directory, <tt>de</tt> and <tt>en</tt>, that contain <tt>LC_MESSAGES</tt>. Now use <tt>msgfmt</tt> tool to encode our <tt>po</tt> files to <tt>mo</tt> files (binary translation files):
   
 
<tt>
 
<tt>
Line 128: Line 128:
 
==Enable internationalization in the code==
 
==Enable internationalization in the code==
   
As the final step we have to modify code, to support the internationalization:
+
As the final step we have to modify the code to support the internationalization:
   
 
<haskell>
 
<haskell>
Line 152: Line 152:
   
   
Here we added three initialization strings:
+
Here we added three initialization commands:
   
 
<haskell>
 
<haskell>
Line 161: Line 161:
   
   
The first one (you'll have to download [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/setlocale setlocale] package to enable this function), sets the current locale to default value. Next two functions tells <tt>gettext</tt> to take '''"hello.mo"''' message file from the locale directory (I set it to ".", but in general case, this directory should be passed from the package configuration).
+
The first one (you'll have to download [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/setlocale setlocale] package to enable this function) sets the current locale to a default value. The next two functions tell <tt>gettext</tt> to take the '''"hello.mo"''' message file from the locale directory (I set it to ".", but, in the general case, this directory should be passed from the package configuration).
   
The final step — define function <hask>__</hask>. It simply call <hask>getText</hask> from the module <hask>Text.I18N.GetText</hask>, but its type is <hask>String -> IO String</hask> so here is used <hask>unsafePerformIO</hask> to make it call more simpler.
+
The final step — define function <hask>__</hask>. It simply calls <hask>getText</hask> from the module <hask>Text.I18N.GetText</hask>, but its type is <hask>String -> IO String</hask> so here we used <hask>unsafePerformIO</hask> to make it simpler to call.
   
 
==Run the program==
 
==Run the program==

Revision as of 18:23, 28 March 2009

Most common in the GNU world approach to internationalization (i18n) of software is to use GNU gettext utilities. In this tutorial we will create simple "Hello world" program, with multilingual support.

Prepare program to internationalization

Consider we want to make the following program multilingual (file Main.hs):

module Main where
    
import IO 

main = do
  putStrLn "Please enter your name:"
  name <- getLine
  putStrLn $ "Hello, " ++ name ++ ", how are you?"

First of all, wrap all strings you want to translate in function __:

module Main where
    
import IO 

__ = id

main = do
  putStrLn (__ "Please enter your name:")
  name <- getLine
  putStrLn $ (__ "Hello, ") ++ name ++ (__ ", how are you?")


We will return to the definition of __ a bit later, for now leave this function as a no-op (id).

Translate

The next step is to generate a POT file (a template which contain all strings needing translation). For Python, C, C++ and Scheme languages there is the xgettext utility, but it doesn't support Haskell. On Hackage you can download hgettext library and utilities, which process haskell source files in the same way as xgettext does for C/C++ files:

cabal install --global hgettext

Now run this from the directory where your project is:

hgettext -k __ -o messages.pot Main.hs

This gathers all strings marked by the function __ from the file Main.hs and writes everything to messages.pot.

Now look at the resulting pot file:

# Translation file
 
msgid ""
msgstr ""
 
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2009-01-13 06:05-0800\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME \n"
"Language-Team: LANGUAGE \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"

#: Main.hs:0
msgid "Please enter your name:"
msgstr ""

#: Main.hs:0
msgid "Hello, "
msgstr ""

#: Main.hs:0
msgid ", how are you?"
msgstr ""


We are interested in the bottom part of this file (starting from '#: Main.hs:...'). Here we can see pairs of lines: msgid and msgstr: msgid is the original text from the code, and msgstr is the translated string. Each language should have its own translation file. I will create two translations, German and English.

To create a PO file for specific locale we should use the msginit utility:

To generate the German translation file template run:

msginit --input=messages.pot --locale=de.UTF-8

And for the English translation run:

msginit --input=messages.pot --locale=en.UTF-8

If we look at the generated files (en.po and de.po), we will see that the English translation is completely filled, we have only to edit the German PO file. So fill it with the following strings:

#: Main.hs:0
msgid "Please enter your name:"
msgstr "Wie heißen Sie?"

#: Main.hs:0
msgid "Hello, "
msgstr "Hallo, "

#: Main.hs:0
msgid ", how are you?"
msgstr ", wie geht es Ihnen?"

Install translation files

Now we have to create the directories where these translations should be placed. By default all translation files are placed on /usr/share/locale/ folder, but we are free to select different places. Run:

mkdir -p {de,en}/LC_MESSAGES

It will create two directories in the current directory, de and en, that contain LC_MESSAGES. Now use msgfmt tool to encode our po files to mo files (binary translation files):

msgfmt --output-file=en/LC_MESSAGES/hello.mo en.po
msgfmt --output-file=de/LC_MESSAGES/hello.mo de.po

Enable internationalization in the code

As the final step we have to modify the code to support the internationalization:

module Main where
    
import IO 
import Text.I18N.GetText
import System.Locale.SetLocale
import System.IO.Unsafe

__ :: String -> String
__ = unsafePerformIO . getText

main = do
  setLocale LC_ALL (Just "") 
  bindTextDomain "hello" "." 
  textDomain "hello" 

  putStrLn (__ "Please enter your name:")
  name <- getLine
  putStrLn $ (__ "Hello, ") ++ name ++ (__ ", how are you?")


Here we added three initialization commands:

setLocale LC_ALL (Just "") 
bindTextDomain "hello" "." 
textDomain "hello"


The first one (you'll have to download setlocale package to enable this function) sets the current locale to a default value. The next two functions tell gettext to take the "hello.mo" message file from the locale directory (I set it to ".", but, in the general case, this directory should be passed from the package configuration).

The final step — define function __. It simply calls getText from the module Text.I18N.GetText, but its type is String -> IO String so here we used unsafePerformIO to make it simpler to call.

Run the program

Now you can build and try this program in different locales:

user> ghc --make Main.hs
[1 of 1] Compiling Main             ( Main.hs, Main.o )
Linking Main ...

user> LOCALE=en_US.UTF-8 ./Main
Please enter your name:
Bond
Hello, Bond, how are you?

user> LOCALE=de_DE.UTF-8 ./Main
Wie heißen Sie?
Bond
Hallo, Bond, wie geht es Ihnen?

user>

Distribute internationalized cabal package

TBD