Unicode-symbols
From HaskellWiki
(→Problematic symbols: LAMBDA) |
(→UnicodeSyntax: fix for broken link) |
||
(8 intermediate revisions by 6 users not shown) | |||
Line 13: | Line 13: | ||
==== UnicodeSyntax ==== | ==== UnicodeSyntax ==== | ||
− | GHC offers the [ | + | GHC offers the [https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#unicode-syntax UnicodeSyntax] language extension. If you decide to use Unicode in your Haskell source then this extension can greatly improve how it looks. |
Simply put the following above a module to enable unicode syntax: | Simply put the following above a module to enable unicode syntax: | ||
Line 93: | Line 93: | ||
This idea is an extension of | This idea is an extension of | ||
− | type &# | + | type ℤ = Integer |
and | and | ||
− | type ℚ = Ratio &# | + | type ℚ = Ratio ℤ |
− | The advantage is that it looks nice and that it is a logical extension of &# | + | The advantage is that it looks nice and that it is a logical extension of ℤ, ℚ and ℝ. The disadvantage is that there is no documented prior use of this character to denote boolean values. This could be detrimental to the readability of code. |
Example: | Example: | ||
Line 180: | Line 180: | ||
* Hex value for BMP codepoints: type '''C-Vunnnn''' where 0 ≤ nnnn ≤ FFFF. | * Hex value for BMP codepoints: type '''C-Vunnnn''' where 0 ≤ nnnn ≤ FFFF. | ||
* Hex value for any codepoint: type '''C-VUnnnnnnnn''' where 0 ≤ nnnnnnnn ≤ FFFFFFFF. | * Hex value for any codepoint: type '''C-VUnnnnnnnn''' where 0 ≤ nnnnnnnn ≤ FFFFFFFF. | ||
+ | |||
+ | '''Digraphs''' | ||
+ | |||
+ | Vim's digraphs can be used to type many Unicode symbols with somewhat memorable key combinations. | ||
+ | |||
+ | Digraphs are entered with <tt>C-K</tt> plus two keystrokes in insert mode. Many of the simplest symbols are entered with the same two characters they'd substitute for: | ||
+ | |||
+ | * <tt>C-K ::</tt> inserts ∷ | ||
+ | * <tt>C-K =></tt> inserts ⇒ | ||
+ | * <tt>C-K -></tt> inserts → | ||
+ | * <tt>C-K <-</tt> inserts ← | ||
+ | |||
+ | Other typeable symbols include: | ||
+ | |||
+ | * <tt>C-K FA</tt> inserts ∀ | ||
+ | * <tt>C-K AN</tt> inserts ∧ | ||
+ | * <tt>C-K OR</tt> inserts ∨ | ||
+ | * <tt>C-K NO</tt> inserts ¬ | ||
+ | * <tt>C-K (-</tt> inserts ∈ | ||
+ | * <tt>C-K !=</tt> inserts ≠ | ||
+ | * <tt>C-K =<</tt> inserts ≤ | ||
+ | * <tt>C-K >=</tt> inserts ≥ | ||
+ | |||
+ | A complete list of default digraphs is available in the documentation under <tt>:help digraphs-default</tt>. | ||
+ | |||
+ | Custom digraphs can be defined in one's <tt>.vimrc</tt>. For example, if one wants to type <tt>C-K ZZ</tt> to insert ℤ: | ||
+ | |||
+ | digraph ZZ 8484 | ||
+ | |||
+ | Here, 8484 is the decimal value of the character ℤ. | ||
'''Automatic Unicode Transformation''' | '''Automatic Unicode Transformation''' | ||
− | + | The Vim conceal definitions in haskellmode-vim pleasantly mask most of usual symbols with the unicode equivalent but have no effect on the actual source code. While in normal mode, the concealed characters on the current line will be displayed as ASCII. In insert mode and on lines other than the current one in normal mode, Unicode characters will be displayed. | |
=== SciTE === | === SciTE === | ||
Line 200: | Line 230: | ||
A set of input methods has been written by Urs Holzer for the [http://www.m17n.org m17n] library. The main goal of Urs is to build input methods for mathematical characters. However, most of the symbols used in the *-unicode-symbols packages can be written using Urs's methods. More information is available at [http://www.andonyar.com/rec/2008-03/mathinput/ Input Methods for Mathematics] page. For most Linux distributions, just download a [http://www.andonyar.com/rec/2008-03/mathinput/methods.tar.gz tarball], extract *.mim files to /usr/share/m17n and enable iBus for input methods. | A set of input methods has been written by Urs Holzer for the [http://www.m17n.org m17n] library. The main goal of Urs is to build input methods for mathematical characters. However, most of the symbols used in the *-unicode-symbols packages can be written using Urs's methods. More information is available at [http://www.andonyar.com/rec/2008-03/mathinput/ Input Methods for Mathematics] page. For most Linux distributions, just download a [http://www.andonyar.com/rec/2008-03/mathinput/methods.tar.gz tarball], extract *.mim files to /usr/share/m17n and enable iBus for input methods. | ||
+ | |||
+ | '''X11 Mode_switch and Multi_key inputs''' | ||
+ | |||
+ | Modern X11 systems can use the ''Mode_switch'' key and xmodmap to assign another | ||
+ | pair of symbols to input keys, and the ''Multi_key'' key to compose multiple keys into yet more symbols. Documentation on how to set this up, along with configurations designed with Haskell in mind, can be found on [http://blog.mired.org/2015/08/unicode-input-with-x11.html Mike Meyer's blog]. | ||
== Fonts == | == Fonts == |
Latest revision as of 22:40, 15 July 2016
Contents |
[edit] 1 Overview
An overview of the packages that provide Unicode symbols.
Naming: A package X-unicode-symbols defines new symbols for functions and operators from the package X.
All symbols are documented with their actual definition and information regarding their Unicode code point. They should be completely interchangeable with their definitions.
Alternatives for existing operators have the same fixity. New operators will have a suitable fixity defined.
[edit] 1.1 UnicodeSyntax
GHC offers the UnicodeSyntax language extension. If you decide to use Unicode in your Haskell source then this extension can greatly improve how it looks.
Simply put the following above a module to enable unicode syntax:
{-# LANGUAGE UnicodeSyntax #-}
[edit] 2 base-unicode-symbols
Extra symbols for the base package.
API docs: http://hackage.haskell.org/package/base-unicode-symbols github: https://github.com/roelvandijk/base-unicode-symbols checkout: git clone git://github.com/roelvandijk/base-unicode-symbols.git
[edit] 2.1 Problematic symbols
Original | Symbol | Code point | Name |
---|---|---|---|
not | ¬ | U+AC | NOT SIGN |
lambda | λ | U+03BB | GREEK SMALL LETTER LAMDA |
The problem with the NOT symbol is that you would like to use it as an unary prefix operator:
¬(¬x) ≡ x
Unfortunately this is not valid Haskell. The following is:
(¬)((¬)x) ≡ x
But you can hardly call that an improvement over the simple:
not (not x) ≡ x
The problem with the LAMBDA symbol is that it is classified as an alphabetic character, so it can be used as part of a name. See the discussion for GHC.
[edit] 2.2 New symbol ideas
(please add your own)
I'm thinking of adding the following symbol as another alternative for (*).
Original | Symbol | Code point | Name |
---|---|---|---|
(*) | × | U+D7 | MULTIPLICATION SIGN |
2 * 3 ≡ 6 2 ⋅ 3 ≡ 6 2 × 3 ≡ 6
A disadvantage of this symbol is its similarity to the letter x:
sqr x = x × x
Original | Symbol | Code point | Name |
---|---|---|---|
Bool | 𝔹 | U+1D539 | MATHEMATICAL DOUBLE-STRUCK CAPITAL B |
This idea is an extension of
type ℤ = Integer
and
type ℚ = Ratio ℤ
The advantage is that it looks nice and that it is a logical extension of ℤ, ℚ and ℝ. The disadvantage is that there is no documented prior use of this character to denote boolean values. This could be detrimental to the readability of code.
Example:
(∧) ∷ 𝔹 → 𝔹 → 𝔹
[edit] 3 containers-unicode-symbols
Extra symbols for the containers package.
API docs: http://hackage.haskell.org/package/containers-unicode-symbols github: https://github.com/roelvandijk/containers-unicode-symbols checkout: git clone git://github.com/roelvandijk/containers-unicode-symbols.git
[edit] 3.1 New symbol ideas
(please add your own)
[edit] 4 Input methods
These symbols are all very nice but how do you type them?
Wikipedia has a helpful article: http://en.wikipedia.org/wiki/Unicode_input
(please add info for other editors)
[edit] 4.1 Emacs
Direct
Enter symbols directly: C-x 8 RET (ucs-insert), then type either the character's name or its hexadecimal code point.
TeX input method
The TeX input method, invoked with M-x set-input-method and entering TeX allows you to enter Unicode characters by typing in TeX-like sequences. For example, typing \lambda inserts a λ.
This is probably the most convenient input method for casual use.
A list of available sequences may be viewed with M-x describe-input-method
Custom input method
I wrote my own input method:
github: https://github.com/roelvandijk/emacs-haskell-unicode-input-method checkout: git clone git://github.com/roelvandijk/emacs-haskell-unicode-input-method.git
To automically load in haskell-mode put the following code in your .emacs file:
(require 'haskell-unicode-input-method) (add-hook 'haskell-mode-hook (lambda () (set-input-method "haskell-unicode")))
Make sure the directory containing the .elisp file is in your load-path, for example:
(add-to-list 'load-path "~/.elisp/emacs-haskell-unicode-input-method")
To manually enable use M-x set-input-method or C-x RET C-\ with haskell-unicode. Note that the elisp file must be evaluated for this to work.
Now you can simply type -> and it is immediately replaced with →. Use C-\ to toggle the input method. To see a table of all key sequences use M-x describe-input-method haskell-unicode. A sequence like <= is ambiguous and can mean either ⇐ or ≤. Typing it presents you with a choice. Type 1 or 2 to select an option or keep typing to use the default option.
If you don't like the highlighting of partially matching tokens you can turn it off:
(setq input-method-highlight-flag nil)
Abbrev mode
The Abbrev mode is not suitable since it only deals with words, not operators.
Agda
Use Agda's input method.
[edit] 4.2 Vim
(real Vim users might want to expand this section)
Direct
- Decimal value: type C-Vnnn where 0 ≤ nnn ≤ 255.
- Octal value: type C-VOnnn or C-Vonnn where 0 ≤ nnn ≤ 377.
- Hex value: type C-VXnn or C-Vxnn where 0 ≤ nn ≤ FF.
- Hex value for BMP codepoints: type C-Vunnnn where 0 ≤ nnnn ≤ FFFF.
- Hex value for any codepoint: type C-VUnnnnnnnn where 0 ≤ nnnnnnnn ≤ FFFFFFFF.
Digraphs
Vim's digraphs can be used to type many Unicode symbols with somewhat memorable key combinations.
Digraphs are entered with C-K plus two keystrokes in insert mode. Many of the simplest symbols are entered with the same two characters they'd substitute for:
- C-K :: inserts ∷
- C-K => inserts ⇒
- C-K -> inserts →
- C-K <- inserts ←
Other typeable symbols include:
- C-K FA inserts ∀
- C-K AN inserts ∧
- C-K OR inserts ∨
- C-K NO inserts ¬
- C-K (- inserts ∈
- C-K != inserts ≠
- C-K =< inserts ≤
- C-K >= inserts ≥
A complete list of default digraphs is available in the documentation under :help digraphs-default.
Custom digraphs can be defined in one's .vimrc. For example, if one wants to type C-K ZZ to insert ℤ:
digraph ZZ 8484
Here, 8484 is the decimal value of the character ℤ.
Automatic Unicode Transformation
The Vim conceal definitions in haskellmode-vim pleasantly mask most of usual symbols with the unicode equivalent but have no effect on the actual source code. While in normal mode, the concealed characters on the current line will be displayed as ASCII. In insert mode and on lines other than the current one in normal mode, Unicode characters will be displayed.
[edit] 4.3 SciTE
See Tips_for_using_SciTE_with_Haskell
[edit] 4.4 Sublime Text 2
Syntax highlighting for the GHC unicode syntax is not supported in the default configuration as of version 2.0.1. However the following patch, when applied to Packages/Haskell/Haskell.tmLanguage
, does enable this: https://gist.github.com/3744568
Insert the following snippet into user key bindings to conveniently type unicode operators in Haskell code: https://gist.github.com/3766192 . For example, typing "->" will automatically insert "→".
[edit] 4.5 System wide
m17n input methods
A set of input methods has been written by Urs Holzer for the m17n library. The main goal of Urs is to build input methods for mathematical characters. However, most of the symbols used in the *-unicode-symbols packages can be written using Urs's methods. More information is available at Input Methods for Mathematics page. For most Linux distributions, just download a tarball, extract *.mim files to /usr/share/m17n and enable iBus for input methods.
X11 Mode_switch and Multi_key inputs
Modern X11 systems can use the Mode_switch key and xmodmap to assign another pair of symbols to input keys, and the Multi_key key to compose multiple keys into yet more symbols. Documentation on how to set this up, along with configurations designed with Haskell in mind, can be found on Mike Meyer's blog.
[edit] 5 Fonts
The following free fonts have good Unicode coverage: