Difference between revisions of "Simple Unix tools"
(can be improved a bit more) |
(Remove dead link.) |
||
(20 intermediate revisions by 11 users not shown) | |||
Line 1: | Line 1: | ||
Simple Unix commandline tools written in Haskell. |
Simple Unix commandline tools written in Haskell. |
||
− | This is intended as a beginner's tutorial for learning Haskell from a |
+ | This is intended as a beginner's tutorial for learning Haskell from a |
"Let's just solve things already!" point of view. The examples should |
"Let's just solve things already!" point of view. The examples should |
||
help give a flavor of the beauty and expressiveness of Haskell |
help give a flavor of the beauty and expressiveness of Haskell |
||
programming. |
programming. |
||
− | These functions can be executed as one liners from a shell. For example, to use the Haskell version of 'wc': |
||
− | |||
− | $ cat file.txt | ghc -e 'wc_l' UnixTools.hs |
||
− | |||
− | Or, one could define 'main' to be a chosen tool/function (add a line to the effect that "main = wc_l") and then compile the tool with |
||
− | $ ghc --make UnixTools.hs |
||
− | |||
− | The given Haskell codes presents yet a third way of doing things: much like the BusyBox[http://en.wikipedia.org/wiki/BusyBox] suite of Unix tools, it is possible to compile a single monolithic binary and have it detect what name it is run by and then act appropriately. This is the approach the following code takes: you can compile it and then make symbolic links (like "ln -s UnixTools echo") and then run those commands ("echo foo | ./echo" would produce output of "foo"). |
||
<haskell> |
<haskell> |
||
− | -- Some Unix-like tools written in simple, clean Haskell |
||
− | module Main |
||
− | where |
||
− | import Data.List ( sort, nub, intersperse, delete, isPrefixOf, foldl' ) |
||
− | import Data.Char ( toUpper, isSpace, ord ) |
||
− | import Text.Printf ( printf ) |
||
− | import System.Environment ( getProgName ) |
||
+ | import Control.Monad.Instances |
||
− | main = do |
||
+ | import Data.List |
||
− | calledBy <- getProgName |
||
+ | import Data.Char |
||
− | case calledBy of |
||
+ | import Data.Maybe |
||
− | "blank" -> io blank |
||
+ | import Text.Printf |
||
− | "cksum" -> interact (showln . cksum) |
||
+ | import System.Environment |
||
− | "clean" -> io clean'' |
||
+ | import Text.Regex.Posix |
||
− | "echo" -> interact id -- not perfect. what if echoed thing is an argument and not on stdin? |
||
+ | |||
− | "drop" -> interact drop' |
||
+ | -- First, three helpers |
||
− | "grep" -> io grep |
||
− | "grep -v" -> io grep_v |
||
− | "head" -> io (return . head') |
||
− | "join" -> io join |
||
− | "num" -> io num |
||
− | "remove" -> io (remove "str") |
||
− | "revw" -> io revw |
||
− | "reverse" -> io rev |
||
− | "reverseword" -> io revw |
||
− | "rpt" -> io rpt |
||
− | "sort" -> interact sort |
||
− | "space" -> io space |
||
− | "tac" -> interact tac |
||
− | "take" -> io take' |
||
− | "tail" -> io (return . tail') |
||
− | --"tr" -> interact tr |
||
− | --"tr -d" -> interact (trd . unwords) |
||
− | "unspace" -> io unspace |
||
− | "upper" -> interact upper |
||
− | "uniq" -> interact uniq |
||
− | "wc_c" -> interact wc_c |
||
− | "wc_l" -> interact wc_l |
||
− | "wc_w" -> interact wc_w |
||
− | |||
− | -- First, two helpers |
||
− | io :: ([String] -> [String]) -> IO () |
||
io f = interact (unlines . f . lines) |
io f = interact (unlines . f . lines) |
||
+ | |||
− | |||
− | showln :: Int -> String |
||
showln = (++ "\n") . show |
showln = (++ "\n") . show |
||
+ | |||
− | |||
+ | regexBool r l = l =~ r :: Bool -- simple boolean regex matching |
||
− | -- Sort a file |
||
+ | |||
− | sort' :: String -> String |
||
− | sort' = sort |
||
− | |||
-- remove duplicate lines from a file (like uniq) |
-- remove duplicate lines from a file (like uniq) |
||
+ | uniq = nub -- Warning: Unix uniq discards *consecutive* dupes, |
||
− | uniq :: String -> String |
||
+ | -- but 'nub' discards all dupes. |
||
− | uniq = nub |
||
+ | |||
− | |||
-- repeat the input file infinitely |
-- repeat the input file infinitely |
||
− | rpt |
+ | rpt = cycle |
+ | |||
− | rpt = cycle |
||
− | |||
-- Return the head -10 line of a file |
-- Return the head -10 line of a file |
||
− | take' :: [String] -> [String] |
||
take' = take 10 |
take' = take 10 |
||
+ | |||
− | |||
-- Remove the first 10 lines of a file |
-- Remove the first 10 lines of a file |
||
− | drop' :: String -> String |
||
drop' = drop 10 |
drop' = drop 10 |
||
+ | |||
− | |||
-- Return the head -1 line of a file |
-- Return the head -1 line of a file |
||
− | head' :: [String] -> String |
||
head' = head |
head' = head |
||
+ | |||
− | |||
-- Return the tail -1 line of a file |
-- Return the tail -1 line of a file |
||
− | tail' :: [String] -> String |
||
tail' = last |
tail' = last |
||
+ | |||
− | |||
+ | -- return the last ten lines of a file |
||
+ | tail10 = drop =<< subtract 10 . length |
||
+ | |||
-- Reverse lines in a file (tac) |
-- Reverse lines in a file (tac) |
||
− | tac |
+ | tac = reverse |
+ | |||
− | tac = reverse |
||
− | |||
-- Reverse characters on each line (rev) |
-- Reverse characters on each line (rev) |
||
− | rev |
+ | rev = map reverse |
+ | |||
− | rev = map reverse |
||
− | |||
-- Reverse words on each line |
-- Reverse words on each line |
||
+ | rev_w = map (unwords . reverse . words) |
||
− | revw :: [String] -> [String] |
||
+ | |||
− | revw = map (unwords. reverse . words) |
||
− | |||
-- Count number of characters in a file (like wc -c) |
-- Count number of characters in a file (like wc -c) |
||
− | wc_c |
+ | wc_c = showln . length |
+ | |||
− | wc_c = showln . length |
||
− | |||
-- Count number of lines in a file, like wc -l |
-- Count number of lines in a file, like wc -l |
||
+ | wc_l = showln . length . lines |
||
− | wc_l :: String -> String |
||
+ | |||
− | wc_l = showln . length . lines |
||
− | |||
-- Count number of words in a file (like wc -w) |
-- Count number of words in a file (like wc -w) |
||
+ | wc_w = showln . length . words |
||
− | wc_w :: String -> String |
||
+ | |||
− | wc_w = showln . length . words |
||
− | |||
-- double space a file |
-- double space a file |
||
− | space |
+ | space = intersperse "" |
+ | |||
− | space = intersperse "" |
||
− | |||
-- undo double space |
-- undo double space |
||
− | unspace :: [String] -> [String] |
||
unspace = filter (not.null) |
unspace = filter (not.null) |
||
+ | |||
− | |||
-- remove the first occurrence of the line "str" |
-- remove the first occurrence of the line "str" |
||
− | remove |
+ | remove = delete |
+ | |||
− | remove a = delete a |
||
− | |||
-- make a string all upper case |
-- make a string all upper case |
||
− | upper :: String -> String |
||
upper = map toUpper |
upper = map toUpper |
||
+ | |||
− | |||
-- remove leading space from each line |
-- remove leading space from each line |
||
− | clean :: [String] -> [String] |
||
clean = map (dropWhile isSpace) |
clean = map (dropWhile isSpace) |
||
+ | |||
− | |||
-- remove trailing whitespace |
-- remove trailing whitespace |
||
+ | clean' = map (reverse . dropWhile isSpace . reverse) |
||
− | clean' :: [String] -> [String] |
||
+ | |||
− | clean' = (map f) |
||
− | where f = reverse . dropWhile isSpace . reverse |
||
− | |||
-- delete leading and trailing whitespace |
-- delete leading and trailing whitespace |
||
− | clean'' :: [String] -> [String] |
||
clean'' = map (f . f) |
clean'' = map (f . f) |
||
− | + | where f = reverse . dropWhile isSpace |
|
+ | |||
− | |||
-- insert blank space at beginning of each line |
-- insert blank space at beginning of each line |
||
− | blank :: [String] -> [String] |
||
blank = map (s ++) |
blank = map (s ++) |
||
− | + | where s = replicate 8 ' ' |
|
+ | |||
− | |||
-- join lines of a file |
-- join lines of a file |
||
− | join :: [String] -> [String] |
||
join = return . concat |
join = return . concat |
||
+ | |||
− | |||
-- Translate the letter 'e' to '*', like tr 'e' '*' (or y// in sed) |
-- Translate the letter 'e' to '*', like tr 'e' '*' (or y// in sed) |
||
− | tr :: Char -> Char -> IO () |
||
tr a b = interact (map f) |
tr a b = interact (map f) |
||
− | + | where f c = if c == a then b else c |
|
+ | |||
− | |||
-- Delete characters from a string. |
-- Delete characters from a string. |
||
− | + | tr_d a = tr a ' ' |
|
− | trd a = tr a ' ' |
||
− | |||
− | -- grep lines matching "^foo" from a file |
||
− | grep :: [String] -> [String] |
||
− | grep = filter (isPrefixOf "foo") |
||
− | |||
− | -- grep lines that don't match "^foo" (grep -v) |
||
− | grep_v :: [String] -> [String] |
||
− | grep_v = filter (not . isPrefixOf "foo") |
||
+ | -- lines matching the regular expression "[bf]oo" from a file |
||
+ | grep = filter (regexBool "[bf]oo") |
||
+ | |||
+ | -- lines not matching the regular expression "[bf]oo" from a file |
||
+ | grep_v = filter (not . regexBool "[bf]oo") |
||
+ | |||
-- number each line of a file |
-- number each line of a file |
||
− | num :: [String] -> [String] |
||
num = zipWith (printf "%3d %s") [(1::Int)..] |
num = zipWith (printf "%3d %s") [(1::Int)..] |
||
+ | |||
− | |||
-- Compute a simple cksum of a file |
-- Compute a simple cksum of a file |
||
− | cksum :: [Char] -> Int |
||
cksum = foldl' k 5381 |
cksum = foldl' k 5381 |
||
where k h c = h * 33 + ord c |
where k h c = h * 33 + ord c |
||
+ | |||
+ | -- And our main wrapper |
||
+ | main = do |
||
+ | who <- getProgName |
||
+ | maybe (return ()) id $ lookup who $ |
||
+ | [("blank", io blank ) |
||
+ | ,("cksum", interact (showln . cksum) ) |
||
+ | ,("clean", io clean'' ) |
||
+ | ,("echo" , interact id ) |
||
+ | ,("drop", interact drop' ) |
||
+ | ,("grep", io grep ) |
||
+ | ,("grep -v", io grep_v ) |
||
+ | ,("head", io (return . head') ) |
||
+ | ,("join", io join ) |
||
+ | ,("num", io num ) |
||
+ | ,("remove", io (remove "str") ) |
||
+ | ,("revw", io rev_w ) |
||
+ | ,("reverse", io rev ) |
||
+ | ,("reverseword", io rev_w ) |
||
+ | ,("rpt", io rpt ) |
||
+ | ,("sort", interact sort ) |
||
+ | ,("space", io space ) |
||
+ | ,("tac", interact tac ) |
||
+ | ,("take", io take' ) |
||
+ | ,("tail", io (return . tail') ) |
||
+ | -- ,( "tr" , interact tr) |
||
+ | -- ,( "tr -d", interact (tr_d . unwords)) |
||
+ | ,("unspace", io unspace ) |
||
+ | ,("upper", interact upper ) |
||
+ | ,("uniq", interact uniq ) |
||
+ | ,("wc_c", interact wc_c ) |
||
+ | ,("wc_l", interact wc_l ) |
||
+ | ,("wc_w", interact wc_w ) |
||
+ | ] |
||
+ | |||
</haskell> |
</haskell> |
||
+ | |||
+ | ==How to run== |
||
+ | These functions can be executed as one liners from a shell. For example, |
||
+ | to use the Haskell version of 'wc': |
||
+ | |||
+ | $ cat file.txt | ghc -e 'wc_l' UnixTools.hs |
||
+ | |||
+ | Or, one could define 'main' to be a chosen tool/function (add a line to |
||
+ | the effect that "main = wc_l") and then compile the tool with |
||
+ | |||
+ | $ ghc --make UnixTools.hs |
||
+ | |||
+ | The given Haskell codes presents yet a third way of doing things: much |
||
+ | like the [http://en.wikipedia.org/wiki/BusyBox BusyBox] suite of Unix |
||
+ | tools, it is possible to compile a single monolithic binary and have it |
||
+ | detect what name it is run by and then act appropriately. This is the |
||
+ | approach the following code takes: you can compile it and then make |
||
+ | symbolic links (like <code>"ln -s UnixTools echo; ln -s UnixTools cat" |
||
+ | </code>) and then run those commands (<code>"./echo foo | ./cat"</code> |
||
+ | would produce output of "foo"). |
||
+ | |||
Line 186: | Line 175: | ||
* The Haskell standard [http://www.cse.unsw.edu.au/~dons/data/List.html list library], with [http://haskell.org/ghc/docs/latest/html/libraries/base/Data-List.html docs] |
* The Haskell standard [http://www.cse.unsw.edu.au/~dons/data/List.html list library], with [http://haskell.org/ghc/docs/latest/html/libraries/base/Data-List.html docs] |
||
* Alternative [[Wc|implementations]] of the wc program |
* Alternative [[Wc|implementations]] of the wc program |
||
− | * Learn how to [[ |
+ | * Learn how to [[Introduction to QuickCheck|test Haskell code]] |
− | * [[ |
+ | * [[Example code|More]] Haskell code |
− | * Haskell for [[ |
+ | * Haskell for [[Applications and libraries/Operating system#Shell|shell scripting]] |
* Export list functions to the shell with [http://www.cse.unsw.edu.au/~dons/h4sh.html h4sh] |
* Export list functions to the shell with [http://www.cse.unsw.edu.au/~dons/h4sh.html h4sh] |
||
+ | * [[Checking for correct invocation of a command line haskell program]] |
||
− | |||
+ | * [[Poor man's here document]] |
||
[[Category:Tutorials]] |
[[Category:Tutorials]] |
||
[[Category:Code]] |
[[Category:Code]] |
Latest revision as of 05:27, 8 February 2016
Simple Unix commandline tools written in Haskell.
This is intended as a beginner's tutorial for learning Haskell from a "Let's just solve things already!" point of view. The examples should help give a flavor of the beauty and expressiveness of Haskell programming.
import Control.Monad.Instances
import Data.List
import Data.Char
import Data.Maybe
import Text.Printf
import System.Environment
import Text.Regex.Posix
-- First, three helpers
io f = interact (unlines . f . lines)
showln = (++ "\n") . show
regexBool r l = l =~ r :: Bool -- simple boolean regex matching
-- remove duplicate lines from a file (like uniq)
uniq = nub -- Warning: Unix uniq discards *consecutive* dupes,
-- but 'nub' discards all dupes.
-- repeat the input file infinitely
rpt = cycle
-- Return the head -10 line of a file
take' = take 10
-- Remove the first 10 lines of a file
drop' = drop 10
-- Return the head -1 line of a file
head' = head
-- Return the tail -1 line of a file
tail' = last
-- return the last ten lines of a file
tail10 = drop =<< subtract 10 . length
-- Reverse lines in a file (tac)
tac = reverse
-- Reverse characters on each line (rev)
rev = map reverse
-- Reverse words on each line
rev_w = map (unwords . reverse . words)
-- Count number of characters in a file (like wc -c)
wc_c = showln . length
-- Count number of lines in a file, like wc -l
wc_l = showln . length . lines
-- Count number of words in a file (like wc -w)
wc_w = showln . length . words
-- double space a file
space = intersperse ""
-- undo double space
unspace = filter (not.null)
-- remove the first occurrence of the line "str"
remove = delete
-- make a string all upper case
upper = map toUpper
-- remove leading space from each line
clean = map (dropWhile isSpace)
-- remove trailing whitespace
clean' = map (reverse . dropWhile isSpace . reverse)
-- delete leading and trailing whitespace
clean'' = map (f . f)
where f = reverse . dropWhile isSpace
-- insert blank space at beginning of each line
blank = map (s ++)
where s = replicate 8 ' '
-- join lines of a file
join = return . concat
-- Translate the letter 'e' to '*', like tr 'e' '*' (or y// in sed)
tr a b = interact (map f)
where f c = if c == a then b else c
-- Delete characters from a string.
tr_d a = tr a ' '
-- lines matching the regular expression "[bf]oo" from a file
grep = filter (regexBool "[bf]oo")
-- lines not matching the regular expression "[bf]oo" from a file
grep_v = filter (not . regexBool "[bf]oo")
-- number each line of a file
num = zipWith (printf "%3d %s") [(1::Int)..]
-- Compute a simple cksum of a file
cksum = foldl' k 5381
where k h c = h * 33 + ord c
-- And our main wrapper
main = do
who <- getProgName
maybe (return ()) id $ lookup who $
[("blank", io blank )
,("cksum", interact (showln . cksum) )
,("clean", io clean'' )
,("echo" , interact id )
,("drop", interact drop' )
,("grep", io grep )
,("grep -v", io grep_v )
,("head", io (return . head') )
,("join", io join )
,("num", io num )
,("remove", io (remove "str") )
,("revw", io rev_w )
,("reverse", io rev )
,("reverseword", io rev_w )
,("rpt", io rpt )
,("sort", interact sort )
,("space", io space )
,("tac", interact tac )
,("take", io take' )
,("tail", io (return . tail') )
-- ,( "tr" , interact tr)
-- ,( "tr -d", interact (tr_d . unwords))
,("unspace", io unspace )
,("upper", interact upper )
,("uniq", interact uniq )
,("wc_c", interact wc_c )
,("wc_l", interact wc_l )
,("wc_w", interact wc_w )
]
How to run
These functions can be executed as one liners from a shell. For example, to use the Haskell version of 'wc':
$ cat file.txt | ghc -e 'wc_l' UnixTools.hs
Or, one could define 'main' to be a chosen tool/function (add a line to the effect that "main = wc_l") and then compile the tool with
$ ghc --make UnixTools.hs
The given Haskell codes presents yet a third way of doing things: much
like the BusyBox suite of Unix
tools, it is possible to compile a single monolithic binary and have it
detect what name it is run by and then act appropriately. This is the
approach the following code takes: you can compile it and then make
symbolic links (like "ln -s UnixTools echo; ln -s UnixTools cat"
) and then run those commands ("./echo foo | ./cat"
would produce output of "foo").
Where to now?
- Haskell.org
- The Haskell standard list library, with docs
- Alternative implementations of the wc program
- Learn how to test Haskell code
- More Haskell code
- Haskell for shell scripting
- Export list functions to the shell with h4sh
- Checking for correct invocation of a command line haskell program
- Poor man's here document