Difference between revisions of "Simple Unix tools"

From HaskellWiki
Jump to navigation Jump to search
(some follow up links)
(Remove dead link.)
 
(32 intermediate revisions by 12 users not shown)
Line 1: Line 1:
Simple unix tools written in Haskell.
+
Simple Unix commandline tools written in Haskell.
   
This is intended as a beginners tutorial for learning Haskell from a
+
This is intended as a beginner's tutorial for learning Haskell from a
"Lets just solve things already!" point of view. The examples should
+
"Let's just solve things already!" point of view. The examples should
help give a flavour of the beauty and expressiveness of Haskell
+
help give a flavor of the beauty and expressiveness of Haskell
 
programming.
 
programming.
  +
   
 
<haskell>
 
<haskell>
--
 
-- Some unix-like tools written in elegant Haskell
 
--
 
   
  +
import Control.Monad.Instances
 
import Data.List
 
import Data.List
 
import Data.Char
 
import Data.Char
import System.IO
+
import Data.Maybe
 
import Text.Printf
 
import Text.Printf
  +
import System.Environment
 
  +
import Text.Regex.Posix
--
 
  +
-- First, a useful helper
 
  +
-- First, three helpers
--
 
input f = interact (unlines . f . lines)
+
io f = interact (unlines . f . lines)
  +
 
  +
showln = (++ "\n") . show
--
 
  +
-- The 'cat' program
 
  +
regexBool r l = l =~ r :: Bool -- simple boolean regex matching
--
 
  +
cat = input id
 
  +
-- remove duplicate lines from a file (like uniq)
 
  +
uniq = nub -- Warning: Unix uniq discards *consecutive* dupes,
--
 
  +
-- but 'nub' discards all dupes.
-- Sort a file
 
  +
--
 
sort' = input sort
+
-- repeat the input file infinitely
  +
rpt = cycle
 
  +
--
 
-- Reverse a file (tac)
 
--
 
tac = input reverse
 
 
--
 
 
-- Return the head -10 line of a file
 
-- Return the head -10 line of a file
  +
take' = take 10
--
 
  +
head' = input $ take 10
 
 
--
 
 
-- Remove the first 10 lines of a file
 
-- Remove the first 10 lines of a file
  +
drop' = drop 10
--
 
  +
drop' = input $ drop 10
 
  +
-- Return the head -1 line of a file
 
  +
head' = head
--
 
  +
 
-- Return the tail -1 line of a file
 
-- Return the tail -1 line of a file
  +
tail' = last
--
 
  +
tail' = input $ (:[]) . last
 
  +
-- return the last ten lines of a file
 
  +
tail10 = drop =<< subtract 10 . length
--
 
  +
-- remove duplicate lines from a file (like uniq)
 
  +
-- Reverse lines in a file (tac)
--
 
uniq = input nub
+
tac = reverse
  +
 
  +
-- Reverse characters on each line (rev)
--
 
  +
rev = map reverse
  +
  +
-- Reverse words on each line
  +
rev_w = map (unwords . reverse . words)
  +
  +
-- Count number of characters in a file (like wc -c)
  +
wc_c = showln . length
  +
  +
-- Count number of lines in a file, like wc -l
  +
wc_l = showln . length . lines
  +
  +
-- Count number of words in a file (like wc -w)
  +
wc_w = showln . length . words
  +
 
-- double space a file
 
-- double space a file
  +
space = intersperse ""
--
 
  +
space = input $ intersperse ""
 
  +
-- undo double space
 
  +
unspace = filter (not.null)
--
 
  +
-- repeat the input file infinitely
 
  +
-- remove the first occurrence of the line "str"
--
 
  +
remove = delete
rpt = interact cycle
 
  +
 
  +
-- make a string all upper case
--
 
  +
upper = map toUpper
-- remove the first occurence of the line "str"
 
  +
--
 
remove = input $ delete "str"
 
 
--
 
-- make a file all upper case
 
--
 
upper = interact $ map toUpper
 
 
--
 
 
-- remove leading space from each line
 
-- remove leading space from each line
  +
clean = map (dropWhile isSpace)
--
 
  +
clean = input $ map (dropWhile isSpace)
 
  +
-- remove trailing whitespace
 
  +
clean' = map (reverse . dropWhile isSpace . reverse)
--
 
  +
  +
-- delete leading and trailing whitespace
  +
clean'' = map (f . f)
  +
where f = reverse . dropWhile isSpace
  +
  +
-- insert blank space at beginning of each line
  +
blank = map (s ++)
  +
where s = replicate 8 ' '
  +
 
-- join lines of a file
 
-- join lines of a file
  +
join = return . concat
--
 
  +
join = input $ (:[]) . concat
 
 
--
 
 
-- Translate the letter 'e' to '*', like tr 'e' '*' (or y// in sed)
 
-- Translate the letter 'e' to '*', like tr 'e' '*' (or y// in sed)
  +
tr a b = interact (map f)
--
 
y = interact $ map f
+
where f c = if c == a then b else c
  +
where f 'e' = '*'
 
  +
-- Delete characters from a string.
f c = c
 
  +
tr_d a = tr a ' '
--
 
-- Filter the letter 'e' from a file, like tr -d 'e'
 
--
 
tr = interact $ filter (/= 'e')
 
   
  +
-- lines matching the regular expression "[bf]oo" from a file
--
 
  +
grep = filter (regexBool "[bf]oo")
-- Count number of characters in a file (like wc -c)
 
  +
--
 
  +
-- lines not matching the regular expression "[bf]oo" from a file
wc_c = interact $ show . length
 
  +
grep_v = filter (not . regexBool "[bf]oo")
  +
  +
-- number each line of a file
  +
num = zipWith (printf "%3d %s") [(1::Int)..]
  +
  +
-- Compute a simple cksum of a file
  +
cksum = foldl' k 5381
  +
where k h c = h * 33 + ord c
  +
  +
-- And our main wrapper
  +
main = do
  +
who <- getProgName
  +
maybe (return ()) id $ lookup who $
  +
[("blank", io blank )
  +
,("cksum", interact (showln . cksum) )
  +
,("clean", io clean'' )
  +
,("echo" , interact id )
  +
,("drop", interact drop' )
  +
,("grep", io grep )
  +
,("grep -v", io grep_v )
  +
,("head", io (return . head') )
  +
,("join", io join )
  +
,("num", io num )
  +
,("remove", io (remove "str") )
  +
,("revw", io rev_w )
  +
,("reverse", io rev )
  +
,("reverseword", io rev_w )
  +
,("rpt", io rpt )
  +
,("sort", interact sort )
  +
,("space", io space )
  +
,("tac", interact tac )
  +
,("take", io take' )
  +
,("tail", io (return . tail') )
  +
-- ,( "tr" , interact tr)
  +
-- ,( "tr -d", interact (tr_d . unwords))
  +
,("unspace", io unspace )
  +
,("upper", interact upper )
  +
,("uniq", interact uniq )
  +
,("wc_c", interact wc_c )
  +
,("wc_l", interact wc_l )
  +
,("wc_w", interact wc_w )
  +
]
   
  +
</haskell>
--
 
-- Count number of lines in a file, like wc -l
 
--
 
wc_l = interact $ show . length . lines
 
   
  +
==How to run==
--
 
  +
These functions can be executed as one liners from a shell. For example,
-- Count number of words in a file (like wc -w)
 
  +
to use the Haskell version of 'wc':
--
 
wc_w = interact $ show . length . words
 
   
  +
$ cat file.txt | ghc -e 'wc_l' UnixTools.hs
--
 
-- grep lines matching "^foo" from a file
 
--
 
grep = input $ filter (isPrefixOf "foo")
 
   
  +
Or, one could define 'main' to be a chosen tool/function (add a line to
--
 
  +
the effect that "main = wc_l") and then compile the tool with
-- grep lines that don't match "^foo" (grep -v)
 
--
 
grep_v = input $ filter (not . isPrefixOf "foo")
 
   
  +
$ ghc --make UnixTools.hs
--
 
-- number each line of a file
 
--
 
num = input $ zipWith (printf "%3d %s") [1::Int..]
 
   
  +
The given Haskell codes presents yet a third way of doing things: much
--
 
  +
like the [http://en.wikipedia.org/wiki/BusyBox BusyBox] suite of Unix
-- Compute a simple hash of a file
 
  +
tools, it is possible to compile a single monolithic binary and have it
--
 
  +
detect what name it is run by and then act appropriately. This is the
cksum = interact $ printf "%u\n" . foldl' k 5381
 
  +
approach the following code takes: you can compile it and then make
where k h c = h * 33 + fromIntegral (ord c) :: Int
 
  +
symbolic links (like <code>"ln -s UnixTools echo; ln -s UnixTools cat"
</haskell>
 
  +
</code>) and then run those commands (<code>"./echo foo | ./cat"</code>
  +
would produce output of "foo").
   
'''Where to now?'''
 
   
  +
  +
==Where to now?==
  +
* [[Haskell|Haskell.org]]
 
* The Haskell standard [http://www.cse.unsw.edu.au/~dons/data/List.html list library], with [http://haskell.org/ghc/docs/latest/html/libraries/base/Data-List.html docs]
 
* The Haskell standard [http://www.cse.unsw.edu.au/~dons/data/List.html list library], with [http://haskell.org/ghc/docs/latest/html/libraries/base/Data-List.html docs]
 
* Alternative [[Wc|implementations]] of the wc program
 
* Alternative [[Wc|implementations]] of the wc program
* [[Example_code|More]] Haskell code
+
* Learn how to [[Introduction to QuickCheck|test Haskell code]]
  +
* [[Example code|More]] Haskell code
* Haskell for [[Libraries_and_tools/Operating_system#Shell|shell scripting]]
 
  +
* Haskell for [[Applications and libraries/Operating system#Shell|shell scripting]]
 
* Export list functions to the shell with [http://www.cse.unsw.edu.au/~dons/h4sh.html h4sh]
 
* Export list functions to the shell with [http://www.cse.unsw.edu.au/~dons/h4sh.html h4sh]
  +
* [[Checking for correct invocation of a command line haskell program]]
 
  +
* [[Poor man's here document]]
 
[[Category:Tutorials]]
 
[[Category:Tutorials]]
  +
[[Category:Code]]

Latest revision as of 05:27, 8 February 2016

Simple Unix commandline tools written in Haskell.

This is intended as a beginner's tutorial for learning Haskell from a "Let's just solve things already!" point of view. The examples should help give a flavor of the beauty and expressiveness of Haskell programming.


import Control.Monad.Instances
import Data.List
import Data.Char
import Data.Maybe
import Text.Printf
import System.Environment
import Text.Regex.Posix
 
-- First, three helpers
io f = interact (unlines . f . lines)
 
showln  = (++ "\n") . show
 
regexBool r l = l =~ r :: Bool -- simple boolean regex matching
 
-- remove duplicate lines from a file (like uniq)
uniq    = nub   -- Warning: Unix uniq discards *consecutive* dupes,
                -- but 'nub' discards all dupes.
 
-- repeat the input file infinitely
rpt     = cycle
 
-- Return the head -10 line of a file
take'   = take 10
 
-- Remove the first 10 lines of a file
drop'   = drop 10
 
-- Return the head -1 line of a file
head'   = head
 
-- Return the tail -1 line of a file
tail'   = last
 
-- return the last ten lines of a file
tail10  = drop =<< subtract 10 . length
 
-- Reverse lines in a file (tac)
tac     = reverse
 
-- Reverse characters on each line (rev)
rev     = map reverse
 
-- Reverse words on each line
rev_w   = map (unwords . reverse . words)
 
-- Count number of characters in a file (like wc -c)
wc_c    = showln . length
 
-- Count number of lines in a file, like wc -l
wc_l    = showln . length . lines
 
-- Count number of words in a file (like wc -w)
wc_w    = showln . length . words
 
-- double space a file
space   = intersperse ""
 
-- undo double space
unspace = filter (not.null)
 
-- remove the first occurrence of the line "str"
remove  = delete
 
-- make a string all upper case
upper   = map toUpper
 
-- remove leading space from each line
clean   = map (dropWhile isSpace)
 
-- remove trailing whitespace
clean'  = map (reverse . dropWhile isSpace . reverse)
 
-- delete leading and trailing whitespace
clean'' = map (f . f)
    where f = reverse . dropWhile isSpace
 
-- insert blank space at beginning of each line
blank   = map (s ++)
     where s = replicate 8 ' '
 
-- join lines of a file
join = return . concat
 
-- Translate the letter 'e' to '*', like tr 'e' '*' (or y// in sed)
tr a b = interact (map f)
    where f c = if c == a then b else c
 
-- Delete characters from a string.
tr_d a = tr a ' '

-- lines matching the regular expression "[bf]oo" from a file
grep = filter (regexBool "[bf]oo")
 
-- lines not matching the regular expression "[bf]oo" from a file
grep_v  = filter (not . regexBool "[bf]oo")
 
-- number each line of a file
num  = zipWith (printf "%3d %s") [(1::Int)..]
 
-- Compute a simple cksum of a file
cksum   =  foldl' k 5381
   where k h c = h * 33 + ord c
 
-- And our main wrapper
main = do
    who <- getProgName
    maybe (return ()) id $ lookup who $
        [("blank",       io blank                  )
        ,("cksum",       interact (showln . cksum) )
        ,("clean",       io clean''                )
        ,("echo" ,       interact id               )
        ,("drop",        interact drop'            )
        ,("grep",        io grep                   )
        ,("grep -v",     io grep_v                 )
        ,("head",        io (return . head')       )
        ,("join",        io join                   )
        ,("num",         io num                    )
        ,("remove",      io (remove "str")         )
        ,("revw",        io rev_w                  )
        ,("reverse",     io rev                    )
        ,("reverseword", io rev_w                  )
        ,("rpt",         io rpt                    )
        ,("sort",        interact sort             )
        ,("space",       io space                  )
        ,("tac",         interact tac              )
        ,("take",        io take'                  )
        ,("tail",        io (return . tail')       )
    --  ,( "tr"  ,    interact tr)
    --  ,( "tr -d",   interact (tr_d . unwords))
        ,("unspace",     io unspace                )
        ,("upper",       interact upper            )
        ,("uniq",        interact uniq             )
        ,("wc_c",        interact wc_c             )
        ,("wc_l",        interact wc_l             )
        ,("wc_w",        interact wc_w             )
        ]

How to run

These functions can be executed as one liners from a shell. For example, to use the Haskell version of 'wc':

   $ cat file.txt | ghc -e 'wc_l' UnixTools.hs

Or, one could define 'main' to be a chosen tool/function (add a line to the effect that "main = wc_l") and then compile the tool with

   $ ghc --make UnixTools.hs

The given Haskell codes presents yet a third way of doing things: much like the BusyBox suite of Unix tools, it is possible to compile a single monolithic binary and have it detect what name it is run by and then act appropriately. This is the approach the following code takes: you can compile it and then make symbolic links (like "ln -s UnixTools echo; ln -s UnixTools cat" ) and then run those commands ("./echo foo | ./cat" would produce output of "foo").


Where to now?