Library/IO
This page describes my proposal for development of new standard low-level I/O library -- Bulatz 09:29, 13 March 2007 (UTC)
Motivation
The existing GHC I/O library (based on using Handles) is very feature-rich, but it cannot be extended any more. The reason is that this library has non-modular design where all features are closely coupled with each other and GHC RTS. But we need to further extend it, adding the following facilities:
- More models for async i/o (support for kqueue,epoll,AIO)
- Unicode filenames on windows and unix
- Using ByteString/UTF8String/UTF16String for filenames
- Various encodings (UTF8,UTF16...) for text files
- Files>4gb on windows
- Memory-mapped files
- ByteString i/o
- Binary i/o and binary serialization
Although additional libraries ([1]-[5]) solves almost every problem mentioned here, they are not coupled together - you can't use async i/o from network-alt with ByteString I/O from FPS and Char encoding routines from Streams. I don't even say that most of these features are simply not available for other Haskell compilers.
On the other hand, there are alternative designs for implementation of higher-level features such as buffering and text encoding (at least, Streams vs SSC). Moreover, higher-level implementation greaty depends on language-extension features (such as MPTC+FD) whose support varies between haskell compilers. As a result, i propose to develop standard *low-level* I/O library that will hide details of interacion with OS but don't provide any higher-level interfaces to work with files - it would be a business for other libs.
Proposal
So, my proposal includes the following:
Prerequisites: implementation in FPS or some other library common operations on ByteString/UTF8String/UTF16String and providing some Stringable class that provides type-independent interface to these operations:
class Stringable a where
length :: a -> Int
concat :: [a] -> a
....
instance Stringable String
instance Stringable ByteString
instance Stringable UTF8String
instance Stringable UTF16String
The library itself should include:
- System.FilePath modules as developed by Neil Mitchell - changed to work via operations of Stringable
- System.Directory modules from Base - but changed to work via Stringable and to support Unicode filenames (on Win32 this means using one set of functions on NT and another on Win9x)
- System.File module from Streams/SSC, extended with support of Stringable, Unicode filenames and files>4GB for Win32
- System.MappedFile from Streams/SSC, ditto
Plus, the library should support async i/o and networking (sockets) facilities, but i'm not sure how this should be accomplished. Please edit the proposal if you know :)
Ultimately, the library would provide OS- and compiler-independent access to file i/o and file system management, with support of all modern OS facilities (unicode filenames, async i/o...). Libraries on top of it (such as Streams and SSC) can then provide higher-level interface with support of buffering, text encoding, locking and other OS-independent features
Additional information
List of various libraries extending I/O or providing features required for this library:
- http://www.cse.unsw.edu.au/~dons/fps.html
- Library/Streams
- http://yogimo.sakura.ne.jp/ssc/
- http://www.cs.helsinki.fi/u/ekarttun/network-alt/
- Win32Files module from http://www.haskell.org/bz/FreeArc-sources.tar.gz
- FilePath
- http://www.seas.upenn.edu/~lipeng/homepage/unify.html
- http://www.haskell.org/pipermail/libraries/2005-July/004189.html