Difference between revisions of "NIO"
JohanTibell (talk | contribs) |
JohanTibell (talk | contribs) |
||
(2 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
= New I/O = |
= New I/O = |
||
+ | This is a new I/O library for Haskell that is intended to provide a |
||
− | ''This page is currently being created. Please do not edit.'' |
||
+ | high performance API that make good use of advance operating system |
||
+ | facilities for I/O. |
||
== Rationale and Goals == |
== Rationale and Goals == |
||
− | Haskell 98 specifies and number of I/O actions. All these actions accept and return <hask>String</hask>s. However, <hask>String</hask>s |
+ | Haskell 98 specifies and number of I/O actions. All these actions accept and return <hask>String</hask>s. However, <hask>String</hask>s perform badly and waste space. They are also conceptually the wrong type for many operations. For example, sockets receive and send bytes while file I/O often deals in terms of characters and yet both use <hask>String</hask> while sockets should use a data type that represents binary data such as <hask>ByteString</hask>s. |
+ | Furthermore, do not use of efficient operating system APIs for asynchronous I/O like <code>epoll</code>. |
||
− | We need to first create a low-level API that covers the basic I/O functionality provided by the operating system which other, more high-level libraries can build upon. |
||
+ | |||
+ | == Background Study == |
||
+ | |||
+ | To get a good idea of the different possible trade-offs in designing an I/O library here's an overview over what I/O libraries look like in other programming languages. |
||
+ | |||
+ | === Java === |
||
+ | |||
+ | While Java first I/O library was built using streams the new I/O library, dubbed NIO, uses a similar concept called channels. The two basic channels, <code>ReadableByteChannel</code> and <code>WritableByteChannel</code>, have a very narrow interface only providing a single read and a single write function. These two function operate on <code>ByteBuffer</code>s. <code>ByteBuffer</code>s are mutable buffers that keep track on the next position available for writing and reading. Since the buffers can be allocated in a memory region used by the operating system for its native I/O operations additional copying can be avoided and the CPU might not have to be involved in the data transfer at all. |
||
+ | |||
+ | === Available OS APIs for asynchronous I/O === |
||
+ | |||
+ | ==== epoll ==== |
||
+ | |||
+ | Linux provides <code>epoll</code>, a more efficient version of the older <code>poll</code> API, since version 2.5.44. The man page describes <code>epoll</code>: |
||
+ | |||
+ | <blockquote> |
||
+ | <p>"An epoll set is connected to a file descriptor created by epoll_create(2). Interest for certain file descriptors is then registered via epoll_ctl(2). Finally, the actual wait is started by epoll_wait(2).</p> |
||
+ | <p> |
||
+ | An epoll set is connected to a file descriptor created by epoll_create(2). Interest for certain file descriptors is then registered via epoll_ctl(2). Finally, the actual wait is started by epoll_wait(2)." |
||
+ | </p> |
||
+ | </blockquote> |
||
+ | |||
+ | The API provides the following functions: |
||
+ | |||
+ | <pre> |
||
+ | #include <sys/epoll.h> |
||
+ | |||
+ | int epoll_create(int size); |
||
+ | int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event); |
||
+ | int epoll_wait(int epfd, struct epoll_event *events, |
||
+ | int maxevents, int timeout); |
||
+ | int epoll_pwait(int epfd, struct epoll_event *events, |
||
+ | int maxevents, int timeout, |
||
+ | const sigset_t *sigmask); |
||
+ | </pre> |
||
== Raw I/O == |
== Raw I/O == |
||
Line 45: | Line 82: | ||
# http://www.python.org/dev/peps/pep-3116/ |
# http://www.python.org/dev/peps/pep-3116/ |
||
+ | # http://www.youtube.com/watch?v=yNRS1ssLPdQ |
||
− | # |
||
+ | # http://openjdk.java.net/projects/nio/presentations/TS-5686.pdf |
||
+ | # http://javanio.info/filearea/nioserver/WhatsNewNIO2.pdf |
||
+ | # http://jcp.org/en/jsr/detail?id=203 |
Latest revision as of 21:13, 10 August 2008
New I/O
This is a new I/O library for Haskell that is intended to provide a high performance API that make good use of advance operating system facilities for I/O.
Rationale and Goals
Haskell 98 specifies and number of I/O actions. All these actions accept and return String
s. However, String
s perform badly and waste space. They are also conceptually the wrong type for many operations. For example, sockets receive and send bytes while file I/O often deals in terms of characters and yet both use String
while sockets should use a data type that represents binary data such as ByteString
s.
Furthermore, do not use of efficient operating system APIs for asynchronous I/O like epoll
.
Background Study
To get a good idea of the different possible trade-offs in designing an I/O library here's an overview over what I/O libraries look like in other programming languages.
Java
While Java first I/O library was built using streams the new I/O library, dubbed NIO, uses a similar concept called channels. The two basic channels, ReadableByteChannel
and WritableByteChannel
, have a very narrow interface only providing a single read and a single write function. These two function operate on ByteBuffer
s. ByteBuffer
s are mutable buffers that keep track on the next position available for writing and reading. Since the buffers can be allocated in a memory region used by the operating system for its native I/O operations additional copying can be avoided and the CPU might not have to be involved in the data transfer at all.
Available OS APIs for asynchronous I/O
epoll
Linux provides epoll
, a more efficient version of the older poll
API, since version 2.5.44. The man page describes epoll
:
"An epoll set is connected to a file descriptor created by epoll_create(2). Interest for certain file descriptors is then registered via epoll_ctl(2). Finally, the actual wait is started by epoll_wait(2).
An epoll set is connected to a file descriptor created by epoll_create(2). Interest for certain file descriptors is then registered via epoll_ctl(2). Finally, the actual wait is started by epoll_wait(2)."
The API provides the following functions:
#include <sys/epoll.h> int epoll_create(int size); int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event); int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout); int epoll_pwait(int epfd, struct epoll_event *events, int maxevents, int timeout, const sigset_t *sigmask);
Raw I/O
The new I/O library resides in the New I/O (NIO) module.
module System.Nio
All I/O actions deal in terms of ByteStrings.
import Data.ByteString
read :: Handle -> Int -> IO ByteString
write :: Handle -> ByteString -> IO Int
tell :: Handle -> IO Integer
seek :: Handle -> SeekMode -> Integer -> IO ()
close :: Handle -> IO ()
truncate :: Handle -> Integer -> IO () -- should throw some kind of exception
isReadable :: Handle -> IO Bool
isWritable :: Handle -> IO Bool