User:Benmachine/New network package

From HaskellWiki

Origin[edit]

This concept originally arose as "fix up the network package" to address the following perceived flaws:

  • Timeout socket options are unusable because setSockOpt takes an Int but they want a struct timeval. This can't be worked around because the underlying C import is not exposed and anyway is imported with the wrong type.
  • Only Network.URI uses parsec, and most users of network probably don't use that module, so the extra dependency is probably unnecessary. [Note added by MichaelT: Maybe it is overdetermined that Network.URI should be separate: There are or have been many Hackage packages that use Network.URI but nothing else from network which introduces unnecessary dependencies (e.g. on a C compiler!).]
  • The strange behaviour of UnixSocket with connectTo and accept leads me to believe that perhaps the address datatypes aren't well thought-out, as the API design allows for nonsensical requests to be made.
  • The API is conditionally exposed based on what symbols are or are not defined on the compilation platform: in a sense this is good because unavailable APIs are caught at compile time, but the error message you get in this case is awkward and has led to spurious bug reports and confusion. It also doesn't seem possible to account for these API differences without yourself using CPP, which seems clumsy to me.
  • The PortNumber newtype defines a Num instance despite the fact that it really doesn't often make sense to subtract ports. It doesn't define a Read instance.

The last few points especially started to suggest that a change to the existing package would probably be quite sweeping and API-breaking, so the idea mutated into "design a new network package minus the above flaws", since that was likely to be less troublesome for upgraders.

New ideas[edit]

  • Essentially a three-tiered API:
    • the lowest-level FFI interface to the C API, exposing foreign imports and datatypes directly, along with their Storable instances, etc.
    • something akin to the current Network package, that is organised to be equivalent in expressivity to the FFI interface but using more idiomatic types and signatures, and taking care of all the marshalling under the hood.
    • nicer Haskell wrappers around the above that capture common or encouraged usage patterns. For example, many socket options only need to be set once, when the socket is created, so we might just add a [SocketOption] (or whatever) parameter to the socket creation function. This makes things less stateful and neater. Depending on how far we want to extend this idea, we could potentially make it a separate package so that we can take advantage of a wider range of dependencies (e.g. iteratee, safer-file-handles, etc.)
  • Socket options:
    • Instead of dispatching on a SocketOption ADT, use multiple functions: instead of setSocketOption sock Debug 1 we have setSocketDebug sock True or something similar.
    • Or how about a class-based approach?
-- retrieves the Fd from the Socket and calls the foreign import
setSockOptWrapper :: Socket
  -> CInt -- ^ level
  -> CInt -- ^ option name
  -> Ptr a -- ^ option value
  -> CSockLen -- ^ option size
  -> IO ()

class SocketOption a where
  setSocketOption :: Socket -> a -> IO ()
  getSocketOption :: Socket -> IO a

newtype SocketDebug = SocketDebug Bool

ptrSizeOf :: Storable a => Ptr a -> Int
ptrSizeOf p = sizeOf ((undefined :: Ptr a -> a) p)

instance SocketOption SocketDebug where
  setSocketOption s (SocketDebug b) =
    with opt $ \ptr ->
      setSockOptWrapper s
        (#const IPPROTO_TCP)
        (#const SO_DEBUG)
        (ptr :: Ptr CInt)
        sizeOf opt
   where
    opt :: CInt
    opt = if b then 1 else 0
  getSocketOption s = alloca $ \vp -> alloca $ \sz -> do
    poke sz (ptrSizeOf vp)
    getSockOptWrapper s
      (#const IPPROTO_TCP)
      (#const SO_DEBUG)
      (vp :: Ptr CInt)
      sz
    fmap (SocketDebug . (/= 0)) $ peek vp

-- And maybe…

data SocketOption = forall so. SocketOption so => SocketOption so

setSocketOptions  :: [SocketOption] -> Socket -> IO ()
withSocketOptions :: [SocketOption] -> Socket -> (Socket -> IO r) -> IO r

-- with safer-file-handles we could make this region-y
withSocket :: Family -> SocketType -> ProtocolNumber -> [SocketOption]
  -> (Socket -> IO r) -> IO r

-- the idea is that the type, and therefore class instance, is inferred from use of the newtype constructor:
twiddleDebug :: Socket -> IO ()
twiddleDebug s = do
  SocketDebug b <- getSocketOption s
  putStrLn $ "Socket debug is: " ++ show b
  setSocketOption s $ SocketDebug True
    • Both of these are more extensible with "pseudo-options" than the ADT approach, which is nice. The latter may aid encapsulation of the Socket type:
<Twey> The obvious [advantage] is that SocketOption is then an open class
<Twey> So you can add e.g. platform-specific options in other packages
<benmachine> ...you could just write more functions? :P
<Twey> Say the SocketOption class contains methods to transform it to the ints C 
       uses
<Twey> Not without knowing the internal details of the Socket type
<benmachine> ah but the C API uses void*
<Twey> void*, whatever
<benmachine> the point being that how you translate things to void* might depend 
             a little on which socket option it is
<Twey> Yeah
<benmachine> and hence can't necessarily be done in a uniform way
<Twey> Which is why it's in the class
<benmachine> you mean in the instance methods?
<Twey> Yes
<benmachine> so you're suggesting that there's a function which takes a socket 
             and a void* and does the FFI call
<benmachine> and the class instances take care of accepting an Integer and 
             turning it into a void*
<Twey> Not necessarily an Integer
<Twey> Whatever argument(s) they accept
<benmachine> sure
<benmachine> just an example
<Twey> *nod*
  • I don't actually think the class-based approach has this property in a way that the multiple-function approach can't. We can still define and use setSockOptWrapper, so the difference just seems to be between
twiddleDebug sock = do
  SocketDebug b <- getSocketOption sock

and

twiddleDebug sock = do
  b <- getSocketDebug sock
  • I think the latter solution is simpler. We still need an analog of the existential SocketOption type, but I think in practical terms we'd not use SocketOption for *getting* socket options, just setting them, so the purpose is pretty much served by type SocketOption = Socket -> IO (). Admittedly that's a bigger type, but it's also a simpler one, that to me seems easier to manipulate.
  • String and ByteString need to be given equal consideration. Whether this is done by module boundaries (as in Network.Socket.ByteString) or a type-class approach is still under discussion.
    • Why does String need equal consideration? It's only a legacy option.
      • At least equal? :) but strings are actually quite convenient to pattern match on and stuff.
        • Only for prefixes, and that case works fine with view patterns.
        • Hmm. Okay, in principle I'll buy this. ByteString only?
    • I think that it encourages people to think about encoding and stuff if we use a concrete ByteString type everywhere. But maybe we want to let people not think about encoding sometimes? We don't want encoding-boilerplate all over the place.
      • How about abstracting Socket into a typeclass and having separate instances for encoded/raw sockets?
      • Or, having a filter mechanism in the Socket type
      • Or, exposing a Handle somehow and letting the clever new GHC7 IO stuff take care of it.
        • I'm not sure what any of these mean and would appreciate exposition :P
  • Use [1] where appropriate

See also[edit]

network-fancy on Hackage