GHC/CloudAndHPCHaskell/Transport

From HaskellWiki
Jump to navigation Jump to search

Networking interfaces for Haskell

Purpose of this page

The purpose of this page is to collect information about networking interfaces to enable us to decide what would work best for Haskell in cloud and HPC applications.

Use cases

The networking interface should address the following use cases:

  • Networked versions of DPH
  • CloudHaskell Networking
  • Client-Server development
  • Internet Services development


Existing C Networking API's

IP

The most well known network interfaces are those in the IP stack:

  • UPD - datagram delivery
  • TCP - sockets

The IP interfaces have seen numerous enhancements in every programming language to provide further higher level interfaces, particularly for remote procedure calls (e.g. SUN RPC and Java RMI) and for the implementation of internet protocols (HTTP, imap).

The maturity if IP is unmatched and quite possibly it's the only protocol suitable for offering services on the internet.

HPC Network Interfaces

HPC applications have typically been programmed for much higher performance hardware, such as Infiniband and numerous less well known one (Myrinet, Elan, Cray Seastar, IBM ...). All of these ship with one or more low level interfaces, which are generally distinct from IP interfaces in two areas:

  1. there is extensive support for RDMA (remote DMA) transport, which follows the exchange of DMA handles among nodes, and results in 0-copy transport
  2. completion of delivery is handled through the delivery of events, and applications either poll / spin on the appearance of an event or sleep on it.
  3. there is minimal CPU involvement in the data transport, and typically user level applications may interface with the hardware directly.
  4. the transport protocol is almost always message oriented, so re-assembly of complete messages as is common in TCP based applications is not needed.

One way latencies of HPC networks are now in the range of 300ns to 5us, and bandwidths approach 10GB/sec.

A number of higher level API's have been developed for HPC networking, but only in a few cases have more than a single hardware platform been addressed. Examples of these API's are:

  • The OFED (Open Fabrics Alliance Stack). This is an extremely large networking stack with API's at many different levels which runs on all IB networks and on some ethernet interfaces. It's socket direct protocol (SDP) is most similar to TCP, yet has major performance degradation over low level API's such as the verbs API (VAPI)
  • Portals. Portals is a simple elegant API developed at Sandia with very rich delivery and event semantics. Portals was used (and forked) by the Lustre project (who call it LNET). LNET runs on some 10 different physical networks and can route between them.
  • CCI - a light weight API also with a driver model. Very promising, excellent performance, and in early stages of design allowing for the incorporation of new features.
  • Gasnet - A transport API tuned to provide the PGAS features for UPC. Has many drivers. The GASNET website shows in its performance section how high the overhead of MPI can be in some cases.
  • MPI - really a large C programming interface for writing distributed parallel programs. It has many features outside networking and a rather involved communication group mechanism that (almost) precludes it from being used in environments where nodes come and go.


Requirements for different network applications

  • High Performance Computing
    • low latency
    • high throughput
    • one sided communication
    • possibility to leverage HW supported collectives and matching
    • support multiple networks
    • possible NOC and SOC implementations for communication with GPU's etc (see e.g. AMD FSA)
    • iov's
  • Cloud applications
    • XXX please assist here
  • Client Server development (e.g. a file service)
    • High throughput, medium low latency
    • Authentication
    • Routing with resilience
    • Channel bonding for load balancing and failover
    • Kernel and user level bindings
    • Secure remote DMA
    • Congestion control
    • iov's
  • Internet applications
    • large TCP connection counts
    • effective handling of DOS attacks