SafeConcurrent

From HaskellWiki
Revision as of 13:06, 12 April 2011 by ChrisKuklewicz (talk | contribs) (Update to 0.5.0 with MSampleVar)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


Motivation

The base package (versions 3 and 4) implementations of Control.Concurrent.QSem and QSemN (and perhaps SamepleVar) are not exception safe. The proposed replacement code is on hackage as SafeSemaphore and source from version 0.4.1 is also included on this page.

Exception correctness means that the semaphore does not lose any of its quantity if the waiter or signaler is interrupted before the operation finishes. QSem and QSemN violate this safetly.

SafeSemaphore defines MSem as the proposed replacements for QSem, and MSemQ as the proposed replacement for QSemN.

The SampleVar module in base also has the same kind of bug, but with SampleVar the rutnime error is worse because it can case writeSampleVar to block indefinitely. The SafeSemaphore package as of version 0.5.0 has a MSampleVar module that does not have this bug.

The GHC ticket is #3160.

TestKillSem

The problem with QSem and QSemN is that a blocked waiter might be killed. This does not prevent a later signal from trying to pass quantity to a dead thread. This quantity is thus thrown out, a blatantly leaky abstraction. This is illustrated by the tests/TestKillSem.hs program in the SafeSemaphore package (run with cabal test, please read the log generated).

The program, preceded by its output as a comment is:

{- output demonstrating fate of thread 3:

Cases: 4  Tried: 0  Errors: 0  Failures: 0
Test QSem
0: forkIO wait thread 1
0: stop thread 1
1: wait interrupted
0: signal q #1
0: forkIO wait thread 2
0: forkIO wait thread 3
0: signal q #2
2: wait done
0: stop thread 2
0: stop thread 3
3: wait interrupted (QUANTITY LOST) FAIL
### Failure in: 0                         

Cases: 4  Tried: 1  Errors: 0  Failures: 1
Test QSemN
0: forkIO wait thread 1
0: stop thread 1
1: wait interrupted
0: signal q #1
0: forkIO wait thread 2
0: forkIO wait thread 3
0: signal q #2
2: wait done
0: stop thread 2
0: stop thread 3
3: wait interrupted (QUANTITY LOST) FAIL
### Failure in: 1                         

Cases: 4  Tried: 2  Errors: 0  Failures: 2
Test MSem
0: forkIO wait thread 1
0: stop thread 1
1: wait interrupted
0: signal q #1
0: forkIO wait thread 2
2: wait done
0: forkIO wait thread 3
0: signal q #2
3: wait done (QUANTITY CONSERVED) PASS
0: stop thread 2
0: stop thread 3
Cases: 4  Tried: 3  Errors: 0  Failures: 2
Test MSemN
0: forkIO wait thread 1
0: stop thread 1
1: wait interrupted
0: signal q #1
0: forkIO wait thread 2
2: wait done
0: forkIO wait thread 3
0: signal q #2
3: wait done (QUANTITY CONSERVED) PASS
0: stop thread 2
0: stop thread 3
Cases: 4  Tried: 4  Errors: 0  Failures: 2

-}
module Main where

import Control.Concurrent
import Control.Exception
import Control.Concurrent.QSem
import Control.Concurrent.QSemN
import qualified Control.Concurrent.MSem as MSem
import qualified Control.Concurrent.MSemN as MSemN
import Control.Concurrent.MVar
import Test.HUnit
import System.Exit

delay = threadDelay (1000*100)

fork x = do m <- newEmptyMVar
            t <- forkIO (finally x (putMVar m ()))
            delay
            return (t,m)

stop (t,m) = do killThread t
                delay
                takeMVar m

-- True if test passed, False if test failed
testSem :: Integral n 
        => String
        -> (n -> IO a) 
        -> (a->IO ()) 
        -> (a -> IO ()) 
        -> IO Bool
testSem name new wait signal = do
  putStrLn ("\nTest "++ name)
  q <- new 0
  putStrLn "0: forkIO wait thread 1"
  (t1,m1) <- fork $ do
    wait q `onException` (putStrLn "1: wait interrupted")
    putStrLn "1: wait done UNEXPECTED"
  putStrLn "0: stop thread 1"
  stop (t1,m1)
  putStrLn "0: signal q #1"
  signal q
  delay
  putStrLn "0: forkIO wait thread 2"
  (t2,m2) <- fork $ do
    wait q `onException` (putStrLn "2: wait interrupted UNEXPECTED")
    putStrLn "2: wait done"
  putStrLn "0: forkIO wait thread 3"
  result <- newEmptyMVar
  (t3,m3) <- fork $ do
    wait q `onException` (putStrLn "3: wait interrupted (QUANTITY LOST) FAIL" >> putMVar result False)
    putStrLn "3: wait done (QUANTITY CONSERVED) PASS"
    putMVar result True
  putStrLn "0: signal q #2"
  signal q
  delay
  putStrLn "0: stop thread 2"
  stop (t2,m2)
  putStrLn "0: stop thread 3"
  stop (t3,m3)
  takeMVar result

testsQ = TestList . map test $
  [ testSem "QSem" newQSem waitQSem signalQSem
  , testSem "QSemN" newQSemN (flip waitQSemN 1) (flip signalQSemN 1)
  ]

testsM = TestList . map test $
  [ testSem "MSem" MSem.new MSem.wait MSem.signal
  , testSem "MSemN" MSemN.new (flip MSemN.wait 1) (flip MSemN.signal 1)
  ]

-- This is run by "cabal test"
main = do
  runTestTT testsQ
  c <- runTestTT testsM
  if failures c == 0 then exitSuccess else exitFailure

This shows that quantity can be easily lost when using a QSem or QSemN, and shows that MSem and MSemN do not have this problem.

MSem

This code should be exception safe and exception correct. The API for QSem is slightly extended to allow peekAvail to query the amount of content in the semaphore. The semantics of QSem are slightly extended to allow a new MSem to be initialized with negative, zero, or positive quantity. The use of Int has been replaced with Integer. The wait operation has been added to encourage safely bracketing wait and signal.

Note that it does not allocate any MVars to manage the waiting queue. Only MSem.new allocates them. This should be more efficient than QSem.

{-# LANGUAGE DeriveDataTypeable #-}
-- | 
-- Module      :  Control.Concurrent.MSem
-- Copyright   :  (c) Chris Kuklewicz 2011
-- License     :  3 clause BSD-style (see the file LICENSE)
-- 
-- Maintainer  :  haskell@list.mightyreason.com
-- Stability   :  experimental
-- Portability :  non-portable (concurrency)
--
-- A semaphore in which operations may 'wait' for or 'signal' single units of value.  This modules
-- is intended to improve on "Control.Concurrent.QSem".
-- 
-- This semaphore gracefully handles threads which die while blocked waiting.  The fairness
-- guarantee is that blocked threads are FIFO.
--
-- If 'with' is used to guard a critical section then no quantity of the semaphore will be lost if
-- the activity throws an exception. 'new' can initialize the semaphore to negative, zero, or
-- positive quantity. 'wait' always leaves the 'MSem' with non-negative quantity.
module Control.Concurrent.MSem
    (MSem
    ,new
    ,with
    ,wait
    ,signal
    ,peekAvail
    ) where

import Control.Concurrent.MVar(MVar,withMVar,modifyMVar,modifyMVar_,newMVar,newEmptyMVar,putMVar,takeMVar,tryTakeMVar,tryPutMVar)
import Control.Exception(bracket_,uninterruptibleMask_,evaluate,mask_)
import Data.Typeable(Typeable)

{- design notes are in MSemN.hs -}

data MS = MS { avail :: !Integer     -- ^ This is the quantity available to be taken from the semaphore. Often updated.
             , headWait :: MVar ()   -- ^ The head of the waiter queue blocks on headWait. Never updated.
             }
  deriving (Eq,Typeable)

-- | A 'MSem' is a semaphore in which the available quantity can be added and removed in single
--  units, and which can start with positive, zero, or negative value.
data MSem = MSem { mSem :: !(MVar MS)      -- ^ Used to lock access to state of semaphore quantity. Never updated.
                 , queueWait :: !(MVar ()) -- ^ Used as FIFO queue for waiter, held by head of queue.  Never updated.
                 }
  deriving (Eq,Typeable)

-- |'new' allows positive, zero, and negative initial values.  The initial value is forced here to
-- better localize errors.
new :: Integer -> IO MSem
new initial = do
  newHeadWait <- newEmptyMVar
  newQueueWait <- newMVar ()
  newMS <- newMVar $! (MS { avail = initial
                          , headWait = newHeadWait })
  return (MSem { mSem = newMS
               , queueWait = newQueueWait })

-- | 'with' takes a unit of value from the semaphore to hold while performing the provided
-- operation.  'with' ensures the quantity of the sempahore cannot be lost if there are exceptions.
--
-- 'with' uses 'bracket_' to ensure 'wait' and 'signal' get called correctly.
with :: MSem -> IO a -> IO a
with m = bracket_ (wait m)  (signal m)

-- |'wait' will take one unit of value from the sempahore, but will block if the quantity available
-- is not positive.
--
-- If 'wait' returns without interruption then it left the 'MSem' with a remaining quantity that was
-- greater than or equal to zero.  If 'wait' is interrupted then no quantity is lost.  If 'wait'
-- returns without interruption then it is known that each earlier waiter has definitely either been
-- interrupted or has retured without interruption.
wait :: MSem -> IO ()
wait (MSem sem advance) = mask_ $ withMVar advance $ \ () -> do
  todo <- mask_ $ modifyMVar sem $ \ m -> do
    mayGrab <- tryTakeMVar (headWait m)
    case mayGrab of
      Just () -> return (m,Nothing)
      Nothing -> if 1 <= avail m
                   then do
                     m' <- evaluate $ m { avail = avail m - 1 }
                     return (m', Nothing)
                   else do
                     return (m, Just (headWait m))
  -- mask_ is needed above because we may have just decremented 'avail' and we must finished 'wait'
  -- without being interrupted so that a 'bracket' can ensure a matching 'signal' can be ensured.
  case todo of
    Nothing -> return ()
    Just hw -> takeMVar hw -- actually may or may not block, a 'signal' could have already arrived.

-- | 'signal' adds one unit to the sempahore.
--
-- 'signal' may block, but it cannot be interrupted, which allows it to dependably restore value to
-- the 'MSem'.  All 'signal', 'peekAvail', and the head waiter may momentarily block in a fair FIFO
-- manner.
signal :: MSem -> IO ()
signal (MSem sem _) = uninterruptibleMask_ $ modifyMVar_ sem $ \ m -> do
  -- mask_ might be as good as uninterruptibleMask_ since nothing below can block
  if avail m < 0
    then evaluate m { avail = avail m + 1 }
    else do
      didPlace <- tryPutMVar (headWait m) ()
      if didPlace
        then return m
        else evaluate m { avail = avail m + 1 }

-- | 'peekAvail' skips the queue of any blocked 'wait' threads, but may momentarily block on
-- 'signal', other 'peekAvail', and the head waiter. This returns the amount of value available to
-- be taken.  Using this value without producing unwanted race conditions is left up to the
-- programmer.
--
-- Note that "Control.Concurrent.MSemN" offers a more powerful API for making decisions based on the available amount.
peekAvail :: MSem -> IO Integer
peekAvail (MSem sem _) = mask_ $ withMVar sem $  \ m -> do
  extraFlag <- tryTakeMVar (headWait m)
  case extraFlag of
    Nothing -> return (avail m)
    Just () -> do putMVar (headWait m) () -- cannot block
                  return (1 + avail m)

MSemN

The API for MSemN follows QSemN with several more complicated additions. All quantity arguments may be negative, zero, or positive. There are waitF, signalF, and withF operations that take a pure function to computes the quantity change based on the current quantity in the semaphore. And peekAvail was added to query the semaphore's quantity.

{-# LANGUAGE DeriveDataTypeable #-}
-- | 
-- Module      :  Control.Concurrent.MSemN
-- Copyright   :  (c) Chris Kuklewicz 2011
-- License     :  3 clause BSD-style (see the file LICENSE)
-- 
-- Maintainer  :  haskell@list.mightyreason.com
-- Stability   :  experimental
-- Portability :  non-portable (concurrency)
--
-- Quantity semaphores in which each thread may wait for an arbitrary amount.  This modules is
-- intended to improve on "Control.Concurrent.QSemN".
-- 
-- This semaphore gracefully handles threads which die while blocked waiting for quantity.  The
-- fairness guarantee is that blocked threads are FIFO.  An early thread waiting for a large
-- quantity will prevent a later thread waiting for a small quantity from jumping the queue.
--
-- If 'with' is used to guard a critical section then no quantity of the semaphore will be lost
-- if the activity throws an exception.
--
module Control.Concurrent.MSemN
    (MSemN
    ,new
    ,with
    ,wait
    ,signal
    ,withF
    ,waitF
    ,signalF
    ,peekAvail
    ) where

import Control.Concurrent.MVar(MVar,withMVar,modifyMVar,modifyMVar_,newMVar,newEmptyMVar,putMVar,takeMVar,tryTakeMVar)
import Control.Exception(bracket,uninterruptibleMask_,onException,evaluate,mask_)
import Data.Typeable(Typeable)

{- 

The only MVars allocated are the three created be 'new'.  Their three roles are
1) to have a FIFO queue of waiters
2) for the head waiter to block on
3) to protect the quantity state of the semaphore and the head waiter

subtle design notes:

with, wait, and signal pattern match the quantity against 0 which has two effect: it avoids locking
in the easy case and it ensures strict evaluation of the quantity before any locks are taken.

Originally withF, waitF, and signal did not strictly evalaute the function they are passed before
locks are taken because there is no real point since the function may throw an error when computing
the size.  But then I realized forcing 'f' might run forever with the locks held and I could move
this particular hang outside the locks by first evaluating 'f'.

-}

-- MS has an invariant that "maybe True (> avail) headWants" is always True.
data MS = MS { avail :: !Integer             -- ^ This is the quantity available to be taken from the semaphore. Often updated.
             , headWants :: !(Maybe Integer) -- ^ If there is waiter then this is Just the amount being waited for. Often updated.
             , headWait :: MVar ()           -- ^ The head of the waiter queue blocks on headWait. Never updated.
             }
  deriving (Eq,Typeable)

-- | A 'MSemN' is a quantity semaphore, in which the available quantity may be signalled or
-- waited for in arbitrary amounts.
data MSemN = MSemN { mSem :: !(MVar MS)      -- ^ Used to lock access to state of semaphore quantity. Never updated.
                   , queueWait :: !(MVar ()) -- ^ Used as FIFO queue for waiter, held by head of queue.  Never updated.
                   }
  deriving (Eq,Typeable)

-- |'new' allows positive, zero, and negative initial values.  The initial value is forced here to
-- better localize errors.
new :: Integer -> IO MSemN
new initial = do
  newHeadWait <- newEmptyMVar
  newQueueWait <- newMVar ()
  newMS <- newMVar $! (MS { avail = initial
                          , headWants = Nothing
                          , headWait = newHeadWait })
  return (MSemN { mSem = newMS
                , queueWait = newQueueWait })

-- | 'with' takes a quantity of the semaphore to take and hold while performing the provided
-- operation.  'with' ensures the quantity of the sempahore cannot be lost if there are exceptions.
-- This uses 'bracket' to ensure 'wait' and 'signal' get called correctly.
with :: MSemN -> Integer -> IO a -> IO a
with _ 0 = id
with m wanted = bracket (wait m wanted)  (\() -> signal m wanted) . const

-- | 'withF' takes a pure function and an operation.  The pure function converts the available
-- quantity to a pair of the wanted quantity and a returned value.  The operation takes the result
-- of the pure function.  'withF' ensures the quantity of the sempahore cannot be lost if there
-- are exceptions.  This uses 'bracket' to ensure 'waitF' and 'signal' get called correctly.
--
-- Note: A long running pure function will block all other access to the 'MSemN' while it is
-- evaluated.
withF :: MSemN -> (Integer -> (Integer,b)) -> ((Integer,b) -> IO a) -> IO a
withF m f = seq f $ bracket (waitF m f)  (\(wanted,_) -> signal m wanted)

-- |'wait' allow positive, zero, and negative wanted values.  Waiters may block, and will be handled
-- fairly in FIFO order.
--
-- If 'wait' returns without interruption then it left the 'MSemN' with a remaining quantity that was
-- greater than or equal to zero.  If 'wait' is interrupted then no quantity is lost.  If 'wait'
-- returns without interruption then it is known that each earlier waiter has definitely either been
-- interrupted or has retured without interruption.
wait :: MSemN -> Integer -> IO ()
wait _ 0 = return ()
wait m wanted = fmap snd $ waitF m (const (wanted,()))

-- | 'waitWith' takes the 'MSemN' and a pure function that takes the available quantity and computes the
-- amount wanted and a second value.  The value wanted is stricly evaluated but the second value is
-- returned lazily.
--
-- 'waitF' allow positive, zero, and negative wanted values.  Waiters may block, and will be handled
-- fairly in FIFO order.
--
-- If 'waitF' returns without interruption then it left the 'MSemN' with a remaining quantity that was
-- greater than or equal to zero.  If 'waitF' or the provided function are interrupted then no
-- quantity is lost.  If 'waitF' returns without interruption then it is known that each previous
-- waiter has each definitely either been interrupted or has retured without interruption.
--
-- Note: A long running pure function will block all other access to the 'MSemN' while it is
-- evaluated.
waitF :: MSemN -> (Integer -> (Integer,b)) -> IO (Integer,b)
waitF (MSemN sem advance) f = seq f $ mask_ $ withMVar advance $ \ () -> do
  (out@(wanted,_),todo) <- modifyMVar sem $ \ m -> do
    let outVal@(wantedVal,_) = f (avail m)
    -- assert that headDown is Nothing via new or signal or cleanup
    -- wantedVal gets forced by the (<=) condition here:
    if wantedVal <= avail m
      then do
        let avail'down = avail m - wantedVal
        m' <- evaluate $ m { avail = avail'down }
        return (m', (outVal,Nothing))
      else do
        m' <- evaluate $ m { headWants = Just wantedVal }
        return (m', (outVal,Just (headWait m)))
  -- mask_ is needed above because either (Just wantedVal) may be set here and this means we need to
  -- get the `onException` setup without being interrupted, or avail'down was set and we must finish
  -- 'waitF' without being interrupted so that a 'bracket' can ensure a matching 'signal' can
  -- protect the returned quantity.
  case todo of
    Nothing -> return ()
    Just hw -> do
      let cleanup = uninterruptibleMask_ $ modifyMVar_ sem $ \m -> do
            mStale <- tryTakeMVar (headWait  m)
            let avail' = avail m + maybe 0 (const wanted) mStale
            evaluate $ m {avail = avail', headWants = Nothing}
      takeMVar hw `onException` cleanup -- may not block if a 'signal' has already arrived.
  return out

-- |'signal' allows positive, zero, and negative values, thus this is also way to remove quantity
-- that skips any threads in the 'wait'/'waitF' queue.  If the new total is greater than the next
-- value being waited for (if present) then the first waiter is woken.  If there are queued waiters
-- then the next one will wake after a waiter has proceeded and notice the remaining value; thus a
-- single 'signal' may result in several waiters obtaining values.  Waking waiting threads is
-- asynchronous.
--
-- 'signal' may block, but it cannot be interrupted, which allows it to dependably restore value to
-- the 'MSemN'.  All 'signal', 'signalF', 'peekAvail', and the head waiter may momentarily block in a
-- fair FIFO manner.
signal :: MSemN -> Integer -> IO ()
signal _ 0 = return ()
signal m size = uninterruptibleMask_ $ fmap snd $ signalF m (const (size,()))

-- | Instead of providing a fixed change to the available quantity, 'signalF' applies a provided
-- pure function to the available quantity to compute the change and a second value.  The
-- requested change is stricly evaluated but the second value is returned lazily.  If the new total is
-- greater than the next value being waited for then the first waiter is woken.  If there are queued
-- waiters then the next one will wake after a waiter has proceeded and notice the remaining value;
-- thus a single 'signalF' may result in several waiters obtaining values.  Waking waiting threads
-- is asynchronous.
--
-- 'signalF' may block, and it can be safely interrupted.  If the provided function throws an error
-- or is interrupted then it leaves the 'MSemN' unchanged.  All 'signal', 'signalF', 'peekAvail', and
-- the head waiter may momentarily block in a fair FIFO manner.
--
-- Note: A long running pure function will block all other access to the 'MSemN' while it is
-- evaluated.
signalF :: MSemN -> (Integer -> (Integer,b)) -> IO (Integer,b)
signalF (MSemN sem _) f = seq f $ modifyMVar sem $ \ m -> do
  let out@(size,_) = f (avail m)
  avail' <- evaluate $ avail m + size -- this forces 'size'
  case headWants m of
    Just wanted | wanted <= avail' -> do
      let avail'down = avail' - wanted
      m' <- evaluate $ m { avail = avail'down, headWants = Nothing }
      putMVar (headWait m') () -- will always succeed without blocking
      return (m',out)
    _ -> do
      m' <- evaluate $ m { avail = avail' }
      return (m',out)

-- | 'peekAvail' skips the queue of any blocked 'wait' and 'waitF' threads, but may momentarily
-- block on 'signal', 'signalF', other 'peekAvail', and the head waiter. This returns the amount of
-- value available to be taken.  Using this value without producing unwanted race conditions is left
-- up to the programmer.
--
-- 'peekAvail' is an optimized form of \"signalF m (\x -> (0,x))\".
--
-- A version of 'peekAvail' that joins the FIFO queue of 'wait' and 'waitF' can be acheived by
-- \"waitF m (\x -> (0,x))\"
peekAvail :: MSemN -> IO Integer
peekAvail (MSemN sem _) = withMVar sem (return . avail)