Difference between revisions of "FFI cook book"
m (Heading levels)
|Line 46:||Line 46:|
> type OCIHandle = Ptr OCIStruct
> type OCIHandle = Ptr OCIStruct
GHC allows this constructor-less version (with -
GHC allows this constructor-less version (with --exts):
> data OCIStruct
> data OCIStruct
Revision as of 20:59, 29 November 2006
This attempts to be a guide/tutorial/cookbook approach to writing a library using external (FFI) functions. Some people complain that cookbook approaches discourage a lack of thinking; that may be so, but they also help novices get started faster. Being a little hard of thinking myself, I would have been grateful for something like this when I was getting started. The FFI spec, while valuable, is not a tutorial.
This guide contains examples and lessons accumulated writing an FFI binding to the Oracle DBMS OCI (Oracle Call Interface), a low-level C library.
My FFI library code tends to look like imperative code written in Haskell. I guess we should expect this to some extent when dealing with external libraries, although it might be better (for me) to explore more functional alternatives. (However, Haskell also seems to be quite a good language for writing imperative code in.)
These libraries are useful for memory management, and using C pointers.
Contains peek, poke, peekByteOff, pokeByteOff, etc:
Contains alloca, malloc, free, etc:
Calling C functions
Passing opaque structures/types
Problem: A C function creates an opaque structure, which I must later pass to other C functions. What type should I use?
Solution: Create a datatype to represent the opaque structure. Note that the C functions expect a pointer to the structure, so I've created a type synonym called OCIHandle for these.
> data OCIStruct = OCIStruct > type OCIHandle = Ptr OCIStruct GHC allows this constructor-less version (with -fglasgow-exts): > data OCIStruct
this would also work, if only for the lesser lines of code you'd need to type:
--no need for the data declaration type OCIStruct = Ptr ()
i don't know if there are any side effects to this but it works fine for me so far -- eyan at eyan dot org
The side-effect I wanted to avoid was using the wrong pointer at the wrong time. Consider:
type EnvStruct = Ptr () type EnvHandle = Ptr EnvStruct type ErrorStruct = Ptr () type ErrorHandle = Ptr ErrorStruct
ErrorHandle and EnvHandle have the same type i.e. you can use one where you would use the other. I would rather use different datatypes so the compiler can help me catch these type errors. Better would be:
data EnvStruct = EnvStruct type EnvHandle = Ptr EnvStruct data ErrorStruct = ErrorStruct type ErrorHandle = Ptr ErrorStruct
Problem: C function takes a pointer-to-a-pointer argument, which is modified to point to some newly allocated structure or value. The return value of the C function is a success-or-failure code (int). So we effectively have parameters which are in-out. How do you wrap these in Haskell functions that return the actual structure (and raise an exception on failure)?
Single argument case
If the function only modifies one of its arguments, then use code like this:
OCIHandle is a synonym for Ptr OCIStruct, so the second argument to OCIHandleAlloc has type Ptr Ptr OCIStruct. The C signature for OCIHandleAlloc describes the second argument as **void, i.e. a pointer to a pointer to something. > foreign import ccall "oci.h OCIHandleAlloc" ociHandleAlloc :: > OCIHandle -> Ptr OCIHandle -> CInt -> CInt -> Ptr a -> IO CInt > > handleAlloc :: CInt -> OCIHandle -> IO OCIHandle > handleAlloc handleType env = alloca $ \ptr -> do > rc <- ociHandleAlloc env ptr handleType 0 nullPtr > if rc < 0 > then throwOCI (OCIException rc "allocate handle") > else peek ptr
(Shouldn't that be "... else peek ptr"?)
(yes, it should. Fixed)
Here, memory is allocated for ptr, and then it is passed to the foreign function.
alloca is prefered because it frees the memory for ptr when the function exits,
or when an exception is raised.
peek to get at the value returned.
alloca takes an IO action which takes a single argument: the newly allocated ptr.
We use a lambda expression here to create an anonymous function (actually an IO action).
Multiple argument case
If the function modifies more than one of its arguments, then things get a little more complex.
In this case we have to allocate the memory for the arguments
(again, using the
alloca* family of functions),
call the C function, and extract the values.
In this example the
ociErrorGet function modifies the third and fourth args
(int and string respectively).
I've chosen an arbitrary size for the buffer for the string: 1000 bytes.
> getOCIErrorMsg2 :: OCIHandle -> CInt -> Ptr CInt -> CString -> CInt -> IO (CInt, String) > getOCIErrorMsg2 ocihandle handleType errCodePtr errMsgBuf maxErrMsgLen = do > rc <- ociErrorGet ocihandle 1 0 errCodePtr errMsgBuf maxErrMsgLen handleType > if rc < 0 > then return (0, "Error message not available.") > else do > msg <- peekCString errMsg > e <- peek errCode > return (e, msg) > > getOCIErrorMsg :: OCIHandle -> CInt -> IO (CInt, String) > getOCIErrorMsg ocihandle handleType = do > let stringBufferLen = 1000 > allocaBytes stringBufferLen $ \errMsg -> > alloca $ \errCode -> > getOCIErrorMsg2 ocihandle handleType errCode errMsg (mkCInt stringBufferLen)
(Thanks to Udo Stenzel for tips for avoiding memory leaks.)
C function expects strings with lengths, where each string
is followed by an int stating how long it is.
Convert Haskell Strings to
CStrlingLens apart with utility functions.
CStringLen is just a
(CString, Int) pair.
Would it have been better to make
(CString, CInt) pair?
> mkCInt :: Int -> CInt > mkCInt n = fromIntegral n > cStrLen :: CStringLen -> CInt > cStrLen = mkCInt . snd > cStr :: CStringLen -> CString > cStr = fst > > dbLogon :: String -> String -> String -> EnvHandle -> ErrorHandle -> IO ConnHandle > dbLogon user pswd db env err = > withCStringLen user $ \userC -> > withCStringLen pswd $ \pswdC -> > withCStringLen db $ \dbC -> > alloca $ \conn -> do > rc <- ociLogon env err conn (cStr userC) (cStrLen userC) (cStr pswdC) (cStrLen pswdC) (cStr dbC) (cStrLen dbC) > case () of > _ | rc == oci_SUCCESS_WITH_INFO -> testForErrorWithPtr oci_ERROR "logon" conn > | otherwise -> testForErrorWithPtr rc "logon" conn
Raising and handling exceptions
Follow the advice for Dynamic Exceptions, in: http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control.Exception.html#10
Create your own exceptions, and your own throw and catch functions. This makes it easier to trap only exceptions raised by your code.
> data OCIException = OCIException Int String deriving (Typeable, Show) > > catchOCI :: IO a -> (OCIException -> IO a) -> IO a > catchOCI = catchDyn > throwOCI :: OCIException -> a > throwOCI = throwDyn If we can't derive Typeable then the following code should do the trick: > -- replaces: > data OCIException = OCIException CInt String deriving (Show) > ociExceptionTc :: TyCon > ociExceptionTc = mkTyCon "Database.Oracle.OciFunctions.OCIException" > instance Typeable OCIException where typeOf _ = mkAppTy ociExceptionTc 
Use the catch functions like this: (Here convertAndRethrow converts the low-level FFI exceptions from one module into higher (application-level) exceptions.)
> commit :: Session -> IO () > commit (Session env err conn) = catchOCI ( do > OCI.commitTrans err conn > ) (\exc -> convertAndRethrow err exc nullAction) > > nullAction :: IO () > nullAction = return () > > convertAndRethrow :: ErrorHandle -> OCIException -> IO () -> IO () > convertAndRethrow err exc cleanupAction = do > (e, m) <- OCI.formatErrorMsg exc err > cleanupAction > throwDB (DBError e m)
Or, an example that must do some cleanup when the exception is thrown: (Note also that the exception handler must return a value of the same type as the main action.)
> logon :: String -> String -> String -> EnvHandle -> ErrorHandle -> IO ConnHandle > logon user pswd dbname env err = catchOCI ( do > connection <- OCI.dbLogon user pswd dbname env err > return connection > ) (\ociexc -> do > convertAndRethrow err ociexc $ do > freeHandle (castPtr err) oci_HTYPE_ERROR > freeHandle (castPtr env) oci_HTYPE_ENV > return undefined > )
Suppose I've got a pointer-to-function, a !FunPtr. How do I call the pointed-to function from Haskell? (This is a real problem: When I tried to create a binding to Libdb 4, all functions are actually !FunPtrs contained in structs. I really don't want to write a C function that extracts and dereferences the pointer for every single one of them.) -- UdoStenzel
I haven't done this before, so I can only suggest looking at the docs and experimenting: http://www.haskell.org/ghc/docs/latest/html/libraries/base/Foreign.Ptr.html#t%3AFunPtr
This comment (from that Foreign.Ptr page) might help: "To convert !FunPtr values to corresponding Haskell functions, one can define a dynamic stub for the specific foreign type, e.g.
type IntFunction = CInt -> IO () foreign import ccall "dynamic" mkFun :: FunPtr IntFunction -> IntFunction
Thanks, I somehow missed that note. Now it seems for every !FunPtr in some structure I need to define a seperate dynamic import? This is annoying, I'd have to spell out the type of every such function at least twice (three times when counting the convenient Haskell wrapper)! Is there a way around it? Maybe a preprocessor (c2hs comes close, but doesn't seem to handle !FunPtrs)? -- UdoStenzel
That is what [wiki:HsffigTutorial HSFFIG] tries to address, especially related to function pointers held in structures' fields, and parsing of their type signatures. And problems with BerkeleyDB described above sort of inspired creation of HSFFIG. See also the HsffigExamples page.
What HSFFIG does not do well yet, is autocreation of dynamic wrappers for !FunPtrs passed as other functions' parameters and/or return values: this is available only in part and not always done in consistent way. -- DimitryGolubovsky