GHC/Using the FFI

From HaskellWiki
< GHC
Revision as of 20:33, 8 February 2007 by Brianh (talk | contribs) (More visible warning about dangers of DllMain())
Jump to navigation Jump to search

Preliminary note

The examples below show how to call C++ functions in a Microsoft Windows Visual Studio DLL from Haskell, and Haskell functions from the C++ DLL. This illustrates that it is possible to link Haskell built with GHC to code produced by a different C compiler (GHC uses GCC), and also illustrates some workarounds to the specific problems encountered when building a standalone library (in this case a Windows DLL) that is to be called from Haskell. The examples are also relevant for the simpler case where you just want to statically link against C that is compiled with GCC.

Setting up your build environment (Visual Studio Specific)

  • If you haven't already done so, create a new windows environment variable called GHC_HOME and set it to c:\ghc\ghc-6.4.2 or wherever you've installed GHC
  • Create a directory eg c:\dll to store all your DLLs
  • Add this to the PATH environment variable so that Windows will find your DLL at runtime (Also your path should contain %GHC_HOME%\bin;%GHC_HOME%\gcc-lib if you also want to compile from the command line)
  • Create a new Visual Studio DLL project. For the purposes of this example, we will call this Duma.
  • In the Solution pane, right click on Duma then select Properties → Build Events → Post-Build Event and add the following event:
       Command Line:        copy "$(TargetDir)Duma.dll" c:\dll
       Description:         Copying duma.dll
       Excluded From Build: No
  • In the Solution Pane, rename the folder Source Files to Code Files and adjust the properties of this folder so that .h files are also filtered into it. (This is a good idea for any C++ project because it keeps the .cpp and .h together in the same list which makes editing a C++ unit a lot easier.)
  • Also rename the folder Header Files to Haskell Files and change its filter to select .hs files.
  • In the Code Files folder, create a text file called Duma.def, then select Properties → Custom Build Step → General and enter the following:
       Command Line: %GHC_HOME%\gcc-lib\dlltool -d "$(InputDir)Duma.def" -l c:\dll\libDuma.a
       Description:  Creating libDuma.a from .def
       Outputs:      c:\dll\libDuma.a
       Additional Dependencies:

This makes the functions you will export from your DLL visible to GHC and GCC. (The path to dlltool given above will need to be changed if you have installed GHC in a different place.)

  • Also in the Code Files folder, create a text file called ghc.bat with the following content:
       echo off
       REM Optimized build
       %GHC_HOME%\bin\ghc.exe -fglasgow-exts -fffi -I. -#include Duma.h --make main.hs -O2 -optl-lDuma -optl-L"c:\dll"
       if errorlevel 1 goto failed
       echo Success :-)
       goto end
       :failed
       echo Failure :-(
       :end

(The above batch file is based on that kindly supplied by Neil Mitchell on the GHC mailing list.)

You may also want to use -optl-mwindows if you don't want Haskell to create a console window.

  • In Tools → External Tools, click Add and specify the following:
        Title: &ghc
        Command: ghc.bat
        Arguments:
        Initial directory: $(ProjectDir)

Remember to tick the Use Output Window box so you can see what's happening on the output pane of Visual Studio.

After you've done all this, you can build your DLL as usual by hitting Control-Shift-B and then make your Haskell program that uses the DLL by using Alt-t-g (assuming that your main.hs is in the same directory as the C files). When you just make changes to your DLL, you can use F5 as normal (the first time you do this you will have to type main.exe into the box where it asks you for the main executable to use).

If you later install a newer version of GHC, you only need to modify the GHC_HOME windows environment variable.

Including the FFI header

The typedefs needed by foreign C functions are in HsFFI.h. However, this file includes various other files, which in turn include other files, and makes use of various definitions which would need to be defined somewhere etc, and doesn't compile under Visual C++ (for example).

All in all, the simplest solution is to just make your own header with just the typedefs that are actually needed, and put this in your project directory, for example as FFI.h:

#ifndef FFI_H
#define FFI_H
typedef unsigned int HsChar;  // on 32 bit machine
typedef int HsInt;
typedef unsigned int HsWord;
typedef void *HsPtr;
typedef void (*HsFunPtr)(void);
typedef void *HsForeignPtr;
typedef void *HsStablePtr;
#define HS_BOOL_FALSE 0
#define HS_BOOL_TRUE 1
#endif // FFI_H

Note that there is no point including prototypes for the functions hs_free_fun_ptr etc, because there is no way to link a standalone C library (such as a DLL) to them. However this is not a problem because your Haskell code can explicitly pass the Haskell versions of these functions (wrapped in a FunPtr) to your C library if you need them.

Example of aquiring and using a foreign resource

Suppose you are writing a DLL to provide a nice windowing environment for Haskell, and you want to let Haskell obtain a text font. Suppose a font is implemented by the C++ class Font, defined by:

// Duma_Font.h
namespace Duma {
   class Font {
   public:
      Font(const char *name) : refcount(0){/* details omitted! */}
      ~Font(){}
      void AddRef() const{++refcount;}
      void Release() const{if (!--refcount) delete this;}
   private:
      mutable unsigned int refcount;
   };
} // Duma

Continuing with our example DLL called Duma, we now edit Duma.h to add the following prototypes:

 // Duma.h
 #ifndef __DUMA_H  // be careful to use #ifndef (!)
 #define __DUMA_H
 // This header needs to be understood by VC++ and GCC
 // When included in the source for building the DLL,
 // __DUMA_DLL_EXPORT must be defined
 #ifdef __DUMA_DLL_EXPORT
 #define DUMA_API __declspec(dllexport)
 #else
 #define DUMA_API
 #endif
 #ifdef __cplusplus
   extern "C" {
 #endif
 DUMA_API HsPtr duma_createFont(const char *name);
 DUMA_API void duma_releaseFont(HsPtr fontRaw);
 #ifdef __cplusplus
   }
 #endif
 #endif // __DUMA_H

We need to tell GHC about these, so we edit Duma.def:

 LIBRARY Duma
 EXPORTS
   duma_createFont
   duma_releaseFont

Then we implement them in Duma.cpp:

 // Duma.cpp
 #include "stdafx.h"
 #include "FFI.h"
 // Note we have to define __DUMA_DLL_EXPORT and also
 // make sure FFI.h is included before we inlcude Duma.h
 #define __DUMA_DLL_EXPORT
 #include "Duma.h"
 #include "Duma_Font.h"
 BOOL APIENTRY DllMain( HANDLE hModule, 
                        DWORD  ul_reason_for_call, 
                        LPVOID lpReserved
                       ){
   // Note this is a *very* dangerous function so do nothing at all in here
   // It is not known which DLLs are currently loaded
   // If you need to do initialization/deinitialization, you should do
   // this explicitly by exporting init/deinit functions to be called from
   // Haskell
   return TRUE;
 }
 using namespace Duma;
 #ifdef __cplusplus
   extern "C" {
 #endif
 DUMA_API HsPtr duma_createFont(const char *name){
   Font *fontRaw = new Font(name);
   fontRaw->AddRef();
   return fontRaw;
 }
 DUMA_API void duma_releaseFont(HsPtr fontRaw){
   reinterpret_cast<Font *>(fontRaw)->Release();
 }
 #ifdef __cplusplus
   }
 #endif

The header Duma.h is used both within our DLL and externally, by GCC, to verify that the foreign function prototypes match those expected by the FFI. When used within the DLL, we make DUMA_API expand to the relevant linkage declaration for Visual C++ by defining __DUMA_DLL_EXPORT. Since (we hope) this symbol will not be defined when GCC sees the header, GCC will see a view of the header suitable for it.

Also in regards to the header, we have made our own local FFI typedefs to avoid the problems of trying to get HsFFI.h and all that it includes, to compile. However GCC will see the real HsFFI.h so when we include Duma.h in our DLL, we have to remember to include FFI.h before it. Of course we could have used some other symbol and another #ifdef in Duma.h to cause FFI.h to be included when needed but since Duma.h is only used in one place in the DLL ie by Duma.cpp there does not seem much point in going to this extra trouble.

The next thing is to create a Haskell module which will allow us to treat fonts as values which are automatically released when they are no longer needed, so for this we edit a module eg Duma.Font by creating a directory called Duma in the same directory as your C++ files, then creating a file Font.hs in this directory:

 module Duma.Font
    ( createFont
    , Font
    ) where
 import Foreign.C.String
 import Foreign.Ptr
 import Foreign.ForeignPtr
 import Control.Exception (block)
 data RawFont -- corresponds to the C++ Font
 newtype Font = Font (ForeignPtr RawFont) deriving (Eq, Ord)
 foreign import ccall duma_createFont :: CString -> IO (Ptr RawFont)
 foreign import ccall
    "&duma_releaseFont"   -- note the ampersand
    duma_releaseFont :: FunPtr (Ptr RawFont -> IO ())
 createFont :: String -> IO Font
 createFont name = block $ do
    f <- (withCString name $ \cname -> duma_createFont cname) >>=
                newForeignPtr duma_releaseFont
    return $ Font f

There are several points to note about this code:

  1. We are using three types of pointers:
    Ptr
    Corresponds to a plain C pointer
    FunPtr
    Points to a Haskell or (in this case) C function, and encapsulates marshalling details necessary both before entry and after leaving the function. For this reason a FunPtr must never be destroyed while the function it points to is still being executed
    ForeignPtr
    Associates a plain C pointer with a finalizer function which will be invoked when the ForeignPtr value is garbage collected
  2. It is important to realise that the type declaration of a foreign import shows the type of the Haskell value, not the type of the corresponding C entity that the Haskell value is bound to. When the specification of the C entity in quotes is omitted, the FFI binds a Haskell value of type A -> B (or IO B) to a C function that takes a single argument of type A and returns a value of type B. If the Haskell value just has type B (or IO B), the FFI expects to find a C function that takes no arguments and returns a value of type B. However in the case of duma_releaseFont, the Haskell value is a function pointer, and we want this to point to the C function duma_releaseFont, thus we need to explicitly specify the C entity being bound and put an ampersand before the C name, to indicate that we are binding to the address of the C function not the C function itself. If we omitted the ampersand, the FFI would think that the C entity to use was a function taking no arguments that returned the function pointer rather than actually being the function pointer itself. It is worth taking the time to achieve clarity about the difference between a function and a pointer to a function, especially since this distinction is blurred in C by the implicit "casting" of a function name (which is considered to be the function itself, and which is not a first class value in C) to the address of a function. Luckily, if you are building with an optimized build, or use the -fviaC option, GCC will give an error message if you forget the ampersand, but beware: GHC native compilation will silently succeed and your app will mysteriously crash.
  3. block is used to make the creation of the font atomic with respect to asynchronous exceptions
  4. withCString marshalls a Haskell String to a null terminated C string, and newForeignPtr takes the Ptr RawFont returned by duma_createFont and returns a ForeignPtr RawFont so that duma_releaseFont will be called on the underlying Ptr RawFont when the ForeignPtr RawFont value is no longer needed. Finally we wrap the ForeignPtr RawFont in a newtype to hide all this from the end user.

Calling Haskell from C

As mentioned above, when building a standalone C library, we can't link our C code to the C functions like hs_free_fun_ptr. However we can solve this problem by passing a FunPtr to the Haskell function freeHaskellFunPtr to our library at runtime.

In the following code, we provide a Haskell function to try to initialize our C library (eg the DLL), run some Haskell code which would set up some call backs, enter some kind of message loop (eg for a Windows app), then deinitialize the C library:

module Duma
     ( run
     , module Duma.Font
     ) where
import Foreign.Ptr
import Foreign.ForeignPtr
import Control.Exception (bracket)
import Duma.Font
foreign import ccall duma_begin :: FunPtr (FunPtr a -> IO ()) -> IO Bool
foreign import ccall duma_end :: IO ()
foreign import ccall duma_run :: IO () -- implements a Windows message loop
foreign import ccall "wrapper" mkFreeFunPtr :: (FunPtr a -> IO ()) -> IO (FunPtr (FunPtr a -> IO ()))
run :: IO a -> IO ()
run f = bracket
          (mkFreeFunPtr freeHaskellFunPtr
          )
          (\freeFunPtrFn -> do
             duma_end
             freeHaskellFunPtr freeFunPtrFn
          )
          (\freeFunPtrFn -> do
             initialized <- duma_begin freeFunPtrFn
             if (initialized)
               then f >> duma_run
               else return ()
          )

An example of C code would be:

 typedef void (*FunPtrFn)(HsFunPtr fn);
 FunPtrFn freeFunPtrFn = NULL;
 __declspec(dllexport) HsBool duma_begin(HsFunPtr freeFunPtrFnRaw){
    freeFunPtrFn = reinterpret_cast<FunPtrFn>(freeFunPtrFnRaw);
    return HS_BOOL_TRUE;
 }

Then within some C function which needs to release a FunPtr, you can just write:

 (*freeFunPtrFn)(the_fun_ptr_to_free);

Note that it is not safe to use the freeFunPtrFn to free itself, because some implementations of FunPtr store exit code (as well as entry code) in the FunPtr thus a FunPtr needs to be thought of as a function holder rather than just a function pointer (this is why the definition of run above frees the pointer from the Haskell side once we've already deinitialized the C library).

A scavenger pattern ([1]) can be used to avoid the danger of a FunPtr being freed while the function it points to (actually holds) is still being executed.

Note that a direct pointer to the hs_free_fun_ptr function can be obtained this way:

 foreign import ccall "&" hs_free_fun_ptr :: FunPtr (FunPtr a -> IO ())

Improving efficiency

There are two types of foreign call:

  1. those where the foreign function may call back into Haskell (includes anything which makes use of the Haskell runtime) before the original foreign call returns
  2. those where the function will always return without first calling back into Haskell.

In the first case, a significant amount of book-keeping is required to ensure that Haskell is ready to accept "incomming" calls from the foreign function before it returns, whereas in the second, since there is no danger of a call back into the Haskell runtime, nothing special needs to be done except to marshal the arguments and result as usual. Therefore the second type of call can be made a lot faster. To specify which type of call to use with a given foreign function, the FFI provides the keywords safe and unsafe. The default is safe (calls which might call back into Haskell before returning), so we only need to annotate those functions which we know will not call back into Haskell eg in our example above:

 foreign import ccall unsafe duma_createFont :: CString -> IO (Ptr RawFont)
 foreign import ccall unsafe
    "&duma_releaseFont"   -- note the ampersand
    duma_releaseFont :: FunPtr (Ptr RawFont -> IO ())

Note that duma_run still needs to be safe because it will execute Haskell IO actions in response to Windows messages (the actual code for doing this is not included in the example), and duma_end also needs to be safe because it will make use of the Haskell function freeHaskellFunPtr (or hs_free_fun_ptr) to free any callbacks that were registered before returning. The annotation you choose for duma_begin would depend on whether or not you need to call back into the Haskell runtime during initialization of your DLL.

Caution! If you are not responsible for the source of the foreign function you should not mark the import as unsafe unless you are absolutely certain it doesn't (and won't ever) call back into Haskell... If in doubt, consider posting a question to the Haskell Cafe or GHC User's mailing list.

Debugging Haskell DLLs

Some folks tend to use print, putStrLn, hPutStrLn and friends to emit debug messages to stdout or stderr. This works fine, if you are developing console application, but if you are writing DLL then you can't be sure that it will be embeded in a process with console interface. The problem is that when the process is with graphical interface, then it doesn't have associated stdin, stdout and stderr at all. Any attempt to use putStrLn for example will raise an exception and the process will terminate. The solution is to use Debug.Trace.trace or Debug.Trace.putTraceMsg instead. When they are called from a process with console interface then the messages are redirected to stderr. When the process is with graphical interface then the debugger console is used instead. If you want to debug your DLL, then launch the embedding process from your favorite debugger (gdb, WinDbg or Visual Studio for example) and you will see all debug messages in its output console.

Beware of DllMain()!

The body of a DllMain() function is an extremely dangerous place! This is because the order in which DLLs are unloaded when a process is terminating is unspecified. This means that the DllMain() for your DLL may be called when other DLLs containing functions that you call when de-initializing your DLL have already been unloaded. In other words, you can't put shutdown code inside DllMain(), unless your shutdown code only requires use of certain functions which are guaranteed to be available (see the Platform SDK docs for more info).

In particular, if you are writing a DLL that's statically linked with Haskell, it is not safe to call hs_exit() from DllMain(), since hs_exit() may make use of other DLLs. (For example it causes finalizers to be run so the finalizer may need to make use of a function in some other DLL.)

A solution is to always export Begin() and End() functions from your DLL, and call these from the application that uses the DLL, so that you can be sure that all DLLs needed by any shutdown code in your End() function are available when it is called.

Random Questions

Are FunPtr's stable?

Yes. Once you've obtained a FunPtr the FunPtr remains valid until it is explicitly freed (ie the garbage collector will not break it). See Tackling the Awkward Squad section 6.4.2

Can I free a FunPtr from inside the function itself?

Yes (unless you are on IA64, right now). See this thread (the later messages in particular) for more details.

Further reading