From ctypes to cython for C library wrapping

Since the cython presentation by R. Bradshaw at Scipy08, I wanted to give cython a shot to wrap existing C libraries. Up to now, my method of choice has been ctypes, because it is relatively simple, and can be done in python directly.

The problem with ctypes

I was not entirely satisfied with ctypes, in particular because it is sometimes difficult to control some platform dependant details, like type size and so on; ctypes has of course the notion of platform-independant type with a given size (int32_t, etc…), but some libraries define their own type, with underlying implementation depending on the platform. Also, making sure the function declarations match the real ones is awckward; ctypes’ uthor Thomas Heller developed a code generator to generate those declarations from headers, but they are dependent on the header you are using; some libraries unfortunately have platform-dependant headers, so in heory you should generate the declarations at installation, but this is awckward because the code generator uses gccxml, which is not widely available.

Here comes cython

One of the advantage of Cython for low leve C wrapping is that cython declarations need not be exact: in theory, you can’t pass an invalid pointer for example, because even if the cython declaration is wrong, the C compiler will complain on the C file generated by cython. Since the generated C file uses the actual header file, you are also pretty sure to avoid any mismatch between declarations and usage; at worse, the failure will happen at compilation time.

Unfortunately, cython does not have a code generator like ctypes. For a long time, I wanted to add sound output capabilities to audiolab, in particular for mac os X and ALSA (linux). Unfortunately, those API are fairly low levels. For example, here is an extract of AudioHardware (the HAL of CoreAudio) usage:

<br />
AudioHardwareGetProperty(kAudioHardwarePropertyDefaultOutputDevice,<br />
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &amp;count, (void *) &amp;(audio_data.device))</p>
<p>AudioDeviceGetProperty(audio_data.device, 0, false,<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; kAudioDevicePropertyBufferSize,<br />
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &amp;count, &amp;buffer_size)<br />

Mac OS X conventions is that variables starting with k are enums, defined like:

<br />
kAudioDevicePropertyDeviceName = 'name',<br />
kAudioDevicePropertyDeviceNameCFString = kAudioObjectPropertyName, kAudioDevicePropertyDeviceManufacturer = 'makr',<br />
kAudioDevicePropertyDeviceManufacturerCFString = kAudioObjectPropertyManufacturer,<br />
kAudioDevicePropertyRegisterBufferList = 'rbuf',<br />
kAudioDevicePropertyBufferSize = 'bsiz',<br />
kAudioDevicePropertyBufferSizeRange = 'bsz#',<br />
kAudioDevicePropertyChannelName = 'chnm',<br />
kAudioDevicePropertyChannelNameCFString = kAudioObjectPropertyElementName,<br />
kAudioDevicePropertyChannelCategoryName = 'ccnm',<br />
kAudioDevicePropertyChannelNominalLineLevelNameForID = 'cnlv'<br />
...<br />

Using the implicit conversion char[4] to int – which is not supported by cython AFAIK. With thousand of enums defined this way, any process which is not mostly automatic will be painful.

During Scipy08 cython’s presentation, I asked whether there was any plan toward automatic generation of cython ‘headers’, and Robert fairly answered please feel free to do so. As announced a couple of days ago, I have taken the idea of ctypes code generator, and ‘ported’ it to cython; I have used on scikits.audiolab to write a basic ALSA and CoreAudio player, and used it to convert my old ctypes-based wrapper to sndfile (a C library for audio file IO). This has worked really well: the optional typing in cython makes some part of the wrapper easier to implement than in ctypes (I don’t need to check whether an int-like argument won’t overflow, for example). Kudos to cython developers !

Usage on alsa

For completness, I added a simple example on how to use xml2cython codegen with ALSA, as used in scikits.audiolab. Hopefully, it should show how it can be used for other libraries. First, I parse the headers with gccxml; I use the ctypes codegenlib helper:

h2xml /usr/include/alsa/asoundlib.h -o asoundlib.xml

Now, I use the xml2cython script to parse the xml file and generate some .pxd files. By default, the sript will pull out almost everything from the xml file, which will generate a big cython file. xml2cython has a couple of basic filters, though, so that I only pull out what I want; in the alsa case, I was mostly interested by a couple of functions, so I used the input file filter: -i input -o alsa.pxd alsa/asoundlib.h asoundlib.xml

Which will generates alsa.pxd with declarations of functions whose name matches the list in input, plus all the typedefs/structures used as arguments (they are recursively pulled out, so if one argument is a function pointer, the types in the function pointer should hopefully be pulled out as well). The exception is enums: every enums defined in the parsed tree from the xml are put out automatically in the cython file, because ‘anonymous’ enums are usually not part of function declarations in C (enums are not typed in C, so it is not so useful). This means every enum coming from standard header files will be included as well, and this is ugly – as well as making cython compilation much slower. So I used a location filter as well, which tells xml2cython to pull out only enums which are defined in some files match by the filter: -l alsa -i input -o alsa.pxd alsa/asoundlib.h asoundlib.xml

This works since every alsa header on my system is of the form /usr/include/alsa/*.h. I used something very similar on AudioHardware.h header in CoreAudio. The generated cython can be seen in scikits trunk here. Doing this kind of things by hand would have been particularly error-prone…

cython-codegen: cython code generator from gccxml files

I have enjoyed using cython to wrap from C libraries recently. Unfortunately, some libraries I was interested in (Alsa, CoreAudio) are quite big. In particular, they have a lot of structures, typedefs and enumerations which are easy to get wrong by doing it manually. Since the problem is quite similar to wrapping with ctypes (my former method of choice), I thought it would be interesting to do something like ctypeslib code generator for cython – hencecython-codegen “project”, available on github:

Basic usage goes like this to generate a .pyx file for the foo.h header:

gccxml -I. foo.h -o foo.xml -l 'foo' foo.h foo.xml

I can’t stress enough that this is little more than a throw-away script, and is likely to fail on many header files, or generate invalid cython code. I could use it successfully on non trivial headers though, like alsa or CoreAudio on Mac OS X. Your mileage may vary.