Notes on building NumPy/SciPy with OpenBlas

This is a quick post to show how to build NumPy/SciPy with OpenBlas on Mac OS X. OpenBlas is a recently open-sourced version of Blas/Lapack that is competitive with the proprietary implementations, without being as hard to build as Atlas.

Note: this is experimental, largely untested, and I would not recommend using this for anything worthwhile at the moment.

Building OpenBlas

After checking out the sources from github, I had the most luck building openblas with a custom-build clang (I used llvm 3.1). With the apple-provided clang, I got some errors related to unsupported opcodes (fsubp).

With the correct version of clang, building is a simple matter of running make (CPU is automatically detected).

Building NumPy/SciPy

I have just added a initial support for customizable blas/lapack in the bento build of NumPy (and scipy). You will need a very recent clone of NumPy git repo,and a recent bento. The single file distribution of bento is the simplest way to make this work:

./ configure --with-blas-lapack-libdir=$OPENBLAS_DIRECTORY --blas-lapack-type=openblas ..
./ build -j4 # build with 4 processes in //

Same for SciPy. The code for bento’s blas/lapack detection is not very robust nor well tested, so it will likely not work on most platforms.

Why setuptools does not matter to me

It is that time of the year where packaging questions resurface in the open (on python-dev and by Armin)

Armin wrote an article on why he loves setuptools, and one of the main takeaway of his text is that one should not replace X with Y without understanding why X was created in the first place. There is another takeaway, though: none of the features Armin mentioned matters much to me. This is not to say they are not important: given the success of setuptools or pip, it would be stupid not to recognize they fulfill an important gap for a lot of people.

About tradeoffs

But while those solutions provide a useful set of features, it is important to realize what they prevent as well. Nick touches this topic a bit on python-dev, but I mean something a bit different here. Some examples:

  • First, the way setuptools install eggs by adding things to sys.path caused a lot of additional stat on the filesystem. In the scientific community (and in corporate environments as well), people often have to use NFS. This can cause import speed to take a lot of time (above 1 minute is not unheard of).
  • Setuptools monkey patches distutils. This has a serious consequence for people who have their own distutils extensions, since you essentially have to deal with two code paths for anything that setuptools monkey patches.

As mentioned by Armin, setuptools had to do the the things it did to support multi-versioning. But this means that it has a significant cost for people who do not care about having multiple versions of the same package. This matters less today than it used to, though, thanks for virtual env, and pip that installs things as non-eggs.

Similar argument can be made about monkey-patching: distutils is not designed to be extensible, especially because of how commands are tightly coupled together. You effectively can NOT extend distutils without monkey-patching it significantly.

Hackable solutions

A couple of years ago, I decided that I could not put up with numpy.distutils extensions and the aforementioned distutils issues anymore. I started working on Bento sometimes around fall 2009, with the intend to bootstrap it by reusing the low-level distutils code, and getting rid of commands and distribution. I also wanted to experiment with simpler solutions to some more questionable setuptools designs such as data resource with pkg_resources.

I think hackable solutions are the key to help people solving packaging solution(s). There is no solution that will work for everyone, because the usecases are so different and clash with each other. Personally, having a system that works like apt-get (reliable and fast metadata search, reliable install/uninstall, etc…) is the holy grail, but I understand that that’s not what other people are after.

What matters the most is to only put in the stdlib what is uncontroversial and battle-tested in the wild. Tarek’s and the rest of the packaging team efforts to specify and write PEP around the metadata are a very good step in that direction. The PEP for metadata works well because it essentially specify things that have been used succesfully (and relatively uncontroversial).

But an elusive PEP around compilers as has been suggested is not that interesting IMO: I could write something to point every API issues with how compilation work in distutils, but that sounds pointless without a proposal for a better system. And I don’t want to design a better system, I want to be able to use one (waf, scons, fbuilt, gyp, whatever). Writing bento is my way of discovering a good design to do just that.

Adding a distutils compatibility layer to bento

From the beginning, it was clear that one of the major hurdle for bento would be transition from distutils. This is a hard issue for any tool trying to improve existing ones, but even more so for distribution/packaging tools, as it impacts everyone (developers and users of the tools).

Since almost day one, bento had some basic facilities to convert existing distutils projects into I have now added something to do the exact contrary, that is maintaing some distutils extensions which are driven by Concretely, it means that if you have a bento package, you can write something like:

import setuptools # this comes first so that setuptools does its monkey dance
import bento.distutils # this monkey patches on top of setuptools

as your, it will give the “illusion” of a distutils package. Of course, it won’t give you all the goodies given by bento (if it could, I would not have written bento in the first place), but it is good enough to enable the following:

  • installing through the usual “python install”
  • building source distributions
  • more significantly: it will make your package easy_install-able/pip-able

This feature will be in bento 0.0.5, which will be released very soon (before pycon 2011 where I will present bento). More details may be found on bento’s documentation

What’s coming in for bento 0.0.4

I initially intended to release the new version (0.0.4) of bento around mid-end
of August, but it now seems like end of september is more likely. The reason is
that I have been working pretty hard on bento and yaku for complex builds in
the last few weeks, where complex means numpy/scipy.

The main reason for making scipy buildable with bento is to get a “feel” of how
really extensible bento is. I think that since 0.0.3, bento is fairly usable,
but extensibility is really what bento is about. The only way that I know to
have an extensible design is to actually extend it in as many scenario as
possible, and as far as complex distutils-based build go, scipy is a pretty
good scenario.

The bottom line: I expect a fully working bento build of scipy within a few
(most hairy fortran stuff now builds and run the tests ok)

Major changes from 0.0.3

No backward-incompatible changes are required for the format. The
major change is recursive package support, which ended up being more complex
than anticipated. I already described this feature in a previous post: it
mostly boils down to splitting a big into several “sub-bento” files
in subdirectories.

Implementation-wise, it required a redesign of internal representation for
files. The issue is how to know that two file names represent the same files: I
quickly realize that using filenames is too complex and too fragile, and I
decided to re-use the Node class from waf, which builds an internal
representation of the filesystem. The conversion is still going on, but it
simplified a lot of hairy code that I used to write in bento (and distutils
previously). It particularly helps to compute the relative paths between too

relpos = node.path_from(othernode)

If node is /foo/bar and othernode is /foo, relpos will be bar, and .. if node
and othernode are inverted. Doing this from the filenames alone has many
corner cases, and path name computation are surprisingly slow on python (waf
Node class caches things like absolute path name computation).

Thanks to the waf Node class, I can now easily list the packages, extensions,
etc… specific to one sub bento, relatively to the sub bento directory, and
translates packages, extensions, etc… as seen from the top directory.

I am happy with the internals, but the “API” for recursive build description is
not good, to put it mildly. To add a subpackage description with associated
bscript (hook file), you need to:

  • add the sub to the Subento field in the parent
  • add the bscript file into the list of the recursive decorator inside
    the parent bscript file. Even though the decorator may be put on e.g.
    the configure hook, the build command will also look there for sub
    bscript files, which is not intuitive at all.

You can see some examples
I am still looking for a good solution to this issue.

Yaku enhancements

Except for recursive package description, not much has changed in bento, and
most of the work has happened in yaku. The first big change is that yaku itself
also uses a waf-like Node class: although I resisted this at first, I think it
is for the best, and it also simplified a lot of hairy corner cases inside

The other big change in yaku is overriding/extending it. I am interested in the
following cases:

  • adding new tool (clang, intel compiler, etc…)
  • adding a new process in the chain (say building extensions from .c.src
    instead of .c without monkey-patching original code)
  • overriding flags for some extensions (say building one extension with -Os
    instead of -O2)
  • overriding extension hook for some extensions. For example in general,
    fortran source files are compiled into .o directly for “pure” (not using
    python C API) libraries, but f2py allows to build a python extension from
    the .f directly. Yaku now allows for temporary overriding the command
    associated to .f file

Now, all those four cases are implemented. Chaining a templating system to
cython (for -> .pyx -> .c -> .o -> .so/.pyd) is now very simple,
supporting new compilers can be done easily, and playing with compiling options
straightforward internally. There are a few issues, though. Besides how the
API should look like, a corny situation is dealing with dictionaries of
configurations. In yaku, each task has an environment attached to it, which is
a simple dictionary containing things like CFLAGS, CC, etc… Most of the
time, you want to share those dictionaries across tasks. Unfortunately, python
semantics for dictionaries don’t make that easy, and deepcopy is too expensive.
A Copy-On-Write dictionary, which internally share common parts between
dictionaries, would be ideal, but I am afraid implementing one in python would
be very difficult.

I am also still not entirely convinced that yaku is warranted:
fbuild is nearly the ideal system if it were
not limited to python 3, and the new waf 1.6 looks great (T. Nagy, the waf
maintainer, recently updated fortran support for 1.6). Fortunately, bento is
build-tool agnostic from the start, and trying waf inside bento for a real
project is on the TODO list.

Other bento features

I put an hold on other features planned for 0.0.4. The main missing features
for bento are:

  • distutils compatibility mode (so that may be used within distutils)
  • wininst <-> egg conversion
  • good documentation
  • python 3 compatibility
  • virtualenv and pip support
  • automatic command dependency (e.g. automatically re-run configure before
    build if necessary)

Python 3 support will definitely not go into 0.0.4. Virtualenv/pip support
should not be difficult, automatic dependency for commands is badly needed.

All being said, I think bento is shaping up quite ok. At my work, I constantly
have to deal with distutils idiosyncraties for the most trivial things, and I
am looking forward to seeing it replaced with something saner.

Recent progress on bento – build numpy !

I have spent the last few days on a relatively big feature for bento: recursive package description. The idea is to be able to simply describe packages in a deeply nested hierarchy without having to write long paths, and to split complicated packages descriptions into several files.

At the level, the addition is easy:

Subento: numpy/core, numpy/lib ...

It took me more time to figure out a way to do it in the hook file. I ended up with a recurse decorator:

@recurse(["numpy/core/bscript", "numpy/lib/bscript"])
def some_func(ctx):

I am not sure it is the right solution yet, but it works for now. My first idea was to simply use a recurse function attached to hook contexts (the ctx argument), but I did not find a good way to guarantee an execution order (declaration order == execution order), and it was a bit unintuitive to integrate both hook decorator and the recurse together.

The reason why I tackle this now is that bento is at a stage where it need to be used on “real” builds to get a feeling of what works and what does not. The target is numpy and hopefully later scipy. Although I still hope to integrate waf or scons in bento as the canonical way of building numpy/scipy with bento, this also gives a good test for yaku (my simple build system).

It took me less than half a day to port the scons scripts to bento/yaku. A full build, unnoptimized build of numpy with clang is less than 10 seconds. A no-op build is ~ 150 ms, but as yaku does not have all the infrastructure for header dependency tracking yet, the number for no-op build is rather meaningless.