Welcome to Cython’s Documentation

Also see the Cython project homepage.

Getting Started

Cython - an overview

[Cython] is a programming language that makes writing C extensions for the Python language as easy as Python itself. It aims to become a superset of the [Python] language which gives it high-level, object-oriented, functional, and dynamic programming. Its main feature on top of these is support for optional static type declarations as part of the language. The source code gets translated into optimized C/C++ code and compiled as Python extension modules. This allows for both very fast program execution and tight integration with external C libraries, while keeping up the high programmer productivity for which the Python language is well known.

The primary Python execution environment is commonly referred to as CPython, as it is written in C. Other major implementations use Java (Jython [Jython]), C# (IronPython [IronPython]) and Python itself (PyPy [PyPy]). Written in C, CPython has been conducive to wrapping many external libraries that interface through the C language. It has, however, remained non trivial to write the necessary glue code in C, especially for programmers who are more fluent in a high-level language like Python than in a close-to-the-metal language like C.

Originally based on the well-known Pyrex [Pyrex], the Cython project has approached this problem by means of a source code compiler that translates Python code to equivalent C code. This code is executed within the CPython runtime environment, but at the speed of compiled C and with the ability to call directly into C libraries. At the same time, it keeps the original interface of the Python source code, which makes it directly usable from Python code. These two-fold characteristics enable Cython’s two major use cases: extending the CPython interpreter with fast binary modules, and interfacing Python code with external C libraries.

While Cython can compile (most) regular Python code, the generated C code usually gains major (and sometime impressive) speed improvements from optional static type declarations for both Python and C types. These allow Cython to assign C semantics to parts of the code, and to translate them into very efficient C code. Type declarations can therefore be used for two purposes: for moving code sections from dynamic Python semantics into static-and-fast C semantics, but also for directly manipulating types defined in external libraries. Cython thus merges the two worlds into a very broadly applicable programming language.

[Cython]G. Ewing, R. W. Bradshaw, S. Behnel, D. S. Seljebotn et al., The Cython compiler, http://cython.org.
[IronPython]Jim Hugunin et al., https://archive.codeplex.com/?p=IronPython.
[Jython]J. Huginin, B. Warsaw, F. Bock, et al., Jython: Python for the Java platform, http://www.jython.org.
[PyPy]The PyPy Group, PyPy: a Python implementation written in Python, http://pypy.org.
[Pyrex]G. Ewing, Pyrex: C-Extensions for Python, http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/
[Python]G. van Rossum et al., The Python programming language, https://www.python.org/.

Installing Cython

Many scientific Python distributions, such as Anaconda [Anaconda], Enthought Canopy [Canopy], and Sage [Sage], bundle Cython and no setup is needed. Note however that if your distribution ships a version of Cython which is too old you can still use the instructions below to update Cython. Everything in this tutorial should work with Cython 0.11.2 and newer, unless a footnote says otherwise.

Unlike most Python software, Cython requires a C compiler to be present on the system. The details of getting a C compiler varies according to the system used:

  • Linux The GNU C Compiler (gcc) is usually present, or easily available through the package system. On Ubuntu or Debian, for instance, the command sudo apt-get install build-essential will fetch everything you need.
  • Mac OS X To retrieve gcc, one option is to install Apple’s XCode, which can be retrieved from the Mac OS X’s install DVDs or from https://developer.apple.com/.
  • Windows A popular option is to use the open source MinGW (a Windows distribution of gcc). See the appendix for instructions for setting up MinGW manually. Enthought Canopy and Python(x,y) bundle MinGW, but some of the configuration steps in the appendix might still be necessary. Another option is to use Microsoft’s Visual C. One must then use the same version which the installed Python was compiled with.

The simplest way of installing Cython is by using pip:

pip install Cython

The newest Cython release can always be downloaded from http://cython.org. Unpack the tarball or zip file, enter the directory, and then run:

python setup.py install

For one-time builds, e.g. for CI/testing, on platforms that are not covered by one of the wheel packages provided on PyPI, it is substantially faster than a full source build to install an uncompiled (slower) version of Cython with

pip install Cython --install-option="--no-cython-compile"
[Anaconda]https://docs.anaconda.com/anaconda/
[Canopy]https://www.enthought.com/product/canopy/
[Sage]
  1. Stein et al., Sage Mathematics Software, http://www.sagemath.org/

Building Cython code

Cython code must, unlike Python, be compiled. This happens in two stages:

  • A .pyx file is compiled by Cython to a .c file, containing the code of a Python extension module.
  • The .c file is compiled by a C compiler to a .so file (or .pyd on Windows) which can be import-ed directly into a Python session. Distutils or setuptools take care of this part. Although Cython can call them for you in certain cases.

To understand fully the Cython + distutils/setuptools build process, one may want to read more about distributing Python modules.

There are several ways to build Cython code:

  • Write a distutils/setuptools setup.py. This is the normal and recommended way.
  • Use Pyximport, importing Cython .pyx files as if they were .py files (using distutils to compile and build in the background). This method is easier than writing a setup.py, but is not very flexible. So you’ll need to write a setup.py if, for example, you need certain compilations options.
  • Run the cython command-line utility manually to produce the .c file from the .pyx file, then manually compiling the .c file into a shared object library or DLL suitable for import from Python. (These manual steps are mostly for debugging and experimentation.)
  • Use the [Jupyter] notebook or the [Sage] notebook, both of which allow Cython code inline. This is the easiest way to get started writing Cython code and running it.

Currently, using distutils or setuptools is the most common way Cython files are built and distributed. The other methods are described in more detail in the Source Files and Compilation section of the reference manual.

Building a Cython module using distutils

Imagine a simple “hello world” script in a file hello.pyx:

def say_hello_to(name):
    print("Hello %s!" % name)

The following could be a corresponding setup.py script:

from distutils.core import setup
from Cython.Build import cythonize

setup(name='Hello world app',
      ext_modules=cythonize("hello.pyx"))

To build, run python setup.py build_ext --inplace. Then simply start a Python session and do from hello import say_hello_to and use the imported function as you see fit.

One caveat if you use setuptools instead of distutils, the default action when running python setup.py install is to create a zipped egg file which will not work with cimport for pxd files when you try to use them from a dependent package. To prevent this, include zip_safe=False in the arguments to setup().

Using the Jupyter notebook

Cython can be used conveniently and interactively from a web browser through the Jupyter notebook. To install Jupyter notebook, e.g. into a virtualenv, use pip:

(venv)$ pip install jupyter
(venv)$ jupyter notebook

To enable support for Cython compilation, install Cython as described in the installation guide and load the Cython extension from within the Jupyter notebook:

%load_ext Cython

Then, prefix a cell with the %%cython marker to compile it:

%%cython

cdef int a = 0
for i in range(10):
    a += i
print(a)

You can show Cython’s code analysis by passing the --annotate option:

%%cython --annotate
...
_images/jupyter.png

For more information about the arguments of the %%cython magic, see Compiling with a Jupyter Notebook.

Using the Sage notebook

_images/sage.png

For users of the Sage math distribution, the Sage notebook allows transparently editing and compiling Cython code simply by typing %cython at the top of a cell and evaluate it. Variables and functions defined in a Cython cell imported into the running session.

[Jupyter]http://jupyter.org/
[Sage]
  1. Stein et al., Sage Mathematics Software, http://www.sagemath.org/

Faster code via static typing

Cython is a Python compiler. This means that it can compile normal Python code without changes (with a few obvious exceptions of some as-yet unsupported language features, see Cython limitations). However, for performance critical code, it is often helpful to add static type declarations, as they will allow Cython to step out of the dynamic nature of the Python code and generate simpler and faster C code - sometimes faster by orders of magnitude.

It must be noted, however, that type declarations can make the source code more verbose and thus less readable. It is therefore discouraged to use them without good reason, such as where benchmarks prove that they really make the code substantially faster in a performance critical section. Typically a few types in the right spots go a long way.

All C types are available for type declarations: integer and floating point types, complex numbers, structs, unions and pointer types. Cython can automatically and correctly convert between the types on assignment. This also includes Python’s arbitrary size integer types, where value overflows on conversion to a C type will raise a Python OverflowError at runtime. (It does not, however, check for overflow when doing arithmetic.) The generated C code will handle the platform dependent sizes of C types correctly and safely in this case.

Types are declared via the cdef keyword.

Typing Variables

Consider the following pure Python code:

def f(x):
    return x ** 2 - x


def integrate_f(a, b, N):
    s = 0
    dx = (b - a) / N
    for i in range(N):
        s += f(a + i * dx)
    return s * dx

Simply compiling this in Cython merely gives a 35% speedup. This is better than nothing, but adding some static types can make a much larger difference.

With additional type declarations, this might look like:

def f(double x):
    return x ** 2 - x


def integrate_f(double a, double b, int N):
    cdef int i
    cdef double s, dx
    s = 0
    dx = (b - a) / N
    for i in range(N):
        s += f(a + i * dx)
    return s * dx

Since the iterator variable i is typed with C semantics, the for-loop will be compiled to pure C code. Typing a, s and dx is important as they are involved in arithmetic within the for-loop; typing b and N makes less of a difference, but in this case it is not much extra work to be consistent and type the entire function.

This results in a 4 times speedup over the pure Python version.

Typing Functions

Python function calls can be expensive – in Cython doubly so because one might need to convert to and from Python objects to do the call. In our example above, the argument is assumed to be a C double both inside f() and in the call to it, yet a Python float object must be constructed around the argument in order to pass it.

Therefore Cython provides a syntax for declaring a C-style function, the cdef keyword:

cdef double f(double x) except? -2:
    return x ** 2 - x

Some form of except-modifier should usually be added, otherwise Cython will not be able to propagate exceptions raised in the function (or a function it calls). The except? -2 means that an error will be checked for if -2 is returned (though the ? indicates that -2 may also be used as a valid return value). Alternatively, the slower except * is always safe. An except clause can be left out if the function returns a Python object or if it is guaranteed that an exception will not be raised within the function call.

A side-effect of cdef is that the function is no longer available from Python-space, as Python wouldn’t know how to call it. It is also no longer possible to change f() at runtime.

Using the cpdef keyword instead of cdef, a Python wrapper is also created, so that the function is available both from Cython (fast, passing typed values directly) and from Python (wrapping values in Python objects). In fact, cpdef does not just provide a Python wrapper, it also installs logic to allow the method to be overridden by python methods, even when called from within cython. This does add a tiny overhead compared to cdef methods.

Speedup: 150 times over pure Python.

Determining where to add types

Because static typing is often the key to large speed gains, beginners often have a tendency to type everything in sight. This cuts down on both readability and flexibility, and can even slow things down (e.g. by adding unnecessary type checks, conversions, or slow buffer unpacking). On the other hand, it is easy to kill performance by forgetting to type a critical loop variable. Two essential tools to help with this task are profiling and annotation. Profiling should be the first step of any optimization effort, and can tell you where you are spending your time. Cython’s annotation can then tell you why your code is taking time.

Using the -a switch to the cython command line program (or following a link from the Sage notebook) results in an HTML report of Cython code interleaved with the generated C code. Lines are colored according to the level of “typedness” – white lines translate to pure C, while lines that require the Python C-API are yellow (darker as they translate to more C-API interaction). Lines that translate to C code have a plus (+) in front and can be clicked to show the generated code.

This report is invaluable when optimizing a function for speed, and for determining when to release the GIL: in general, a nogil block may contain only “white” code.

_images/htmlreport.png

Note that Cython deduces the type of local variables based on their assignments (including as loop variable targets) which can also cut down on the need to explicitly specify types everywhere. For example, declaring dx to be of type double above is unnecessary, as is declaring the type of s in the last version (where the return type of f is known to be a C double.) A notable exception, however, is integer types used in arithmetic expressions, as Cython is unable to ensure that an overflow would not occur (and so falls back to object in case Python’s bignums are needed). To allow inference of C integer types, set the infer_types directive to True. This directive does a work similar to the auto keyword in C++ for the readers who are familiar with this language feature. It can be of great help to cut down on the need to type everything, but it also can lead to surprises. Especially if one isn’t familiar with arithmetic expressions with c types. A quick overview of those can be found here.

Tutorials

Basic Tutorial

The Basics of Cython

The fundamental nature of Cython can be summed up as follows: Cython is Python with C data types.

Cython is Python: Almost any piece of Python code is also valid Cython code. (There are a few Limitations, but this approximation will serve for now.) The Cython compiler will convert it into C code which makes equivalent calls to the Python/C API.

But Cython is much more than that, because parameters and variables can be declared to have C data types. Code which manipulates Python values and C values can be freely intermixed, with conversions occurring automatically wherever possible. Reference count maintenance and error checking of Python operations is also automatic, and the full power of Python’s exception handling facilities, including the try-except and try-finally statements, is available to you – even in the midst of manipulating C data.

Cython Hello World

As Cython can accept almost any valid python source file, one of the hardest things in getting started is just figuring out how to compile your extension.

So lets start with the canonical python hello world:

print("Hello World")

Save this code in a file named helloworld.pyx. Now we need to create the setup.py, which is like a python Makefile (for more information see Source Files and Compilation). Your setup.py should look like:

from distutils.core import setup
from Cython.Build import cythonize

setup(
    ext_modules = cythonize("helloworld.pyx")
)

To use this to build your Cython file use the commandline options:

$ python setup.py build_ext --inplace

Which will leave a file in your local directory called helloworld.so in unix or helloworld.pyd in Windows. Now to use this file: start the python interpreter and simply import it as if it was a regular python module:

>>> import helloworld
Hello World

Congratulations! You now know how to build a Cython extension. But so far this example doesn’t really give a feeling why one would ever want to use Cython, so lets create a more realistic example.

pyximport: Cython Compilation for Developers

If your module doesn’t require any extra C libraries or a special build setup, then you can use the pyximport module, originally developed by Paul Prescod, to load .pyx files directly on import, without having to run your setup.py file each time you change your code. It is shipped and installed with Cython and can be used like this:

>>> import pyximport; pyximport.install()
>>> import helloworld
Hello World

The Pyximport module also has experimental compilation support for normal Python modules. This allows you to automatically run Cython on every .pyx and .py module that Python imports, including the standard library and installed packages. Cython will still fail to compile a lot of Python modules, in which case the import mechanism will fall back to loading the Python source modules instead. The .py import mechanism is installed like this:

>>> pyximport.install(pyimport=True)

Note that it is not recommended to let Pyximport build code on end user side as it hooks into their import system. The best way to cater for end users is to provide pre-built binary packages in the wheel packaging format.

Fibonacci Fun

From the official Python tutorial a simple fibonacci function is defined as:

from __future__ import print_function

def fib(n):
    """Print the Fibonacci series up to n."""
    a, b = 0, 1
    while b < n:
        print(b, end=' ')
        a, b = b, a + b

    print()

Now following the steps for the Hello World example we first rename the file to have a .pyx extension, lets say fib.pyx, then we create the setup.py file. Using the file created for the Hello World example, all that you need to change is the name of the Cython filename, and the resulting module name, doing this we have:

from distutils.core import setup
from Cython.Build import cythonize

setup(
    ext_modules=cythonize("fib.pyx"),
)

Build the extension with the same command used for the helloworld.pyx:

$ python setup.py build_ext --inplace

And use the new extension with:

>>> import fib
>>> fib.fib(2000)
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597

Primes

Here’s a small example showing some of what can be done. It’s a routine for finding prime numbers. You tell it how many primes you want, and it returns them as a Python list.

primes.pyx:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
def primes(int nb_primes):
    cdef int n, i, len_p
    cdef int p[1000]
    if nb_primes > 1000:
        nb_primes = 1000

    len_p = 0  # The current number of elements in p.
    n = 2
    while len_p < nb_primes:
        # Is n prime?
        for i in p[:len_p]:
            if n % i == 0:
                break

        # If no break occurred in the loop, we have a prime.
        else:
            p[len_p] = n
            len_p += 1
        n += 1

    # Let's return the result in a python list:
    result_as_list  = [prime for prime in p[:len_p]]
    return result_as_list

You’ll see that it starts out just like a normal Python function definition, except that the parameter nb_primes is declared to be of type int . This means that the object passed will be converted to a C integer (or a TypeError. will be raised if it can’t be).

Now, let’s dig into the core of the function:

cdef int n, i, len_p
cdef int p[1000]

Lines 2 and 3 use the cdef statement to define some local C variables. The result is stored in the C array p during processing, and will be copied into a Python list at the end (line 22).

Note

You cannot create very large arrays in this manner, because they are allocated on the C function call stack, which is a rather precious and scarce resource. To request larger arrays, or even arrays with a length only known at runtime, you can learn how to make efficient use of C memory allocation, Python arrays or NumPy arrays with Cython.

if nb_primes > 1000:
    nb_primes = 1000

As in C, declaring a static array requires knowing the size at compile time. We make sure the user doesn’t set a value above 1000 (or we would have a segmentation fault, just like in C).

len_p = 0  # The number of elements in p
n = 2
while len_p < nb_primes:

Lines 7-9 set up for a loop which will test candidate numbers for primeness until the required number of primes has been found.

# Is n prime?
for i in p[:len_p]:
    if n % i == 0:
        break

Lines 11-12, which try dividing a candidate by all the primes found so far, are of particular interest. Because no Python objects are referred to, the loop is translated entirely into C code, and thus runs very fast. You will notice the way we iterate over the p C array.

for i in p[:len_p]:

The loop gets translated into a fast C loop and works just like iterating over a Python list or NumPy array. If you don’t slice the C array with [:len_p], then Cython will loop over the 1000 elements of the array.

# If no break occurred in the loop
else:
    p[len_p] = n
    len_p += 1
n += 1

If no breaks occurred, it means that we found a prime, and the block of code after the else line 16 will be executed. We add the prime found to p. If you find having an else after a for-loop strange, just know that it’s a lesser known features of the Python language, and that Cython executes it at C speed for you. If the for-else syntax confuses you, see this excellent blog post.

# Let's put the result in a python list:
result_as_list  = [prime for prime in p[:len_p]]
return result_as_list

In line 22, before returning the result, we need to copy our C array into a Python list, because Python can’t read C arrays. Cython can automatically convert many C types from and to Python types, as described in the documentation on type conversion, so we can use a simple list comprehension here to copy the C int values into a Python list of Python int objects, which Cython creates automatically along the way. You could also have iterated manually over the C array and used result_as_list.append(prime), the result would have been the same.

You’ll notice we declare a Python list exactly the same way it would be in Python. Because the variable result_as_list hasn’t been explicitly declared with a type, it is assumed to hold a Python object, and from the assignment, Cython also knows that the exact type is a Python list.

Finally, at line 18, a normal Python return statement returns the result list.

Compiling primes.pyx with the Cython compiler produces an extension module which we can try out in the interactive interpreter as follows:

>>> import primes
>>> primes.primes(10)
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

See, it works! And if you’re curious about how much work Cython has saved you, take a look at the C code generated for this module.

Cython has a way to visualise where interaction with Python objects and Python’s C-API is taking place. For this, pass the annotate=True parameter to cythonize(). It produces a HTML file. Let’s see:

_images/htmlreport1.png

If a line is white, it means that the code generated doesn’t interact with Python, so will run as fast as normal C code. The darker the yellow, the more Python interaction there is in that line. Those yellow lines will usually operate on Python objects, raise exceptions, or do other kinds of higher-level operations than what can easily be translated into simple and fast C code. The function declaration and return use the Python interpreter so it makes sense for those lines to be yellow. Same for the list comprehension because it involves the creation of a Python object. But the line if n % i == 0:, why? We can examine the generated C code to understand:

_images/python_division.png

We can see that some checks happen. Because Cython defaults to the Python behavior, the language will perform division checks at runtime, just like Python does. You can deactivate those checks by using the compiler directives.

Now let’s see if, even if we have division checks, we obtained a boost in speed. Let’s write the same program, but Python-style:

def primes_python(nb_primes):
    p = []
    n = 2
    while len(p) < nb_primes:
        # Is n prime?
        for i in p:
            if n % i == 0:
                break

        # If no break occurred in the loop
        else:
            p.append(n)
        n += 1
    return p

It is also possible to take a plain .py file and to compile it with Cython. Let’s take primes_python, change the function name to primes_python_compiled and compile it with Cython (without changing the code). We will also change the name of the file to example_py_cy.py to differentiate it from the others. Now the setup.py looks like this:

from distutils.core import setup
from Cython.Build import cythonize

setup(
    ext_modules=cythonize(['example.pyx',        # Cython code file with primes() function
                           'example_py_cy.py'],  # Python code file with primes_python_compiled() function
                          annotate=True),        # enables generation of the html annotation file
)

Now we can ensure that those two programs output the same values:

>>> primes_python(1000) == primes(1000)
True
>>> primes_python_compiled(1000) == primes(1000)
True

It’s possible to compare the speed now:

python -m timeit -s 'from example_py import primes_python' 'primes_python(1000)'
10 loops, best of 3: 23 msec per loop

python -m timeit -s 'from example_py_cy import primes_python_compiled' 'primes_python_compiled(1000)'
100 loops, best of 3: 11.9 msec per loop

python -m timeit -s 'from example import primes' 'primes(1000)'
1000 loops, best of 3: 1.65 msec per loop

The cythonize version of primes_python is 2 times faster than the Python one, without changing a single line of code. The Cython version is 13 times faster than the Python version! What could explain this?

Multiple things:
  • In this program, very little computation happen at each line. So the overhead of the python interpreter is very important. It would be very different if you were to do a lot computation at each line. Using NumPy for example.
  • Data locality. It’s likely that a lot more can fit in CPU cache when using C than when using Python. Because everything in python is an object, and every object is implemented as a dictionary, this is not very cache friendly.

Usually the speedups are between 2x to 1000x. It depends on how much you call the Python interpreter. As always, remember to profile before adding types everywhere. Adding types makes your code less readable, so use them with moderation.

Primes with C++

With Cython, it is also possible to take advantage of the C++ language, notably, part of the C++ standard library is directly importable from Cython code.

Let’s see what our primes.pyx becomes when using vector from the C++ standard library.

Note

Vector in C++ is a data structure which implements a list or stack based on a resizeable C array. It is similar to the Python array type in the array standard library module. There is a method reserve available which will avoid copies if you know in advance how many elements you are going to put in the vector. For more details see this page from cppreference.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# distutils: language=c++

from libcpp.vector cimport vector

def primes(unsigned int nb_primes):
    cdef int n, i
    cdef vector[int] p
    p.reserve(nb_primes)  # allocate memory for 'nb_primes' elements.

    n = 2
    while p.size() < nb_primes:  # size() for vectors is similar to len()
        for i in p:
            if n % i == 0:
                break
        else:
            p.push_back(n)  # push_back is similar to append()
        n += 1

    # Vectors are automatically converted to Python
    # lists when converted to Python objects.
    return p

The first line is a compiler directive. It tells Cython to compile your code to C++. This will enable the use of C++ language features and the C++ standard library. Note that it isn’t possible to compile Cython code to C++ with pyximport. You should use a setup.py or a notebook to run this example.

You can see that the API of a vector is similar to the API of a Python list, and can sometimes be used as a drop-in replacement in Cython.

For more details about using C++ with Cython, see Using C++ in Cython.

Language Details

For more about the Cython language, see Language Basics. To dive right in to using Cython in a numerical computation context, see Typed Memoryviews.

Calling C functions

This tutorial describes shortly what you need to know in order to call C library functions from Cython code. For a longer and more comprehensive tutorial about using external C libraries, wrapping them and handling errors, see Using C libraries.

For simplicity, let’s start with a function from the standard C library. This does not add any dependencies to your code, and it has the additional advantage that Cython already defines many such functions for you. So you can just cimport and use them.

For example, let’s say you need a low-level way to parse a number from a char* value. You could use the atoi() function, as defined by the stdlib.h header file. This can be done as follows:

from libc.stdlib cimport atoi

cdef parse_charptr_to_py_int(char* s):
    assert s is not NULL, "byte string value is NULL"
    return atoi(s)  # note: atoi() has no error detection!

You can find a complete list of these standard cimport files in Cython’s source package Cython/Includes/. They are stored in .pxd files, the standard way to provide reusable Cython declarations that can be shared across modules (see Sharing Declarations Between Cython Modules).

Cython also has a complete set of declarations for CPython’s C-API. For example, to test at C compilation time which CPython version your code is being compiled with, you can do this:

from cpython.version cimport PY_VERSION_HEX

# Python version >= 3.2 final ?
print(PY_VERSION_HEX >= 0x030200F0)

Cython also provides declarations for the C math library:

from libc.math cimport sin

cdef double f(double x):
    return sin(x * x)

Dynamic linking

The libc math library is special in that it is not linked by default on some Unix-like systems, such as Linux. In addition to cimporting the declarations, you must configure your build system to link against the shared library m. For distutils, it is enough to add it to the libraries parameter of the Extension() setup:

from distutils.core import setup
from distutils.extension import Extension
from Cython.Build import cythonize

ext_modules = [
    Extension("demo",
              sources=["demo.pyx"],
              libraries=["m"]  # Unix-like specific
              )
]

setup(name="Demos",
      ext_modules=cythonize(ext_modules))

External declarations

If you want to access C code for which Cython does not provide a ready to use declaration, you must declare them yourself. For example, the above sin() function is defined as follows:

cdef extern from "math.h":
    double sin(double x)

This declares the sin() function in a way that makes it available to Cython code and instructs Cython to generate C code that includes the math.h header file. The C compiler will see the original declaration in math.h at compile time, but Cython does not parse “math.h” and requires a separate definition.

Just like the sin() function from the math library, it is possible to declare and call into any C library as long as the module that Cython generates is properly linked against the shared or static library.

Note that you can easily export an external C function from your Cython module by declaring it as cpdef. This generates a Python wrapper for it and adds it to the module dict. Here is a Cython module that provides direct access to the C sin() function for Python code:

"""
>>> sin(0)
0.0
"""

cdef extern from "math.h":
    cpdef double sin(double x)

You get the same result when this declaration appears in the .pxd file that belongs to the Cython module (i.e. that has the same name, see Sharing Declarations Between Cython Modules). This allows the C declaration to be reused in other Cython modules, while still providing an automatically generated Python wrapper in this specific module.

Naming parameters

Both C and Cython support signature declarations without parameter names like this:

cdef extern from "string.h":
    char* strstr(const char*, const char*)

However, this prevents Cython code from calling it with keyword arguments. It is therefore preferable to write the declaration like this instead:

cdef extern from "string.h":
    char* strstr(const char *haystack, const char *needle)

You can now make it clear which of the two arguments does what in your call, thus avoiding any ambiguities and often making your code more readable:

cdef extern from "string.h":
    char* strstr(const char *haystack, const char *needle)

cdef char* data = "hfvcakdfagbcffvschvxcdfgccbcfhvgcsnfxjh"

cdef char* pos = strstr(needle='akd', haystack=data)
print(pos is not NULL)

Note that changing existing parameter names later is a backwards incompatible API modification, just as for Python code. Thus, if you provide your own declarations for external C or C++ functions, it is usually worth the additional bit of effort to choose the names of their arguments well.

Using C libraries

Apart from writing fast code, one of the main use cases of Cython is to call external C libraries from Python code. As Cython code compiles down to C code itself, it is actually trivial to call C functions directly in the code. The following gives a complete example for using (and wrapping) an external C library in Cython code, including appropriate error handling and considerations about designing a suitable API for Python and Cython code.

Imagine you need an efficient way to store integer values in a FIFO queue. Since memory really matters, and the values are actually coming from C code, you cannot afford to create and store Python int objects in a list or deque. So you look out for a queue implementation in C.

After some web search, you find the C-algorithms library [CAlg] and decide to use its double ended queue implementation. To make the handling easier, however, you decide to wrap it in a Python extension type that can encapsulate all memory management.

[CAlg]Simon Howard, C Algorithms library, http://c-algorithms.sourceforge.net/

Defining external declarations

You can download CAlg here.

The C API of the queue implementation, which is defined in the header file c-algorithms/src/queue.h, essentially looks like this:

/* queue.h */

typedef struct _Queue Queue;
typedef void *QueueValue;

Queue *queue_new(void);
void queue_free(Queue *queue);

int queue_push_head(Queue *queue, QueueValue data);
QueueValue queue_pop_head(Queue *queue);
QueueValue queue_peek_head(Queue *queue);

int queue_push_tail(Queue *queue, QueueValue data);
QueueValue queue_pop_tail(Queue *queue);
QueueValue queue_peek_tail(Queue *queue);

int queue_is_empty(Queue *queue);

To get started, the first step is to redefine the C API in a .pxd file, say, cqueue.pxd:

# cqueue.pxd

cdef extern from "c-algorithms/src/queue.h":
    ctypedef struct Queue:
        pass
    ctypedef void* QueueValue

    Queue* queue_new()
    void queue_free(Queue* queue)

    int queue_push_head(Queue* queue, QueueValue data)
    QueueValue  queue_pop_head(Queue* queue)
    QueueValue queue_peek_head(Queue* queue)

    int queue_push_tail(Queue* queue, QueueValue data)
    QueueValue queue_pop_tail(Queue* queue)
    QueueValue queue_peek_tail(Queue* queue)

    bint queue_is_empty(Queue* queue)

Note how these declarations are almost identical to the header file declarations, so you can often just copy them over. However, you do not need to provide all declarations as above, just those that you use in your code or in other declarations, so that Cython gets to see a sufficient and consistent subset of them. Then, consider adapting them somewhat to make them more comfortable to work with in Cython.

Specifically, you should take care of choosing good argument names for the C functions, as Cython allows you to pass them as keyword arguments. Changing them later on is a backwards incompatible API modification. Choosing good names right away will make these functions more pleasant to work with from Cython code.

One noteworthy difference to the header file that we use above is the declaration of the Queue struct in the first line. Queue is in this case used as an opaque handle; only the library that is called knows what is really inside. Since no Cython code needs to know the contents of the struct, we do not need to declare its contents, so we simply provide an empty definition (as we do not want to declare the _Queue type which is referenced in the C header) [1].

[1]There’s a subtle difference between cdef struct Queue: pass and ctypedef struct Queue: pass. The former declares a type which is referenced in C code as struct Queue, while the latter is referenced in C as Queue. This is a C language quirk that Cython is not able to hide. Most modern C libraries use the ctypedef kind of struct.

Another exception is the last line. The integer return value of the queue_is_empty() function is actually a C boolean value, i.e. the only interesting thing about it is whether it is non-zero or zero, indicating if the queue is empty or not. This is best expressed by Cython’s bint type, which is a normal int type when used in C but maps to Python’s boolean values True and False when converted to a Python object. This way of tightening declarations in a .pxd file can often simplify the code that uses them.

It is good practice to define one .pxd file for each library that you use, and sometimes even for each header file (or functional group) if the API is large. That simplifies their reuse in other projects. Sometimes, you may need to use C functions from the standard C library, or want to call C-API functions from CPython directly. For common needs like this, Cython ships with a set of standard .pxd files that provide these declarations in a readily usable way that is adapted to their use in Cython. The main packages are cpython, libc and libcpp. The NumPy library also has a standard .pxd file numpy, as it is often used in Cython code. See Cython’s Cython/Includes/ source package for a complete list of provided .pxd files.

Writing a wrapper class

After declaring our C library’s API, we can start to design the Queue class that should wrap the C queue. It will live in a file called queue.pyx. [2]

[2]Note that the name of the .pyx file must be different from the cqueue.pxd file with declarations from the C library, as both do not describe the same code. A .pxd file next to a .pyx file with the same name defines exported declarations for code in the .pyx file. As the cqueue.pxd file contains declarations of a regular C library, there must not be a .pyx file with the same name that Cython associates with it.

Here is a first start for the Queue class:

# queue.pyx

cimport cqueue

cdef class Queue:
    cdef cqueue.Queue* _c_queue

    def __cinit__(self):
        self._c_queue = cqueue.queue_new()

Note that it says __cinit__ rather than __init__. While __init__ is available as well, it is not guaranteed to be run (for instance, one could create a subclass and forget to call the ancestor’s constructor). Because not initializing C pointers often leads to hard crashes of the Python interpreter, Cython provides __cinit__ which is always called immediately on construction, before CPython even considers calling __init__, and which therefore is the right place to initialise cdef fields of the new instance. However, as __cinit__ is called during object construction, self is not fully constructed yet, and one must avoid doing anything with self but assigning to cdef fields.

Note also that the above method takes no parameters, although subtypes may want to accept some. A no-arguments __cinit__() method is a special case here that simply does not receive any parameters that were passed to a constructor, so it does not prevent subclasses from adding parameters. If parameters are used in the signature of __cinit__(), they must match those of any declared __init__ method of classes in the class hierarchy that are used to instantiate the type.

Memory management

Before we continue implementing the other methods, it is important to understand that the above implementation is not safe. In case anything goes wrong in the call to queue_new(), this code will simply swallow the error, so we will likely run into a crash later on. According to the documentation of the queue_new() function, the only reason why the above can fail is due to insufficient memory. In that case, it will return NULL, whereas it would normally return a pointer to the new queue.

The Python way to get out of this is to raise a MemoryError [3]. We can thus change the init function as follows:

# queue.pyx

cimport cqueue

cdef class Queue:
    cdef cqueue.Queue* _c_queue

    def __cinit__(self):
        self._c_queue = cqueue.queue_new()
        if self._c_queue is NULL:
            raise MemoryError()
[3]In the specific case of a MemoryError, creating a new exception instance in order to raise it may actually fail because we are running out of memory. Luckily, CPython provides a C-API function PyErr_NoMemory() that safely raises the right exception for us. Cython automatically substitutes this C-API call whenever you write raise MemoryError or raise MemoryError(). If you use an older version, you have to cimport the C-API function from the standard package cpython.exc and call it directly.

The next thing to do is to clean up when the Queue instance is no longer used (i.e. all references to it have been deleted). To this end, CPython provides a callback that Cython makes available as a special method __dealloc__(). In our case, all we have to do is to free the C Queue, but only if we succeeded in initialising it in the init method:

def __dealloc__(self):
    if self._c_queue is not NULL:
        cqueue.queue_free(self._c_queue)

Compiling and linking

At this point, we have a working Cython module that we can test. To compile it, we need to configure a setup.py script for distutils. Here is the most basic script for compiling a Cython module:

from distutils.core import setup
from distutils.extension import Extension
from Cython.Build import cythonize

setup(
    ext_modules = cythonize([Extension("queue", ["queue.pyx"])])
)

To build against the external C library, we need to make sure Cython finds the necessary libraries. There are two ways to archive this. First we can tell distutils where to find the c-source to compile the queue.c implementation automatically. Alternatively, we can build and install C-Alg as system library and dynamically link it. The latter is useful if other applications also use C-Alg.

Static Linking

To build the c-code automatically we need to include compiler directives in queue.pyx:

# distutils: sources = c-algorithms/src/queue.c
# distutils: include_dirs = c-algorithms/src/

cimport cqueue

cdef class Queue:
    cdef cqueue.Queue* _c_queue
    def __cinit__(self):
        self._c_queue = cqueue.queue_new()
        if self._c_queue is NULL:
            raise MemoryError()

    def __dealloc__(self):
        if self._c_queue is not NULL:
            cqueue.queue_free(self._c_queue)

The sources compiler directive gives the path of the C files that distutils is going to compile and link (statically) into the resulting extension module. In general all relevant header files should be found in include_dirs. Now we can build the project using:

$ python setup.py build_ext -i

And test whether our build was successful:

$ python -c 'import queue; Q = queue.Queue()'
Dynamic Linking

Dynamic linking is useful, if the library we are going to wrap is already installed on the system. To perform dynamic linking we first need to build and install c-alg.

To build c-algorithms on your system:

$ cd c-algorithms
$ sh autogen.sh
$ ./configure
$ make

to install CAlg run:

$ make install

Afterwards the file /usr/local/lib/libcalg.so should exist.

Note

This path applies to Linux systems and may be different on other platforms, so you will need to adapt the rest of the tutorial depending on the path where libcalg.so or libcalg.dll is on your system.

In this approach we need to tell the setup script to link with an external library. To do so we need to extend the setup script to install change the extension setup from

ext_modules = cythonize([Extension("queue", ["queue.pyx"])])

to

ext_modules = cythonize([
    Extension("queue", ["queue.pyx"],
              libraries=["calg"])
    ])

Now we should be able to build the project using:

$ python setup.py build_ext -i

If the libcalg is not installed in a ‘normal’ location, users can provide the required parameters externally by passing appropriate C compiler flags, such as:

CFLAGS="-I/usr/local/otherdir/calg/include"  \
LDFLAGS="-L/usr/local/otherdir/calg/lib"     \
    python setup.py build_ext -i

Before we run the module, we also need to make sure that libcalg is in the LD_LIBRARY_PATH environment variable, e.g. by setting:

$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib

Once we have compiled the module for the first time, we can now import it and instantiate a new Queue:

$ export PYTHONPATH=.
$ python -c 'import queue; Q = queue.Queue()'

However, this is all our Queue class can do so far, so let’s make it more usable.

Mapping functionality

Before implementing the public interface of this class, it is good practice to look at what interfaces Python offers, e.g. in its list or collections.deque classes. Since we only need a FIFO queue, it’s enough to provide the methods append(), peek() and pop(), and additionally an extend() method to add multiple values at once. Also, since we already know that all values will be coming from C, it’s best to provide only cdef methods for now, and to give them a straight C interface.

In C, it is common for data structures to store data as a void* to whatever data item type. Since we only want to store int values, which usually fit into the size of a pointer type, we can avoid additional memory allocations through a trick: we cast our int values to void* and vice versa, and store the value directly as the pointer value.

Here is a simple implementation for the append() method:

cdef append(self, int value):
    cqueue.queue_push_tail(self._c_queue, <void*>value)

Again, the same error handling considerations as for the __cinit__() method apply, so that we end up with this implementation instead:

cdef append(self, int value):
    if not cqueue.queue_push_tail(self._c_queue,
                                  <void*>value):
        raise MemoryError()

Adding an extend() method should now be straight forward:

cdef extend(self, int* values, size_t count):
    """Append all ints to the queue.
    """
    cdef int value
    for value in values[:count]:  # Slicing pointer to limit the iteration boundaries.
        self.append(value)

This becomes handy when reading values from a C array, for example.

So far, we can only add data to the queue. The next step is to write the two methods to get the first element: peek() and pop(), which provide read-only and destructive read access respectively. To avoid compiler warnings when casting void* to int directly, we use an intermediate data type that is big enough to hold a void*. Here, Py_ssize_t:

cdef int peek(self):
    return <Py_ssize_t>cqueue.queue_peek_head(self._c_queue)

cdef int pop(self):
    return <Py_ssize_t>cqueue.queue_pop_head(self._c_queue)

Normally, in C, we risk losing data when we convert a larger integer type to a smaller integer type without checking the boundaries, and Py_ssize_t may be a larger type than int. But since we control how values are added to the queue, we already know that all values that are in the queue fit into an int, so the above conversion from void* to Py_ssize_t to int (the return type) is safe by design.

Handling errors

Now, what happens when the queue is empty? According to the documentation, the functions return a NULL pointer, which is typically not a valid value. But since we are simply casting to and from ints, we cannot distinguish anymore if the return value was NULL because the queue was empty or because the value stored in the queue was 0. In Cython code, we want the first case to raise an exception, whereas the second case should simply return 0. To deal with this, we need to special case this value, and check if the queue really is empty or not:

cdef int peek(self) except? -1:
    cdef int value = <Py_ssize_t>cqueue.queue_peek_head(self._c_queue)
    if value == 0:
        # this may mean that the queue is empty, or
        # that it happens to contain a 0 value
        if cqueue.queue_is_empty(self._c_queue):
            raise IndexError("Queue is empty")
    return value

Note how we have effectively created a fast path through the method in the hopefully common cases that the return value is not 0. Only that specific case needs an additional check if the queue is empty.

The except? -1 declaration in the method signature falls into the same category. If the function was a Python function returning a Python object value, CPython would simply return NULL internally instead of a Python object to indicate an exception, which would immediately be propagated by the surrounding code. The problem is that the return type is int and any int value is a valid queue item value, so there is no way to explicitly signal an error to the calling code. In fact, without such a declaration, there is no obvious way for Cython to know what to return on exceptions and for calling code to even know that this method may exit with an exception.

The only way calling code can deal with this situation is to call PyErr_Occurred() when returning from a function to check if an exception was raised, and if so, propagate the exception. This obviously has a performance penalty. Cython therefore allows you to declare which value it should implicitly return in the case of an exception, so that the surrounding code only needs to check for an exception when receiving this exact value.

We chose to use -1 as the exception return value as we expect it to be an unlikely value to be put into the queue. The question mark in the except? -1 declaration indicates that the return value is ambiguous (there may be a -1 value in the queue, after all) and that an additional exception check using PyErr_Occurred() is needed in calling code. Without it, Cython code that calls this method and receives the exception return value would silently (and sometimes incorrectly) assume that an exception has been raised. In any case, all other return values will be passed through almost without a penalty, thus again creating a fast path for ‘normal’ values.

Now that the peek() method is implemented, the pop() method also needs adaptation. Since it removes a value from the queue, however, it is not enough to test if the queue is empty after the removal. Instead, we must test it on entry:

cdef int pop(self) except? -1:
    if cqueue.queue_is_empty(self._c_queue):
        raise IndexError("Queue is empty")
    return <Py_ssize_t>cqueue.queue_pop_head(self._c_queue)

The return value for exception propagation is declared exactly as for peek().

Lastly, we can provide the Queue with an emptiness indicator in the normal Python way by implementing the __bool__() special method (note that Python 2 calls this method __nonzero__, whereas Cython code can use either name):

def __bool__(self):
    return not cqueue.queue_is_empty(self._c_queue)

Note that this method returns either True or False as we declared the return type of the queue_is_empty() function as bint in cqueue.pxd.

Testing the result

Now that the implementation is complete, you may want to write some tests for it to make sure it works correctly. Especially doctests are very nice for this purpose, as they provide some documentation at the same time. To enable doctests, however, you need a Python API that you can call. C methods are not visible from Python code, and thus not callable from doctests.

A quick way to provide a Python API for the class is to change the methods from cdef to cpdef. This will let Cython generate two entry points, one that is callable from normal Python code using the Python call semantics and Python objects as arguments, and one that is callable from C code with fast C semantics and without requiring intermediate argument conversion from or to Python types. Note that cpdef methods ensure that they can be appropriately overridden by Python methods even when they are called from Cython. This adds a tiny overhead compared to cdef methods.

Now that we have both a C-interface and a Python interface for our class, we should make sure that both interfaces are consistent. Python users would expect an extend() method that accepts arbitrary iterables, whereas C users would like to have one that allows passing C arrays and C memory. Both signatures are incompatible.

We will solve this issue by considering that in C, the API could also want to support other input types, e.g. arrays of long or char, which is usually supported with differently named C API functions such as extend_ints(), extend_longs(), extend_chars()``, etc. This allows us to free the method name extend() for the duck typed Python method, which can accept arbitrary iterables.

The following listing shows the complete implementation that uses cpdef methods where possible:

# queue.pyx

cimport cqueue

cdef class Queue:
    """A queue class for C integer values.

    >>> q = Queue()
    >>> q.append(5)
    >>> q.peek()
    5
    >>> q.pop()
    5
    """
    cdef cqueue.Queue* _c_queue
    def __cinit__(self):
        self._c_queue = cqueue.queue_new()
        if self._c_queue is NULL:
            raise MemoryError()

    def __dealloc__(self):
        if self._c_queue is not NULL:
            cqueue.queue_free(self._c_queue)

    cpdef append(self, int value):
        if not cqueue.queue_push_tail(self._c_queue,
                                      <void*> value):
            raise MemoryError()

    # The `cpdef` feature is obviously not available for the original "extend()"
    # method, as the method signature is incompatible with Python argument
    # types (Python does not have pointers).  However, we can rename
    # the C-ish "extend()" method to e.g. "extend_ints()", and write
    # a new "extend()" method that provides a suitable Python interface by
    # accepting an arbitrary Python iterable.
    cpdef extend(self, values):
        for value in values:
            self.append(value)

    cdef extend_ints(self, int* values, size_t count):
        cdef int value
        for value in values[:count]:  # Slicing pointer to limit the iteration boundaries.
            self.append(value)

    cpdef int peek(self) except? -1:
        cdef int value = <Py_ssize_t> cqueue.queue_peek_head(self._c_queue)

        if value == 0:
            # this may mean that the queue is empty,
            # or that it happens to contain a 0 value
            if cqueue.queue_is_empty(self._c_queue):
                raise IndexError("Queue is empty")
        return value

    cpdef int pop(self) except? -1:
        if cqueue.queue_is_empty(self._c_queue):
            raise IndexError("Queue is empty")
        return <Py_ssize_t> cqueue.queue_pop_head(self._c_queue)

    def __bool__(self):
        return not cqueue.queue_is_empty(self._c_queue)

Now we can test our Queue implementation using a python script, for example here test_queue.py:

from __future__ import print_function

import time

import queue

Q = queue.Queue()

Q.append(10)
Q.append(20)
print(Q.peek())
print(Q.pop())
print(Q.pop())
try:
    print(Q.pop())
except IndexError as e:
    print("Error message:", e)  # Prints "Queue is empty"

i = 10000

values = range(i)

start_time = time.time()

Q.extend(values)

end_time = time.time() - start_time

print("Adding {} items took {:1.3f} msecs.".format(i, 1000 * end_time))

for i in range(41):
    Q.pop()

Q.pop()
print("The answer is:")
print(Q.pop())

As a quick test with 10000 numbers on the author’s machine indicates, using this Queue from Cython code with C int values is about five times as fast as using it from Cython code with Python object values, almost eight times faster than using it from Python code in a Python loop, and still more than twice as fast as using Python’s highly optimised collections.deque type from Cython code with Python integers.

Callbacks

Let’s say you want to provide a way for users to pop values from the queue up to a certain user defined event occurs. To this end, you want to allow them to pass a predicate function that determines when to stop, e.g.:

def pop_until(self, predicate):
    while not predicate(self.peek()):
        self.pop()

Now, let us assume for the sake of argument that the C queue provides such a function that takes a C callback function as predicate. The API could look as follows:

/* C type of a predicate function that takes a queue value and returns
 * -1 for errors
 *  0 for reject
 *  1 for accept
 */
typedef int (*predicate_func)(void* user_context, QueueValue data);

/* Pop values as long as the predicate evaluates to true for them,
 * returns -1 if the predicate failed with an error and 0 otherwise.
 */
int queue_pop_head_until(Queue *queue, predicate_func predicate,
                         void* user_context);

It is normal for C callback functions to have a generic void* argument that allows passing any kind of context or state through the C-API into the callback function. We will use this to pass our Python predicate function.

First, we have to define a callback function with the expected signature that we can pass into the C-API function:

cdef int evaluate_predicate(void* context, cqueue.QueueValue value):
    "Callback function that can be passed as predicate_func"
    try:
        # recover Python function object from void* argument
        func = <object>context
        # call function, convert result into 0/1 for True/False
        return bool(func(<int>value))
    except:
        # catch any Python errors and return error indicator
        return -1

The main idea is to pass a pointer (a.k.a. borrowed reference) to the function object as the user context argument. We will call the C-API function as follows:

def pop_until(self, python_predicate_function):
    result = cqueue.queue_pop_head_until(
        self._c_queue, evaluate_predicate,
        <void*>python_predicate_function)
    if result == -1:
        raise RuntimeError("an error occurred")

The usual pattern is to first cast the Python object reference into a void* to pass it into the C-API function, and then cast it back into a Python object in the C predicate callback function. The cast to void* creates a borrowed reference. On the cast to <object>, Cython increments the reference count of the object and thus converts the borrowed reference back into an owned reference. At the end of the predicate function, the owned reference goes out of scope again and Cython discards it.

The error handling in the code above is a bit simplistic. Specifically, any exceptions that the predicate function raises will essentially be discarded and only result in a plain RuntimeError() being raised after the fact. This can be improved by storing away the exception in an object passed through the context parameter and re-raising it after the C-API function has returned -1 to indicate the error.

Extension types (aka. cdef classes)

To support object-oriented programming, Cython supports writing normal Python classes exactly as in Python:

class MathFunction(object):
    def __init__(self, name, operator):
        self.name = name
        self.operator = operator

    def __call__(self, *operands):
        return self.operator(*operands)

Based on what Python calls a “built-in type”, however, Cython supports a second kind of class: extension types, sometimes referred to as “cdef classes” due to the keywords used for their declaration. They are somewhat restricted compared to Python classes, but are generally more memory efficient and faster than generic Python classes. The main difference is that they use a C struct to store their fields and methods instead of a Python dict. This allows them to store arbitrary C types in their fields without requiring a Python wrapper for them, and to access fields and methods directly at the C level without passing through a Python dictionary lookup.

Normal Python classes can inherit from cdef classes, but not the other way around. Cython requires to know the complete inheritance hierarchy in order to lay out their C structs, and restricts it to single inheritance. Normal Python classes, on the other hand, can inherit from any number of Python classes and extension types, both in Cython code and pure Python code.

So far our integration example has not been very useful as it only integrates a single hard-coded function. In order to remedy this, with hardly sacrificing speed, we will use a cdef class to represent a function on floating point numbers:

cdef class Function:
    cpdef double evaluate(self, double x) except *:
        return 0

The directive cpdef makes two versions of the method available; one fast for use from Cython and one slower for use from Python. Then:

from libc.math cimport sin

cdef class Function:
    cpdef double evaluate(self, double x) except *:
        return 0

cdef class SinOfSquareFunction(Function):
    cpdef double evaluate(self, double x) except *:
        return sin(x ** 2)

This does slightly more than providing a python wrapper for a cdef method: unlike a cdef method, a cpdef method is fully overridable by methods and instance attributes in Python subclasses. It adds a little calling overhead compared to a cdef method.

To make the class definitions visible to other modules, and thus allow for efficient C-level usage and inheritance outside of the module that implements them, we define them in a sin_of_square.pxd file:

cdef class Function:
    cpdef double evaluate(self, double x) except *

cdef class SinOfSquareFunction(Function):
    cpdef double evaluate(self, double x) except *

Using this, we can now change our integration example:

from sin_of_square cimport Function, SinOfSquareFunction

def integrate(Function f, double a, double b, int N):
    cdef int i
    cdef double s, dx
    if f is None:
        raise ValueError("f cannot be None")
    s = 0
    dx = (b - a) / N
    for i in range(N):
        s += f.evaluate(a + i * dx)
    return s * dx

print(integrate(SinOfSquareFunction(), 0, 1, 10000))

This is almost as fast as the previous code, however it is much more flexible as the function to integrate can be changed. We can even pass in a new function defined in Python-space:

>>> import integrate
>>> class MyPolynomial(integrate.Function):
...     def evaluate(self, x):
...         return 2*x*x + 3*x - 10
...
>>> integrate(MyPolynomial(), 0, 1, 10000)
-7.8335833300000077

This is about 20 times slower, but still about 10 times faster than the original Python-only integration code. This shows how large the speed-ups can easily be when whole loops are moved from Python code into a Cython module.

Some notes on our new implementation of evaluate:

  • The fast method dispatch here only works because evaluate was declared in Function. Had evaluate been introduced in SinOfSquareFunction, the code would still work, but Cython would have used the slower Python method dispatch mechanism instead.
  • In the same way, had the argument f not been typed, but only been passed as a Python object, the slower Python dispatch would be used.
  • Since the argument is typed, we need to check whether it is None. In Python, this would have resulted in an AttributeError when the evaluate method was looked up, but Cython would instead try to access the (incompatible) internal structure of None as if it were a Function, leading to a crash or data corruption.

There is a compiler directive nonecheck which turns on checks for this, at the cost of decreased speed. Here’s how compiler directives are used to dynamically switch on or off nonecheck:

# cython: nonecheck=True
#        ^^^ Turns on nonecheck globally

import cython

cdef class MyClass:
    pass

# Turn off nonecheck locally for the function
@cython.nonecheck(False)
def func():
    cdef MyClass obj = None
    try:
        # Turn nonecheck on again for a block
        with cython.nonecheck(True):
            print(obj.myfunc())  # Raises exception
    except AttributeError:
        pass
    print(obj.myfunc())  # Hope for a crash!

Attributes in cdef classes behave differently from attributes in regular classes:

  • All attributes must be pre-declared at compile-time
  • Attributes are by default only accessible from Cython (typed access)
  • Properties can be declared to expose dynamic attributes to Python-space
from sin_of_square cimport Function

cdef class WaveFunction(Function):

    # Not available in Python-space:
    cdef double offset

    # Available in Python-space:
    cdef public double freq

    # Available in Python-space, but only for reading:
    cdef readonly double scale

    # Available in Python-space:
    @property
    def period(self):
        return 1.0 / self.freq

    @period.setter
    def period(self, value):
        self.freq = 1.0 / value

pxd files

In addition to the .pyx source files, Cython uses .pxd files which work like C header files – they contain Cython declarations (and sometimes code sections) which are only meant for inclusion by Cython modules. A pxd file is imported into a pyx module by using the cimport keyword.

pxd files have many use-cases:

  1. They can be used for sharing external C declarations.

  2. They can contain functions which are well suited for inlining by the C compiler. Such functions should be marked inline, example:

    cdef inline int int_min(int a, int b):
        return b if b < a else a
    
  3. When accompanying an equally named pyx file, they provide a Cython interface to the Cython module so that other Cython modules can communicate with it using a more efficient protocol than the Python one.

In our integration example, we might break it up into pxd files like this:

  1. Add a cmath.pxd function which defines the C functions available from the C math.h header file, like sin. Then one would simply do from cmath cimport sin in integrate.pyx.

  2. Add a integrate.pxd so that other modules written in Cython can define fast custom functions to integrate.

    cdef class Function:
        cpdef evaluate(self, double x)
    cpdef integrate(Function f, double a,
                    double b, int N)
    

    Note that if you have a cdef class with attributes, the attributes must be declared in the class declaration pxd file (if you use one), not the pyx file. The compiler will tell you about this.

Caveats

Since Cython mixes C and Python semantics, some things may be a bit surprising or unintuitive. Work always goes on to make Cython more natural for Python users, so this list may change in the future.

  • 10**-2 == 0, instead of 0.01 like in Python.
  • Given two typed int variables a and b, a % b has the same sign as the second argument (following Python semantics) rather than having the same sign as the first (as in C). The C behavior can be obtained, at some speed gain, by enabling the cdivision directive (versions prior to Cython 0.12 always followed C semantics).
  • Care is needed with unsigned types. cdef unsigned n = 10; print(range(-n, n)) will print an empty list, since -n wraps around to a large positive integer prior to being passed to the range function.
  • Python’s float type actually wraps C double values, and the int type in Python 2.x wraps C long values.

Profiling

This part describes the profiling abilities of Cython. If you are familiar with profiling pure Python code, you can only read the first section (Cython Profiling Basics). If you are not familiar with Python profiling you should also read the tutorial (Profiling Tutorial) which takes you through a complete example step by step.

Cython Profiling Basics

Profiling in Cython is controlled by a compiler directive. It can be set either for an entire file or on a per function basis via a Cython decorator.

Enabling profiling for a complete source file

Profiling is enabled for a complete source file via a global directive to the Cython compiler at the top of a file:

# cython: profile=True

Note that profiling gives a slight overhead to each function call therefore making your program a little slower (or a lot, if you call some small functions very often).

Once enabled, your Cython code will behave just like Python code when called from the cProfile module. This means you can just profile your Cython code together with your Python code using the same tools as for Python code alone.

Disabling profiling function wise

If your profiling is messed up because of the call overhead to some small functions that you rather do not want to see in your profile - either because you plan to inline them anyway or because you are sure that you can’t make them any faster - you can use a special decorator to disable profiling for one function only (regardless of whether it is globally enabled or not):

cimport cython

@cython.profile(False)
def my_often_called_function():
    pass
Enabling line tracing

To get more detailed trace information (for tools that can make use of it), you can enable line tracing:

# cython: linetrace=True

This will also enable profiling support, so the above profile=True option is not needed. Line tracing is needed for coverage analysis, for example.

Note that even if line tracing is enabled via the compiler directive, it is not used by default. As the runtime slowdown can be substantial, it must additionally be compiled in by the C compiler by setting the C macro definition CYTHON_TRACE=1. To include nogil functions in the trace, set CYTHON_TRACE_NOGIL=1 (which implies CYTHON_TRACE=1). C macros can be defined either in the extension definition of the setup.py script or by setting the respective distutils options in the source file with the following file header comment (if cythonize() is used for compilation):

# distutils: define_macros=CYTHON_TRACE_NOGIL=1
Enabling coverage analysis

Since Cython 0.23, line tracing (see above) also enables support for coverage reporting with the coverage.py tool. To make the coverage analysis understand Cython modules, you also need to enable Cython’s coverage plugin in your .coveragerc file as follows:

[run]
plugins = Cython.Coverage

With this plugin, your Cython source files should show up normally in the coverage reports.

To include the coverage report in the Cython annotated HTML file, you need to first run the coverage.py tool to generate an XML result file. Pass this file into the cython command as follows:

$ cython  --annotate-coverage coverage.xml  package/mymodule.pyx

This will recompile the Cython module and generate one HTML output file next to each Cython source file it processes, containing colour markers for lines that were contained in the coverage report.

Profiling Tutorial

This will be a complete tutorial, start to finish, of profiling Python code, turning it into Cython code and keep profiling until it is fast enough.

As a toy example, we would like to evaluate the summation of the reciprocals of squares up to a certain integer n for evaluating \pi. The relation we want to use has been proven by Euler in 1735 and is known as the Basel problem.

\pi^2 = 6 \sum_{k=1}^{\infty} \frac{1}{k^2} =
6 \lim_{k \to \infty} \big( \frac{1}{1^2} +
      \frac{1}{2^2} + \dots + \frac{1}{k^2}  \big) \approx
6 \big( \frac{1}{1^2} + \frac{1}{2^2} + \dots + \frac{1}{n^2}  \big)

A simple Python code for evaluating the truncated sum looks like this:

# calc_pi.py

def recip_square(i):
    return 1. / i ** 2

def approx_pi(n=10000000):
    val = 0.
    for k in range(1, n + 1):
        val += recip_square(k)
    return (6 * val) ** .5

On my box, this needs approximately 4 seconds to run the function with the default n. The higher we choose n, the better will be the approximation for \pi. An experienced Python programmer will already see plenty of places to optimize this code. But remember the golden rule of optimization: Never optimize without having profiled. Let me repeat this: Never optimize without having profiled your code. Your thoughts about which part of your code takes too much time are wrong. At least, mine are always wrong. So let’s write a short script to profile our code:

# profile.py

import pstats, cProfile

import calc_pi

cProfile.runctx("calc_pi.approx_pi()", globals(), locals(), "Profile.prof")

s = pstats.Stats("Profile.prof")
s.strip_dirs().sort_stats("time").print_stats()

Running this on my box gives the following output:

Sat Nov  7 17:40:54 2009    Profile.prof

         10000004 function calls in 6.211 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    3.243    3.243    6.211    6.211 calc_pi.py:7(approx_pi)
 10000000    2.526    0.000    2.526    0.000 calc_pi.py:4(recip_square)
        1    0.442    0.442    0.442    0.442 {range}
        1    0.000    0.000    6.211    6.211 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

This contains the information that the code runs in 6.2 CPU seconds. Note that the code got slower by 2 seconds because it ran inside the cProfile module. The table contains the real valuable information. You might want to check the Python profiling documentation for the nitty gritty details. The most important columns here are totime (total time spent in this function not counting functions that were called by this function) and cumtime (total time spent in this function also counting the functions called by this function). Looking at the tottime column, we see that approximately half the time is spent in approx_pi and the other half is spent in recip_square. Also half a second is spent in range … of course we should have used xrange for such a big iteration. And in fact, just changing range to xrange makes the code run in 5.8 seconds.

We could optimize a lot in the pure Python version, but since we are interested in Cython, let’s move forward and bring this module to Cython. We would do this anyway at some time to get the loop run faster. Here is our first Cython version:

# cython: profile=True

# calc_pi.pyx

def recip_square(int i):
    return 1. / i ** 2

def approx_pi(int n=10000000):
    cdef double val = 0.
    cdef int k
    for k in range(1, n + 1):
        val += recip_square(k)
    return (6 * val) ** .5

Note the first line: We have to tell Cython that profiling should be enabled. This makes the Cython code slightly slower, but without this we would not get meaningful output from the cProfile module. The rest of the code is mostly unchanged, I only typed some variables which will likely speed things up a bit.

We also need to modify our profiling script to import the Cython module directly. Here is the complete version adding the import of the Pyximport module:

# profile.py

import pstats, cProfile

import pyximport
pyximport.install()

import calc_pi

cProfile.runctx("calc_pi.approx_pi()", globals(), locals(), "Profile.prof")

s = pstats.Stats("Profile.prof")
s.strip_dirs().sort_stats("time").print_stats()

We only added two lines, the rest stays completely the same. Alternatively, we could also manually compile our code into an extension; we wouldn’t need to change the profile script then at all. The script now outputs the following:

Sat Nov  7 18:02:33 2009    Profile.prof

         10000004 function calls in 4.406 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    3.305    3.305    4.406    4.406 calc_pi.pyx:7(approx_pi)
 10000000    1.101    0.000    1.101    0.000 calc_pi.pyx:4(recip_square)
        1    0.000    0.000    4.406    4.406 {calc_pi.approx_pi}
        1    0.000    0.000    4.406    4.406 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

We gained 1.8 seconds. Not too shabby. Comparing the output to the previous, we see that recip_square function got faster while the approx_pi function has not changed a lot. Let’s concentrate on the recip_square function a bit more. First note, that this function is not to be called from code outside of our module; so it would be wise to turn it into a cdef to reduce call overhead. We should also get rid of the power operator: it is turned into a pow(i,2) function call by Cython, but we could instead just write i*i which could be faster. The whole function is also a good candidate for inlining. Let’s look at the necessary changes for these ideas:

# cython: profile=True

# calc_pi.pyx

cdef inline double recip_square(int i):
    return 1. / (i * i)

def approx_pi(int n=10000000):
    cdef double val = 0.
    cdef int k
    for k in range(1, n + 1):
        val += recip_square(k)
    return (6 * val) ** .5

Now running the profile script yields:

Sat Nov  7 18:10:11 2009    Profile.prof

         10000004 function calls in 2.622 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    1.782    1.782    2.622    2.622 calc_pi.pyx:7(approx_pi)
 10000000    0.840    0.000    0.840    0.000 calc_pi.pyx:4(recip_square)
        1    0.000    0.000    2.622    2.622 {calc_pi.approx_pi}
        1    0.000    0.000    2.622    2.622 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

That bought us another 1.8 seconds. Not the dramatic change we could have expected. And why is recip_square still in this table; it is supposed to be inlined, isn’t it? The reason for this is that Cython still generates profiling code even if the function call is eliminated. Let’s tell it to not profile recip_square any more; we couldn’t get the function to be much faster anyway:

# cython: profile=True

# calc_pi.pyx

cimport cython

@cython.profile(False)
cdef inline double recip_square(int i):
    return 1. / (i * i)

def approx_pi(int n=10000000):
    cdef double val = 0.
    cdef int k
    for k in range(1, n + 1):
        val += recip_square(k)
    return (6 * val) ** .5

Running this shows an interesting result:

Sat Nov  7 18:15:02 2009    Profile.prof

         4 function calls in 0.089 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.089    0.089    0.089    0.089 calc_pi.pyx:10(approx_pi)
        1    0.000    0.000    0.089    0.089 {calc_pi.approx_pi}
        1    0.000    0.000    0.089    0.089 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

First note the tremendous speed gain: this version only takes 1/50 of the time of our first Cython version. Also note that recip_square has vanished from the table like we wanted. But the most peculiar and import change is that approx_pi also got much faster. This is a problem with all profiling: calling a function in a profile run adds a certain overhead to the function call. This overhead is not added to the time spent in the called function, but to the time spent in the calling function. In this example, approx_pi didn’t need 2.622 seconds in the last run; but it called recip_square 10000000 times, each time taking a little to set up profiling for it. This adds up to the massive time loss of around 2.6 seconds. Having disabled profiling for the often called function now reveals realistic timings for approx_pi; we could continue optimizing it now if needed.

This concludes this profiling tutorial. There is still some room for improvement in this code. We could try to replace the power operator in approx_pi with a call to sqrt from the C stdlib; but this is not necessarily faster than calling pow(x,0.5).

Even so, the result we achieved here is quite satisfactory: we came up with a solution that is much faster then our original Python version while retaining functionality and readability.

Unicode and passing strings

Similar to the string semantics in Python 3, Cython strictly separates byte strings and unicode strings. Above all, this means that by default there is no automatic conversion between byte strings and unicode strings (except for what Python 2 does in string operations). All encoding and decoding must pass through an explicit encoding/decoding step. To ease conversion between Python and C strings in simple cases, the module-level c_string_type and c_string_encoding directives can be used to implicitly insert these encoding/decoding steps.

Python string types in Cython code

Cython supports four Python string types: bytes, str, unicode and basestring. The bytes and unicode types are the specific types known from normal Python 2.x (named bytes and str in Python 3). Additionally, Cython also supports the bytearray type which behaves like the bytes type, except that it is mutable.

The str type is special in that it is the byte string in Python 2 and the Unicode string in Python 3 (for Cython code compiled with language level 2, i.e. the default). Meaning, it always corresponds exactly with the type that the Python runtime itself calls str. Thus, in Python 2, both bytes and str represent the byte string type, whereas in Python 3, both str and unicode represent the Python Unicode string type. The switch is made at C compile time, the Python version that is used to run Cython is not relevant.

When compiling Cython code with language level 3, the str type is identified with exactly the Unicode string type at Cython compile time, i.e. it does not identify with bytes when running in Python 2.

Note that the str type is not compatible with the unicode type in Python 2, i.e. you cannot assign a Unicode string to a variable or argument that is typed str. The attempt will result in either a compile time error (if detectable) or a TypeError exception at runtime. You should therefore be careful when you statically type a string variable in code that must be compatible with Python 2, as this Python version allows a mix of byte strings and unicode strings for data and users normally expect code to be able to work with both. Code that only targets Python 3 can safely type variables and arguments as either bytes or unicode.

The basestring type represents both the types str and unicode, i.e. all Python text string types in Python 2 and Python 3. This can be used for typing text variables that normally contain Unicode text (at least in Python 3) but must additionally accept the str type in Python 2 for backwards compatibility reasons. It is not compatible with the bytes type. Its usage should be rare in normal Cython code as the generic object type (i.e. untyped code) will normally be good enough and has the additional advantage of supporting the assignment of string subtypes. Support for the basestring type was added in Cython 0.20.

String literals

Cython understands all Python string type prefixes:

  • b'bytes' for byte strings
  • u'text' for Unicode strings
  • f'formatted {value}' for formatted Unicode string literals as defined by PEP 498 (added in Cython 0.24)

Unprefixed string literals become str objects when compiling with language level 2 and unicode objects (i.e. Python 3 str) with language level 3.

General notes about C strings

In many use cases, C strings (a.k.a. character pointers) are slow and cumbersome. For one, they usually require manual memory management in one way or another, which makes it more likely to introduce bugs into your code.

Then, Python string objects cache their length, so requesting it (e.g. to validate the bounds of index access or when concatenating two strings into one) is an efficient constant time operation. In contrast, calling strlen() to get this information from a C string takes linear time, which makes many operations on C strings rather costly.

Regarding text processing, Python has built-in support for Unicode, which C lacks completely. If you are dealing with Unicode text, you are usually better off using Python Unicode string objects than trying to work with encoded data in C strings. Cython makes this quite easy and efficient.

Generally speaking: unless you know what you are doing, avoid using C strings where possible and use Python string objects instead. The obvious exception to this is when passing them back and forth from and to external C code. Also, C++ strings remember their length as well, so they can provide a suitable alternative to Python bytes objects in some cases, e.g. when reference counting is not needed within a well defined context.

Passing byte strings

we have dummy C functions declared in a file called c_func.pyx that we are going to reuse throughout this tutorial:

from libc.stdlib cimport malloc
from libc.string cimport strcpy, strlen

cdef char* hello_world = 'hello world'
cdef Py_ssize_t n = strlen(hello_world)


cdef char* c_call_returning_a_c_string():
    cdef char* c_string = <char *> malloc((n + 1) * sizeof(char))
    if not c_string:
        raise MemoryError()
    strcpy(c_string, hello_world)
    return c_string


cdef void get_a_c_string(char** c_string_ptr, Py_ssize_t *length):
    c_string_ptr[0] = <char *> malloc((n + 1) * sizeof(char))
    if not c_string_ptr[0]:
        raise MemoryError()

    strcpy(c_string_ptr[0], hello_world)
    length[0] = n

We make a corresponding c_func.pxd to be able to cimport those functions:

cdef char* c_call_returning_a_c_string()
cdef void get_a_c_string(char** c_string, Py_ssize_t *length)

It is very easy to pass byte strings between C code and Python. When receiving a byte string from a C library, you can let Cython convert it into a Python byte string by simply assigning it to a Python variable:

from c_func cimport c_call_returning_a_c_string

cdef char* c_string = c_call_returning_a_c_string()
cdef bytes py_string = c_string

A type cast to object or bytes will do the same thing:

py_string = <bytes> c_string

This creates a Python byte string object that holds a copy of the original C string. It can be safely passed around in Python code, and will be garbage collected when the last reference to it goes out of scope. It is important to remember that null bytes in the string act as terminator character, as generally known from C. The above will therefore only work correctly for C strings that do not contain null bytes.

Besides not working for null bytes, the above is also very inefficient for long strings, since Cython has to call strlen() on the C string first to find out the length by counting the bytes up to the terminating null byte. In many cases, the user code will know the length already, e.g. because a C function returned it. In this case, it is much more efficient to tell Cython the exact number of bytes by slicing the C string. Here is an example:

from libc.stdlib cimport free
from c_func cimport get_a_c_string


def main():
    cdef char* c_string = NULL
    cdef Py_ssize_t length = 0

    # get pointer and length from a C function
    get_a_c_string(&c_string, &length)

    try:
        py_bytes_string = c_string[:length]  # Performs a copy of the data
    finally:
        free(c_string)

Here, no additional byte counting is required and length bytes from the c_string will be copied into the Python bytes object, including any null bytes. Keep in mind that the slice indices are assumed to be accurate in this case and no bounds checking is done, so incorrect slice indices will lead to data corruption and crashes.

Note that the creation of the Python bytes string can fail with an exception, e.g. due to insufficient memory. If you need to free() the string after the conversion, you should wrap the assignment in a try-finally construct:

from libc.stdlib cimport free
from c_func cimport c_call_returning_a_c_string

cdef bytes py_string
cdef char* c_string = c_call_returning_a_c_string()
try:
    py_string = c_string
finally:
    free(c_string)

To convert the byte string back into a C char*, use the opposite assignment:

cdef char* other_c_string = py_string  # other_c_string is a 0-terminated string.

This is a very fast operation after which other_c_string points to the byte string buffer of the Python string itself. It is tied to the life time of the Python string. When the Python string is garbage collected, the pointer becomes invalid. It is therefore important to keep a reference to the Python string as long as the char* is in use. Often enough, this only spans the call to a C function that receives the pointer as parameter. Special care must be taken, however, when the C function stores the pointer for later use. Apart from keeping a Python reference to the string object, no manual memory management is required.

Starting with Cython 0.20, the bytearray type is supported and coerces in the same way as the bytes type. However, when using it in a C context, special care must be taken not to grow or shrink the object buffer after converting it to a C string pointer. These modifications can change the internal buffer address, which will make the pointer invalid.

Accepting strings from Python code

The other side, receiving input from Python code, may appear simple at first sight, as it only deals with objects. However, getting this right without making the API too narrow or too unsafe may not be entirely obvious.

In the case that the API only deals with byte strings, i.e. binary data or encoded text, it is best not to type the input argument as something like bytes, because that would restrict the allowed input to exactly that type and exclude both subtypes and other kinds of byte containers, e.g. bytearray objects or memory views.

Depending on how (and where) the data is being processed, it may be a good idea to instead receive a 1-dimensional memory view, e.g.

def process_byte_data(unsigned char[:] data):
    length = data.shape[0]
    first_byte = data[0]
    slice_view = data[1:-1]
    # ...

Cython’s memory views are described in more detail in Typed Memoryviews, but the above example already shows most of the relevant functionality for 1-dimensional byte views. They allow for efficient processing of arrays and accept anything that can unpack itself into a byte buffer, without intermediate copying. The processed content can finally be returned in the memory view itself (or a slice of it), but it is often better to copy the data back into a flat and simple bytes or bytearray object, especially when only a small slice is returned. Since memoryviews do not copy the data, they would otherwise keep the entire original buffer alive. The general idea here is to be liberal with input by accepting any kind of byte buffer, but strict with output by returning a simple, well adapted object. This can simply be done as follows:

def process_byte_data(unsigned char[:] data):
    # ... process the data, here, dummy processing.
    cdef bint return_all = (data[0] == 108)

    if return_all:
        return bytes(data)
    else:
        # example for returning a slice
        return bytes(data[5:7])

If the byte input is actually encoded text, and the further processing should happen at the Unicode level, then the right thing to do is to decode the input straight away. This is almost only a problem in Python 2.x, where Python code expects that it can pass a byte string (str) with encoded text into a text API. Since this usually happens in more than one place in the module’s API, a helper function is almost always the way to go, since it allows for easy adaptation of the input normalisation process later.

This kind of input normalisation function will commonly look similar to the following:

# to_unicode.pyx

from cpython.version cimport PY_MAJOR_VERSION

cdef unicode _text(s):
    if type(s) is unicode:
        # Fast path for most common case(s).
        return <unicode>s

    elif PY_MAJOR_VERSION < 3 and isinstance(s, bytes):
        # Only accept byte strings as text input in Python 2.x, not in Py3.
        return (<bytes>s).decode('ascii')

    elif isinstance(s, unicode):
        # We know from the fast path above that 's' can only be a subtype here.
        # An evil cast to <unicode> might still work in some(!) cases,
        # depending on what the further processing does.  To be safe,
        # we can always create a copy instead.
        return unicode(s)

    else:
        raise TypeError("Could not convert to unicode.")

And should then be used like this:

from to_unicode cimport _text

def api_func(s):
    text_input = _text(s)
    # ...

Similarly, if the further processing happens at the byte level, but Unicode string input should be accepted, then the following might work, if you are using memory views:

# define a global name for whatever char type is used in the module
ctypedef unsigned char char_type

cdef char_type[:] _chars(s):
    if isinstance(s, unicode):
        # encode to the specific encoding used inside of the module
        s = (<unicode>s).encode('utf8')
    return s

In this case, you might want to additionally ensure that byte string input really uses the correct encoding, e.g. if you require pure ASCII input data, you can run over the buffer in a loop and check the highest bit of each byte. This should then also be done in the input normalisation function.

Dealing with “const”

Many C libraries use the const modifier in their API to declare that they will not modify a string, or to require that users must not modify a string they return, for example:

typedef const char specialChar;
int process_string(const char* s);
const unsigned char* look_up_cached_string(const unsigned char* key);

Cython has support for the const modifier in the language, so you can declare the above functions straight away as follows:

cdef extern from "someheader.h":
    ctypedef const char specialChar
    int process_string(const char* s)
    const unsigned char* look_up_cached_string(const unsigned char* key)

Decoding bytes to text

The initially presented way of passing and receiving C strings is sufficient if your code only deals with binary data in the strings. When we deal with encoded text, however, it is best practice to decode the C byte strings to Python Unicode strings on reception, and to encode Python Unicode strings to C byte strings on the way out.

With a Python byte string object, you would normally just call the bytes.decode() method to decode it into a Unicode string:

ustring = byte_string.decode('UTF-8')

Cython allows you to do the same for a C string, as long as it contains no null bytes:

from c_func cimport c_call_returning_a_c_string

cdef char* some_c_string = c_call_returning_a_c_string()
ustring = some_c_string.decode('UTF-8')

And, more efficiently, for strings where the length is known:

from c_func cimport get_a_c_string

cdef char* c_string = NULL
cdef Py_ssize_t length = 0

# get pointer and length from a C function
get_a_c_string(&c_string, &length)

ustring = c_string[:length].decode('UTF-8')

The same should be used when the string contains null bytes, e.g. when it uses an encoding like UCS-4, where each character is encoded in four bytes most of which tend to be 0.

Again, no bounds checking is done if slice indices are provided, so incorrect indices lead to data corruption and crashes. However, using negative indices is possible and will inject a call to strlen() in order to determine the string length. Obviously, this only works for 0-terminated strings without internal null bytes. Text encoded in UTF-8 or one of the ISO-8859 encodings is usually a good candidate. If in doubt, it’s better to pass indices that are ‘obviously’ correct than to rely on the data to be as expected.

It is common practice to wrap string conversions (and non-trivial type conversions in general) in dedicated functions, as this needs to be done in exactly the same way whenever receiving text from C. This could look as follows:

from libc.stdlib cimport free

cdef unicode tounicode(char* s):
    return s.decode('UTF-8', 'strict')

cdef unicode tounicode_with_length(
        char* s, size_t length):
    return s[:length].decode('UTF-8', 'strict')

cdef unicode tounicode_with_length_and_free(
        char* s, size_t length):
    try:
        return s[:length].decode('UTF-8', 'strict')
    finally:
        free(s)

Most likely, you will prefer shorter function names in your code based on the kind of string being handled. Different types of content often imply different ways of handling them on reception. To make the code more readable and to anticipate future changes, it is good practice to use separate conversion functions for different types of strings.

Encoding text to bytes

The reverse way, converting a Python unicode string to a C char*, is pretty efficient by itself, assuming that what you actually want is a memory managed byte string:

py_byte_string = py_unicode_string.encode('UTF-8')
cdef char* c_string = py_byte_string

As noted before, this takes the pointer to the byte buffer of the Python byte string. Trying to do the same without keeping a reference to the Python byte string will fail with a compile error:

# this will not compile !
cdef char* c_string = py_unicode_string.encode('UTF-8')

Here, the Cython compiler notices that the code takes a pointer to a temporary string result that will be garbage collected after the assignment. Later access to the invalidated pointer will read invalid memory and likely result in a segfault. Cython will therefore refuse to compile this code.

C++ strings

When wrapping a C++ library, strings will usually come in the form of the std::string class. As with C strings, Python byte strings automatically coerce from and to C++ strings:

# distutils: language = c++

from libcpp.string cimport string

def get_bytes():
    py_bytes_object = b'hello world'
    cdef string s = py_bytes_object

    s.append('abc')
    py_bytes_object = s
    return py_bytes_object

The memory management situation is different than in C because the creation of a C++ string makes an independent copy of the string buffer which the string object then owns. It is therefore possible to convert temporarily created Python objects directly into C++ strings. A common way to make use of this is when encoding a Python unicode string into a C++ string:

cdef string cpp_string = py_unicode_string.encode('UTF-8')

Note that this involves a bit of overhead because it first encodes the Unicode string into a temporarily created Python bytes object and then copies its buffer into a new C++ string.

For the other direction, efficient decoding support is available in Cython 0.17 and later:

# distutils: language = c++

from libcpp.string cimport string

def get_ustrings():
    cdef string s = string(b'abcdefg')

    ustring1 = s.decode('UTF-8')
    ustring2 = s[2:-2].decode('UTF-8')
    return ustring1, ustring2

For C++ strings, decoding slices will always take the proper length of the string into account and apply Python slicing semantics (e.g. return empty strings for out-of-bounds indices).

Auto encoding and decoding

Cython 0.19 comes with two new directives: c_string_type and c_string_encoding. They can be used to change the Python string types that C/C++ strings coerce from and to. By default, they only coerce from and to the bytes type, and encoding or decoding must be done explicitly, as described above.

There are two use cases where this is inconvenient. First, if all C strings that are being processed (or the large majority) contain text, automatic encoding and decoding from and to Python unicode objects can reduce the code overhead a little. In this case, you can set the c_string_type directive in your module to unicode and the c_string_encoding to the encoding that your C code uses, for example:

# cython: c_string_type=unicode, c_string_encoding=utf8

cdef char* c_string = 'abcdefg'

# implicit decoding:
cdef object py_unicode_object = c_string

# explicit conversion to Python bytes:
py_bytes_object = <bytes>c_string

The second use case is when all C strings that are being processed only contain ASCII encodable characters (e.g. numbers) and you want your code to use the native legacy string type in Python 2 for them, instead of always using Unicode. In this case, you can set the string type to str:

# cython: c_string_type=str, c_string_encoding=ascii

cdef char* c_string = 'abcdefg'

# implicit decoding in Py3, bytes conversion in Py2:
cdef object py_str_object = c_string

# explicit conversion to Python bytes:
py_bytes_object = <bytes>c_string

# explicit conversion to Python unicode:
py_bytes_object = <unicode>c_string

The other direction, i.e. automatic encoding to C strings, is only supported for ASCII and the “default encoding”, which is usually UTF-8 in Python 3 and usually ASCII in Python 2. CPython handles the memory management in this case by keeping an encoded copy of the string alive together with the original unicode string. Otherwise, there would be no way to limit the lifetime of the encoded string in any sensible way, thus rendering any attempt to extract a C string pointer from it a dangerous endeavour. The following safely converts a Unicode string to ASCII (change c_string_encoding to default to use the default encoding instead):

# cython: c_string_type=unicode, c_string_encoding=ascii

def func():
    ustring = u'abc'
    cdef char* s = ustring
    return s[0]    # returns u'a'

(This example uses a function context in order to safely control the lifetime of the Unicode string. Global Python variables can be modified from the outside, which makes it dangerous to rely on the lifetime of their values.)

Source code encoding

When string literals appear in the code, the source code encoding is important. It determines the byte sequence that Cython will store in the C code for bytes literals, and the Unicode code points that Cython builds for unicode literals when parsing the byte encoded source file. Following PEP 263, Cython supports the explicit declaration of source file encodings. For example, putting the following comment at the top of an ISO-8859-15 (Latin-9) encoded source file (into the first or second line) is required to enable ISO-8859-15 decoding in the parser:

# -*- coding: ISO-8859-15 -*-

When no explicit encoding declaration is provided, the source code is parsed as UTF-8 encoded text, as specified by PEP 3120. UTF-8 is a very common encoding that can represent the entire Unicode set of characters and is compatible with plain ASCII encoded text that it encodes efficiently. This makes it a very good choice for source code files which usually consist mostly of ASCII characters.

As an example, putting the following line into a UTF-8 encoded source file will print 5, as UTF-8 encodes the letter 'ö' in the two byte sequence '\xc3\xb6':

print( len(b'abcö') )

whereas the following ISO-8859-15 encoded source file will print 4, as the encoding uses only 1 byte for this letter:

# -*- coding: ISO-8859-15 -*-
print( len(b'abcö') )

Note that the unicode literal u'abcö' is a correctly decoded four character Unicode string in both cases, whereas the unprefixed Python str literal 'abcö' will become a byte string in Python 2 (thus having length 4 or 5 in the examples above), and a 4 character Unicode string in Python 3. If you are not familiar with encodings, this may not appear obvious at first read. See CEP 108 for details.

As a rule of thumb, it is best to avoid unprefixed non-ASCII str literals and to use unicode string literals for all text. Cython also supports the __future__ import unicode_literals that instructs the parser to read all unprefixed str literals in a source file as unicode string literals, just like Python 3.

Single bytes and characters

The Python C-API uses the normal C char type to represent a byte value, but it has two special integer types for a Unicode code point value, i.e. a single Unicode character: Py_UNICODE and Py_UCS4. Cython supports the first natively, support for Py_UCS4 is new in Cython 0.15. Py_UNICODE is either defined as an unsigned 2-byte or 4-byte integer, or as wchar_t, depending on the platform. The exact type is a compile time option in the build of the CPython interpreter and extension modules inherit this definition at C compile time. The advantage of Py_UCS4 is that it is guaranteed to be large enough for any Unicode code point value, regardless of the platform. It is defined as a 32bit unsigned int or long.

In Cython, the char type behaves differently from the Py_UNICODE and Py_UCS4 types when coercing to Python objects. Similar to the behaviour of the bytes type in Python 3, the char type coerces to a Python integer value by default, so that the following prints 65 and not A:

# -*- coding: ASCII -*-

cdef char char_val = 'A'
assert char_val == 65   # ASCII encoded byte value of 'A'
print( char_val )

If you want a Python bytes string instead, you have to request it explicitly, and the following will print A (or b'A' in Python 3):

print( <bytes>char_val )

The explicit coercion works for any C integer type. Values outside of the range of a char or unsigned char will raise an OverflowError at runtime. Coercion will also happen automatically when assigning to a typed variable, e.g.:

cdef bytes py_byte_string
py_byte_string = char_val

On the other hand, the Py_UNICODE and Py_UCS4 types are rarely used outside of the context of a Python unicode string, so their default behaviour is to coerce to a Python unicode object. The following will therefore print the character A, as would the same code with the Py_UNICODE type:

cdef Py_UCS4 uchar_val = u'A'
assert uchar_val == 65 # character point value of u'A'
print( uchar_val )

Again, explicit casting will allow users to override this behaviour. The following will print 65:

cdef Py_UCS4 uchar_val = u'A'
print( <long>uchar_val )

Note that casting to a C long (or unsigned long) will work just fine, as the maximum code point value that a Unicode character can have is 1114111 (0x10FFFF). On platforms with 32bit or more, int is just as good.

Narrow Unicode builds

In narrow Unicode builds of CPython before version 3.3, i.e. builds where sys.maxunicode is 65535 (such as all Windows builds, as opposed to 1114111 in wide builds), it is still possible to use Unicode character code points that do not fit into the 16 bit wide Py_UNICODE type. For example, such a CPython build will accept the unicode literal u'\U00012345'. However, the underlying system level encoding leaks into Python space in this case, so that the length of this literal becomes 2 instead of 1. This also shows when iterating over it or when indexing into it. The visible substrings are u'\uD808' and u'\uDF45' in this example. They form a so-called surrogate pair that represents the above character.

For more information on this topic, it is worth reading the Wikipedia article about the UTF-16 encoding.

The same properties apply to Cython code that gets compiled for a narrow CPython runtime environment. In most cases, e.g. when searching for a substring, this difference can be ignored as both the text and the substring will contain the surrogates. So most Unicode processing code will work correctly also on narrow builds. Encoding, decoding and printing will work as expected, so that the above literal turns into exactly the same byte sequence on both narrow and wide Unicode platforms.

However, programmers should be aware that a single Py_UNICODE value (or single ‘character’ unicode string in CPython) may not be enough to represent a complete Unicode character on narrow platforms. For example, if an independent search for u'\uD808' and u'\uDF45' in a unicode string succeeds, this does not necessarily mean that the character u'\U00012345 is part of that string. It may well be that two different characters are in the string that just happen to share a code unit with the surrogate pair of the character in question. Looking for substrings works correctly because the two code units in the surrogate pair use distinct value ranges, so the pair is always identifiable in a sequence of code points.

As of version 0.15, Cython has extended support for surrogate pairs so that you can safely use an in test to search character values from the full Py_UCS4 range even on narrow platforms:

cdef Py_UCS4 uchar = 0x12345
print( uchar in some_unicode_string )

Similarly, it can coerce a one character string with a high Unicode code point value to a Py_UCS4 value on both narrow and wide Unicode platforms:

cdef Py_UCS4 uchar = u'\U00012345'
assert uchar == 0x12345

In CPython 3.3 and later, the Py_UNICODE type is an alias for the system specific wchar_t type and is no longer tied to the internal representation of the Unicode string. Instead, any Unicode character can be represented on all platforms without resorting to surrogate pairs. This implies that narrow builds no longer exist from that version on, regardless of the size of Py_UNICODE. See PEP 393 for details.

Cython 0.16 and later handles this change internally and does the right thing also for single character values as long as either type inference is applied to untyped variables or the portable Py_UCS4 type is explicitly used in the source code instead of the platform specific Py_UNICODE type. Optimisations that Cython applies to the Python unicode type will automatically adapt to PEP 393 at C compile time, as usual.

Iteration

Cython 0.13 supports efficient iteration over char*, bytes and unicode strings, as long as the loop variable is appropriately typed. So the following will generate the expected C code:

cdef char* c_string = "Hello to A C-string's world"

cdef char c
for c in c_string[:11]:
    if c == 'A':
        print("Found the letter A")

The same applies to bytes objects:

cdef bytes bytes_string = b"hello to A bytes' world"

cdef char c
for c in bytes_string:
    if c == 'A':
        print("Found the letter A")

For unicode objects, Cython will automatically infer the type of the loop variable as Py_UCS4:

cdef unicode ustring = u'Hello world'

# NOTE: no typing required for 'uchar' !
for uchar in ustring:
    if uchar == u'A':
        print("Found the letter A")

The automatic type inference usually leads to much more efficient code here. However, note that some unicode operations still require the value to be a Python object, so Cython may end up generating redundant conversion code for the loop variable value inside of the loop. If this leads to a performance degradation for a specific piece of code, you can either type the loop variable as a Python object explicitly, or assign its value to a Python typed variable somewhere inside of the loop to enforce one-time coercion before running Python operations on it.

There are also optimisations for in tests, so that the following code will run in plain C code, (actually using a switch statement):

cpdef void is_in(Py_UCS4 uchar_val):
    if uchar_val in u'abcABCxY':
        print("The character is in the string.")
    else:
        print("The character is not in the string")

Combined with the looping optimisation above, this can result in very efficient character switching code, e.g. in unicode parsers.

Windows and wide character APIs

Windows system APIs natively support Unicode in the form of zero-terminated UTF-16 encoded wchar_t* strings, so called “wide strings”.

By default, Windows builds of CPython define Py_UNICODE as a synonym for wchar_t. This makes internal unicode representation compatible with UTF-16 and allows for efficient zero-copy conversions. This also means that Windows builds are always Narrow Unicode builds with all the caveats.

To aid interoperation with Windows APIs, Cython 0.19 supports wide strings (in the form of Py_UNICODE*) and implicitly converts them to and from unicode string objects. These conversions behave the same way as they do for char* and bytes as described in Passing byte strings.

In addition to automatic conversion, unicode literals that appear in C context become C-level wide string literals and len() built-in function is specialized to compute the length of zero-terminated Py_UNICODE* string or array.

Here is an example of how one would call a Unicode API on Windows:

cdef extern from "Windows.h":

    ctypedef Py_UNICODE WCHAR
    ctypedef const WCHAR* LPCWSTR
    ctypedef void* HWND

    int MessageBoxW(HWND hWnd, LPCWSTR lpText, LPCWSTR lpCaption, int uType)

title = u"Windows Interop Demo - Python %d.%d.%d" % sys.version_info[:3]
MessageBoxW(NULL, u"Hello Cython \u263a", title, 0)

Warning

The use of Py_UNICODE* strings outside of Windows is strongly discouraged. Py_UNICODE is inherently not portable between different platforms and Python versions.

CPython 3.3 has moved to a flexible internal representation of unicode strings (PEP 393), making all Py_UNICODE related APIs deprecated and inefficient.

One consequence of CPython 3.3 changes is that len() of unicode strings is always measured in code points (“characters”), while Windows API expect the number of UTF-16 code units (where each surrogate is counted individually). To always get the number of code units, call PyUnicode_GetSize() directly.

Memory Allocation

Dynamic memory allocation is mostly a non-issue in Python. Everything is an object, and the reference counting system and garbage collector automatically return memory to the system when it is no longer being used.

When it comes to more low-level data buffers, Cython has special support for (multi-dimensional) arrays of simple types via NumPy, memory views or Python’s stdlib array type. They are full featured, garbage collected and much easier to work with than bare pointers in C, while still retaining the speed and static typing benefits. See Working with Python arrays and Typed Memoryviews.

In some situations, however, these objects can still incur an unacceptable amount of overhead, which can then makes a case for doing manual memory management in C.

Simple C values and structs (such as a local variable cdef double x) are usually allocated on the stack and passed by value, but for larger and more complicated objects (e.g. a dynamically-sized list of doubles), the memory must be manually requested and released. C provides the functions malloc(), realloc(), and free() for this purpose, which can be imported in cython from clibc.stdlib. Their signatures are:

void* malloc(size_t size)
void* realloc(void* ptr, size_t size)
void free(void* ptr)

A very simple example of malloc usage is the following:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
import random
from libc.stdlib cimport malloc, free

def random_noise(int number=1):
    cdef int i
    # allocate number * sizeof(double) bytes of memory
    cdef double *my_array = <double *> malloc(number * sizeof(double))
    if not my_array:
        raise MemoryError()

    try:
        ran = random.normalvariate
        for i in range(number):
            my_array[i] = ran(0, 1)

        # ... let's just assume we do some more heavy C calculations here to make up
        # for the work that it takes to pack the C double values into Python float
        # objects below, right after throwing away the existing objects above.

        return [x for x in my_array[:number]]
    finally:
        # return the previously allocated memory to the system
        free(my_array)

Note that the C-API functions for allocating memory on the Python heap are generally preferred over the low-level C functions above as the memory they provide is actually accounted for in Python’s internal memory management system. They also have special optimisations for smaller memory blocks, which speeds up their allocation by avoiding costly operating system calls.

The C-API functions can be found in the cpython.mem standard declarations file:

from cpython.mem cimport PyMem_Malloc, PyMem_Realloc, PyMem_Free

Their interface and usage is identical to that of the corresponding low-level C functions.

One important thing to remember is that blocks of memory obtained with malloc() or PyMem_Malloc() must be manually released with a corresponding call to free() or PyMem_Free() when they are no longer used (and must always use the matching type of free function). Otherwise, they won’t be reclaimed until the python process exits. This is called a memory leak.

If a chunk of memory needs a larger lifetime than can be managed by a try..finally block, another helpful idiom is to tie its lifetime to a Python object to leverage the Python runtime’s memory management, e.g.:

from cpython.mem cimport PyMem_Malloc, PyMem_Realloc, PyMem_Free

cdef class SomeMemory:

    cdef double* data

    def __cinit__(self, size_t number):
        # allocate some memory (uninitialised, may contain arbitrary data)
        self.data = <double*> PyMem_Malloc(number * sizeof(double))
        if not self.data:
            raise MemoryError()

    def resize(self, size_t new_number):
        # Allocates new_number * sizeof(double) bytes,
        # preserving the current content and making a best-effort to
        # re-use the original data location.
        mem = <double*> PyMem_Realloc(self.data, new_number * sizeof(double))
        if not mem:
            raise MemoryError()
        # Only overwrite the pointer if the memory was really reallocated.
        # On error (mem is NULL), the originally memory has not been freed.
        self.data = mem

    def __dealloc__(self):
        PyMem_Free(self.data)  # no-op if self.data is NULL

Pure Python Mode

In some cases, it’s desirable to speed up Python code without losing the ability to run it with the Python interpreter. While pure Python scripts can be compiled with Cython, it usually results only in a speed gain of about 20%-50%.

To go beyond that, Cython provides language constructs to add static typing and cythonic functionalities to a Python module to make it run much faster when compiled, while still allowing it to be interpreted. This is accomplished via an augmenting .pxd file, via Python type annotations (following PEP 484 and PEP 526), and/or via special functions and decorators available after importing the magic cython module. All three ways can be combined at need, although projects would commonly decide on a specific way to keep the static type information easy to manage.

Although it is not typically recommended over writing straight Cython code in a .pyx file, there are legitimate reasons to do this - easier testing and debugging, collaboration with pure Python developers, etc. In pure mode, you are more or less restricted to code that can be expressed (or at least emulated) in Python, plus static type declarations. Anything beyond that can only be done in .pyx files with extended language syntax, because it depends on features of the Cython compiler.

Augmenting .pxd

Using an augmenting .pxd allows to let the original .py file completely untouched. On the other hand, one needs to maintain both the .pxd and the .py to keep them in sync.

While declarations in a .pyx file must correspond exactly with those of a .pxd file with the same name (and any contradiction results in a compile time error, see pxd files), the untyped definitions in a .py file can be overridden and augmented with static types by the more specific ones present in a .pxd.

If a .pxd file is found with the same name as the .py file being compiled, it will be searched for cdef classes and cdef/cpdef functions and methods. The compiler will then convert the corresponding classes/functions/methods in the .py file to be of the declared type. Thus if one has a file A.py:

def myfunction(x, y=2):
    a = x - y
    return a + x * y

def _helper(a):
    return a + 1

class A:
    def __init__(self, b=0):
        self.a = 3
        self.b = b

    def foo(self, x):
        print(x + _helper(1.0))

and adds A.pxd:

cpdef int myfunction(int x, int y=*)
cdef double _helper(double a)

cdef class A:
    cdef public int a, b
    cpdef foo(self, double x)

then Cython will compile the A.py as if it had been written as follows:

cpdef int myfunction(int x, int y=2):
    a = x - y
    return a + x * y

cdef double _helper(double a):
    return a + 1

cdef class A:
    cdef public int a, b
    def __init__(self, b=0):
        self.a = 3
        self.b = b

    cpdef foo(self, double x):
        print(x + _helper(1.0))

Notice how in order to provide the Python wrappers to the definitions in the .pxd, that is, to be accessible from Python,

  • Python visible function signatures must be declared as cpdef (with default arguments replaced by a * to avoid repetition):

    cpdef int myfunction(int x, int y=*)
    
  • C function signatures of internal functions can be declared as cdef:

    cdef double _helper(double a)
    
  • cdef classes (extension types) are declared as cdef class;

  • cdef class attributes must be declared as cdef public if read/write Python access is needed, cdef readonly for read-only Python access, or plain cdef for internal C level attributes;

  • cdef class methods must be declared as cpdef for Python visible methods or cdef for internal C methods.

In the example above, the type of the local variable a in myfunction() is not fixed and will thus be a Python object. To statically type it, one can use Cython’s @cython.locals decorator (see Magic Attributes, and Magic Attributes within the .pxd).

Normal Python (def) functions cannot be declared in .pxd files. It is therefore currently impossible to override the types of plain Python functions in .pxd files, e.g. to override types of their local variables. In most cases, declaring them as cpdef will work as expected.

Magic Attributes

Special decorators are available from the magic cython module that can be used to add static typing within the Python file, while being ignored by the interpreter.

This option adds the cython module dependency to the original code, but does not require to maintain a supplementary .pxd file. Cython provides a fake version of this module as Cython.Shadow, which is available as cython.py when Cython is installed, but can be copied to be used by other modules when Cython is not installed.

“Compiled” switch
  • compiled is a special variable which is set to True when the compiler runs, and False in the interpreter. Thus, the code

    import cython
    
    if cython.compiled:
        print("Yep, I'm compiled.")
    else:
        print("Just a lowly interpreted script.")
    

    will behave differently depending on whether or not the code is executed as a compiled extension (.so/.pyd) module or a plain .py file.

Static typing
  • cython.declare declares a typed variable in the current scope, which can be used in place of the cdef type var [= value] construct. This has two forms, the first as an assignment (useful as it creates a declaration in interpreted mode as well):

    import cython
    
    x = cython.declare(cython.int)              # cdef int x
    y = cython.declare(cython.double, 0.57721)  # cdef double y = 0.57721
    

    and the second mode as a simple function call:

    import cython
    
    cython.declare(x=cython.int, y=cython.double)  # cdef int x; cdef double y
    

    It can also be used to define extension type private, readonly and public attributes:

    import cython
    
    
    @cython.cclass
    class A:
        cython.declare(a=cython.int, b=cython.int)
        c = cython.declare(cython.int, visibility='public')
        d = cython.declare(cython.int)  # private by default.
        e = cython.declare(cython.int, visibility='readonly')
    
        def __init__(self, a, b, c, d=5, e=3):
            self.a = a
            self.b = b
            self.c = c
            self.d = d
            self.e = e
    
  • @cython.locals is a decorator that is used to specify the types of local variables in the function body (including the arguments):

    import cython
    
    @cython.locals(a=cython.long, b=cython.long, n=cython.longlong)
    def foo(a, b, x, y):
        n = a * b
        # ...
    
  • @cython.returns(<type>) specifies the function’s return type.

  • @cython.exceptval(value=None, *, check=False) specifies the function’s exception return value and exception check semantics as follows:

    @exceptval(-1)               # cdef int func() except -1:
    @exceptval(-1, check=False)  # cdef int func() except -1:
    @exceptval(check=True)       # cdef int func() except *:
    @exceptval(-1, check=True)   # cdef int func() except? -1:
    
  • Python annotations can be used to declare argument types, as shown in the following example. To avoid conflicts with other kinds of annotation usages, this can be disabled with the directive annotation_typing=False.

    import cython
    
    def func(foo: dict, bar: cython.int) -> tuple:
        foo["hello world"] = 3 + bar
        return foo, 5
    

    This can be combined with the @cython.exceptval() decorator for non-Python return types:

    import cython
    
    @cython.exceptval(-1)
    def func(x: cython.int) -> cython.int:
        if x < 0:
            raise ValueError("need integer >= 0")
        return x + 1
    

    Since version 0.27, Cython also supports the variable annotations defined in PEP 526. This allows to declare types of variables in a Python 3.6 compatible way as follows:

    import cython
    
    def func():
        # Cython types are evaluated as for cdef declarations
        x: cython.int               # cdef int x
        y: cython.double = 0.57721  # cdef double y = 0.57721
        z: cython.float = 0.57721   # cdef float z  = 0.57721
    
        # Python types shadow Cython types for compatibility reasons
        a: float = 0.54321          # cdef double a = 0.54321
        b: int = 5                  # cdef object b = 5
        c: long = 6                 # cdef object c = 6
        pass
    
    @cython.cclass
    class A:
        a: cython.int
        b: cython.int
    
        def __init__(self, b=0):
            self.a = 3
            self.b = b
    

    There is currently no way to express the visibility of object attributes.

C types

There are numerous types built into the Cython module. It provides all the standard C types, namely char, short, int, long, longlong as well as their unsigned versions uchar, ushort, uint, ulong, ulonglong. The special bint type is used for C boolean values and Py_ssize_t for (signed) sizes of Python containers.

For each type, there are pointer types p_int, pp_int, etc., up to three levels deep in interpreted mode, and infinitely deep in compiled mode. Further pointer types can be constructed with cython.pointer(cython.int), and arrays as cython.int[10]. A limited attempt is made to emulate these more complex types, but only so much can be done from the Python language.

The Python types int, long and bool are interpreted as C int, long and bint respectively. Also, the Python builtin types list, dict, tuple, etc. may be used, as well as any user defined types.

Typed C-tuples can be declared as a tuple of C types.

Extension types and cdef functions
  • The class decorator @cython.cclass creates a cdef class.
  • The function/method decorator @cython.cfunc creates a cdef function.
  • @cython.ccall creates a cpdef function, i.e. one that Cython code can call at the C level.
  • @cython.locals declares local variables (see above). It can also be used to declare types for arguments, i.e. the local variables that are used in the signature.
  • @cython.inline is the equivalent of the C inline modifier.
  • @cython.final terminates the inheritance chain by preventing a type from being used as a base class, or a method from being overridden in subtypes. This enables certain optimisations such as inlined method calls.

Here is an example of a cdef function:

@cython.cfunc
@cython.returns(cython.bint)
@cython.locals(a=cython.int, b=cython.int)
def c_compare(a,b):
    return a == b
Further Cython functions and declarations
  • address is used in place of the & operator:

    cython.declare(x=cython.int, x_ptr=cython.p_int)
    x_ptr = cython.address(x)
    
  • sizeof emulates the sizeof operator. It can take both types and expressions.

    cython.declare(n=cython.longlong)
    print(cython.sizeof(cython.longlong))
    print(cython.sizeof(n))
    
  • struct can be used to create struct types.:

    MyStruct = cython.struct(x=cython.int, y=cython.int, data=cython.double)
    a = cython.declare(MyStruct)
    

    is equivalent to the code:

    cdef struct MyStruct:
        int x
        int y
        double data
    
    cdef MyStruct a
    
  • union creates union types with exactly the same syntax as struct.

  • typedef defines a type under a given name:

    T = cython.typedef(cython.p_int)   # ctypedef int* T
    
  • cast will (unsafely) reinterpret an expression type. cython.cast(T, t) is equivalent to <T>t. The first attribute must be a type, the second is the expression to cast. Specifying the optional keyword argument typecheck=True has the semantics of <T?>t.

    t1 = cython.cast(T, t)
    t2 = cython.cast(T, t, typecheck=True)
    
Magic Attributes within the .pxd

The special cython module can also be imported and used within the augmenting .pxd file. For example, the following Python file dostuff.py:

def dostuff(n):
    t = 0
    for i in range(n):
        t += i
    return t

can be augmented with the following .pxd file dostuff.pxd:

import cython

@cython.locals(t=cython.int, i=cython.int)
cpdef int dostuff(int n)

The cython.declare() function can be used to specify types for global variables in the augmenting .pxd file.

Tips and Tricks

Calling C functions

Normally, it isn’t possible to call C functions in pure Python mode as there is no general way to support it in normal (uncompiled) Python. However, in cases where an equivalent Python function exists, this can be achieved by combining C function coercion with a conditional import as follows:

# mymodule.pxd

# declare a C function as "cpdef" to export it to the module
cdef extern from "math.h":
    cpdef double sin(double x)
# mymodule.py

import cython

# override with Python import if not in compiled code
if not cython.compiled:
    from math import sin

# calls sin() from math.h when compiled with Cython and math.sin() in Python
print(sin(0))

Note that the “sin” function will show up in the module namespace of “mymodule” here (i.e. there will be a mymodule.sin() function). You can mark it as an internal name according to Python conventions by renaming it to “_sin” in the .pxd file as follows:

cdef extern from "math.h":
    cpdef double _sin "sin" (double x)

You would then also change the Python import to from math import sin as _sin to make the names match again.

Using C arrays for fixed size lists

C arrays can automatically coerce to Python lists or tuples. This can be exploited to replace fixed size Python lists in Python code by C arrays when compiled. An example:

import cython


@cython.locals(counts=cython.int[10], digit=cython.int)
def count_digits(digits):
    """
    >>> digits = '01112222333334445667788899'
    >>> count_digits(map(int, digits))
    [1, 3, 4, 5, 3, 1, 2, 2, 3, 2]
    """
    counts = [0] * 10
    for digit in digits:
        assert 0 <= digit <= 9
        counts[digit] += 1
    return counts

In normal Python, this will use a Python list to collect the counts, whereas Cython will generate C code that uses a C array of C ints.

Working with NumPy

Note

Cython 0.16 introduced typed memoryviews as a successor to the NumPy integration described here. They are easier to use than the buffer syntax below, have less overhead, and can be passed around without requiring the GIL. They should be preferred to the syntax presented in this page. See Cython for NumPy users.

You can use NumPy from Cython exactly the same as in regular Python, but by doing so you are losing potentially high speedups because Cython has support for fast access to NumPy arrays. Let’s see how this works with a simple example.

The code below does 2D discrete convolution of an image with a filter (and I’m sure you can do better!, let it serve for demonstration purposes). It is both valid Python and valid Cython code. I’ll refer to it as both convolve_py.py for the Python version and convolve1.pyx for the Cython version – Cython uses “.pyx” as its file suffix.

import numpy as np


def naive_convolve(f, g):
    # f is an image and is indexed by (v, w)
    # g is a filter kernel and is indexed by (s, t),
    #   it needs odd dimensions
    # h is the output image and is indexed by (x, y),
    #   it is not cropped
    if g.shape[0] % 2 != 1 or g.shape[1] % 2 != 1:
        raise ValueError("Only odd dimensions on filter supported")
    # smid and tmid are number of pixels between the center pixel
    # and the edge, ie for a 5x5 filter they will be 2.
    #
    # The output size is calculated by adding smid, tmid to each
    # side of the dimensions of the input image.
    vmax = f.shape[0]
    wmax = f.shape[1]
    smax = g.shape[0]
    tmax = g.shape[1]
    smid = smax // 2
    tmid = tmax // 2
    xmax = vmax + 2 * smid
    ymax = wmax + 2 * tmid
    # Allocate result image.
    h = np.zeros([xmax, ymax], dtype=f.dtype)
    # Do convolution
    for x in range(xmax):
        for y in range(ymax):
            # Calculate pixel value for h at (x,y). Sum one component
            # for each pixel (s, t) of the filter g.
            s_from = max(smid - x, -smid)
            s_to = min((xmax - x) - smid, smid + 1)
            t_from = max(tmid - y, -tmid)
            t_to = min((ymax - y) - tmid, tmid + 1)
            value = 0
            for s in range(s_from, s_to):
                for t in range(t_from, t_to):
                    v = x - smid + s
                    w = y - tmid + t
                    value += g[smid - s, tmid - t] * f[v, w]
            h[x, y] = value
    return h

This should be compiled to produce yourmod.so (for Linux systems, on Windows systems, it will be yourmod.pyd). We run a Python session to test both the Python version (imported from .py-file) and the compiled Cython module.

In [1]: import numpy as np
In [2]: import convolve_py
In [3]: convolve_py.naive_convolve(np.array([[1, 1, 1]], dtype=np.int),
...     np.array([[1],[2],[1]], dtype=np.int))
Out [3]:
array([[1, 1, 1],
    [2, 2, 2],
    [1, 1, 1]])
In [4]: import convolve1
In [4]: convolve1.naive_convolve(np.array([[1, 1, 1]], dtype=np.int),
...     np.array([[1],[2],[1]], dtype=np.int))
Out [4]:
array([[1, 1, 1],
    [2, 2, 2],
    [1, 1, 1]])
In [11]: N = 100
In [12]: f = np.arange(N*N, dtype=np.int).reshape((N,N))
In [13]: g = np.arange(81, dtype=np.int).reshape((9, 9))
In [19]: %timeit -n2 -r3 convolve_py.naive_convolve(f, g)
2 loops, best of 3: 1.86 s per loop
In [20]: %timeit -n2 -r3 convolve1.naive_convolve(f, g)
2 loops, best of 3: 1.41 s per loop

There’s not such a huge difference yet; because the C code still does exactly what the Python interpreter does (meaning, for instance, that a new object is allocated for each number used). Look at the generated html file and see what is needed for even the simplest statements you get the point quickly. We need to give Cython more information; we need to add types.

Adding types

To add types we use custom Cython syntax, so we are now breaking Python source compatibility. Consider this code (read the comments!) :

# tag: numpy_old
# You can ignore the previous line.
# It's for internal testing of the cython documentation.

import numpy as np

# "cimport" is used to import special compile-time information
# about the numpy module (this is stored in a file numpy.pxd which is
# currently part of the Cython distribution).
cimport numpy as np

# We now need to fix a datatype for our arrays. I've used the variable
# DTYPE for this, which is assigned to the usual NumPy runtime
# type info object.
DTYPE = np.int

# "ctypedef" assigns a corresponding compile-time type to DTYPE_t. For
# every type in the numpy module there's a corresponding compile-time
# type with a _t-suffix.
ctypedef np.int_t DTYPE_t

# "def" can type its arguments but not have a return type. The type of the
# arguments for a "def" function is checked at run-time when entering the
# function.
#
# The arrays f, g and h is typed as "np.ndarray" instances. The only effect
# this has is to a) insert checks that the function arguments really are
# NumPy arrays, and b) make some attribute access like f.shape[0] much
# more efficient. (In this example this doesn't matter though.)
def naive_convolve(np.ndarray f, np.ndarray g):
    if g.shape[0] % 2 != 1 or g.shape[1] % 2 != 1:
        raise ValueError("Only odd dimensions on filter supported")
    assert f.dtype == DTYPE and g.dtype == DTYPE

    # The "cdef" keyword is also used within functions to type variables. It
    # can only be used at the top indentation level (there are non-trivial
    # problems with allowing them in other places, though we'd love to see
    # good and thought out proposals for it).
    #
    # For the indices, the "int" type is used. This corresponds to a C int,
    # other C types (like "unsigned int") could have been used instead.
    # Purists could use "Py_ssize_t" which is the proper Python type for
    # array indices.
    cdef int vmax = f.shape[0]
    cdef int wmax = f.shape[1]
    cdef int smax = g.shape[0]
    cdef int tmax = g.shape[1]
    cdef int smid = smax // 2
    cdef int tmid = tmax // 2
    cdef int xmax = vmax + 2 * smid
    cdef int ymax = wmax + 2 * tmid
    cdef np.ndarray h = np.zeros([xmax, ymax], dtype=DTYPE)
    cdef int x, y, s, t, v, w

    # It is very important to type ALL your variables. You do not get any
    # warnings if not, only much slower code (they are implicitly typed as
    # Python objects).
    cdef int s_from, s_to, t_from, t_to

    # For the value variable, we want to use the same data type as is
    # stored in the array, so we use "DTYPE_t" as defined above.
    # NB! An important side-effect of this is that if "value" overflows its
    # datatype size, it will simply wrap around like in C, rather than raise
    # an error like in Python.
    cdef DTYPE_t value
    for x in range(xmax):
        for y in range(ymax):
            s_from = max(smid - x, -smid)
            s_to = min((xmax - x) - smid, smid + 1)
            t_from = max(tmid - y, -tmid)
            t_to = min((ymax - y) - tmid, tmid + 1)
            value = 0
            for s in range(s_from, s_to):
                for t in range(t_from, t_to):
                    v = x - smid + s
                    w = y - tmid + t
                    value += g[smid - s, tmid - t] * f[v, w]
            h[x, y] = value
    return h

After building this and continuing my (very informal) benchmarks, I get:

In [21]: import convolve2
In [22]: %timeit -n2 -r3 convolve2.naive_convolve(f, g)
2 loops, best of 3: 828 ms per loop

Efficient indexing

There’s still a bottleneck killing performance, and that is the array lookups and assignments. The []-operator still uses full Python operations – what we would like to do instead is to access the data buffer directly at C speed.

What we need to do then is to type the contents of the ndarray objects. We do this with a special “buffer” syntax which must be told the datatype (first argument) and number of dimensions (“ndim” keyword-only argument, if not provided then one-dimensional is assumed).

These are the needed changes:

...
def naive_convolve(np.ndarray[DTYPE_t, ndim=2] f, np.ndarray[DTYPE_t, ndim=2] g):
...
cdef np.ndarray[DTYPE_t, ndim=2] h = ...

Usage:

In [18]: import convolve3
In [19]: %timeit -n3 -r100 convolve3.naive_convolve(f, g)
3 loops, best of 100: 11.6 ms per loop

Note the importance of this change.

Gotcha: This efficient indexing only affects certain index operations, namely those with exactly ndim number of typed integer indices. So if v for instance isn’t typed, then the lookup f[v, w] isn’t optimized. On the other hand this means that you can continue using Python objects for sophisticated dynamic slicing etc. just as when the array is not typed.

Tuning indexing further

The array lookups are still slowed down by two factors:

  1. Bounds checking is performed.

  2. Negative indices are checked for and handled correctly. The code above is explicitly coded so that it doesn’t use negative indices, and it (hopefully) always access within bounds. We can add a decorator to disable bounds checking:

    ...
    cimport cython
    @cython.boundscheck(False) # turn off bounds-checking for entire function
    @cython.wraparound(False)  # turn off negative index wrapping for entire function
    def naive_convolve(np.ndarray[DTYPE_t, ndim=2] f, np.ndarray[DTYPE_t, ndim=2] g):
    ...
    

Now bounds checking is not performed (and, as a side-effect, if you ‘’do’’ happen to access out of bounds you will in the best case crash your program and in the worst case corrupt data). It is possible to switch bounds-checking mode in many ways, see Compiler directives for more information.

Also, we’ve disabled the check to wrap negative indices (e.g. g[-1] giving the last value). As with disabling bounds checking, bad things will happen if we try to actually use negative indices with this disabled.

The function call overhead now starts to play a role, so we compare the latter two examples with larger N:

In [11]: %timeit -n3 -r100 convolve4.naive_convolve(f, g)
3 loops, best of 100: 5.97 ms per loop
In [12]: N = 1000
In [13]: f = np.arange(N*N, dtype=np.int).reshape((N,N))
In [14]: g = np.arange(81, dtype=np.int).reshape((9, 9))
In [17]: %timeit -n1 -r10 convolve3.naive_convolve(f, g)
1 loops, best of 10: 1.16 s per loop
In [18]: %timeit -n1 -r10 convolve4.naive_convolve(f, g)
1 loops, best of 10: 597 ms per loop

(Also this is a mixed benchmark as the result array is allocated within the function call.)

Warning

Speed comes with some cost. Especially it can be dangerous to set typed objects (like f, g and h in our sample code) to None. Setting such objects to None is entirely legal, but all you can do with them is check whether they are None. All other use (attribute lookup or indexing) can potentially segfault or corrupt data (rather than raising exceptions as they would in Python).

The actual rules are a bit more complicated but the main message is clear: Do not use typed objects without knowing that they are not set to None.

More generic code

It would be possible to do:

def naive_convolve(object[DTYPE_t, ndim=2] f, ...):

i.e. use object rather than np.ndarray. Under Python 3.0 this can allow your algorithm to work with any libraries supporting the buffer interface; and support for e.g. the Python Imaging Library may easily be added if someone is interested also under Python 2.x.

There is some speed penalty to this though (as one makes more assumptions compile-time if the type is set to np.ndarray, specifically it is assumed that the data is stored in pure strided mode and not in indirect mode).

Working with Python arrays

Python has a builtin array module supporting dynamic 1-dimensional arrays of primitive types. It is possible to access the underlying C array of a Python array from within Cython. At the same time they are ordinary Python objects which can be stored in lists and serialized between processes when using multiprocessing.

Compared to the manual approach with malloc() and free(), this gives the safe and automatic memory management of Python, and compared to a Numpy array there is no need to install a dependency, as the array module is built into both Python and Cython.

Safe usage with memory views

from cpython cimport array
import array
cdef array.array a = array.array('i', [1, 2, 3])
cdef int[:] ca = a

print(ca[0])

NB: the import brings the regular Python array object into the namespace while the cimport adds functions accessible from Cython.

A Python array is constructed with a type signature and sequence of initial values. For the possible type signatures, refer to the Python documentation for the array module.

Notice that when a Python array is assigned to a variable typed as memory view, there will be a slight overhead to construct the memory view. However, from that point on the variable can be passed to other functions without overhead, so long as it is typed:

from cpython cimport array
import array

cdef array.array a = array.array('i', [1, 2, 3])
cdef int[:] ca = a

cdef int overhead(object a):
    cdef int[:] ca = a
    return ca[0]

cdef int no_overhead(int[:] ca):
    return ca[0]

print(overhead(a))  # new memory view will be constructed, overhead
print(no_overhead(ca))  # ca is already a memory view, so no overhead

Zero-overhead, unsafe access to raw C pointer

To avoid any overhead and to be able to pass a C pointer to other functions, it is possible to access the underlying contiguous array as a pointer. There is no type or bounds checking, so be careful to use the right type and signedness.

from cpython cimport array
import array

cdef array.array a = array.array('i', [1, 2, 3])

# access underlying pointer:
print(a.data.as_ints[0])

from libc.string cimport memset

memset(a.data.as_voidptr, 0, len(a) * sizeof(int))

Note that any length-changing operation on the array object may invalidate the pointer.

Cloning, extending arrays

To avoid having to use the array constructor from the Python module, it is possible to create a new array with the same type as a template, and preallocate a given number of elements. The array is initialized to zero when requested.

from cpython cimport array
import array

cdef array.array int_array_template = array.array('i', [])
cdef array.array newarray

# create an array with 3 elements with same type as template
newarray = array.clone(int_array_template, 3, zero=False)

An array can also be extended and resized; this avoids repeated memory reallocation which would occur if elements would be appended or removed one by one.

from cpython cimport array
import array

cdef array.array a = array.array('i', [1, 2, 3])
cdef array.array b = array.array('i', [4, 5, 6])

# extend a with b, resize as needed
array.extend(a, b)
# resize a, leaving just original three elements
array.resize(a, len(a) - len(b))

API reference

Data fields
data.as_voidptr
data.as_chars
data.as_schars
data.as_uchars
data.as_shorts
data.as_ushorts
data.as_ints
data.as_uints
data.as_longs
data.as_ulongs
data.as_longlongs  # requires Python >=3
data.as_ulonglongs  # requires Python >=3
data.as_floats
data.as_doubles
data.as_pyunicodes

Direct access to the underlying contiguous C array, with given type; e.g., myarray.data.as_ints.

Functions

The following functions are available to Cython from the array module:

int resize(array self, Py_ssize_t n) except -1

Fast resize / realloc. Not suitable for repeated, small increments; resizes underlying array to exactly the requested amount.

int resize_smart(array self, Py_ssize_t n) except -1

Efficient for small increments; uses growth pattern that delivers amortized linear-time appends.

cdef inline array clone(array template, Py_ssize_t length, bint zero)

Fast creation of a new array, given a template array. Type will be same as template. If zero is True, new array will be initialized with zeroes.

cdef inline array copy(array self)

Make a copy of an array.

cdef inline int extend_buffer(array self, char* stuff, Py_ssize_t n) except -1

Efficient appending of new data of same type (e.g. of same array type) n: number of elements (not number of bytes!)

cdef inline int extend(array self, array other) except -1

Extend array with data from another array; types must match.

cdef inline void zero(array self)

Set all elements of array to zero.

Further reading

The main documentation is located at http://docs.cython.org/. Some recent features might not have documentation written yet, in such cases some notes can usually be found in the form of a Cython Enhancement Proposal (CEP) on https://github.com/cython/cython/wiki/enhancements.

[Seljebotn09] contains more information about Cython and NumPy arrays. If you intend to use Cython code in a multi-threaded setting, it is essential to read up on Cython’s features for managing the Global Interpreter Lock (the GIL). The same paper contains an explanation of the GIL, and the main documentation explains the Cython features for managing it.

Finally, don’t hesitate to ask questions (or post reports on successes!) on the Cython users mailing list [UserList]. The Cython developer mailing list, [DevList], is also open to everybody, but focusses on core development issues. Feel free to use it to report a clear bug, to ask for guidance if you have time to spare to develop Cython, or if you have suggestions for future development.

[DevList]Cython developer mailing list: https://mail.python.org/mailman/listinfo/cython-devel
[Seljebotn09]D. S. Seljebotn, Fast numerical computations with Cython, Proceedings of the 8th Python in Science Conference, 2009.
[UserList]Cython users mailing list: https://groups.google.com/group/cython-users

Appendix: Installing MinGW on Windows

  1. Download the MinGW installer from http://www.mingw.org/wiki/HOWTO_Install_the_MinGW_GCC_Compiler_Suite. (As of this writing, the download link is a bit difficult to find; it’s under “About” in the menu on the left-hand side). You want the file entitled “Automated MinGW Installer” (currently version 5.1.4).

  2. Run it and install MinGW. Only the basic package is strictly needed for Cython, although you might want to grab at least the C++ compiler as well.

  3. You need to set up Windows’ “PATH” environment variable so that includes e.g. “c:\mingw\bin” (if you installed MinGW to “c:\mingw”). The following web-page describes the procedure in Windows XP (the Vista procedure is similar): https://support.microsoft.com/kb/310519

  4. Finally, tell Python to use MinGW as the default compiler (otherwise it will try for Visual C). If Python is installed to “c:\Python27”, create a file named “c:\Python27\Lib\distutils\distutils.cfg” containing:

    [build]
    compiler = mingw32
    

The [WinInst] wiki page contains updated information about this procedure. Any contributions towards making the Windows install process smoother is welcomed; it is an unfortunate fact that none of the regular Cython developers have convenient access to Windows.

[WinInst]https://github.com/cython/cython/wiki/CythonExtensionsOnWindows

Users Guide

Contents:

Language Basics

Declaring Data Types

As a dynamic language, Python encourages a programming style of considering classes and objects in terms of their methods and attributes, more than where they fit into the class hierarchy.

This can make Python a very relaxed and comfortable language for rapid development, but with a price - the ‘red tape’ of managing data types is dumped onto the interpreter. At run time, the interpreter does a lot of work searching namespaces, fetching attributes and parsing argument and keyword tuples. This run-time ‘late binding’ is a major cause of Python’s relative slowness compared to ‘early binding’ languages such as C++.

However with Cython it is possible to gain significant speed-ups through the use of ‘early binding’ programming techniques.

Note

Typing is not a necessity

Providing static typing to parameters and variables is convenience to speed up your code, but it is not a necessity. Optimize where and when needed. In fact, typing can slow down your code in the case where the typing does not allow optimizations but where Cython still needs to check that the type of some object matches the declared type.

C variable and type definitions

The cdef statement is used to declare C variables, either local or module-level:

cdef int i, j, k
cdef float f, g[42], *h

and C struct, union or enum types:

cdef struct Grail:
    int age
    float volume

cdef union Food:
    char *spam
    float *eggs

cdef enum CheeseType:
    cheddar, edam,
    camembert

cdef enum CheeseState:
    hard = 1
    soft = 2
    runny = 3

See also Styles of struct, union and enum declaration

Note

Structs can be declared as cdef packed struct, which has the same effect as the C directive #pragma pack(1).

Declaring an enum as cpdef will create a PEP 435-style Python wrapper:

cpdef enum CheeseState:
    hard = 1
    soft = 2
    runny = 3

There is currently no special syntax for defining a constant, but you can use an anonymous enum declaration for this purpose, for example,:

cdef enum:
    tons_of_spam = 3

Note

the words struct, union and enum are used only when defining a type, not when referring to it. For example, to declare a variable pointing to a Grail you would write:

cdef Grail *gp

and not:

cdef struct Grail *gp # WRONG

There is also a ctypedef statement for giving names to types, e.g.:

ctypedef unsigned long ULong

ctypedef int* IntPtr

It is also possible to declare functions with cdef, making them c functions.

cdef int eggs(unsigned long l, float f):
    ...

You can read more about them in Python functions vs. C functions.

You can declare classes with cdef, making them Extension Types. Those will have a behavior very close to python classes, but are faster because they use a struct internally to store attributes.

Here is a simple example:

from __future__ import print_function

cdef class Shrubbery:
    cdef int width, height

    def __init__(self, w, h):
        self.width = w
        self.height = h

    def describe(self):
        print("This shrubbery is", self.width,
              "by", self.height, "cubits.")

You can read more about them in Extension Types.

Types

Cython uses the normal C syntax for C types, including pointers. It provides all the standard C types, namely char, short, int, long, long long as well as their unsigned versions, e.g. unsigned int. The special bint type is used for C boolean values (int with 0/non-0 values for False/True) and Py_ssize_t for (signed) sizes of Python containers.

Pointer types are constructed as in C, by appending a * to the base type they point to, e.g. int** for a pointer to a pointer to a C int. Arrays use the normal C array syntax, e.g. int[10], and the size must be known at compile time for stack allocated arrays. Cython doesn’t support variable length arrays from C99. Note that Cython uses array access for pointer dereferencing, as *x is not valid Python syntax, whereas x[0] is.

Also, the Python types list, dict, tuple, etc. may be used for static typing, as well as any user defined Extension Types. For example:

cdef list foo = []

This requires an exact match of the class, it does not allow subclasses. This allows Cython to optimize code by accessing internals of the builtin class. For this kind of typing, Cython uses internally a C variable of type PyObject*. The Python types int, long, and float are not available for static typing and instead interpreted as C int, long, and float respectively, as statically typing variables with these Python types has zero advantages.

Cython provides an accelerated and typed equivalent of a Python tuple, the ctuple. A ctuple is assembled from any valid C types. For example:

cdef (double, int) bar

They compile down to C-structures and can be used as efficient alternatives to Python tuples.

While these C types can be vastly faster, they have C semantics. Specifically, the integer types overflow and the C float type only has 32 bits of precision (as opposed to the 64-bit C double which Python floats wrap and is typically what one wants). If you want to use these numeric Python types simply omit the type declaration and let them be objects.

It is also possible to declare Extension Types (declared with cdef class). This does allow subclasses. This typing is mostly used to access cdef methods and attributes of the extension type. The C code uses a variable which is a pointer to a structure of the specific type, something like struct MyExtensionTypeObject*.

Grouping multiple C declarations

If you have a series of declarations that all begin with cdef, you can group them into a cdef block like this:

from __future__ import print_function

cdef:
    struct Spam:
        int tons

    int i
    float a
    Spam *p

    void f(Spam *s):
        print(s.tons, "Tons of spam")

Python functions vs. C functions

There are two kinds of function definition in Cython:

Python functions are defined using the def statement, as in Python. They take Python objects as parameters and return Python objects.

C functions are defined using the new cdef statement. They take either Python objects or C values as parameters, and can return either Python objects or C values.

Within a Cython module, Python functions and C functions can call each other freely, but only Python functions can be called from outside the module by interpreted Python code. So, any functions that you want to “export” from your Cython module must be declared as Python functions using def. There is also a hybrid function, called cpdef. A cpdef can be called from anywhere, but uses the faster C calling conventions when being called from other Cython code. A cpdef can also be overridden by a Python method on a subclass or an instance attribute, even when called from Cython. If this happens, most performance gains are of course lost and even if it does not, there is a tiny overhead in calling a cpdef method from Cython compared to calling a cdef method.

Parameters of either type of function can be declared to have C data types, using normal C declaration syntax. For example,:

def spam(int i, char *s):
    ...

cdef int eggs(unsigned long l, float f):
    ...

ctuples may also be used:

cdef (int, float) chips((long, long, double) t):
    ...

When a parameter of a Python function is declared to have a C data type, it is passed in as a Python object and automatically converted to a C value, if possible. In other words, the definition of spam above is equivalent to writing:

def spam(python_i, python_s):
    cdef int i = python_i
    cdef char* s = python_s
    ...

Automatic conversion is currently only possible for numeric types, string types and structs (composed recursively of any of these types); attempting to use any other type for the parameter of a Python function will result in a compile-time error. Care must be taken with strings to ensure a reference if the pointer is to be used after the call. Structs can be obtained from Python mappings, and again care must be taken with string attributes if they are to be used after the function returns.

C functions, on the other hand, can have parameters of any type, since they’re passed in directly using a normal C function call.

Functions declared using cdef with Python object return type, like Python functions, will return a None value when execution leaves the function body without an explicit return value. This is in contrast to C/C++, which leaves the return value undefined. In the case of non-Python object return types, the equivalent of zero is returned, for example, 0 for int, False for bint and NULL for pointer types.

A more complete comparison of the pros and cons of these different method types can be found at Early Binding for Speed.

Python objects as parameters and return values

If no type is specified for a parameter or return value, it is assumed to be a Python object. (Note that this is different from the C convention, where it would default to int.) For example, the following defines a C function that takes two Python objects as parameters and returns a Python object:

cdef spamobjs(x, y):
    ...

Reference counting for these objects is performed automatically according to the standard Python/C API rules (i.e. borrowed references are taken as parameters and a new reference is returned).

Warning

This only applies to Cython code. Other Python packages which are implemented in C like NumPy may not follow these conventions.

The name object can also be used to explicitly declare something as a Python object. This can be useful if the name being declared would otherwise be taken as the name of a type, for example,:

cdef ftang(object int):
    ...

declares a parameter called int which is a Python object. You can also use object as the explicit return type of a function, e.g.:

cdef object ftang(object int):
    ...

In the interests of clarity, it is probably a good idea to always be explicit about object parameters in C functions.

Optional Arguments

Unlike C, it is possible to use optional arguments in cdef and cpdef functions. There are differences though whether you declare them in a .pyx file or the corresponding .pxd file.

To avoid repetition (and potential future inconsistencies), default argument values are not visible in the declaration (in .pxd files) but only in the implementation (in .pyx files).

When in a .pyx file, the signature is the same as it is in Python itself:

from __future__ import print_function

cdef class A:
    cdef foo(self):
        print("A")

cdef class B(A):
    cdef foo(self, x=None):
        print("B", x)

cdef class C(B):
    cpdef foo(self, x=True, int k=3):
        print("C", x, k)

When in a .pxd file, the signature is different like this example: cdef foo(x=*). This is because the program calling the function just needs to know what signatures are possible in C, but doesn’t need to know the value of the default arguments.:

cdef class A:
    cdef foo(self)

cdef class B(A):
    cdef foo(self, x=*)

cdef class C(B):
    cpdef foo(self, x=*, int k=*)

Note

The number of arguments may increase when subclassing, but the arg types and order must be the same, as shown in the example above.

There may be a slight performance penalty when the optional arg is overridden with one that does not have default values.

Keyword-only Arguments

As in Python 3, def functions can have keyword-only arguments listed after a "*" parameter and before a "**" parameter if any:

def f(a, b, *args, c, d = 42, e, **kwds):
    ...


# We cannot call f with less verbosity than this.
foo = f(4, "bar", c=68, e=1.0)

As shown above, the c, d and e arguments can not be passed as positional arguments and must be passed as keyword arguments. Furthermore, c and e are required keyword arguments since they do not have a default value.

A single "*" without argument name can be used to terminate the list of positional arguments:

def g(a, b, *, c, d):
    ...

# We cannot call g with less verbosity than this.
foo = g(4.0, "something", c=68, d="other")

Shown above, the signature takes exactly two positional parameters and has two required keyword parameters.

Function Pointers

Functions declared in a struct are automatically converted to function pointers.

For using error return values with function pointers, see the note at the bottom of Error return values.

Error return values

If you don’t do anything special, a function declared with cdef that does not return a Python object has no way of reporting Python exceptions to its caller. If an exception is detected in such a function, a warning message is printed and the exception is ignored.

If you want a C function that does not return a Python object to be able to propagate exceptions to its caller, you need to declare an exception value for it. Here is an example:

cdef int spam() except -1:
    ...

With this declaration, whenever an exception occurs inside spam, it will immediately return with the value -1. Furthermore, whenever a call to spam returns -1, an exception will be assumed to have occurred and will be propagated.

When you declare an exception value for a function, you should never explicitly or implicitly return that value. In particular, if the exceptional return value is a False value, then you should ensure the function will never terminate via an implicit or empty return.

If all possible return values are legal and you can’t reserve one entirely for signalling errors, you can use an alternative form of exception value declaration:

cdef int spam() except? -1:
    ...

The “?” indicates that the value -1 only indicates a possible error. In this case, Cython generates a call to PyErr_Occurred() if the exception value is returned, to make sure it really is an error.

There is also a third form of exception value declaration:

cdef int spam() except *:
    ...

This form causes Cython to generate a call to PyErr_Occurred() after every call to spam, regardless of what value it returns. If you have a function returning void that needs to propagate errors, you will have to use this form, since there isn’t any return value to test. Otherwise there is little use for this form.

An external C++ function that may raise an exception can be declared with:

cdef int spam() except +

See Using C++ in Cython for more details.

Some things to note:

  • Exception values can only declared for functions returning an integer, enum, float or pointer type, and the value must be a constant expression. Void functions can only use the except * form.

  • The exception value specification is part of the signature of the function. If you’re passing a pointer to a function as a parameter or assigning it to a variable, the declared type of the parameter or variable must have the same exception value specification (or lack thereof). Here is an example of a pointer-to-function declaration with an exception value:

    int (*grail)(int, char*) except -1
    
  • You don’t need to (and shouldn’t) declare exception values for functions which return Python objects. Remember that a function with no declared return type implicitly returns a Python object. (Exceptions on such functions are implicitly propagated by returning NULL.)

Checking return values of non-Cython functions

It’s important to understand that the except clause does not cause an error to be raised when the specified value is returned. For example, you can’t write something like:

cdef extern FILE *fopen(char *filename, char *mode) except NULL # WRONG!

and expect an exception to be automatically raised if a call to fopen() returns NULL. The except clause doesn’t work that way; its only purpose is for propagating Python exceptions that have already been raised, either by a Cython function or a C function that calls Python/C API routines. To get an exception from a non-Python-aware function such as fopen(), you will have to check the return value and raise it yourself, for example:

from libc.stdio cimport FILE, fopen
from libc.stdlib cimport malloc, free
from cpython.exc cimport PyErr_SetFromErrnoWithFilenameObject

def open_file():
    cdef FILE* p
    p = fopen("spam.txt", "r")
    if p is NULL:
        PyErr_SetFromErrnoWithFilenameObject(OSError, "spam.txt")
    ...


def allocating_memory(number=10):
    cdef double *my_array = <double *> malloc(number * sizeof(double))
    if not my_array:  # same as 'is NULL' above
        raise MemoryError()
    ...
    free(my_array)
Overriding in extension types

cpdef methods can override cdef methods:

from __future__ import print_function

cdef class A:
    cdef foo(self):
        print("A")

cdef class B(A):
    cdef foo(self, x=None):
        print("B", x)

cdef class C(B):
    cpdef foo(self, x=True, int k=3):
        print("C", x, k)

When subclassing an extension type with a Python class, def methods can override cpdef methods but not cdef methods:

from __future__ import print_function

cdef class A:
    cdef foo(self):
        print("A")

cdef class B(A):
    cpdef foo(self):
        print("B")

class C(B):  # NOTE: not cdef class
    def foo(self):
        print("C")

If C above would be an extension type (cdef class), this would not work correctly. The Cython compiler will give a warning in that case.

Automatic type conversions

In most situations, automatic conversions will be performed for the basic numeric and string types when a Python object is used in a context requiring a C value, or vice versa. The following table summarises the conversion possibilities.

C types From Python types To Python types
[unsigned] char, [unsigned] short, int, long int, long int
unsigned int, unsigned long, [unsigned] long long int, long long
float, double, long double int, long, float float
char* str/bytes str/bytes [3]
C array iterable list [5]
struct, union   dict [4]
[3]The conversion is to/from str for Python 2.x, and bytes for Python 3.x.
[4]The conversion from a C union type to a Python dict will add a value for each of the union fields. Cython 0.23 and later, however, will refuse to automatically convert a union with unsafe type combinations. An example is a union of an int and a char*, in which case the pointer value may or may not be a valid pointer.
[5]Other than signed/unsigned char[]. The conversion will fail if the length of C array is not known at compile time, and when using a slice of a C array.
Caveats when using a Python string in a C context

You need to be careful when using a Python string in a context expecting a char*. In this situation, a pointer to the contents of the Python string is used, which is only valid as long as the Python string exists. So you need to make sure that a reference to the original Python string is held for as long as the C string is needed. If you can’t guarantee that the Python string will live long enough, you will need to copy the C string.

Cython detects and prevents some mistakes of this kind. For instance, if you attempt something like:

cdef char *s
s = pystring1 + pystring2

then Cython will produce the error message Obtaining char* from temporary Python value. The reason is that concatenating the two Python strings produces a new Python string object that is referenced only by a temporary internal variable that Cython generates. As soon as the statement has finished, the temporary variable will be decrefed and the Python string deallocated, leaving s dangling. Since this code could not possibly work, Cython refuses to compile it.

The solution is to assign the result of the concatenation to a Python variable, and then obtain the char* from that, i.e.:

cdef char *s
p = pystring1 + pystring2
s = p

It is then your responsibility to hold the reference p for as long as necessary.

Keep in mind that the rules used to detect such errors are only heuristics. Sometimes Cython will complain unnecessarily, and sometimes it will fail to detect a problem that exists. Ultimately, you need to understand the issue and be careful what you do.

Type Casting

Where C uses "(" and ")", Cython uses "<" and ">". For example:

cdef char *p
cdef float *q
p = <char*>q

When casting a C value to a Python object type or vice versa, Cython will attempt a coercion. Simple examples are casts like <int>pyobj, which converts a Python number to a plain C int value, or <bytes>charptr, which copies a C char* string into a new Python bytes object.

Note

Cython will not prevent a redundant cast, but emits a warning for it.

To get the address of some Python object, use a cast to a pointer type like <void*> or <PyObject*>. You can also cast a C pointer back to a Python object reference with <object>, or a more specific builtin or extension type (e.g. <MyExtType>ptr). This will increase the reference count of the object by one, i.e. the cast returns an owned reference. Here is an example:

from cpython.ref cimport PyObject

cdef extern from *:
    ctypedef Py_ssize_t Py_intptr_t

python_string = "foo"

cdef void* ptr = <void*>python_string
cdef Py_intptr_t adress_in_c = <Py_intptr_t>ptr
address_from_void = adress_in_c        # address_from_void is a python int

cdef PyObject* ptr2 = <PyObject*>python_string
cdef Py_intptr_t address_in_c2 = <Py_intptr_t>ptr2
address_from_PyObject = address_in_c2  # address_from_PyObject is a python int

assert address_from_void == address_from_PyObject == id(python_string)

print(<object>ptr)                     # Prints "foo"
print(<object>ptr2)                    # prints "foo"

The precedence of <...> is such that <type>a.b.c is interpreted as <type>(a.b.c).

Checked Type Casts

A cast like <MyExtensionType>x will cast x to the class MyExtensionType without any checking at all.

To have a cast checked, use the syntax like: <MyExtensionType?>x. In this case, Cython will apply a runtime check that raises a TypeError if x is not an instance of MyExtensionType. This tests for the exact class for builtin types, but allows subclasses for Extension Types.

Statements and expressions

Control structures and expressions follow Python syntax for the most part. When applied to Python objects, they have the same semantics as in Python (unless otherwise noted). Most of the Python operators can also be applied to C values, with the obvious semantics.

If Python objects and C values are mixed in an expression, conversions are performed automatically between Python objects and C numeric or string types.

Reference counts are maintained automatically for all Python objects, and all Python operations are automatically checked for errors, with appropriate action taken.

Differences between C and Cython expressions

There are some differences in syntax and semantics between C expressions and Cython expressions, particularly in the area of C constructs which have no direct equivalent in Python.

  • An integer literal is treated as a C constant, and will be truncated to whatever size your C compiler thinks appropriate. To get a Python integer (of arbitrary precision) cast immediately to an object (e.g. <object>100000000000000000000). The L, LL, and U suffixes have the same meaning as in C.

  • There is no -> operator in Cython. Instead of p->x, use p.x

  • There is no unary * operator in Cython. Instead of *p, use p[0]

  • There is an & operator, with the same semantics as in C.

  • The null C pointer is called NULL, not 0 (and NULL is a reserved word).

  • Type casts are written <type>value , for example,:

    cdef char* p, float* q
    p = <char*>q
    
Scope rules

Cython determines whether a variable belongs to a local scope, the module scope, or the built-in scope completely statically. As with Python, assigning to a variable which is not otherwise declared implicitly declares it to be a variable residing in the scope where it is assigned. The type of the variable depends on type inference, except for the global module scope, where it is always a Python object.

Built-in Functions

Cython compiles calls to most built-in functions into direct calls to the corresponding Python/C API routines, making them particularly fast.

Only direct function calls using these names are optimised. If you do something else with one of these names that assumes it’s a Python object, such as assign it to a Python variable, and later call it, the call will be made as a Python function call.

Function and arguments Return type Python/C API Equivalent
abs(obj) object, double, … PyNumber_Absolute, fabs, fabsf, …
callable(obj) bint PyObject_Callable
delattr(obj, name) None PyObject_DelAttr
exec(code, [glob, [loc]]) object
dir(obj) list PyObject_Dir
divmod(a, b) tuple PyNumber_Divmod
getattr(obj, name, [default]) (Note 1) object PyObject_GetAttr
hasattr(obj, name) bint PyObject_HasAttr
hash(obj) int / long PyObject_Hash
intern(obj) object Py*_InternFromString
isinstance(obj, type) bint PyObject_IsInstance
issubclass(obj, type) bint PyObject_IsSubclass
iter(obj, [sentinel]) object PyObject_GetIter
len(obj) Py_ssize_t PyObject_Length
pow(x, y, [z]) object PyNumber_Power
reload(obj) object PyImport_ReloadModule
repr(obj) object PyObject_Repr
setattr(obj, name) void PyObject_SetAttr

Note 1: Pyrex originally provided a function getattr3(obj, name, default)() corresponding to the three-argument form of the Python builtin getattr(). Cython still supports this function, but the usage is deprecated in favour of the normal builtin, which Cython can optimise in both forms.

Operator Precedence

Keep in mind that there are some differences in operator precedence between Python and C, and that Cython uses the Python precedences, not the C ones.

Integer for-loops

Cython recognises the usual Python for-in-range integer loop pattern:

for i in range(n):
    ...

If i is declared as a cdef integer type, it will optimise this into a pure C loop. This restriction is required as otherwise the generated code wouldn’t be correct due to potential integer overflows on the target architecture. If you are worried that the loop is not being converted correctly, use the annotate feature of the cython commandline (-a) to easily see the generated C code. See Automatic range conversion

For backwards compatibility to Pyrex, Cython also supports a more verbose form of for-loop which you might find in legacy code:

for i from 0 <= i < n:
    ...

or:

for i from 0 <= i < n by s:
    ...

where s is some integer step size.

Note

This syntax is deprecated and should not be used in new code. Use the normal Python for-loop instead.

Some things to note about the for-from loop:

  • The target expression must be a plain variable name.
  • The name between the lower and upper bounds must be the same as the target name.
  • The direction of iteration is determined by the relations. If they are both from the set {<, <=} then it is upwards; if they are both from the set {>, >=} then it is downwards. (Any other combination is disallowed.)

Like other Python looping statements, break and continue may be used in the body, and the loop may have an else clause.

Cython file types

There are three file types in Cython:

  • The implementation files, carrying a .py or .pyx suffix.
  • The definition files, carrying a .pxd suffix.
  • The include files, carrying a .pxi suffix.
The implementation file

The implementation file, as the name suggest, contains the implementation of your functions, classes, extension types, etc. Nearly all the python syntax is supported in this file. Most of the time, a .py file can be renamed into a .pyx file without changing any code, and Cython will retain the python behavior.

It is possible for Cython to compile both .py and .pyx files. The name of the file isn’t important if one wants to use only the Python syntax, and Cython won’t change the generated code depending on the suffix used. Though, if one want to use the Cython syntax, using a .pyx file is necessary.

In addition to the Python syntax, the user can also leverage Cython syntax (such as cdef) to use C variables, can declare functions as cdef or cpdef and can import C definitions with cimport. Many other Cython features usable in implementation files can be found throughout this page and the rest of the Cython documentation.

There are some restrictions on the implementation part of some Extension Types if the corresponding definition file also defines that type.

Note

When a .pyx file is compiled, Cython first checks to see if a corresponding .pxd file exists and processes it first. It acts like a header file for a Cython .pyx file. You can put inside functions that will be used by other Cython modules. This allows different Cython modules to use functions and classes from each other without the Python overhead. To read more about what how to do that, you can see pxd files.

The definition file

A definition file is used to declare various things.

Any C declaration can be made, and it can be also a declaration of a C variable or function implemented in a C/C++ file. This can be done with cdef extern from. Sometimes, .pxd files are used as a translation of C/C++ header files into a syntax that Cython can understand. This allows then the C/C++ variable and functions to be used directly in implementation files with cimport. You can read more about it in Interfacing with External C Code and Using C++ in Cython.

It can also contain the definition part of an extension type and the declarations of functions for an external library.

It cannot contain the implementations of any C or Python functions, or any Python class definitions, or any executable statements. It is needed when one wants to access cdef attributes and methods, or to inherit from cdef classes defined in this module.

Note

You don’t need to (and shouldn’t) declare anything in a declaration file public in order to make it available to other Cython modules; its mere presence in a definition file does that. You only need a public declaration if you want to make something available to external C code.

The include statement and include files

Warning

Historically the include statement was used for sharing declarations. Use Sharing Declarations Between Cython Modules instead.

A Cython source file can include material from other files using the include statement, for example,:

include "spamstuff.pxi"

The contents of the named file are textually included at that point. The included file can contain any complete statements or declarations that are valid in the context where the include statement appears, including other include statements. The contents of the included file should begin at an indentation level of zero, and will be treated as though they were indented to the level of the include statement that is including the file. The include statement cannot, however, be used outside of the module scope, such as inside of functions or class bodies.

Note

There are other mechanisms available for splitting Cython code into separate parts that may be more appropriate in many cases. See Sharing Declarations Between Cython Modules.

Conditional Compilation

Some features are available for conditional compilation and compile-time constants within a Cython source file.

Compile-Time Definitions

A compile-time constant can be defined using the DEF statement:

DEF FavouriteFood = u"spam"
DEF ArraySize = 42
DEF OtherArraySize = 2 * ArraySize + 17

The right-hand side of the DEF must be a valid compile-time expression. Such expressions are made up of literal values and names defined using DEF statements, combined using any of the Python expression syntax.

The following compile-time names are predefined, corresponding to the values returned by os.uname().

UNAME_SYSNAME, UNAME_NODENAME, UNAME_RELEASE, UNAME_VERSION, UNAME_MACHINE

The following selection of builtin constants and functions are also available:

None, True, False, abs, all, any, ascii, bin, bool, bytearray, bytes, chr, cmp, complex, dict, divmod, enumerate, filter, float, format, frozenset, hash, hex, int, len, list, long, map, max, min, oct, ord, pow, range, reduce, repr, reversed, round, set, slice, sorted, str, sum, tuple, xrange, zip

Note that some of these builtins may not be available when compiling under Python 2.x or 3.x, or may behave differently in both.

A name defined using DEF can be used anywhere an identifier can appear, and it is replaced with its compile-time value as though it were written into the source at that point as a literal. For this to work, the compile-time expression must evaluate to a Python value of type int, long, float, bytes or unicode (str in Py3).

from __future__ import print_function

DEF FavouriteFood = u"spam"
DEF ArraySize = 42
DEF OtherArraySize = 2 * ArraySize + 17

cdef int a1[ArraySize]
cdef int a2[OtherArraySize]
print("I like", FavouriteFood)
Conditional Statements

The IF statement can be used to conditionally include or exclude sections of code at compile time. It works in a similar way to the #if preprocessor directive in C.:

IF UNAME_SYSNAME == "Windows":
    include "icky_definitions.pxi"
ELIF UNAME_SYSNAME == "Darwin":
    include "nice_definitions.pxi"
ELIF UNAME_SYSNAME == "Linux":
    include "penguin_definitions.pxi"
ELSE:
    include "other_definitions.pxi"

The ELIF and ELSE clauses are optional. An IF statement can appear anywhere that a normal statement or declaration can appear, and it can contain any statements or declarations that would be valid in that context, including DEF statements and other IF statements.

The expressions in the IF and ELIF clauses must be valid compile-time expressions as for the DEF statement, although they can evaluate to any Python value, and the truth of the result is determined in the usual Python way.

Extension Types

Introduction

As well as creating normal user-defined classes with the Python class statement, Cython also lets you create new built-in Python types, known as extension types. You define an extension type using the cdef class statement. Here’s an example:

from __future__ import print_function

cdef class Shrubbery:
    cdef int width, height

    def __init__(self, w, h):
        self.width = w
        self.height = h

    def describe(self):
        print("This shrubbery is", self.width,
              "by", self.height, "cubits.")

As you can see, a Cython extension type definition looks a lot like a Python class definition. Within it, you use the def statement to define methods that can be called from Python code. You can even define many of the special methods such as __init__() as you would in Python.

The main difference is that you can use the cdef statement to define attributes. The attributes may be Python objects (either generic or of a particular extension type), or they may be of any C data type. So you can use extension types to wrap arbitrary C data structures and provide a Python-like interface to them.

Static Attributes

Attributes of an extension type are stored directly in the object’s C struct. The set of attributes is fixed at compile time; you can’t add attributes to an extension type instance at run time simply by assigning to them, as you could with a Python class instance. However, you can explicitly enable support for dynamically assigned attributes, or subclass the extension type with a normal Python class, which then supports arbitrary attribute assignments. See Dynamic Attributes.

There are two ways that attributes of an extension type can be accessed: by Python attribute lookup, or by direct access to the C struct from Cython code. Python code is only able to access attributes of an extension type by the first method, but Cython code can use either method.

By default, extension type attributes are only accessible by direct access, not Python access, which means that they are not accessible from Python code. To make them accessible from Python code, you need to declare them as public or readonly. For example:

cdef class Shrubbery:
    cdef public int width, height
    cdef readonly float depth

makes the width and height attributes readable and writable from Python code, and the depth attribute readable but not writable.

Note

You can only expose simple C types, such as ints, floats, and strings, for Python access. You can also expose Python-valued attributes.

Note

Also the public and readonly options apply only to Python access, not direct access. All the attributes of an extension type are always readable and writable by C-level access.

Dynamic Attributes

It is not possible to add attributes to an extension type at runtime by default. You have two ways of avoiding this limitation, both add an overhead when a method is called from Python code. Especially when calling cpdef methods.

The first approach is to create a Python subclass.:

cdef class Animal:

    cdef int number_of_legs

    def __cinit__(self, int number_of_legs):
        self.number_of_legs = number_of_legs


class ExtendableAnimal(Animal):  # Note that we use class, not cdef class
    pass


dog = ExtendableAnimal(4)
dog.has_tail = True

Declaring a __dict__ attribute is the second way of enabling dynamic attributes.:

cdef class Animal:

    cdef int number_of_legs
    cdef dict __dict__

    def __cinit__(self, int number_of_legs):
        self.number_of_legs = number_of_legs


dog = Animal(4)
dog.has_tail = True

Type declarations

Before you can directly access the attributes of an extension type, the Cython compiler must know that you have an instance of that type, and not just a generic Python object. It knows this already in the case of the self parameter of the methods of that type, but in other cases you will have to use a type declaration.

For example, in the following function:

cdef widen_shrubbery(sh, extra_width): # BAD
    sh.width = sh.width + extra_width

because the sh parameter hasn’t been given a type, the width attribute will be accessed by a Python attribute lookup. If the attribute has been declared public or readonly then this will work, but it will be very inefficient. If the attribute is private, it will not work at all – the code will compile, but an attribute error will be raised at run time.

The solution is to declare sh as being of type Shrubbery, as follows:

from my_module cimport Shrubbery

cdef widen_shrubbery(Shrubbery sh, extra_width):
    sh.width = sh.width + extra_width

Now the Cython compiler knows that sh has a C attribute called width and will generate code to access it directly and efficiently. The same consideration applies to local variables, for example:

from my_module cimport Shrubbery

cdef Shrubbery another_shrubbery(Shrubbery sh1):
    cdef Shrubbery sh2
    sh2 = Shrubbery()
    sh2.width = sh1.width
    sh2.height = sh1.height
    return sh2

Note

We here cimport the class Shrubbery, and this is necessary to declare the type at compile time. To be able to cimport an extension type, we split the class definition into two parts, one in a definition file and the other in the corresponding implementation file. You should read Sharing Extension Types to learn to do that.

Type Testing and Casting

Suppose I have a method quest() which returns an object of type Shrubbery. To access it’s width I could write:

cdef Shrubbery sh = quest()
print(sh.width)

which requires the use of a local variable and performs a type test on assignment. If you know the return value of quest() will be of type Shrubbery you can use a cast to write:

print( (<Shrubbery>quest()).width )

This may be dangerous if quest() is not actually a Shrubbery, as it will try to access width as a C struct member which may not exist. At the C level, rather than raising an AttributeError, either an nonsensical result will be returned (interpreting whatever data is at that address as an int) or a segfault may result from trying to access invalid memory. Instead, one can write:

print( (<Shrubbery?>quest()).width )

which performs a type check (possibly raising a TypeError) before making the cast and allowing the code to proceed.

To explicitly test the type of an object, use the isinstance() builtin function. For known builtin or extension types, Cython translates these into a fast and safe type check that ignores changes to the object’s __class__ attribute etc., so that after a successful isinstance() test, code can rely on the expected C structure of the extension type and its cdef attributes and methods.

Extension types and None

When you declare a parameter or C variable as being of an extension type, Cython will allow it to take on the value None as well as values of its declared type. This is analogous to the way a C pointer can take on the value NULL, and you need to exercise the same caution because of it. There is no problem as long as you are performing Python operations on it, because full dynamic type checking will be applied. However, when you access C attributes of an extension type (as in the widen_shrubbery function above), it’s up to you to make sure the reference you’re using is not None – in the interests of efficiency, Cython does not check this.

You need to be particularly careful when exposing Python functions which take extension types as arguments. If we wanted to make widen_shrubbery() a Python function, for example, if we simply wrote:

def widen_shrubbery(Shrubbery sh, extra_width): # This is
    sh.width = sh.width + extra_width           # dangerous!

then users of our module could crash it by passing None for the sh parameter.

One way to fix this would be:

def widen_shrubbery(Shrubbery sh, extra_width):
    if sh is None:
        raise TypeError
    sh.width = sh.width + extra_width

but since this is anticipated to be such a frequent requirement, Cython provides a more convenient way. Parameters of a Python function declared as an extension type can have a not None clause:

def widen_shrubbery(Shrubbery sh not None, extra_width):
    sh.width = sh.width + extra_width

Now the function will automatically check that sh is not None along with checking that it has the right type.

Note

not None clause can only be used in Python functions (defined with def) and not C functions (defined with cdef). If you need to check whether a parameter to a C function is None, you will need to do it yourself.

Note

Some more things:

  • The self parameter of a method of an extension type is guaranteed never to be None.
  • When comparing a value with None, keep in mind that, if x is a Python object, x is None and x is not None are very efficient because they translate directly to C pointer comparisons, whereas x == None and x != None, or simply using x as a boolean value (as in if x: ...) will invoke Python operations and therefore be much slower.

Special methods

Although the principles are similar, there are substantial differences between many of the __xxx__() special methods of extension types and their Python counterparts. There is a separate page devoted to this subject, and you should read it carefully before attempting to use any special methods in your extension types.

Properties

You can declare properties in an extension class using the same syntax as in ordinary Python code:

cdef class Spam:

    @property
    def cheese(self):
        # This is called when the property is read.
        ...

    @cheese.setter
    def cheese(self, value):
            # This is called when the property is written.
            ...

    @cheese.deleter
    def cheese(self):
        # This is called when the property is deleted.

There is also a special (deprecated) legacy syntax for defining properties in an extension class:

cdef class Spam:

    property cheese:

        "A doc string can go here."

        def __get__(self):
            # This is called when the property is read.
            ...

        def __set__(self, value):
            # This is called when the property is written.
            ...

        def __del__(self):
            # This is called when the property is deleted.

The __get__(), __set__() and __del__() methods are all optional; if they are omitted, an exception will be raised when the corresponding operation is attempted.

Here’s a complete example. It defines a property which adds to a list each time it is written to, returns the list when it is read, and empties the list when it is deleted.:

# cheesy.pyx
cdef class CheeseShop:

    cdef object cheeses

    def __cinit__(self):
        self.cheeses = []

    @property
    def cheese(self):
        return "We don't have: %s" % self.cheeses

    @cheese.setter
    def cheese(self, value):
        self.cheeses.append(value)

    @cheese.deleter
    def cheese(self):
        del self.cheeses[:]

# Test input
from cheesy import CheeseShop

shop = CheeseShop()
print(shop.cheese)

shop.cheese = "camembert"
print(shop.cheese)

shop.cheese = "cheddar"
print(shop.cheese)

del shop.cheese
print(shop.cheese)
# Test output
We don't have: []
We don't have: ['camembert']
We don't have: ['camembert', 'cheddar']
We don't have: []

Subclassing

An extension type may inherit from a built-in type or another extension type:

cdef class Parrot:
    ...

cdef class Norwegian(Parrot):
    ...

A complete definition of the base type must be available to Cython, so if the base type is a built-in type, it must have been previously declared as an extern extension type. If the base type is defined in another Cython module, it must either be declared as an extern extension type or imported using the cimport statement.

An extension type can only have one base class (no multiple inheritance).

Cython extension types can also be subclassed in Python. A Python class can inherit from multiple extension types provided that the usual Python rules for multiple inheritance are followed (i.e. the C layouts of all the base classes must be compatible).

There is a way to prevent extension types from being subtyped in Python. This is done via the final directive, usually set on an extension type using a decorator:

cimport cython

@cython.final
cdef class Parrot:
   def done(self): pass

Trying to create a Python subclass from this type will raise a TypeError at runtime. Cython will also prevent subtyping a final type inside of the same module, i.e. creating an extension type that uses a final type as its base type will fail at compile time. Note, however, that this restriction does not currently propagate to other extension modules, so even final extension types can still be subtyped at the C level by foreign code.

C methods

Extension types can have C methods as well as Python methods. Like C functions, C methods are declared using cdef or cpdef instead of def. C methods are “virtual”, and may be overridden in derived extension types. In addition, cpdef methods can even be overridden by python methods when called as C method. This adds a little to their calling overhead compared to a cdef method:

# pets.pyx
cdef class Parrot:

    cdef void describe(self):
        print("This parrot is resting.")

cdef class Norwegian(Parrot):

    cdef void describe(self):
        Parrot.describe(self)
        print("Lovely plumage!")


cdef Parrot p1, p2
p1 = Parrot()
p2 = Norwegian()
print("p1:")
p1.describe()
print("p2:")
p2.describe()
# Output
p1:
This parrot is resting.
p2:
This parrot is resting.
Lovely plumage!

The above example also illustrates that a C method can call an inherited C method using the usual Python technique, i.e.:

Parrot.describe(self)

cdef methods can be declared static by using the @staticmethod decorator. This can be especially useful for constructing classes that take non-Python compatible types.:

cdef class OwnedPointer:
    cdef void* ptr

    def __dealloc__(self):
        if self.ptr is not NULL:
            free(self.ptr)

    @staticmethod
    cdef create(void* ptr):
        p = OwnedPointer()
        p.ptr = ptr
        return p

Forward-declaring extension types

Extension types can be forward-declared, like struct and union types. This is usually not necessary and violates the DRY principle (Don’t Repeat Yourself).

If you are forward-declaring an extension type that has a base class, you must specify the base class in both the forward declaration and its subsequent definition, for example,:

cdef class A(B)

...

cdef class A(B):
    # attributes and methods

Fast instantiation

Cython provides two ways to speed up the instantiation of extension types. The first one is a direct call to the __new__() special static method, as known from Python. For an extension type Penguin, you could use the following code:

cdef class Penguin:
    cdef object food

    def __cinit__(self, food):
        self.food = food

    def __init__(self, food):
        print("eating!")

normal_penguin = Penguin('fish')
fast_penguin = Penguin.__new__(Penguin, 'wheat')  # note: not calling __init__() !

Note that the path through __new__() will not call the type’s __init__() method (again, as known from Python). Thus, in the example above, the first instantiation will print eating!, but the second will not. This is only one of the reasons why the __cinit__() method is safer and preferable over the normal __init__() method for extension types.

The second performance improvement applies to types that are often created and deleted in a row, so that they can benefit from a freelist. Cython provides the decorator @cython.freelist(N) for this, which creates a statically sized freelist of N instances for a given type. Example:

cimport cython

@cython.freelist(8)
cdef class Penguin:
    cdef object food
    def __cinit__(self, food):
        self.food = food

penguin = Penguin('fish 1')
penguin = None
penguin = Penguin('fish 2')  # does not need to allocate memory!

Instantiation from existing C/C++ pointers

It is quite common to want to instantiate an extension class from an existing (pointer to a) data structure, often as returned by external C/C++ functions.

As extension classes can only accept Python objects as arguments in their contructors, this necessitates the use of factory functions. For example,

from libc.stdlib cimport malloc, free

# Example C struct
ctypedef struct my_c_struct:
    int a
    int b


cdef class WrapperClass:
    """A wrapper class for a C/C++ data structure"""
    cdef my_c_struct *_ptr
    cdef bint ptr_owner

    def __cinit__(self):
        self.ptr_owner = False

    def __dealloc__(self):
        # De-allocate if not null and flag is set
        if self._ptr is not NULL and self.ptr_owner is True:
            free(self._ptr)
            self._ptr = NULL

    # Extension class properties
    @property
    def a(self):
        return self._ptr.a if self._ptr is not NULL else None

    @property
    def b(self):
        return self._ptr.b if self._ptr is not NULL else None

    @staticmethod
    cdef WrapperClass from_ptr(my_c_struct *_ptr, bint owner=False):
        """Factory function to create WrapperClass objects from
        given my_c_struct pointer.

        Setting ``owner`` flag to ``True`` causes
        the extension type to ``free`` the structure pointed to by ``_ptr``
        when the wrapper object is deallocated."""
        # Call to __new__ bypasses __init__ constructor
        cdef WrapperClass wrapper = WrapperClass.__new__(WrapperClass)
        wrapper._ptr = _ptr
        wrapper.ptr_owner = owner
        return wrapper

    @staticmethod
    cdef WrapperClass new_struct():
        """Factory function to create WrapperClass objects with
        newly allocated my_c_struct"""
        cdef my_c_struct *_ptr = <my_c_struct *>malloc(sizeof(my_c_struct))
        if _ptr is NULL:
            raise MemoryError
        _ptr.a = 0
        _ptr.b = 0
        return WrapperClass.from_ptr(_ptr, owner=True)

To then create a WrapperClass object from an existing my_c_struct pointer, WrapperClass.from_ptr(ptr) can be used in Cython code. To allocate a new structure and wrap it at the same time, WrapperClass.new_struct can be used instead.

It is possible to create multiple Python objects all from the same pointer which point to the same in-memory data, if that is wanted, though care must be taken when de-allocating as can be seen above. Additionally, the ptr_owner flag can be used to control which WrapperClass object owns the pointer and is responsible for de-allocation - this is set to False by default in the example and can be enabled by calling from_ptr(ptr, owner=True).

The GIL must not be released in __dealloc__ either, or another lock used if it is, in such cases or race conditions can occur with multiple de-allocations.

Being a part of the object constructor, the __cinit__ method has a Python signature, which makes it unable to accept a my_c_struct pointer as an argument.

Attempts to use pointers in a Python signature will result in errors like:

Cannot convert 'my_c_struct *' to Python object

This is because Cython cannot automatically convert a pointer to a Python object, unlike with native types like int.

Note that for native types, Cython will copy the value and create a new Python object while in the above case, data is not copied and deallocating memory is a responsibility of the extension class.

Making extension types weak-referenceable

By default, extension types do not support having weak references made to them. You can enable weak referencing by declaring a C attribute of type object called __weakref__. For example,:

cdef class ExplodingAnimal:
    """This animal will self-destruct when it is
    no longer strongly referenced."""

    cdef object __weakref__

Controlling cyclic garbage collection in CPython

By default each extension type will support the cyclic garbage collector of CPython. If any Python objects can be referenced, Cython will automatically generate the tp_traverse and tp_clear slots. This is usually what you want.

There is at least one reason why this might not be what you want: If you need to cleanup some external resources in the __dealloc__ special function and your object happened to be in a reference cycle, the garbage collector may have triggered a call to tp_clear to drop references. This is the way that reference cycles are broken so that the garbage can actually be reclaimed.

In that case any object references have vanished by the time when __dealloc__ is called. Now your cleanup code lost access to the objects it has to clean up. In that case you can disable the cycle breaker tp_clear by using the no_gc_clear decorator

@cython.no_gc_clear
cdef class DBCursor:
    cdef DBConnection conn
    cdef DBAPI_Cursor *raw_cursor
    # ...
    def __dealloc__(self):
        DBAPI_close_cursor(self.conn.raw_conn, self.raw_cursor)

This example tries to close a cursor via a database connection when the Python object is destroyed. The DBConnection object is kept alive by the reference from DBCursor. But if a cursor happens to be in a reference cycle, the garbage collector may effectively “steal” the database connection reference, which makes it impossible to clean up the cursor.

Using the no_gc_clear decorator this can not happen anymore because the references of a cursor object will not be cleared anymore.

In rare cases, extension types can be guaranteed not to participate in cycles, but the compiler won’t be able to prove this. This would be the case if the class can never reference itself, even indirectly. In that case, you can manually disable cycle collection by using the no_gc decorator, but beware that doing so when in fact the extension type can participate in cycles could cause memory leaks

@cython.no_gc
cdef class UserInfo:
    cdef str name
    cdef tuple addresses

If you can be sure addresses will contain only references to strings, the above would be safe, and it may yield a significant speedup, depending on your usage pattern.

Controlling pickling

By default, Cython will generate a __reduce__() method to allow pickling an extension type if and only if each of its members are convertible to Python and it has no __cinit__ method. To require this behavior (i.e. throw an error at compile time if a class cannot be pickled) decorate the class with @cython.auto_pickle(True). One can also annotate with @cython.auto_pickle(False) to get the old behavior of not generating a __reduce__ method in any case.

Manually implementing a __reduce__ or __reduce_ex__` method will also disable this auto-generation and can be used to support pickling of more complicated types.

Public and external extension types

Extension types can be declared extern or public. An extern extension type declaration makes an extension type defined in external C code available to a Cython module. A public extension type declaration makes an extension type defined in a Cython module available to external C code.

External extension types

An extern extension type allows you to gain access to the internals of Python objects defined in the Python core or in a non-Cython extension module.

Note

In previous versions of Pyrex, extern extension types were also used to reference extension types defined in another Pyrex module. While you can still do that, Cython provides a better mechanism for this. See Sharing Declarations Between Cython Modules.

Here is an example which will let you get at the C-level members of the built-in complex object.:

from __future__ import print_function

cdef extern from "complexobject.h":

    struct Py_complex:
        double real
        double imag

    ctypedef class __builtin__.complex [object PyComplexObject]:
        cdef Py_complex cval

# A function which uses the above type
def spam(complex c):
    print("Real:", c.cval.real)
    print("Imag:", c.cval.imag)

Note

Some important things:

  1. In this example, ctypedef class has been used. This is because, in the Python header files, the PyComplexObject struct is declared with:

    typedef struct {
        ...
    } PyComplexObject;
    

    At runtime, a check will be performed when importing the Cython c-extension module that __builtin__.complex’s tp_basicsize matches sizeof(`PyComplexObject). This check can fail if the Cython c-extension module was compiled with one version of the complexobject.h header but imported into a Python with a changed header. This check can be tweaked by using check_size in the name specification clause.

  2. As well as the name of the extension type, the module in which its type object can be found is also specified. See the implicit importing section below.

  3. When declaring an external extension type, you don’t declare any methods. Declaration of methods is not required in order to call them, because the calls are Python method calls. Also, as with struct and union, if your extension class declaration is inside a cdef extern from block, you only need to declare those C members which you wish to access.

Name specification clause

The part of the class declaration in square brackets is a special feature only available for extern or public extension types. The full form of this clause is:

[object object_struct_name, type type_object_name, check_size cs_option]

Where:

  • object_struct_name is the name to assume for the type’s C struct.
  • type_object_name is the name to assume for the type’s statically declared type object.
  • cs_option is warn (the default), error, or ignore and is only used for external extension types. If error, the sizeof(object_struct) that was found at compile time must match the type’s runtime tp_basicsize exactly, otherwise the module import will fail with an error. If warn or ignore, the object_struct is allowed to be smaller than the type’s tp_basicsize, which indicates the runtime type may be part of an updated module, and that the external module’s developers extended the object in a backward-compatible fashion (only adding new fields to the end of the object). If warn, a warning will be emitted in this case.

The clauses can be written in any order.

If the extension type declaration is inside a cdef extern from block, the object clause is required, because Cython must be able to generate code that is compatible with the declarations in the header file. Otherwise, for extern extension types, the object clause is optional.

For public extension types, the object and type clauses are both required, because Cython must be able to generate code that is compatible with external C code.

Attribute name matching and aliasing

Sometimes the type’s C struct as specified in object_struct_name may use different labels for the fields than those in the PyTypeObject. This can easily happen in hand-coded C extensions where the PyTypeObject_Foo has a getter method, but the name does not match the name in the PyFooObject. In NumPy, for instance, python-level dtype.itemsize is a getter for the C struct field elsize. Cython supports aliasing field names so that one can write dtype.itemsize in Cython code which will be compiled into direct access of the C struct field, without going through a C-API equivalent of dtype.__getattr__('itemsize').

For example we may have an extension module foo_extension:

cdef class Foo:
    cdef public int field0, field1, field2;

    def __init__(self, f0, f1, f2):
        self.field0 = f0
        self.field1 = f1
        self.field2 = f2

but a C struct in a file foo_nominal.h:

typedef struct {
     PyObject_HEAD
     int f0;
     int f1;
     int f2;
 } FooStructNominal;

Note that the struct uses f0, f1, f2 but they are field0, field1, and field2 in Foo. We are given this situation, including a header file with that struct, and we wish to write a function to sum the values. If we write an extension module wrapper:

cdef extern from "foo_nominal.h":

    ctypedef class foo_extension.Foo [object FooStructNominal]:
        cdef:
            int field0
            int field1
            int feild2

def sum(Foo f):
    return f.field0 + f.field1 + f.field2

then wrapper.sum(f) (where f = foo_extension.Foo(1, 2, 3)) will still use the C-API equivalent of:

return f.__getattr__('field0') +
       f.__getattr__('field1') +
       f.__getattr__('field1')

instead of the desired C equivalent of return f->f0 + f->f1 + f->f2. We can alias the fields by using:

cdef extern from "foo_nominal.h":

    ctypedef class foo_extension.Foo [object FooStructNominal]:
        cdef:
            int field0 "f0"
            int field1 "f1"
            int field2 "f2"

def sum(Foo f) except -1:
    return f.field0 + f.field1 + f.field2

and now Cython will replace the slow __getattr__ with direct C access to the FooStructNominal fields. This is useful when directly processing Python code. No changes to Python need be made to achieve significant speedups, even though the field names in Python and C are different. Of course, one should make sure the fields are equivalent.

Implicit importing

Cython requires you to include a module name in an extern extension class declaration, for example,:

cdef extern class MyModule.Spam:
    ...

The type object will be implicitly imported from the specified module and bound to the corresponding name in this module. In other words, in this example an implicit:

from MyModule import Spam

statement will be executed at module load time.

The module name can be a dotted name to refer to a module inside a package hierarchy, for example,:

cdef extern class My.Nested.Package.Spam:
    ...

You can also specify an alternative name under which to import the type using an as clause, for example,:

cdef extern class My.Nested.Package.Spam as Yummy:
   ...

which corresponds to the implicit import statement:

from My.Nested.Package import Spam as Yummy
Type names vs. constructor names

Inside a Cython module, the name of an extension type serves two distinct purposes. When used in an expression, it refers to a module-level global variable holding the type’s constructor (i.e. its type-object). However, it can also be used as a C type name to declare variables, arguments and return values of that type.

When you declare:

cdef extern class MyModule.Spam:
    ...

the name Spam serves both these roles. There may be other names by which you can refer to the constructor, but only Spam can be used as a type name. For example, if you were to explicitly import MyModule, you could use MyModule.Spam() to create a Spam instance, but you wouldn’t be able to use MyModule.Spam as a type name.

When an as clause is used, the name specified in the as clause also takes over both roles. So if you declare:

cdef extern class MyModule.Spam as Yummy:
    ...

then Yummy becomes both the type name and a name for the constructor. Again, there are other ways that you could get hold of the constructor, but only Yummy is usable as a type name.

Public extension types

An extension type can be declared public, in which case a .h file is generated containing declarations for its object struct and type object. By including the .h file in external C code that you write, that code can access the attributes of the extension type.

Special Methods of Extension Types

This page describes the special methods currently supported by Cython extension types. A complete list of all the special methods appears in the table at the bottom. Some of these methods behave differently from their Python counterparts or have no direct Python counterparts, and require special mention.

Note

Everything said on this page applies only to extension types, defined with the cdef class statement. It doesn’t apply to classes defined with the Python class statement, where the normal Python rules apply.

Declaration

Special methods of extension types must be declared with def, not cdef. This does not impact their performance–Python uses different calling conventions to invoke these special methods.

Docstrings

Currently, docstrings are not fully supported in some special methods of extension types. You can place a docstring in the source to serve as a comment, but it won’t show up in the corresponding __doc__ attribute at run time. (This seems to be is a Python limitation – there’s nowhere in the PyTypeObject data structure to put such docstrings.)

Initialisation methods: __cinit__() and __init__()

There are two methods concerned with initialising the object.

The __cinit__() method is where you should perform basic C-level initialisation of the object, including allocation of any C data structures that your object will own. You need to be careful what you do in the __cinit__() method, because the object may not yet be fully valid Python object when it is called. Therefore, you should be careful invoking any Python operations which might touch the object; in particular, its methods.

By the time your __cinit__() method is called, memory has been allocated for the object and any C attributes it has have been initialised to 0 or null. (Any Python attributes have also been initialised to None, but you probably shouldn’t rely on that.) Your __cinit__() method is guaranteed to be called exactly once.

If your extension type has a base type, the __cinit__() method of the base type is automatically called before your __cinit__() method is called; you cannot explicitly call the inherited __cinit__() method. If you need to pass a modified argument list to the base type, you will have to do the relevant part of the initialisation in the __init__() method instead (where the normal rules for calling inherited methods apply).

Any initialisation which cannot safely be done in the __cinit__() method should be done in the __init__() method. By the time __init__() is called, the object is a fully valid Python object and all operations are safe. Under some circumstances it is possible for __init__() to be called more than once or not to be called at all, so your other methods should be designed to be robust in such situations.

Any arguments passed to the constructor will be passed to both the __cinit__() method and the __init__() method. If you anticipate subclassing your extension type in Python, you may find it useful to give the __cinit__() method * and ** arguments so that it can accept and ignore extra arguments. Otherwise, any Python subclass which has an __init__() with a different signature will have to override __new__() [1] as well as __init__(), which the writer of a Python class wouldn’t expect to have to do. Alternatively, as a convenience, if you declare your __cinit__`() method to take no arguments (other than self) it will simply ignore any extra arguments passed to the constructor without complaining about the signature mismatch.

Note

All constructor arguments will be passed as Python objects. This implies that non-convertible C types such as pointers or C++ objects cannot be passed into the constructor from Cython code. If this is needed, use a factory function instead that handles the object initialisation. It often helps to directly call __new__() in this function to bypass the call to the __init__() constructor.

See Instantiation from existing C/C++ pointers for an example.

[1]https://docs.python.org/reference/datamodel.html#object.__new__

Finalization method: __dealloc__()

The counterpart to the __cinit__() method is the __dealloc__() method, which should perform the inverse of the __cinit__() method. Any C data that you explicitly allocated (e.g. via malloc) in your __cinit__() method should be freed in your __dealloc__() method.

You need to be careful what you do in a __dealloc__() method. By the time your __dealloc__() method is called, the object may already have been partially destroyed and may not be in a valid state as far as Python is concerned, so you should avoid invoking any Python operations which might touch the object. In particular, don’t call any other methods of the object or do anything which might cause the object to be resurrected. It’s best if you stick to just deallocating C data.

You don’t need to worry about deallocating Python attributes of your object, because that will be done for you by Cython after your __dealloc__() method returns.

When subclassing extension types, be aware that the __dealloc__() method of the superclass will always be called, even if it is overridden. This is in contrast to typical Python behavior where superclass methods will not be executed unless they are explicitly called by the subclass.

Note

There is no __del__() method for extension types.

Arithmetic methods

Arithmetic operator methods, such as __add__(), behave differently from their Python counterparts. There are no separate “reversed” versions of these methods (__radd__(), etc.) Instead, if the first operand cannot perform the operation, the same method of the second operand is called, with the operands in the same order.

This means that you can’t rely on the first parameter of these methods being “self” or being the right type, and you should test the types of both operands before deciding what to do. If you can’t handle the combination of types you’ve been given, you should return NotImplemented.

This also applies to the in-place arithmetic method __ipow__(). It doesn’t apply to any of the other in-place methods (__iadd__(), etc.) which always take self as the first argument.

Rich comparisons

There are two ways to implement comparison methods. Depending on the application, one way or the other may be better:

  • The first way uses the 6 Python special methods __eq__(), __lt__(), etc. This is new since Cython 0.27 and works exactly as in plain Python classes.

  • The second way uses a single special method __richcmp__(). This implements all rich comparison operations in one method. The signature is def __richcmp__(self, other, int op). The integer argument op indicates which operation is to be performed as shown in the table below:

    < Py_LT
    == Py_EQ
    > Py_GT
    <= Py_LE
    != Py_NE
    >= Py_GE

    These constants can be cimported from the cpython.object module.

The __next__() method

Extension types wishing to implement the iterator interface should define a method called __next__(), not next. The Python system will automatically supply a next method which calls your __next__(). Do NOT explicitly give your type a next() method, or bad things could happen.

Special Method Table

This table lists all of the special methods together with their parameter and return types. In the table below, a parameter name of self is used to indicate that the parameter has the type that the method belongs to. Other parameters with no type specified in the table are generic Python objects.

You don’t have to declare your method as taking these parameter types. If you declare different types, conversions will be performed as necessary.

General

https://docs.python.org/3/reference/datamodel.html#special-method-names

Name Parameters Return type Description
__cinit__ self, …   Basic initialisation (no direct Python equivalent)
__init__ self, …   Further initialisation
__dealloc__ self   Basic deallocation (no direct Python equivalent)
__cmp__ x, y int 3-way comparison
__str__ self object str(self)
__repr__ self object repr(self)
__hash__ self int Hash function
__call__ self, … object self(…)
__iter__ self object Return iterator for sequence
__getattr__ self, name object Get attribute
__getattribute__ self, name object Get attribute, unconditionally
__setattr__ self, name, val   Set attribute
__delattr__ self, name   Delete attribute
Rich comparison operators

https://docs.python.org/3/reference/datamodel.html#basic-customization

You can choose to either implement the standard Python special methods like __eq__() or the single special method __richcmp__(). Depending on the application, one way or the other may be better.

Name Parameters Return type Description
__eq__ self, y object self == y
__ne__ self, y object self != y (falls back to __eq__ if not available)
__lt__ self, y object self < y
__gt__ self, y object self > y
__le__ self, y object self <= y
__ge__ self, y object self >= y
__richcmp__ self, y, int op object Joined rich comparison method for all of the above (no direct Python equivalent)
Arithmetic operators

https://docs.python.org/3/reference/datamodel.html#emulating-numeric-types

Name Parameters Return type Description
__add__ x, y object binary + operator
__sub__ x, y object binary - operator
__mul__ x, y object * operator
__div__ x, y object / operator for old-style division
__floordiv__ x, y object // operator
__truediv__ x, y object / operator for new-style division
__mod__ x, y object % operator
__divmod__ x, y object combined div and mod
__pow__ x, y, z object ** operator or pow(x, y, z)
__neg__ self object unary - operator
__pos__ self object unary + operator
__abs__ self object absolute value
__nonzero__ self int convert to boolean
__invert__ self object ~ operator
__lshift__ x, y object << operator
__rshift__ x, y object >> operator
__and__ x, y object & operator
__or__ x, y object | operator
__xor__ x, y object ^ operator
Numeric conversions

https://docs.python.org/3/reference/datamodel.html#emulating-numeric-types

Name Parameters Return type Description
__int__ self object Convert to integer
__long__ self object Convert to long integer
__float__ self object Convert to float
__oct__ self object Convert to octal
__hex__ self object Convert to hexadecimal
__index__ (2.5+ only) self object Convert to sequence index
In-place arithmetic operators

https://docs.python.org/3/reference/datamodel.html#emulating-numeric-types

Name Parameters Return type Description
__iadd__ self, x object += operator
__isub__ self, x object -= operator
__imul__ self, x object *= operator
__idiv__ self, x object /= operator for old-style division
__ifloordiv__ self, x object //= operator
__itruediv__ self, x object /= operator for new-style division
__imod__ self, x object %= operator
__ipow__ x, y, z object **= operator
__ilshift__ self, x object <<= operator
__irshift__ self, x object >>= operator
__iand__ self, x object &= operator
__ior__ self, x object |= operator
__ixor__ self, x object ^= operator
Sequences and mappings

https://docs.python.org/3/reference/datamodel.html#emulating-container-types

Name Parameters Return type Description
__len__ self int   len(self)
__getitem__ self, x object self[x]
__setitem__ self, x, y   self[x] = y
__delitem__ self, x   del self[x]
__getslice__ self, Py_ssize_t i, Py_ssize_t j object self[i:j]
__setslice__ self, Py_ssize_t i, Py_ssize_t j, x   self[i:j] = x
__delslice__ self, Py_ssize_t i, Py_ssize_t j   del self[i:j]
__contains__ self, x int x in self
Iterators

https://docs.python.org/3/reference/datamodel.html#emulating-container-types

Name Parameters Return type Description
__next__ self object Get next item (called next in Python)
Buffer interface [PEP 3118] (no Python equivalents - see note 1)
Name Parameters Return type Description
__getbuffer__ self, Py_buffer *view, int flags    
__releasebuffer__ self, Py_buffer *view    
Buffer interface [legacy] (no Python equivalents - see note 1)
Name Parameters Return type Description
__getreadbuffer__ self, Py_ssize_t i, void **p    
__getwritebuffer__ self, Py_ssize_t i, void **p    
__getsegcount__ self, Py_ssize_t *p    
__getcharbuffer__ self, Py_ssize_t i, char **p    
Descriptor objects (see note 2)

https://docs.python.org/3/reference/datamodel.html#emulating-container-types

Name Parameters Return type Description
__get__ self, instance, class object Get value of attribute
__set__ self, instance, value   Set value of attribute
__delete__ self, instance   Delete attribute

Note

(1) The buffer interface was intended for use by C code and is not directly accessible from Python. It is described in the Python/C API Reference Manual of Python 2.x under sections 6.6 and 10.6. It was superseded by the new PEP 3118 buffer protocol in Python 2.6 and is no longer available in Python 3. For a how-to guide to the new API, see Implementing the buffer protocol.

Note

(2) Descriptor objects are part of the support mechanism for new-style Python classes. See the discussion of descriptors in the Python documentation. See also PEP 252, “Making Types Look More Like Classes”, and PEP 253, “Subtyping Built-In Types”.

Sharing Declarations Between Cython Modules

This section describes how to make C declarations, functions and extension types in one Cython module available for use in another Cython module. These facilities are closely modeled on the Python import mechanism, and can be thought of as a compile-time version of it.

Definition and Implementation files

A Cython module can be split into two parts: a definition file with a .pxd suffix, containing C declarations that are to be available to other Cython modules, and an implementation file with a .pyx suffix, containing everything else. When a module wants to use something declared in another module’s definition file, it imports it using the cimport statement.

A .pxd file that consists solely of extern declarations does not need to correspond to an actual .pyx file or Python module. This can make it a convenient place to put common declarations, for example declarations of functions from an external library that one wants to use in several modules.

What a Definition File contains

A definition file can contain:

  • Any kind of C type declaration.
  • extern C function or variable declarations.
  • Declarations of C functions defined in the module.
  • The definition part of an extension type (see below).

It cannot contain the implementations of any C or Python functions, or any Python class definitions, or any executable statements. It is needed when one wants to access cdef attributes and methods, or to inherit from cdef classes defined in this module.

Note

You don’t need to (and shouldn’t) declare anything in a declaration file public in order to make it available to other Cython modules; its mere presence in a definition file does that. You only need a public declaration if you want to make something available to external C code.

What an Implementation File contains

An implementation file can contain any kind of Cython statement, although there are some restrictions on the implementation part of an extension type if the corresponding definition file also defines that type (see below). If one doesn’t need to cimport anything from this module, then this is the only file one needs.

The cimport statement

The cimport statement is used in a definition or implementation file to gain access to names declared in another definition file. Its syntax exactly parallels that of the normal Python import statement:

cimport module [, module...]

from module cimport name [as name] [, name [as name] ...]

Here is an example. dishes.pxd is a definition file which exports a C data type. restaurant.pyx is an implementation file which imports and uses it.

dishes.pxd:

cdef enum otherstuff:
    sausage, eggs, lettuce

cdef struct spamdish:
    int oz_of_spam
    otherstuff filler

restaurant.pyx:

from __future__ import print_function
cimport dishes
from dishes cimport spamdish

cdef void prepare(spamdish *d):
    d.oz_of_spam = 42
    d.filler = dishes.sausage

def serve():
    cdef spamdish d
    prepare(&d)
    print(f'{d.oz_of_spam} oz spam, filler no. {d.filler}')

It is important to understand that the cimport statement can only be used to import C data types, C functions and variables, and extension types. It cannot be used to import any Python objects, and (with one exception) it doesn’t imply any Python import at run time. If you want to refer to any Python names from a module that you have cimported, you will have to include a regular import statement for it as well.

The exception is that when you use cimport to import an extension type, its type object is imported at run time and made available by the name under which you imported it. Using cimport to import extension types is covered in more detail below.

If a .pxd file changes, any modules that cimport from it may need to be recompiled. The Cython.Build.cythonize utility can take care of this for you.

Search paths for definition files

When you cimport a module called modulename, the Cython compiler searches for a file called modulename.pxd. It searches for this file along the path for include files (as specified by -I command line options or the include_path option to cythonize()), as well as sys.path.

Using package_data to install .pxd files in your setup.py script allows other packages to cimport items from your module as a dependency.

Also, whenever you compile a file modulename.pyx, the corresponding definition file modulename.pxd is first searched for along the include path (but not sys.path), and if found, it is processed before processing the .pyx file.

Using cimport to resolve naming conflicts

The cimport mechanism provides a clean and simple way to solve the problem of wrapping external C functions with Python functions of the same name. All you need to do is put the extern C declarations into a .pxd file for an imaginary module, and cimport that module. You can then refer to the C functions by qualifying them with the name of the module. Here’s an example:

c_lunch.pxd:

cdef extern from "lunch.h":
    void eject_tomato(float)

lunch.pyx:

cimport c_lunch

def eject_tomato(float speed):
    c_lunch.eject_tomato(speed)

You don’t need any c_lunch.pyx file, because the only things defined in c_lunch.pxd are extern C entities. There won’t be any actual c_lunch module at run time, but that doesn’t matter; the c_lunch.pxd file has done its job of providing an additional namespace at compile time.

Sharing C Functions

C functions defined at the top level of a module can be made available via cimport by putting headers for them in the .pxd file, for example:

volume.pxd:

cdef float cube(float)

volume.pyx:

cdef float cube(float x):
    return x * x * x

spammery.pyx:

from __future__ import print_function

from volume cimport cube

def menu(description, size):
    print(description, ":", cube(size),
          "cubic metres of spam")

menu("Entree", 1)
menu("Main course", 3)
menu("Dessert", 2)

Note

When a module exports a C function in this way, an object appears in the module dictionary under the function’s name. However, you can’t make use of this object from Python, nor can you use it from Cython using a normal import statement; you have to use cimport.

Sharing Extension Types

An extension type can be made available via cimport by splitting its definition into two parts, one in a definition file and the other in the corresponding implementation file.

The definition part of the extension type can only declare C attributes and C methods, not Python methods, and it must declare all of that type’s C attributes and C methods.

The implementation part must implement all of the C methods declared in the definition part, and may not add any further C attributes. It may also define Python methods.

Here is an example of a module which defines and exports an extension type, and another module which uses it:

shrubbing.pxd:

cdef class Shrubbery:
    cdef int width
    cdef int length

shrubbing.pyx:

cdef class Shrubbery:
    def __cinit__(self, int w, int l):
        self.width = w
        self.length = l

def standard_shrubbery():
    return Shrubbery(3, 7)

landscaping.pyx:

cimport shrubbing
import shrubbing

def main():
    cdef shrubbing.Shrubbery sh
    sh = shrubbing.standard_shrubbery()
    print("Shrubbery size is", sh.width, 'x', sh.length)

One would then need to compile both of these modules, e.g. using

setup.py:

from distutils.core import setup
from Cython.Build import cythonize

setup(ext_modules=cythonize(["landscaping.pyx", "shrubbing.pyx"]))

Some things to note about this example:

  • There is a cdef class Shrubbery declaration in both Shrubbing.pxd and Shrubbing.pyx. When the Shrubbing module is compiled, these two declarations are combined into one.
  • In Landscaping.pyx, the cimport Shrubbing declaration allows us to refer to the Shrubbery type as Shrubbing.Shrubbery. But it doesn’t bind the name Shrubbing in Landscaping’s module namespace at run time, so to access Shrubbing.standard_shrubbery() we also need to import Shrubbing.
  • One caveat if you use setuptools instead of distutils, the default action when running python setup.py install is to create a zipped egg file which will not work with cimport for pxd files when you try to use them from a dependent package. To prevent this, include zip_safe=False in the arguments to setup().

Interfacing with External C Code

One of the main uses of Cython is wrapping existing libraries of C code. This is achieved by using external declarations to declare the C functions and variables from the library that you want to use.

You can also use public declarations to make C functions and variables defined in a Cython module available to external C code. The need for this is expected to be less frequent, but you might want to do it, for example, if you are embedding Python in another application as a scripting language. Just as a Cython module can be used as a bridge to allow Python code to call C code, it can also be used to allow C code to call Python code.

External declarations

By default, C functions and variables declared at the module level are local to the module (i.e. they have the C static storage class). They can also be declared extern to specify that they are defined elsewhere, for example,:

cdef extern int spam_counter

cdef extern void order_spam(int tons)
Referencing C header files

When you use an extern definition on its own as in the examples above, Cython includes a declaration for it in the generated C file. This can cause problems if the declaration doesn’t exactly match the declaration that will be seen by other C code. If you’re wrapping an existing C library, for example, it’s important that the generated C code is compiled with exactly the same declarations as the rest of the library.

To achieve this, you can tell Cython that the declarations are to be found in a C header file, like this:

cdef extern from "spam.h":

    int spam_counter

    void order_spam(int tons)

The cdef extern from clause does three things:

  1. It directs Cython to place a #include statement for the named header file in the generated C code.
  2. It prevents Cython from generating any C code for the declarations found in the associated block.
  3. It treats all declarations within the block as though they started with cdef extern.

It’s important to understand that Cython does not itself read the C header file, so you still need to provide Cython versions of any declarations from it that you use. However, the Cython declarations don’t always have to exactly match the C ones, and in some cases they shouldn’t or can’t. In particular:

  1. Leave out any platform-specific extensions to C declarations such as __declspec().

  2. If the header file declares a big struct and you only want to use a few members, you only need to declare the members you’re interested in. Leaving the rest out doesn’t do any harm, because the C compiler will use the full definition from the header file.

    In some cases, you might not need any of the struct’s members, in which case you can just put pass in the body of the struct declaration, e.g.:

    cdef extern from "foo.h":
        struct spam:
            pass
    

    Note

    you can only do this inside a cdef extern from block; struct declarations anywhere else must be non-empty.

  3. If the header file uses typedef names such as word to refer to platform-dependent flavours of numeric types, you will need a corresponding ctypedef statement, but you don’t need to match the type exactly, just use something of the right general kind (int, float, etc). For example,:

    ctypedef int word
    

    will work okay whatever the actual size of a word is (provided the header file defines it correctly). Conversion to and from Python types, if any, will also be used for this new type.

  4. If the header file uses macros to define constants, translate them into a normal external variable declaration. You can also declare them as an enum if they contain normal int values. Note that Cython considers enum to be equivalent to int, so do not do this for non-int values.

  5. If the header file defines a function using a macro, declare it as though it were an ordinary function, with appropriate argument and result types.

  6. For archaic reasons C uses the keyword void to declare a function taking no parameters. In Cython as in Python, simply declare such functions as foo().

A few more tricks and tips:

  • If you want to include a C header because it’s needed by another header, but don’t want to use any declarations from it, put pass in the extern-from block:

    cdef extern from "spam.h":
        pass
    
  • If you want to include a system header, put angle brackets inside the quotes:

    cdef extern from "<sysheader.h>":
        ...
    
  • If you want to include some external declarations, but don’t want to specify a header file (because it’s included by some other header that you’ve already included) you can put * in place of the header file name:

    cdef extern from *:
        ...
    
  • If a cdef extern from "inc.h" block is not empty and contains only function or variable declarations (and no type declarations of any kind), Cython will put the #include "inc.h" statement after all declarations generated by Cython. This means that the included file has access to the variables, functions, structures, … which are declared by Cython.

Implementing functions in C

When you want to call C code from a Cython module, usually that code will be in some external library that you link your extension against. However, you can also directly compile C (or C++) code as part of your Cython module. In the .pyx file, you can put something like:

cdef extern from "spam.c":
    void order_spam(int tons)

Cython will assume that the function order_spam() is defined in the file spam.c. If you also want to cimport this function from another module, it must be declared (not extern!) in the .pxd file:

cdef void order_spam(int tons)

For this to work, the signature of order_spam() in spam.c must match the signature that Cython uses, in particular the function must be static:

static void order_spam(int tons)
{
    printf("Ordered %i tons of spam!\n", tons);
}
Styles of struct, union and enum declaration

There are two main ways that structs, unions and enums can be declared in C header files: using a tag name, or using a typedef. There are also some variations based on various combinations of these.

It’s important to make the Cython declarations match the style used in the header file, so that Cython can emit the right sort of references to the type in the code it generates. To make this possible, Cython provides two different syntaxes for declaring a struct, union or enum type. The style introduced above corresponds to the use of a tag name. To get the other style, you prefix the declaration with ctypedef, as illustrated below.

The following table shows the various possible styles that can be found in a header file, and the corresponding Cython declaration that you should put in the cdef extern from block. Struct declarations are used as an example; the same applies equally to union and enum declarations.

C code Possibilities for corresponding Cython Code Comments
struct Foo {
  ...
};
cdef struct Foo:
  ...
Cython will refer to the as struct Foo in the generated C code.
typedef struct {
  ...
} Foo;
ctypedef struct Foo:
  ...
Cython will refer to the type simply as Foo in the generated C code.
typedef struct foo {
  ...
} Foo;
cdef struct foo:
  ...
ctypedef foo Foo #optional

or:

ctypedef struct Foo:
  ...
If the C header uses both a tag and a typedef with different names, you can use either form of declaration in Cython (although if you need to forward reference the type, you’ll have to use the first form).
typedef struct Foo {
  ...
} Foo;
cdef struct Foo:
  ...
If the header uses the same name for the tag and typedef, you won’t be able to include a ctypedef for it – but then, it’s not necessary.

See also use of External extension types. Note that in all the cases below, you refer to the type in Cython code simply as Foo, not struct Foo.

Accessing Python/C API routines

One particular use of the cdef extern from statement is for gaining access to routines in the Python/C API. For example,:

cdef extern from "Python.h":

    object PyString_FromStringAndSize(char *s, Py_ssize_t len)

will allow you to create Python strings containing null bytes.

Special Types

Cython predefines the name Py_ssize_t for use with Python/C API routines. To make your extensions compatible with 64-bit systems, you should always use this type where it is specified in the documentation of Python/C API routines.

Windows Calling Conventions

The __stdcall and __cdecl calling convention specifiers can be used in Cython, with the same syntax as used by C compilers on Windows, for example,:

cdef extern int __stdcall FrobnicateWindow(long handle)

cdef void (__stdcall *callback)(void *)

If __stdcall is used, the function is only considered compatible with other __stdcall functions of the same signature.

Resolving naming conflicts - C name specifications

Each Cython module has a single module-level namespace for both Python and C names. This can be inconvenient if you want to wrap some external C functions and provide the Python user with Python functions of the same names.

Cython provides a couple of different ways of solving this problem. The best way, especially if you have many C functions to wrap, is to put the extern C function declarations into a .pxd file and thus a different namespace, using the facilities described in sharing declarations between Cython modules. Writing them into a .pxd file allows their reuse across modules, avoids naming collisions in the normal Python way and even makes it easy to rename them on cimport. For example, if your decl.pxd file declared a C function eject_tomato:

cdef extern from "myheader.h":
    void eject_tomato(float speed)

then you can cimport and wrap it in a .pyx file as follows:

from decl cimport eject_tomato as c_eject_tomato

def eject_tomato(speed):
    c_eject_tomato(speed)

or simply cimport the .pxd file and use it as prefix:

cimport decl

def eject_tomato(speed):
    decl.eject_tomato(speed)

Note that this has no runtime lookup overhead, as it would in Python. Cython resolves the names in the .pxd file at compile time.

For special cases where namespacing or renaming on import is not enough, e.g. when a name in C conflicts with a Python keyword, you can use a C name specification to give different Cython and C names to the C function at declaration time. Suppose, for example, that you want to wrap an external C function called yield(). If you declare it as:

cdef extern from "myheader.h":
    void c_yield "yield" (float speed)

then its Cython visible name will be c_yield, whereas its name in C will be yield. You can then wrap it with:

def call_yield(speed):
    c_yield(speed)

As for functions, C names can be specified for variables, structs, unions, enums, struct and union members, and enum values. For example:

cdef extern int one "eins", two "zwei"
cdef extern float three "drei"

cdef struct spam "SPAM":
    int i "eye"

cdef enum surprise "inquisition":
    first "alpha"
    second "beta" = 3

Note that Cython will not do any validation or name mangling on the string you provide. It will inject the bare text into the C code unmodified, so you are entirely on your own with this feature. If you want to declare a name xyz and have Cython inject the text “make the C compiler fail here” into the C file for it, you can do this using a C name declaration. Consider this an advanced feature, only for the rare cases where everything else fails.

Including verbatim C code

For advanced use cases, Cython allows you to directly write C code as “docstring” of a cdef extern from block:

cdef extern from *:
    """
    /* This is C code which will be put
     * in the .c file output by Cython */
    static long square(long x) {return x * x;}
    #define assign(x, y) ((x) = (y))
    """
    long square(long x)
    void assign(long& x, long y)

The above is essentially equivalent to having the C code in a file header.h and writing

cdef extern from "header.h":
    long square(long x)
    void assign(long& x, long y)

It is also possible to combine a header file and verbatim C code:

cdef extern from "badheader.h":
    """
    /* This macro breaks stuff */
    #undef int
    """
    # Stuff from badheader.h

In this case, the C code #undef int is put right after #include "badheader.h" in the C code generated by Cython.

Note that the string is parsed like any other docstring in Python. If you require character escapes to be passed into the C code file, use a raw docstring, i.e. r""" ... """.

Using Cython Declarations from C

Cython provides two methods for making C declarations from a Cython module available for use by external C code—public declarations and C API declarations.

Note

You do not need to use either of these to make declarations from one Cython module available to another Cython module – you should use the cimport statement for that. Sharing Declarations Between Cython Modules.

Public Declarations

You can make C types, variables and functions defined in a Cython module accessible to C code that is linked together with the Cython-generated C file, by declaring them with the public keyword:

cdef public struct Bunny: # public type declaration
    int vorpalness

cdef public int spam # public variable declaration

cdef public void grail(Bunny *) # public function declaration

If there are any public declarations in a Cython module, a header file called modulename.h file is generated containing equivalent C declarations for inclusion in other C code.

A typical use case for this is building an extension module from multiple C sources, one of them being Cython generated (i.e. with something like Extension("grail", sources=["grail.pyx", "grail_helper.c"]) in setup.py. In this case, the file grail_helper.c just needs to add #include "grail.h" in order to access the public Cython variables.

A more advanced use case is embedding Python in C using Cython. In this case, make sure to call Py_Initialize() and Py_Finalize(). For example, in the following snippet that includes grail.h:

#include <Python.h>
#include "grail.h"

int main() {
    Py_Initialize();
    initgrail();  /* Python 2.x only ! */
    Bunny b;
    grail(b);
    Py_Finalize();
}

This C code can then be built together with the Cython-generated C code in a single program (or library).

In Python 3.x, calling the module init function directly should be avoided. Instead, use the inittab mechanism to link Cython modules into a single shared library or program.

err = PyImport_AppendInittab("grail", PyInit_grail);
Py_Initialize();
grail_module = PyImport_ImportModule("grail");

If the Cython module resides within a package, then the name of the .h file consists of the full dotted name of the module, e.g. a module called foo.spam would have a header file called foo.spam.h.

Note

On some operating systems like Linux, it is also possible to first build the Cython extension in the usual way and then link against the resulting .so file like a dynamic library. Beware that this is not portable, so it should be avoided.

C API Declarations

The other way of making declarations available to C code is to declare them with the api keyword. You can use this keyword with C functions and extension types. A header file called modulename_api.h is produced containing declarations of the functions and extension types, and a function called import_modulename().

C code wanting to use these functions or extension types needs to include the header and call the import_modulename() function. The other functions can then be called and the extension types used as usual.

If the C code wanting to use these functions is part of more than one shared library or executable, then import_modulename() function needs to be called in each of the shared libraries which use these functions. If you crash with a segmentation fault (SIGSEGV on linux) when calling into one of these api calls, this is likely an indication that the shared library which contains the api call which is generating the segmentation fault does not call the import_modulename() function before the api call which crashes.

Any public C type or extension type declarations in the Cython module are also made available when you include modulename_api.h.:

# delorean.pyx

cdef public struct Vehicle:
    int speed
    float power

cdef api void activate(Vehicle *v):
    if v.speed >= 88 and v.power >= 1.21:
        print("Time travel achieved")
# marty.c
#include "delorean_api.h"

Vehicle car;

int main(int argc, char *argv[]) {
	Py_Initialize();
	import_delorean();
	car.speed = atoi(argv[1]);
	car.power = atof(argv[2]);
	activate(&car);
	Py_Finalize();
}

Note

Any types defined in the Cython module that are used as argument or return types of the exported functions will need to be declared public, otherwise they won’t be included in the generated header file, and you will get errors when you try to compile a C file that uses the header.

Using the api method does not require the C code using the declarations to be linked with the extension module in any way, as the Python import machinery is used to make the connection dynamically. However, only functions can be accessed this way, not variables. Note also that for the module import mechanism to be set up correctly, the user must call Py_Initialize() and Py_Finalize(); if you experience a segmentation fault in the call to import_modulename(), it is likely that this wasn’t done.

You can use both public and api on the same function to make it available by both methods, e.g.:

cdef public api void belt_and_braces():
    ...

However, note that you should include either modulename.h or modulename_api.h in a given C file, not both, otherwise you may get conflicting dual definitions.

If the Cython module resides within a package, then:

  • The name of the header file contains of the full dotted name of the module.
  • The name of the importing function contains the full name with dots replaced by double underscores.

E.g. a module called foo.spam would have an API header file called foo.spam_api.h and an importing function called import_foo__spam().

Multiple public and API declarations

You can declare a whole group of items as public and/or api all at once by enclosing them in a cdef block, for example,:

cdef public api:
    void order_spam(int tons)
    char *get_lunch(float tomato_size)

This can be a useful thing to do in a .pxd file (see Sharing Declarations Between Cython Modules) to make the module’s public interface available by all three methods.

Acquiring and Releasing the GIL

Cython provides facilities for acquiring and releasing the Global Interpreter Lock (GIL). This may be useful when calling from multi-threaded code into (external C) code that may block, or when wanting to use Python from a (native) C thread callback. Releasing the GIL should obviously only be done for thread-safe code or for code that uses other means of protection against race conditions and concurrency issues.

Note that acquiring the GIL is a blocking thread-synchronising operation, and therefore potentially costly. It might not be worth releasing the GIL for minor calculations. Usually, I/O operations and substantial computations in parallel code will benefit from it.

Releasing the GIL

You can release the GIL around a section of code using the with nogil statement:

with nogil:
    <code to be executed with the GIL released>

Code in the body of the with-statement must not raise exceptions or manipulate Python objects in any way, and must not call anything that manipulates Python objects without first re-acquiring the GIL. Cython validates these operations at compile time, but cannot look into external C functions, for example. They must be correctly declared as requiring or not requiring the GIL (see below) in order to make Cython’s checks effective.

Acquiring the GIL

A C function that is to be used as a callback from C code that is executed without the GIL needs to acquire the GIL before it can manipulate Python objects. This can be done by specifying with gil in the function header:

cdef void my_callback(void *data) with gil:
    ...

If the callback may be called from another non-Python thread, care must be taken to initialize the GIL first, through a call to PyEval_InitThreads(). If you’re already using cython.parallel in your module, this will already have been taken care of.

The GIL may also be acquired through the with gil statement:

with gil:
    <execute this block with the GIL acquired>
Declaring a function as callable without the GIL

You can specify nogil in a C function header or function type to declare that it is safe to call without the GIL.:

cdef void my_gil_free_func(int spam) nogil:
    ...

When you implement such a function in Cython, it cannot have any Python arguments or Python object return type. Furthermore, any operation that involves Python objects (including calling Python functions) must explicitly acquire the GIL first, e.g. by using a with gil block or by calling a function that has been defined with gil. These restrictions are checked by Cython and you will get a compile error if it finds any Python interaction inside of a nogil code section.

Note

The nogil function annotation declares that it is safe to call the function without the GIL. It is perfectly allowed to execute it while holding the GIL. The function does not in itself release the GIL if it is held by the caller.

Declaring a function with gil (i.e. as acquiring the GIL on entry) also implicitly makes its signature nogil.

Source Files and Compilation

Cython source file names consist of the name of the module followed by a .pyx extension, for example a module called primes would have a source file named primes.pyx.

Cython code, unlike Python, must be compiled. This happens in two stages:

  • A .pyx file is compiled by Cython to a .c file.
  • The .c file is compiled by a C compiler to a .so file (or a .pyd file on Windows)

Once you have written your .pyx file, there are a couple of ways of turning it into an extension module.

The following sub-sections describe several ways to build your extension modules, and how to pass directives to the Cython compiler.

Compiling from the command line

There are two ways of compiling from the command line.

  • The cython command takes a .py or .pyx file and compiles it into a C/C++ file.
  • The cythonize command takes a .py or .pyx file and compiles it into a C/C++ file. It then compiles the C/C++ file into an extension module which is directly importable from Python.
Compiling with the cython command

One way is to compile it manually with the Cython compiler, e.g.:

$ cython primes.pyx

This will produce a file called primes.c, which then needs to be compiled with the C compiler using whatever options are appropriate on your platform for generating an extension module. For these options look at the official Python documentation.

The other, and probably better, way is to use the distutils extension provided with Cython. The benefit of this method is that it will give the platform specific compilation options, acting like a stripped down autotools.

Compiling with the cythonize command

Run the cythonize compiler command with your options and list of .pyx files to generate an extension module. For example:

$ cythonize -a -i yourmod.pyx

This creates a yourmod.c file (or yourmod.cpp in C++ mode), compiles it, and puts the resulting extension module (.so or .pyd, depending on your platform) next to the source file for direct import (-i builds “in place”). The -a switch additionally produces an annotated html file of the source code.

The cythonize command accepts multiple source files and glob patterns like **/*.pyx as argument and also understands the common -j option for running multiple parallel build jobs. When called without further options, it will only translate the source files to .c or .cpp files. Pass the -h flag for a complete list of supported options.

There simpler command line tool cython only invokes the source code translator.

In the case of manual compilation, how to compile your .c files will vary depending on your operating system and compiler. The Python documentation for writing extension modules should have some details for your system. On a Linux system, for example, it might look similar to this:

$ gcc -shared -pthread -fPIC -fwrapv -O2 -Wall -fno-strict-aliasing \
      -I/usr/include/python3.5 -o yourmod.so yourmod.c

(gcc will need to have paths to your included header files and paths to libraries you want to link with.)

After compilation, a yourmod.so (yourmod.pyd for Windows) file is written into the target directory and your module, yourmod, is available for you to import as with any other Python module. Note that if you are not relying on cythonize or distutils, you will not automatically benefit from the platform specific file extension that CPython generates for disambiguation, such as yourmod.cpython-35m-x86_64-linux-gnu.so on a regular 64bit Linux installation of CPython 3.5.

Basic setup.py

The distutils extension provided with Cython allows you to pass .pyx files directly to the Extension constructor in your setup file.

If you have a single Cython file that you want to turn into a compiled extension, say with filename example.pyx the associated setup.py would be:

from distutils.core import setup
from Cython.Build import cythonize

setup(
    ext_modules = cythonize("example.pyx")
)

To understand the setup.py more fully look at the official distutils documentation. To compile the extension for use in the current directory use:

$ python setup.py build_ext --inplace
Configuring the C-Build

If you have include files in non-standard places you can pass an include_path parameter to cythonize:

from distutils.core import setup
from Cython.Build import cythonize

setup(
    name="My hello app",
    ext_modules=cythonize("src/*.pyx", include_path=[...]),
)

Often, Python packages that offer a C-level API provide a way to find the necessary include files, e.g. for NumPy:

include_path = [numpy.get_include()]

Note

Using memoryviews or importing NumPy with import numpy does not mean that you have to add the path to NumPy include files. You need to add this path only if you use cimport numpy.

Despite this, you will still get warnings like the following from the compiler, because Cython is using a deprecated Numpy API:

.../include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]

For the time being, it is just a warning that you can ignore.

If you need to specify compiler options, libraries to link with or other linker options you will need to create Extension instances manually (note that glob syntax can still be used to specify multiple extensions in one line):

from distutils.core import setup
from distutils.extension import Extension
from Cython.Build import cythonize

extensions = [
    Extension("primes", ["primes.pyx"],
        include_dirs=[...],
        libraries=[...],
        library_dirs=[...]),
    # Everything but primes.pyx is included here.
    Extension("*", ["*.pyx"],
        include_dirs=[...],
        libraries=[...],
        library_dirs=[...]),
]
setup(
    name="My hello app",
    ext_modules=cythonize(extensions),
)

Note that when using setuptools, you should import it before Cython as setuptools may replace the Extension class in distutils. Otherwise, both might disagree about the class to use here.

Note also that if you use setuptools instead of distutils, the default action when running python setup.py install is to create a zipped egg file which will not work with cimport for pxd files when you try to use them from a dependent package. To prevent this, include zip_safe=False in the arguments to setup().

If your options are static (for example you do not need to call a tool like pkg-config to determine them) you can also provide them directly in your .pyx or .pxd source file using a special comment block at the start of the file:

# distutils: libraries = spam eggs
# distutils: include_dirs = /opt/food/include

If you cimport multiple .pxd files defining libraries, then Cython merges the list of libraries, so this works as expected (similarly with other options, like include_dirs above).

If you have some C files that have been wrapped with Cython and you want to compile them into your extension, you can define the distutils sources parameter:

# distutils: sources = helper.c, another_helper.c

Note that these sources are added to the list of sources of the current extension module. Spelling this out in the setup.py file looks as follows:

from distutils.core import setup
from Cython.Build import cythonize
from distutils.extension import Extension

sourcefiles = ['example.pyx', 'helper.c', 'another_helper.c']

extensions = [Extension("example", sourcefiles)]

setup(
    ext_modules=cythonize(extensions)
)

The Extension class takes many options, and a fuller explanation can be found in the distutils documentation. Some useful options to know about are include_dirs, libraries, and library_dirs which specify where to find the .h and library files when linking to external libraries.

Sometimes this is not enough and you need finer customization of the distutils Extension. To do this, you can provide a custom function create_extension to create the final Extension object after Cython has processed the sources, dependencies and # distutils directives but before the file is actually Cythonized. This function takes 2 arguments template and kwds, where template is the Extension object given as input to Cython and kwds is a dict with all keywords which should be used to create the Extension. The function create_extension must return a 2-tuple (extension, metadata), where extension is the created Extension and metadata is metadata which will be written as JSON at the top of the generated C files. This metadata is only used for debugging purposes, so you can put whatever you want in there (as long as it can be converted to JSON). The default function (defined in Cython.Build.Dependencies) is:

def default_create_extension(template, kwds):
    if 'depends' in kwds:
        include_dirs = kwds.get('include_dirs', []) + ["."]
        depends = resolve_depends(kwds['depends'], include_dirs)
        kwds['depends'] = sorted(set(depends + template.depends))

    t = template.__class__
    ext = t(**kwds)
    metadata = dict(distutils=kwds, module_name=kwds['name'])
    return ext, metadata

In case that you pass a string instead of an Extension to cythonize(), the template will be an Extension without sources. For example, if you do cythonize("*.pyx"), the template will be Extension(name="*.pyx", sources=[]).

Just as an example, this adds mylib as library to every extension:

from Cython.Build.Dependencies import default_create_extension

def my_create_extension(template, kwds):
    libs = kwds.get('libraries', []) + ["mylib"]
    kwds['libraries'] = libs
    return default_create_extension(template, kwds)

ext_modules = cythonize(..., create_extension=my_create_extension)

Note

If you Cythonize in parallel (using the nthreads argument), then the argument to create_extension must be pickleable. In particular, it cannot be a lambda function.

Cythonize arguments

The function cythonize() can take extra arguments which will allow you to customize your build.

Cython.Build.cythonize(module_list, exclude=None, nthreads=0, aliases=None, quiet=False, force=False, language=None, exclude_failures=False, **options)

Compile a set of source modules into C/C++ files and return a list of distutils Extension objects for them.

Parameters:
  • module_list – As module list, pass either a glob pattern, a list of glob patterns or a list of Extension objects. The latter allows you to configure the extensions separately through the normal distutils options. You can also pass Extension objects that have glob patterns as their sources. Then, cythonize will resolve the pattern and create a copy of the Extension for every matching file.
  • exclude – When passing glob patterns as module_list, you can exclude certain module names explicitly by passing them into the exclude option.
  • nthreads – The number of concurrent builds for parallel compilation (requires the multiprocessing module).
  • aliases – If you want to use compiler directives like # distutils: ... but can only know at compile time (when running the setup.py) which values to use, you can use aliases and pass a dictionary mapping those aliases to Python strings when calling cythonize(). As an example, say you want to use the compiler directive # distutils: include_dirs = ../static_libs/include/ but this path isn’t always fixed and you want to find it when running the setup.py. You can then do # distutils: include_dirs = MY_HEADERS, find the value of MY_HEADERS in the setup.py, put it in a python variable called foo as a string, and then call cythonize(..., aliases={'MY_HEADERS': foo}).
  • quiet – If True, Cython won’t print error and warning messages during the compilation.
  • force – Forces the recompilation of the Cython modules, even if the timestamps don’t indicate that a recompilation is necessary.
  • language – To globally enable C++ mode, you can pass language='c++'. Otherwise, this will be determined at a per-file level based on compiler directives. This affects only modules found based on file names. Extension instances passed into cythonize() will not be changed. It is recommended to rather use the compiler directive # distutils: language = c++ than this option.
  • exclude_failures – For a broad ‘try to compile’ mode that ignores compilation failures and simply excludes the failed extensions, pass exclude_failures=True. Note that this only really makes sense for compiling .py files which can also be used without compilation.
  • annotate – If True, will produce a HTML file for each of the .pyx or .py files compiled. The HTML file gives an indication of how much Python interaction there is in each of the source code lines, compared to plain C code. It also allows you to see the C/C++ code generated for each line of Cython code. This report is invaluable when optimizing a function for speed, and for determining when to release the GIL: in general, a nogil block may contain only “white” code. See examples in Determining where to add types or Primes.
  • compiler_directives – Allow to set compiler directives in the setup.py like this: compiler_directives={'embedsignature': True}. See Compiler directives.

Multiple Cython Files in a Package

To automatically compile multiple Cython files without listing all of them explicitly, you can use glob patterns:

setup(
    ext_modules = cythonize("package/*.pyx")
)

You can also use glob patterns in Extension objects if you pass them through cythonize():

extensions = [Extension("*", ["*.pyx"])]

setup(
    ext_modules = cythonize(extensions)
)
Distributing Cython modules

It is strongly recommended that you distribute the generated .c files as well as your Cython sources, so that users can install your module without needing to have Cython available.

It is also recommended that Cython compilation not be enabled by default in the version you distribute. Even if the user has Cython installed, he/she probably doesn’t want to use it just to install your module. Also, the installed version may not be the same one you used, and may not compile your sources correctly.

This simply means that the setup.py file that you ship with will just be a normal distutils file on the generated .c files, for the basic example we would have instead:

from distutils.core import setup
from distutils.extension import Extension

setup(
    ext_modules = [Extension("example", ["example.c"])]
)

This is easy to combine with cythonize() by changing the file extension of the extension module sources:

from distutils.core import setup
from distutils.extension import Extension

USE_CYTHON = ...   # command line option, try-import, ...

ext = '.pyx' if USE_CYTHON else '.c'

extensions = [Extension("example", ["example"+ext])]

if USE_CYTHON:
    from Cython.Build import cythonize
    extensions = cythonize(extensions)

setup(
    ext_modules = extensions
)

If you have many extensions and want to avoid the additional complexity in the declarations, you can declare them with their normal Cython sources and then call the following function instead of cythonize() to adapt the sources list in the Extensions when not using Cython:

import os.path

def no_cythonize(extensions, **_ignore):
    for extension in extensions:
        sources = []
        for sfile in extension.sources:
            path, ext = os.path.splitext(sfile)
            if ext in ('.pyx', '.py'):
                if extension.language == 'c++':
                    ext = '.cpp'
                else:
                    ext = '.c'
                sfile = path + ext
            sources.append(sfile)
        extension.sources[:] = sources
    return extensions

Another option is to make Cython a setup dependency of your system and use Cython’s build_ext module which runs cythonize as part of the build process:

setup(
    setup_requires=[
        'cython>=0.x',
    ],
    extensions = [Extension("*", ["*.pyx"])],
    cmdclass={'build_ext': Cython.Build.build_ext},
    ...
)

If you want to expose the C-level interface of your library for other libraries to cimport from, use package_data to install the .pxd files, e.g.:

setup(
    package_data = {
        'my_package': ['*.pxd'],
        'my_package/sub_package': ['*.pxd'],
    },
    ...
)

These .pxd files need not have corresponding .pyx modules if they contain purely declarations of external libraries.

Remember that if you use setuptools instead of distutils, the default action when running python setup.py install is to create a zipped egg file which will not work with cimport for pxd files when you try to use them from a dependent package. To prevent this, include zip_safe=False in the arguments to setup().

Integrating multiple modules

In some scenarios, it can be useful to link multiple Cython modules (or other extension modules) into a single binary, e.g. when embedding Python in another application. This can be done through the inittab import mechanism of CPython.

Create a new C file to integrate the extension modules and add this macro to it:

#if PY_MAJOR_VERSION < 3
# define MODINIT(name)  init ## name
#else
# define MODINIT(name)  PyInit_ ## name
#endif

If you are only targeting Python 3.x, just use PyInit_ as prefix.

Then, for each of the modules, declare its module init function as follows, replacing some_module_name with the name of the module:

PyMODINIT_FUNC  MODINIT(some_module_name) (void);

In C++, declare them as extern C.

If you are not sure of the name of the module init function, refer to your generated module source file and look for a function name starting with PyInit_.

Next, before you start the Python runtime from your application code with Py_Initialize(), you need to initialise the modules at runtime using the PyImport_AppendInittab() C-API function, again inserting the name of each of the modules:

PyImport_AppendInittab("some_module_name", MODINIT(some_module_name));

This enables normal imports for the embedded extension modules.

In order to prevent the joined binary from exporting all of the module init functions as public symbols, Cython 0.28 and later can hide these symbols if the macro CYTHON_NO_PYINIT_EXPORT is defined while C-compiling the module C files.

Also take a look at the cython_freeze tool. It can generate the necessary boilerplate code for linking one or more modules into a single Python executable.

Compiling with pyximport

For building Cython modules during development without explicitly running setup.py after each change, you can use pyximport:

>>> import pyximport; pyximport.install()
>>> import helloworld
Hello World

This allows you to automatically run Cython on every .pyx that Python is trying to import. You should use this for simple Cython builds only where no extra C libraries and no special building setup is needed.

It is also possible to compile new .py modules that are being imported (including the standard library and installed packages). For using this feature, just tell that to pyximport:

>>> pyximport.install(pyimport=True)

In the case that Cython fails to compile a Python module, pyximport will fall back to loading the source modules instead.

Note that it is not recommended to let pyximport build code on end user side as it hooks into their import system. The best way to cater for end users is to provide pre-built binary packages in the wheel packaging format.

Arguments

The function pyximport.install() can take several arguments to influence the compilation of Cython or Python files.

pyximport.install(pyximport=True, pyimport=False, build_dir=None, build_in_temp=True, setup_args=None, reload_support=False, load_py_module_on_import_failure=False, inplace=False, language_level=None)

Main entry point for pyxinstall.

Call this to install the .pyx import hook in your meta-path for a single Python process. If you want it to be installed whenever you use Python, add it to your sitecustomize (as described above).

Parameters:
  • pyximport – If set to False, does not try to import .pyx files.
  • pyimport – You can pass pyimport=True to also install the .py import hook in your meta-path. Note, however, that it is rather experimental, will not work at all for some .py files and packages, and will heavily slow down your imports due to search and compilation. Use at your own risk.
  • build_dir – By default, compiled modules will end up in a .pyxbld directory in the user’s home directory. Passing a different path as build_dir will override this.
  • build_in_temp – If False, will produce the C files locally. Working with complex dependencies and debugging becomes more easy. This can principally interfere with existing files of the same name.
  • setup_args – Dict of arguments for Distribution. See distutils.core.setup().
  • reload_support – Enables support for dynamic reload(my_module), e.g. after a change in the Cython code. Additional files <so_path>.reloadNN may arise on that account, when the previously loaded module file cannot be overwritten.
  • load_py_module_on_import_failure – If the compilation of a .py file succeeds, but the subsequent import fails for some reason, retry the import with the normal .py module instead of the compiled module. Note that this may lead to unpredictable results for modules that change the system state during their import, as the second import will rerun these modifications in whatever state the system was left after the import of the compiled module failed.
  • inplace – Install the compiled module (.so for Linux and Mac / .pyd for Windows) next to the source file.
  • language_level – The source language level to use: 2 or 3. The default is to use the language level of the current Python runtime for .py files and Py2 for .pyx files.
Dependency Handling

Since pyximport does not use cythonize() internally, it currently requires a different setup for dependencies. It is possible to declare that your module depends on multiple files, (likely .h and .pxd files). If your Cython module is named foo and thus has the filename foo.pyx then you should create another file in the same directory called foo.pyxdep. The modname.pyxdep file can be a list of filenames or “globs” (like *.pxd or include/*.h). Each filename or glob must be on a separate line. Pyximport will check the file date for each of those files before deciding whether to rebuild the module. In order to keep track of the fact that the dependency has been handled, Pyximport updates the modification time of your “.pyx” source file. Future versions may do something more sophisticated like informing distutils of the dependencies directly.

Limitations

pyximport does not use cythonize(). Thus it is not possible to do things like using compiler directives at the top of Cython files or compiling Cython code to C++.

Pyximport does not give you any control over how your Cython file is compiled. Usually the defaults are fine. You might run into problems if you wanted to write your program in half-C, half-Cython and build them into a single library.

Pyximport does not hide the Distutils/GCC warnings and errors generated by the import process. Arguably this will give you better feedback if something went wrong and why. And if nothing went wrong it will give you the warm fuzzy feeling that pyximport really did rebuild your module as it was supposed to.

Basic module reloading support is available with the option reload_support=True. Note that this will generate a new module filename for each build and thus end up loading multiple shared libraries into memory over time. CPython has limited support for reloading shared libraries as such, see PEP 489.

Pyximport puts both your .c file and the platform-specific binary into a separate build directory, usually $HOME/.pyxblx/. To copy it back into the package hierarchy (usually next to the source file) for manual reuse, you can pass the option inplace=True.

Compiling with cython.inline

One can also compile Cython in a fashion similar to SciPy’s weave.inline. For example:

>>> import cython
>>> def f(a):
...     ret = cython.inline("return a+b", b=3)
...

Unbound variables are automatically pulled from the surrounding local and global scopes, and the result of the compilation is cached for efficient re-use.

Compiling with Sage

The Sage notebook allows transparently editing and compiling Cython code simply by typing %cython at the top of a cell and evaluate it. Variables and functions defined in a Cython cell are imported into the running session. Please check Sage documentation for details.

You can tailor the behavior of the Cython compiler by specifying the directives below.

Compiling with a Jupyter Notebook

It’s possible to compile code in a notebook cell with Cython. For this you need to load the Cython magic:

%load_ext cython

Then you can define a Cython cell by writing %%cython on top of it. Like this:

%%cython

cdef int a = 0
for i in range(10):
    a += i
print(a)

Note that each cell will be compiled into a separate extension module. So if you use a package in a Cython cell, you will have to import this package in the same cell. It’s not enough to have imported the package in a previous cell. Cython will tell you that there are “undefined global names” at compilation time if you don’t comply.

The global names (top level functions, classes, variables and modules) of the cell are then loaded into the global namespace of the notebook. So in the end, it behaves as if you executed a Python cell.

Additional allowable arguments to the Cython magic are listed below. You can see them also by typing `%%cython? in IPython or a Jupyter notebook.

-a, –annotate Produce a colorized HTML version of the source.
-+, –cplus Output a C++ rather than C file.
-f, –force Force the compilation of a new module, even if the source has been previously compiled.
-3 Select Python 3 syntax
-2 Select Python 2 syntax
-c=COMPILE_ARGS, –compile-args=COMPILE_ARGS Extra flags to pass to compiler via the extra_compile_args.
–link-args LINK_ARGS Extra flags to pass to linker via the extra_link_args.
-l LIB, –lib LIB Add a library to link the extension against (can be specified multiple times).
-L dir Add a path to the list of library directories (can be specified multiple times).
-I INCLUDE, –include INCLUDE Add a path to the list of include directories (can be specified multiple times).
-S, –src Add a path to the list of src files (can be specified multiple times).
-n NAME, –name NAME Specify a name for the Cython module.
–pgo Enable profile guided optimisation in the C compiler. Compiles the cell twice and executes it in between to generate a runtime profile.
–verbose Print debug information like generated .c/.cpp file location and exact gcc/g++ command invoked.
Compiler options

Compiler options can be set in the setup.py, before calling cythonize(), like this:

from distutils.core import setup

from Cython.Build import cythonize
from Cython.Compiler import Options

Options.docstrings = False

setup(
    name = "hello",
    ext_modules = cythonize("lib.pyx"),
)

Here are the options that are available:

Cython.Compiler.Options.docstrings = True

Whether or not to include docstring in the Python extension. If False, the binary size will be smaller, but the __doc__ attribute of any class or function will be an empty string.

Cython.Compiler.Options.embed_pos_in_docstring = False

Embed the source code position in the docstrings of functions and classes.

Cython.Compiler.Options.emit_code_comments = True

Copy the original source code line by line into C code comments in the generated code file to help with understanding the output. This is also required for coverage analysis.

Cython.Compiler.Options.generate_cleanup_code = False

Decref global variables in each module on exit for garbage collection. 0: None, 1+: interned objects, 2+: cdef globals, 3+: types objects Mostly for reducing noise in Valgrind, only executes at process exit (when all memory will be reclaimed anyways).

Cython.Compiler.Options.clear_to_none = True

Should tp_clear() set object fields to None instead of clearing them to NULL?

Cython.Compiler.Options.annotate = False

Generate an annotated HTML version of the input source files for debugging and optimisation purposes. This has the same effect as the annotate argument in cythonize().

Cython.Compiler.Options.fast_fail = False

This will abort the compilation on the first error occurred rather than trying to keep going and printing further error messages.

Cython.Compiler.Options.warning_errors = False

Turn all warnings into errors.

Cython.Compiler.Options.error_on_unknown_names = True

Make unknown names an error. Python raises a NameError when encountering unknown names at runtime, whereas this option makes them a compile time error. If you want full Python compatibility, you should disable this option and also ‘cache_builtins’.

Cython.Compiler.Options.error_on_uninitialized = True

Make uninitialized local variable reference a compile time error. Python raises UnboundLocalError at runtime, whereas this option makes them a compile time error. Note that this option affects only variables of “python object” type.

Cython.Compiler.Options.convert_range = True

This will convert statements of the form for i in range(...) to for i from ... when i is a C integer type, and the direction (i.e. sign of step) can be determined. WARNING: This may change the semantics if the range causes assignment to i to overflow. Specifically, if this option is set, an error will be raised before the loop is entered, whereas without this option the loop will execute until an overflowing value is encountered.

Cython.Compiler.Options.cache_builtins = True

Perform lookups on builtin names only once, at module initialisation time. This will prevent the module from getting imported if a builtin name that it uses cannot be found during initialisation. Default is True. Note that some legacy builtins are automatically remapped from their Python 2 names to their Python 3 names by Cython when building in Python 3.x, so that they do not get in the way even if this option is enabled.

Cython.Compiler.Options.gcc_branch_hints = True

Generate branch prediction hints to speed up error handling etc.

Cython.Compiler.Options.lookup_module_cpdef = False

Enable this to allow one to write your_module.foo = ... to overwrite the definition if the cpdef function foo, at the cost of an extra dictionary lookup on every call. If this is false it generates only the Python wrapper and no override check.

Cython.Compiler.Options.embed = None

Whether or not to embed the Python interpreter, for use in making a standalone executable or calling from external libraries. This will provide a C function which initialises the interpreter and executes the body of this module. See this demo for a concrete example. If true, the initialisation function is the C main() function, but this option can also be set to a non-empty string to provide a function name explicitly. Default is False.

Cython.Compiler.Options.cimport_from_pyx = False

Allows cimporting from a pyx file without a pxd file.

Cython.Compiler.Options.buffer_max_dims = 8

Maximum number of dimensions for buffers – set lower than number of dimensions in numpy, as slices are passed by value and involve a lot of copying.

Cython.Compiler.Options.closure_freelist_size = 8

Number of function closure instances to keep in a freelist (0: no freelists)

Compiler directives

Compiler directives are instructions which affect the behavior of Cython code. Here is the list of currently supported directives:

binding (True / False)
Controls whether free functions behave more like Python’s CFunctions (e.g. len()) or, when set to True, more like Python’s functions. When enabled, functions will bind to an instance when looked up as a class attribute (hence the name) and will emulate the attributes of Python functions, including introspections like argument names and annotations. Default is False.
boundscheck (True / False)
If set to False, Cython is free to assume that indexing operations ([]-operator) in the code will not cause any IndexErrors to be raised. Lists, tuples, and strings are affected only if the index can be determined to be non-negative (or if wraparound is False). Conditions which would normally trigger an IndexError may instead cause segfaults or data corruption if this is set to False. Default is True.
wraparound (True / False)
In Python, arrays and sequences can be indexed relative to the end. For example, A[-1] indexes the last value of a list. In C, negative indexing is not supported. If set to False, Cython is allowed to neither check for nor correctly handle negative indices, possibly causing segfaults or data corruption. If bounds checks are enabled (the default, see boundschecks above), negative indexing will usually raise an IndexError for indices that Cython evaluates itself. However, these cases can be difficult to recognise in user code to distinguish them from indexing or slicing that is evaluated by the underlying Python array or sequence object and thus continues to support wrap-around indices. It is therefore safest to apply this option only to code that does not process negative indices at all. Default is True.
initializedcheck (True / False)
If set to True, Cython checks that a memoryview is initialized whenever its elements are accessed or assigned to. Setting this to False disables these checks. Default is True.
nonecheck (True / False)
If set to False, Cython is free to assume that native field accesses on variables typed as an extension type, or buffer accesses on a buffer variable, never occurs when the variable is set to None. Otherwise a check is inserted and the appropriate exception is raised. This is off by default for performance reasons. Default is False.
overflowcheck (True / False)
If set to True, raise errors on overflowing C integer arithmetic operations. Incurs a modest runtime penalty, but is much faster than using Python ints. Default is False.
overflowcheck.fold (True / False)
If set to True, and overflowcheck is True, check the overflow bit for nested, side-effect-free arithmetic expressions once rather than at every step. Depending on the compiler, architecture, and optimization settings, this may help or hurt performance. A simple suite of benchmarks can be found in Demos/overflow_perf.pyx. Default is True.
embedsignature (True / False)
If set to True, Cython will embed a textual copy of the call signature in the docstring of all Python visible functions and classes. Tools like IPython and epydoc can thus display the signature, which cannot otherwise be retrieved after compilation. Default is False.
cdivision (True / False)
If set to False, Cython will adjust the remainder and quotient operators C types to match those of Python ints (which differ when the operands have opposite signs) and raise a ZeroDivisionError when the right operand is 0. This has up to a 35% speed penalty. If set to True, no checks are performed. See CEP 516. Default is False.
cdivision_warnings (True / False)
If set to True, Cython will emit a runtime warning whenever division is performed with negative operands. See CEP 516. Default is False.
always_allow_keywords (True / False)
Avoid the METH_NOARGS and METH_O when constructing functions/methods which take zero or one arguments. Has no effect on special methods and functions with more than one argument. The METH_NOARGS and METH_O signatures provide faster calling conventions but disallow the use of keywords.
profile (True / False)
Write hooks for Python profilers into the compiled C code. Default is False.
linetrace (True / False)
Write line tracing hooks for Python profilers or coverage reporting into the compiled C code. This also enables profiling. Default is False. Note that the generated module will not actually use line tracing, unless you additionally pass the C macro definition CYTHON_TRACE=1 to the C compiler (e.g. using the distutils option define_macros). Define CYTHON_TRACE_NOGIL=1 to also include nogil functions and sections.
infer_types (True / False)
Infer types of untyped variables in function bodies. Default is None, indicating that only safe (semantically-unchanging) inferences are allowed. In particular, inferring integral types for variables used in arithmetic expressions is considered unsafe (due to possible overflow) and must be explicitly requested.
language_level (2/3/3str)
Globally set the Python language level to be used for module compilation. Default is compatibility with Python 2. To enable Python 3 source code semantics, set this to 3 (or 3str) at the start of a module or pass the “-3” or “–3str” command line options to the compiler. The 3str option enables Python 3 semantics but does not change the str type and unprefixed string literals to unicode when the compiled code runs in Python 2.x. Note that cimported files inherit this setting from the module being compiled, unless they explicitly set their own language level. Included source files always inherit this setting.
c_string_type (bytes / str / unicode)
Globally set the type of an implicit coercion from char* or std::string.
c_string_encoding (ascii, default, utf-8, etc.)
Globally set the encoding to use when implicitly coercing char* or std:string to a unicode object. Coercion from a unicode object to C type is only allowed when set to ascii or default, the latter being utf-8 in Python 3 and nearly-always ascii in Python 2.
type_version_tag (True / False)
Enables the attribute cache for extension types in CPython by setting the type flag Py_TPFLAGS_HAVE_VERSION_TAG. Default is True, meaning that the cache is enabled for Cython implemented types. To disable it explicitly in the rare cases where a type needs to juggle with its tp_dict internally without paying attention to cache consistency, this option can be set to False.
unraisable_tracebacks (True / False)
Whether to print tracebacks when suppressing unraisable exceptions.
iterable_coroutine (True / False)
PEP 492 specifies that async-def coroutines must not be iterable, in order to prevent accidental misuse in non-async contexts. However, this makes it difficult and inefficient to write backwards compatible code that uses async-def coroutines in Cython but needs to interact with async Python code that uses the older yield-from syntax, such as asyncio before Python 3.5. This directive can be applied in modules or selectively as decorator on an async-def coroutine to make the affected coroutine(s) iterable and thus directly interoperable with yield-from.
Configurable optimisations
optimize.use_switch (True / False)
Whether to expand chained if-else statements (including statements like if x == 1 or x == 2:) into C switch statements. This can have performance benefits if there are lots of values but cause compiler errors if there are any duplicate values (which may not be detectable at Cython compile time for all C constants). Default is True.
optimize.unpack_method_calls (True / False)
Cython can generate code that optimistically checks for Python method objects at call time and unpacks the underlying function to call it directly. This can substantially speed up method calls, especially for builtins, but may also have a slight negative performance impact in some cases where the guess goes completely wrong. Disabling this option can also reduce the code size. Default is True.
Warnings

All warning directives take True / False as options to turn the warning on / off.

warn.undeclared (default False)
Warns about any variables that are implicitly declared without a cdef declaration
warn.unreachable (default True)
Warns about code paths that are statically determined to be unreachable, e.g. returning twice unconditionally.
warn.maybe_uninitialized (default False)
Warns about use of variables that are conditionally uninitialized.
warn.unused (default False)
Warns about unused variables and declarations
warn.unused_arg (default False)
Warns about unused function arguments
warn.unused_result (default False)
Warns about unused assignment to the same name, such as r = 2; r = 1 + 2
warn.multiple_declarators (default True)
Warns about multiple variables declared on the same line with at least one pointer type. For example cdef double* a, b - which, as in C, declares a as a pointer, b as a value type, but could be mininterpreted as declaring two pointers.
How to set directives
Globally

One can set compiler directives through a special header comment near the top of the file, like this:

# cython: language_level=3, boundscheck=False

The comment must appear before any code (but can appear after other comments or whitespace).

One can also pass a directive on the command line by using the -X switch:

$ cython -X boundscheck=True ...

Directives passed on the command line will override directives set in header comments.

Locally

For local blocks, you need to cimport the special builtin cython module:

#!python
cimport cython

Then you can use the directives either as decorators or in a with statement, like this:

#!python
@cython.boundscheck(False) # turn off boundscheck for this function
def f():
    ...
    # turn it temporarily on again for this block
    with cython.boundscheck(True):
        ...

Warning

These two methods of setting directives are not affected by overriding the directive on the command-line using the -X option.

In setup.py

Compiler directives can also be set in the setup.py file by passing a keyword argument to cythonize:

from distutils.core import setup
from Cython.Build import cythonize

setup(
    name="My hello app",
    ext_modules=cythonize('hello.pyx', compiler_directives={'embedsignature': True}),
)

This will override the default directives as specified in the compiler_directives dictionary. Note that explicit per-file or local directives as explained above take precedence over the values passed to cythonize.

Early Binding for Speed

As a dynamic language, Python encourages a programming style of considering classes and objects in terms of their methods and attributes, more than where they fit into the class hierarchy.

This can make Python a very relaxed and comfortable language for rapid development, but with a price - the ‘red tape’ of managing data types is dumped onto the interpreter. At run time, the interpreter does a lot of work searching namespaces, fetching attributes and parsing argument and keyword tuples. This run-time ‘late binding’ is a major cause of Python’s relative slowness compared to ‘early binding’ languages such as C++.

However with Cython it is possible to gain significant speed-ups through the use of ‘early binding’ programming techniques.

For example, consider the following (silly) code example:

cdef class Rectangle:
    cdef int x0, y0
    cdef int x1, y1

    def __init__(self, int x0, int y0, int x1, int y1):
        self.x0 = x0
        self.y0 = y0
        self.x1 = x1
        self.y1 = y1

    def area(self):
        area = (self.x1 - self.x0) * (self.y1 - self.y0)
        if area < 0:
            area = -area
        return area

def rectArea(x0, y0, x1, y1):
    rect = Rectangle(x0, y0, x1, y1)
    return rect.area()

In the rectArea() method, the call to rect.area() and the area() method contain a lot of Python overhead.

However, in Cython, it is possible to eliminate a lot of this overhead in cases where calls occur within Cython code. For example:

cdef class Rectangle:
    cdef int x0, y0
    cdef int x1, y1

    def __init__(self, int x0, int y0, int x1, int y1):
        self.x0 = x0
        self.y0 = y0
        self.x1 = x1
        self.y1 = y1

    cdef int _area(self):
        area = (self.x1 - self.x0) * (self.y1 - self.y0)
        if area < 0:
            area = -area
        return area

    def area(self):
        return self._area()

def rectArea(x0, y0, x1, y1):
    rect = Rectangle(x0, y0, x1, y1)
    return rect.area()

Here, in the Rectangle extension class, we have defined two different area calculation methods, the efficient _area() C method, and the Python-callable area() method which serves as a thin wrapper around _area(). Note also in the function rectArea() how we ‘early bind’ by declaring the local variable rect which is explicitly given the type Rectangle. By using this declaration, instead of just dynamically assigning to rect, we gain the ability to access the much more efficient C-callable _area() method.

But Cython offers us more simplicity again, by allowing us to declare dual-access methods - methods that can be efficiently called at C level, but can also be accessed from pure Python code at the cost of the Python access overheads. Consider this code:

cdef class Rectangle:
    cdef int x0, y0
    cdef int x1, y1

    def __init__(self, int x0, int y0, int x1, int y1):
        self.x0 = x0
        self.y0 = y0
        self.x1 = x1
        self.y1 = y1

    cpdef int area(self):
        area = (self.x1 - self.x0) * (self.y1 - self.y0)
        if area < 0:
            area = -area
        return area

def rectArea(x0, y0, x1, y1):
    rect = Rectangle(x0, y0, x1, y1)
    return rect.area()

Here, we just have a single area method, declared as cpdef to make it efficiently callable as a C function, but still accessible from pure Python (or late-binding Cython) code.

If within Cython code, we have a variable already ‘early-bound’ (ie, declared explicitly as type Rectangle, (or cast to type Rectangle), then invoking its area method will use the efficient C code path and skip the Python overhead. But if in Pyrex or regular Python code we have a regular object variable storing a Rectangle object, then invoking the area method will require:

  • an attribute lookup for the area method
  • packing a tuple for arguments and a dict for keywords (both empty in this case)
  • using the Python API to call the method

and within the area method itself:

  • parsing the tuple and keywords
  • executing the calculation code
  • converting the result to a python object and returning it

So within Cython, it is possible to achieve massive optimisations by using strong typing in declaration and casting of variables. For tight loops which use method calls, and where these methods are pure C, the difference can be huge.

Using C++ in Cython

Overview

Cython has native support for most of the C++ language. Specifically:

  • C++ objects can be dynamically allocated with new and del keywords.
  • C++ objects can be stack-allocated.
  • C++ classes can be declared with the new keyword cppclass.
  • Templated classes and functions are supported.
  • Overloaded functions are supported.
  • Overloading of C++ operators (such as operator+, operator[],…) is supported.
Procedure Overview

The general procedure for wrapping a C++ file can now be described as follows:

  • Specify C++ language in a setup.py script or locally in a source file.
  • Create one or more .pxd files with cdef extern from blocks and (if existing) the C++ namespace name. In these blocks:
    • declare classes as cdef cppclass blocks
    • declare public names (variables, methods and constructors)
  • cimport them in one or more extension modules (.pyx files).

A simple Tutorial

An example C++ API

Here is a tiny C++ API which we will use as an example throughout this document. Let’s assume it will be in a header file called Rectangle.h:

#ifndef RECTANGLE_H
#define RECTANGLE_H

namespace shapes {
    class Rectangle {
        public:
            int x0, y0, x1, y1;
            Rectangle();
            Rectangle(int x0, int y0, int x1, int y1);
            ~Rectangle();
            int getArea();
            void getSize(int* width, int* height);
            void move(int dx, int dy);
    };
}

#endif

and the implementation in the file called Rectangle.cpp:

#include <iostream>
#include "Rectangle.h"

namespace shapes {

    // Default constructor
    Rectangle::Rectangle () {}

    // Overloaded constructor
    Rectangle::Rectangle (int x0, int y0, int x1, int y1) {
        this->x0 = x0;
        this->y0 = y0;
        this->x1 = x1;
        this->y1 = y1;
    }

    // Destructor
    Rectangle::~Rectangle () {}

    // Return the area of the rectangle
    int Rectangle::getArea () {
        return (this->x1 - this->x0) * (this->y1 - this->y0);
    }

    // Get the size of the rectangle.
    // Put the size in the pointer args
    void Rectangle::getSize (int *width, int *height) {
        (*width) = x1 - x0;
        (*height) = y1 - y0;
    }

    // Move the rectangle by dx dy
    void Rectangle::move (int dx, int dy) {
        this->x0 += dx;
        this->y0 += dy;
        this->x1 += dx;
        this->y1 += dy;
    }
}

This is pretty dumb, but should suffice to demonstrate the steps involved.

Declaring a C++ class interface

The procedure for wrapping a C++ class is quite similar to that for wrapping normal C structs, with a couple of additions. Let’s start here by creating the basic cdef extern from block:

cdef extern from "Rectangle.h" namespace "shapes":

This will make the C++ class def for Rectangle available. Note the namespace declaration. Namespaces are simply used to make the fully qualified name of the object, and can be nested (e.g. "outer::inner") or even refer to classes (e.g. "namespace::MyClass to declare static members on MyClass).

Declare class with cdef cppclass

Now, let’s add the Rectangle class to this extern from block - just copy the class name from Rectangle.h and adjust for Cython syntax, so now it becomes:

cdef extern from "Rectangle.h" namespace "shapes":
    cdef cppclass Rectangle:
Add public attributes

We now need to declare the attributes and methods for use on Cython. We put those declarations in a file called Rectangle.pxd. You can see it as a header file which is readable by Cython:

cdef extern from "Rectangle.cpp":
    pass

# Decalre the class with cdef
cdef extern from "Rectangle.h" namespace "shapes":
    cdef cppclass Rectangle:
        Rectangle() except +
        Rectangle(int, int, int, int) except +
        int x0, y0, x1, y1
        int getArea()
        void getSize(int* width, int* height)
        void move(int, int)

Note that the constructor is declared as “except +”. If the C++ code or the initial memory allocation raises an exception due to a failure, this will let Cython safely raise an appropriate Python exception instead (see below). Without this declaration, C++ exceptions originating from the constructor will not be handled by Cython.

We use the lines:

cdef extern from "Rectangle.cpp":
    pass

to include the C++ code from Rectangle.cpp. It is also possible to specify to distutils that Rectangle.cpp is a source. To do that, you can add this directive at the top of the .pyx (not .pxd) file:

# distutils: sources = Rectangle.cpp

Note that when you use cdef extern from, the path that you specify is relative to the current file, but if you use the distutils directive, the path is relative to the setup.py. If you want to discover the path of the sources when running the setup.py, you can use the aliases argument of the cythonize() function.

Declare a var with the wrapped C++ class

We’ll create a .pyx file named rect.pyx to build our wrapper. We’re using a name other than Rectangle, but if you prefer giving the same name to the wrapper as the C++ class, see the section on resolving naming conflicts.

Within, we use cdef to declare a var of the class with the C++ new statement:

# distutils: language = c++

from Rectangle cimport Rectangle

def main():
    rec_ptr = new Rectangle(1, 2, 3, 4)  # Instantiate a Rectangle object on the heap
    try:
        rec_area = rec_ptr.getArea()
    finally:
        del rec_ptr  # delete heap allocated object

    cdef Rectangle rec_stack  # Instantiate a Rectangle object on the stack

The line:

# distutils: language = c++

is to indicate to Cython that this .pyx file has to be compiled to C++.

It’s also possible to declare a stack allocated object, as long as it has a “default” constructor:

cdef extern from "Foo.h":
    cdef cppclass Foo:
        Foo()

def func():
    cdef Foo foo
    ...

Note that, like C++, if the class has only one constructor and it is a nullary one, it’s not necessary to declare it.

Create Cython wrapper class

At this point, we have exposed into our pyx file’s namespace the interface of the C++ Rectangle type. Now, we need to make this accessible from external Python code (which is our whole point).

Common programming practice is to create a Cython extension type which holds a C++ instance as an attribute and create a bunch of forwarding methods. So we can implement the Python extension type as:

# distutils: language = c++

from Rectangle cimport Rectangle

# Create a Cython extension type which holds a C++ instance
# as an attribute and create a bunch of forwarding methods
# Python extension type.
cdef class PyRectangle:
    cdef Rectangle c_rect  # Hold a C++ instance which we're wrapping

    def __cinit__(self, int x0, int y0, int x1, int y1):
        self.c_rect = Rectangle(x0, y0, x1, y1)

    def get_area(self):
        return self.c_rect.getArea()

    def get_size(self):
        cdef int width, height
        self.c_rect.getSize(&width, &height)
        return width, height

    def move(self, dx, dy):
        self.c_rect.move(dx, dy)

And there we have it. From a Python perspective, this extension type will look and feel just like a natively defined Rectangle class. It should be noted that if you want to give attribute access, you could just implement some properties:

# distutils: language = c++

from Rectangle cimport Rectangle

cdef class PyRectangle:
    cdef Rectangle c_rect

    def __cinit__(self, int x0, int y0, int x1, int y1):
        self.c_rect = Rectangle(x0, y0, x1, y1)

    def get_area(self):
        return self.c_rect.getArea()

    def get_size(self):
        cdef int width, height
        self.c_rect.getSize(&width, &height)
        return width, height

    def move(self, dx, dy):
        self.c_rect.move(dx, dy)

    # Attribute access
    @property
    def x0(self):
        return self.c_rect.x0
    @x0.setter
    def x0(self, x0):
        self.c_rect.x0 = x0

    # Attribute access
    @property
    def x1(self):
        return self.c_rect.x1
    @x1.setter
    def x1(self, x1):
        self.c_rect.x1 = x1

    # Attribute access
    @property
    def y0(self):
        return self.c_rect.y0
    @y0.setter
    def y0(self, y0):
        self.c_rect.y0 = y0

    # Attribute access
    @property
    def y1(self):
        return self.c_rect.y1
    @y1.setter
    def y1(self, y1):
        self.c_rect.y1 = y1

Cython initializes C++ class attributes of a cdef class using the nullary constructor. If the class you’re wrapping does not have a nullary constructor, you must store a pointer to the wrapped class and manually allocate and deallocate it. A convenient and safe place to do so is in the __cinit__ and __dealloc__ methods which are guaranteed to be called exactly once upon creation and deletion of the Python instance.

# distutils: language = c++

from Rectangle cimport Rectangle

cdef class PyRectangle:
    cdef Rectangle*c_rect  # hold a pointer to the C++ instance which we're wrapping

    def __cinit__(self, int x0, int y0, int x1, int y1):
        self.c_rect = new Rectangle(x0, y0, x1, y1)

    def __dealloc__(self):
        del self.c_rect

Compilation and Importing

To compile a Cython module, it is necessary to have a setup.py file:

from distutils.core import setup

from Cython.Build import cythonize

setup(ext_modules=cythonize("rect.pyx"))

Run $ python setup.py build_ext --inplace

To test it, open the Python interpreter:

>>> import rect
>>> x0, y0, x1, y1 = 1, 2, 3, 4
>>> rect_obj = rect.PyRectangle(x0, y0, x1, y1)
>>> print(dir(rect_obj))
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__',
 '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__',
 '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__',
 '__setstate__', '__sizeof__', '__str__', '__subclasshook__', 'get_area', 'get_size', 'move']

Advanced C++ features

We describe here all the C++ features that were not discussed in the above tutorial.

Overloading

Overloading is very simple. Just declare the method with different parameters and use any of them:

cdef extern from "Foo.h":
    cdef cppclass Foo:
        Foo(int)
        Foo(bool)
        Foo(int, bool)
        Foo(int, int)
Overloading operators

Cython uses C++ naming for overloading operators:

cdef extern from "foo.h":
    cdef cppclass Foo:
        Foo()
        Foo operator+(Foo)
        Foo operator-(Foo)
        int operator*(Foo)
        int operator/(int)
        int operator*(int, Foo) # allows 1*Foo()
    # nonmember operators can also be specified outside the class
    double operator/(double, Foo)


cdef Foo foo = new Foo()

foo2 = foo + foo
foo2 = foo - foo

x = foo * foo2
x = foo / 1

x = foo[0] * foo2
x = foo[0] / 1
x = 1*foo[0]

cdef double y
y = 2.0/foo[0]

Note that if one has pointers to C++ objects, dereferencing must be done to avoid doing pointer arithmetic rather than arithmetic on the objects themselves:

cdef Foo* foo_ptr = new Foo()
foo = foo_ptr[0] + foo_ptr[0]
x = foo_ptr[0] / 2

del foo_ptr
Nested class declarations

C++ allows nested class declaration. Class declarations can also be nested in Cython:

# distutils: language = c++

cdef extern from "<vector>" namespace "std":
    cdef cppclass vector[T]:
        cppclass iterator:
            T operator*()
            iterator operator++()
            bint operator==(iterator)
            bint operator!=(iterator)
        vector()
        void push_back(T&)
        T& operator[](int)
        T& at(int)
        iterator begin()
        iterator end()

cdef vector[int].iterator iter  #iter is declared as being of type vector<int>::iterator

Note that the nested class is declared with a cppclass but without a cdef, as it is already part of a cdef declaration section.

C++ operators not compatible with Python syntax

Cython tries to keep its syntax as close as possible to standard Python. Because of this, certain C++ operators, like the preincrement ++foo or the dereferencing operator *foo cannot be used with the same syntax as C++. Cython provides functions replacing these operators in a special module cython.operator. The functions provided are:

  • cython.operator.dereference for dereferencing. dereference(foo) will produce the C++ code *(foo)
  • cython.operator.preincrement for pre-incrementation. preincrement(foo) will produce the C++ code ++(foo). Similarly for predecrement, postincrement and postdecrement.
  • cython.operator.comma for the comma operator. comma(a, b) will produce the C++ code ((a), (b)).

These functions need to be cimported. Of course, one can use a from ... cimport ... as to have shorter and more readable functions. For example: from cython.operator cimport dereference as deref.

For completeness, it’s also worth mentioning cython.operator.address which can also be written &foo.

Templates

Cython uses a bracket syntax for templating. A simple example for wrapping C++ vector:

# distutils: language = c++

# import dereference and increment operators
from cython.operator cimport dereference as deref, preincrement as inc

cdef extern from "<vector>" namespace "std":
    cdef cppclass vector[T]:
        cppclass iterator:
            T operator*()
            iterator operator++()
            bint operator==(iterator)
            bint operator!=(iterator)
        vector()
        void push_back(T&)
        T& operator[](int)
        T& at(int)
        iterator begin()
        iterator end()

cdef vector[int] *v = new vector[int]()
cdef int i
for i in range(10):
    v.push_back(i)

cdef vector[int].iterator it = v.begin()
while it != v.end():
    print(deref(it))
    inc(it)

del v

Multiple template parameters can be defined as a list, such as [T, U, V] or [int, bool, char]. Optional template parameters can be indicated by writing [T, U, V=*]. In the event that Cython needs to explicitly reference the type of a default template parameter for an incomplete template instantiation, it will write MyClass<T, U>::V, so if the class provides a typedef for its template parameters it is preferable to use that name here.

Template functions are defined similarly to class templates, with the template parameter list following the function name:

# distutils: language = c++

cdef extern from "<algorithm>" namespace "std":
    T max[T](T a, T b)

print(max[long](3, 4))
print(max(1.5, 2.5))  # simple template argument deduction
Standard library

Most of the containers of the C++ Standard Library have been declared in pxd files located in /Cython/Includes/libcpp. These containers are: deque, list, map, pair, queue, set, stack, vector.

For example:

# distutils: language = c++

from libcpp.vector cimport vector

cdef vector[int] vect
cdef int i, x

for i in range(10):
    vect.push_back(i)

for i in range(10):
    print(vect[i])

for x in vect:
    print(x)

The pxd files in /Cython/Includes/libcpp also work as good examples on how to declare C++ classes.

The STL containers coerce from and to the corresponding Python builtin types. The conversion is triggered either by an assignment to a typed variable (including typed function arguments) or by an explicit cast, e.g.:

# distutils: language = c++

from libcpp.string cimport string
from libcpp.vector cimport vector

py_bytes_object = b'The knights who say ni'
py_unicode_object = u'Those who hear them seldom live to tell the tale.'

cdef string s = py_bytes_object
print(s)  # b'The knights who say ni'

cdef string cpp_string = <string> py_unicode_object.encode('utf-8')
print(cpp_string)  # b'Those who hear them seldom live to tell the tale.'

cdef vector[int] vect = range(1, 10, 2)
print(vect)  # [1, 3, 5, 7, 9]

cdef vector[string] cpp_strings = b'It is a good shrubbery'.split()
print(cpp_strings[1])   # b'is'

The following coercions are available:

Python type => C++ type => Python type
bytes std::string bytes
iterable std::vector list
iterable std::list list
iterable std::set set
iterable (len 2) std::pair tuple (len 2)

All conversions create a new container and copy the data into it. The items in the containers are converted to a corresponding type automatically, which includes recursively converting containers inside of containers, e.g. a C++ vector of maps of strings.

Iteration over stl containers (or indeed any class with begin() and end() methods returning an object supporting incrementing, dereferencing, and comparison) is supported via the for .. in syntax (including in list comprehensions). For example, one can write:

# distutils: language = c++

from libcpp.vector cimport vector

def main():
    cdef vector[int] v = [4, 6, 5, 10, 3]

    cdef int value
    for value in v:
        print(value)

    return [x*x for x in v if x % 2 == 0]

If the loop target variable is unspecified, an assignment from type *container.begin() is used for type inference.

Note

Slicing stl containers is supported, you can do for x in my_vector[:5]: ... but unlike pointers slices, it will create a temporary Python object and iterate over it. Thus making the iteration very slow. You might want to avoid slicing C++ containers for performance reasons.

Simplified wrapping with default constructor

If your extension type instantiates a wrapped C++ class using the default constructor (not passing any arguments), you may be able to simplify the lifecycle handling by tying it directly to the lifetime of the Python wrapper object. Instead of a pointer attribute, you can declare an instance:

# distutils: language = c++

from libcpp.vector cimport vector


cdef class VectorStack:
    cdef vector[int] v

    def push(self, x):
        self.v.push_back(x)

    def pop(self):
        if self.v.empty():
            raise IndexError()
        x = self.v.back()
        self.v.pop_back()
        return x

Cython will automatically generate code that instantiates the C++ object instance when the Python object is created and deletes it when the Python object is garbage collected.

Exceptions

Cython cannot throw C++ exceptions, or catch them with a try-except statement, but it is possible to declare a function as potentially raising an C++ exception and converting it into a Python exception. For example,

cdef extern from "some_file.h":
    cdef int foo() except +

This will translate try and the C++ error into an appropriate Python exception. The translation is performed according to the following table (the std:: prefix is omitted from the C++ identifiers):

C++ Python
bad_alloc MemoryError
bad_cast TypeError
bad_typeid TypeError
domain_error ValueError
invalid_argument ValueError
ios_base::failure IOError
out_of_range IndexError
overflow_error OverflowError
range_error ArithmeticError
underflow_error ArithmeticError
(all others) RuntimeError

The what() message, if any, is preserved. Note that a C++ ios_base_failure can denote EOF, but does not carry enough information for Cython to discern that, so watch out with exception masks on IO streams.

cdef int bar() except +MemoryError

This will catch any C++ error and raise a Python MemoryError in its place. (Any Python exception is valid here.)

cdef int raise_py_error()
cdef int something_dangerous() except +raise_py_error

If something_dangerous raises a C++ exception then raise_py_error will be called, which allows one to do custom C++ to Python error “translations.” If raise_py_error does not actually raise an exception a RuntimeError will be raised.

There is also the special form:

cdef int raise_py_or_cpp() except +*

for those functions that may raise either a Python or a C++ exception.

Static member method

If the Rectangle class has a static member:

namespace shapes {
    class Rectangle {
    ...
    public:
        static void do_something();

    };
}

you can declare it using the Python @staticmethod decorator, i.e.:

cdef extern from "Rectangle.h" namespace "shapes":
    cdef cppclass Rectangle:
        ...
        @staticmethod
        void do_something()
Declaring/Using References

Cython supports declaring lvalue references using the standard Type& syntax. Note, however, that it is unnecessary to declare the arguments of extern functions as references (const or otherwise) as it has no impact on the caller’s syntax.

auto Keyword

Though Cython does not have an auto keyword, Cython local variables not explicitly typed with cdef are deduced from the types of the right hand side of all their assignments (see the infer_types compiler directive). This is particularly handy when dealing with functions that return complicated, nested, templated types, e.g.:

cdef vector[int] v = ...
it = v.begin()

(Though of course the for .. in syntax is preferred for objects supporting the iteration protocol.)

RTTI and typeid()

Cython has support for the typeid(...) operator.

from cython.operator cimport typeid

The typeid(...) operator returns an object of the type const type_info &.

If you want to store a type_info value in a C variable, you will need to store it as a pointer rather than a reference:

from libcpp.typeinfo cimport type_info
cdef const type_info* info = &typeid(MyClass)

If an invalid type is passed to typeid, it will throw an std::bad_typeid exception which is converted into a TypeError exception in Python.

An additional C++11-only RTTI-related class, std::type_index, is available in libcpp.typeindex.

Specify C++ language in setup.py

Instead of specifying the language and the sources in the source files, it is possible to declare them in the setup.py file:

from distutils.core import setup
from Cython.Build import cythonize

setup(ext_modules = cythonize(
           "rect.pyx",                 # our Cython source
           sources=["Rectangle.cpp"],  # additional source file(s)
           language="c++",             # generate C++ code
      ))

Cython will generate and compile the rect.cpp file (from rect.pyx), then it will compile Rectangle.cpp (implementation of the Rectangle class) and link both object files together into rect.so on Linux, or rect.pyd on windows, which you can then import in Python using import rect (if you forget to link the Rectangle.o, you will get missing symbols while importing the library in Python).

Note that the language option has no effect on user provided Extension objects that are passed into cythonize(). It is only used for modules found by file name (as in the example above).

The cythonize() function in Cython versions up to 0.21 does not recognize the language option and it needs to be specified as an option to an Extension that describes your extension and that is then handled by cythonize() as follows:

from distutils.core import setup, Extension
from Cython.Build import cythonize

setup(ext_modules = cythonize(Extension(
           "rect",                                # the extension name
           sources=["rect.pyx", "Rectangle.cpp"], # the Cython source and
                                                  # additional C++ source files
           language="c++",                        # generate and compile C++ code
      )))

The options can also be passed directly from the source file, which is often preferable (and overrides any global option). Starting with version 0.17, Cython also allows passing external source files into the cythonize() command this way. Here is a simplified setup.py file:

from distutils.core import setup
from Cython.Build import cythonize

setup(
    name = "rectangleapp",
    ext_modules = cythonize('*.pyx'),
)

And in the .pyx source file, write this into the first comment block, before any source code, to compile it in C++ mode and link it statically against the Rectangle.cpp code file:

# distutils: language = c++
# distutils: sources = Rectangle.cpp

Note

When using distutils directives, the paths are relative to the working directory of the distutils run (which is usually the project root where the setup.py resides).

To compile manually (e.g. using make), the cython command-line utility can be used to generate a C++ .cpp file, and then compile it into a python extension. C++ mode for the cython command is turned on with the --cplus option.

Caveats and Limitations

Access to C-only functions

Whenever generating C++ code, Cython generates declarations of and calls to functions assuming these functions are C++ (ie, not declared as extern "C" {...}. This is ok if the C functions have C++ entry points, but if they’re C only, you will hit a roadblock. If you have a C++ Cython module needing to make calls to pure-C functions, you will need to write a small C++ shim module which:

  • includes the needed C headers in an extern “C” block
  • contains minimal forwarding functions in C++, each of which calls the respective pure-C function
C++ left-values

C++ allows functions returning a reference to be left-values. This is currently not supported in Cython. cython.operator.dereference(foo) is also not considered a left-value.

Fused Types (Templates)

Fused types allow you to have one type definition that can refer to multiple types. This allows you to write a single static-typed cython algorithm that can operate on values of multiple types. Thus fused types allow generic programming and are akin to templates in C++ or generics in languages like Java / C#.

Note

Fused types are not currently supported as attributes of extension types. Only variables and function/method arguments can be declared with fused types.

Quickstart

from __future__ import print_function

ctypedef fused char_or_float:
    char
    float


cpdef char_or_float plus_one(char_or_float var):
    return var + 1


def show_me():
    cdef:
        char a = 127
        float b = 127
    print('char', plus_one(a))
    print('float', plus_one(b))

This gives:

>>> show_me()
char -128
float 128.0

plus_one(a) “specializes” the fused type char_or_float as a char, whereas plus_one(b) specializes char_or_float as a float.

Declaring Fused Types

Fused types may be declared as follows:

cimport cython

ctypedef fused my_fused_type:
    cython.int
    cython.double

This declares a new type called my_fused_type which can be either an int or a double. Alternatively, the declaration may be written as:

my_fused_type = cython.fused_type(cython.int, cython.float)

Only names may be used for the constituent types, but they may be any (non-fused) type, including a typedef. i.e. one may write:

ctypedef double my_double
my_fused_type = cython.fused_type(cython.int, my_double)

Using Fused Types

Fused types can be used to declare parameters of functions or methods:

cdef cfunc(my_fused_type arg):
    return arg + 1

If the you use the same fused type more than once in an argument list, then each specialization of the fused type must be the same:

cdef cfunc(my_fused_type arg1, my_fused_type arg2):
    return cython.typeof(arg1) == cython.typeof(arg2)

In this case, the type of both parameters is either an int, or a double (according to the previous examples). However, because these arguments use the same fused type my_fused_type, both arg1 and arg2 are specialized to the same type. Therefore this function returns True for every possible valid invocation. You are allowed to mix fused types however:

def func(A x, B y):
    ...

where A and B are different fused types. This will result in specialized code paths for all combinations of types contained in A and B.

Fused types and arrays

Note that specializations of only numeric types may not be very useful, as one can usually rely on promotion of types. This is not true for arrays, pointers and typed views of memory however. Indeed, one may write:

def myfunc(A[:, :] x):
    ...

# and

cdef otherfunc(A *x):
    ...

Note that in Cython 0.20.x and earlier, the compiler generated the full cross product of all type combinations when a fused type was used by more than one memory view in a type signature, e.g.

def myfunc(A[:] a, A[:] b):
    # a and b had independent item types in Cython 0.20.x and earlier.
    ...

This was unexpected for most users, unlikely to be desired, and also inconsistent with other structured type declarations like C arrays of fused types, which were considered the same type. It was thus changed in Cython 0.21 to use the same type for all memory views of a fused type. In order to get the original behaviour, it suffices to declare the same fused type under different names, and then use these in the declarations:

ctypedef fused A:
    int
    long

ctypedef fused B:
    int
    long

def myfunc(A[:] a, B[:] b):
    # a and b are independent types here and may have different item types
    ...

To get only identical types also in older Cython versions (pre-0.21), a ctypedef can be used:

ctypedef A[:] A_1d

def myfunc(A_1d a, A_1d b):
    # a and b have identical item types here, also in older Cython versions
    ...

Selecting Specializations

You can select a specialization (an instance of the function with specific or specialized (i.e., non-fused) argument types) in two ways: either by indexing or by calling.

Indexing

You can index functions with types to get certain specializations, i.e.:

cfunc[cython.p_double](p1, p2)

# From Cython space
func[float, double](myfloat, mydouble)

# From Python space
func[cython.float, cython.double](myfloat, mydouble)

If a fused type is used as a base type, this will mean that the base type is the fused type, so the base type is what needs to be specialized:

cdef myfunc(A *x):
    ...

# Specialize using int, not int *
myfunc[int](myint)
Calling

A fused function can also be called with arguments, where the dispatch is figured out automatically:

cfunc(p1, p2)
func(myfloat, mydouble)

For a cdef or cpdef function called from Cython this means that the specialization is figured out at compile time. For def functions the arguments are typechecked at runtime, and a best-effort approach is performed to figure out which specialization is needed. This means that this may result in a runtime TypeError if no specialization was found. A cpdef function is treated the same way as a def function if the type of the function is unknown (e.g. if it is external and there is no cimport for it).

The automatic dispatching rules are typically as follows, in order of preference:

  • try to find an exact match
  • choose the biggest corresponding numerical type (biggest float, biggest complex, biggest int)

Built-in Fused Types

There are some built-in fused types available for convenience, these are:

cython.integral # short, int, long
cython.floating # float, double
cython.numeric  # short, int, long, float, double, float complex, double complex

Casting Fused Functions

Fused cdef and cpdef functions may be cast or assigned to C function pointers as follows:

cdef myfunc(cython.floating, cython.integral):
    ...

# assign directly
cdef object (*funcp)(float, int)
funcp = myfunc
funcp(f, i)

# alternatively, cast it
(<object (*)(float, int)> myfunc)(f, i)

# This is also valid
funcp = myfunc[float, int]
funcp(f, i)

Type Checking Specializations

Decisions can be made based on the specializations of the fused parameters. False conditions are pruned to avoid invalid code. One may check with is, is not and == and != to see if a fused type is equal to a certain other non-fused type (to check the specialization), or use in and not in to figure out whether a specialization is part of another set of types (specified as a fused type). In example:

ctypedef fused bunch_of_types:
    ...

ctypedef fused string_t:
    cython.p_char
    bytes
    unicode

cdef cython.integral myfunc(cython.integral i, bunch_of_types s):
    cdef int *int_pointer
    cdef long *long_pointer

    # Only one of these branches will be compiled for each specialization!
    if cython.integral is int:
        int_pointer = &i
    else:
        long_pointer = &i

    if bunch_of_types in string_t:
        print("s is a string!")

__signatures__

Finally, function objects from def or cpdef functions have an attribute __signatures__, which maps the signature strings to the actual specialized functions. This may be useful for inspection. Listed signature strings may also be used as indices to the fused function, but the index format may change between Cython versions:

specialized_function = fused_function["MyExtensionClass|int|float"]

It would usually be preferred to index like this, however:

specialized_function = fused_function[MyExtensionClass, int, float]

Although the latter will select the biggest types for int and float from Python space, as they are not type identifiers but builtin types there. Passing cython.int and cython.float would resolve that, however.

For memoryview indexing from python space we can do the following:

ctypedef fused my_fused_type:
    int[:, ::1]
    float[:, ::1]

def func(my_fused_type array):
    ...

my_fused_type[cython.int[:, ::1]](myarray)

The same goes for when using e.g. cython.numeric[:, :].

Porting Cython code to PyPy

Cython has basic support for cpyext, the layer in PyPy that emulates CPython’s C-API. This is achieved by making the generated C code adapt at C compile time, so the generated code will compile in both CPython and PyPy unchanged.

However, beyond what Cython can cover and adapt internally, the cpyext C-API emulation involves some differences to the real C-API in CPython that have a visible impact on user code. This page lists major differences and ways to deal with them in order to write Cython code that works in both CPython and PyPy.

Reference counts

A general design difference in PyPy is that the runtime does not use reference counting internally but always a garbage collector. Reference counting is only emulated at the cpyext layer by counting references being held in C space. This implies that the reference count in PyPy is generally different from that in CPython because it does not count any references held in Python space.

Object lifetime

As a direct consequence of the different garbage collection characteristics, objects may see the end of their lifetime at other points than in CPython. Special care therefore has to be taken when objects are expected to have died in CPython but may not in PyPy. Specifically, a deallocator method of an extension type (__dealloc__()) may get called at a much later point than in CPython, triggered rather by memory getting tighter than by objects dying.

If the point in the code is known when an object is supposed to die (e.g. when it is tied to another object or to the execution time of a function), it is worth considering if it can be invalidated and cleaned up manually at that point, rather than relying on a deallocator.

As a side effect, this can sometimes even lead to a better code design, e.g. when context managers can be used together with the with statement.

Borrowed references and data pointers

The memory management in PyPy is allowed to move objects around in memory. The C-API layer is only an indirect view on PyPy objects and often replicates data or state into C space that is then tied to the lifetime of a C-API object rather then the underlying PyPy object. It is important to understand that these two objects are separate things in cpyext.

The effect can be that when data pointers or borrowed references are used, and the owning object is no longer directly referenced from C space, the reference or data pointer may become invalid at some point, even if the object itself is still alive. As opposed to CPython, it is not enough to keep the reference to the object alive in a list (or other Python container), because the contents of those is only managed in Python space and thus only references the PyPy object. A reference in a Python container will not keep the C-API view on it alive. Entries in a Python class dict will obviously not work either.

One of the more visible places where this may happen is when accessing the char* buffer of a byte string. In PyPy, this will only work as long as the Cython code holds a direct reference to the byte string object itself.

Another point is when CPython C-API functions are used directly that return borrowed references, e.g. PyTuple_GET_ITEM() and similar functions, but also some functions that return borrowed references to built-in modules or low-level objects of the runtime environment. The GIL in PyPy only guarantees that the borrowed reference stays valid up to the next call into PyPy (or its C-API), but not necessarily longer.

When accessing the internals of Python objects or using borrowed references longer than up to the next call into PyPy, including reference counting or anything that frees the GIL, it is therefore required to additionally keep direct owned references to these objects alive in C space, e.g. in local variables in a function or in the attributes of an extension type.

When in doubt, avoid using C-API functions that return borrowed references, or surround the usage of a borrowed reference explicitly by a pair of calls to Py_INCREF() when getting the reference and Py_DECREF() when done with it to convert it into an owned reference.

Builtin types, slots and fields

The following builtin types are not currently available in cpyext in form of their C level representation: PyComplexObject, PyFloatObject and PyBoolObject.

Many of the type slot functions of builtin types are not initialised in cpyext and can therefore not be used directly.

Similarly, almost none of the (implementation) specific struct fields of builtin types is exposed at the C level, such as the ob_digit field of PyLongObject or the allocated field of the PyListObject struct etc. Although the ob_size field of containers (used by the Py_SIZE() macro) is available, it is not guaranteed to be accurate.

It is best not to access any of these struct fields and slots and to use the normal Python types instead as well as the normal Python protocols for object operations. Cython will map them to an appropriate usage of the C-API in both CPython and cpyext.

GIL handling

Currently, the GIL handling function PyGILState_Ensure() is not re-entrant in PyPy and deadlocks when called twice. This means that code that tries to acquire the GIL “just in case”, because it might be called with or without the GIL, will not work as expected in PyPy. See PyGILState_Ensure should not deadlock if GIL already held.

Efficiency

Simple functions and especially macros that are used for speed in CPython may exhibit substantially different performance characteristics in cpyext.

Functions returning borrowed references were already mentioned as requiring special care, but they also induce substantially more runtime overhead because they often create weak references in PyPy where they only return a plain pointer in CPython. A visible example is PyTuple_GET_ITEM().

Some more high-level functions may also show entirely different performance characteristics, e.g. PyDict_Next() for dict iteration. While being the fastest way to iterate over a dict in CPython, having linear time complexity and a low overhead, it currently has quadratic runtime in PyPy because it maps to normal dict iteration, which cannot keep track of the current position between two calls and thus needs to restart the iteration on each call.

The general advice applies here even more than in CPython, that it is always best to rely on Cython generating appropriately adapted C-API handling code for you than to use the C-API directly - unless you really know what you are doing. And if you find a better way of doing something in PyPy and cpyext than Cython currently does, it’s best to fix Cython for everyone’s benefit.

Known problems

  • As of PyPy 1.9, subtyping builtin types can result in infinite recursion on method calls in some rare cases.
  • Docstrings of special methods are not propagated to Python space.
  • The Python 3.x adaptations in pypy3 only slowly start to include the C-API, so more incompatibilities can be expected there.

Bugs and crashes

The cpyext implementation in PyPy is much younger and substantially less mature than the well tested C-API and its underlying native implementation in CPython. This should be remembered when running into crashes, as the problem may not always be in your code or in Cython. Also, PyPy and its cpyext implementation are less easy to debug at the C level than CPython and Cython, simply because they were not designed for it.

Limitations

This page used to list bugs in Cython that made the semantics of compiled code differ from that in Python. Most of the missing features have been fixed in Cython 0.15. Note that a future version 1.0 of Cython is planned to provide full Python language compatibility.

Below is a list of differences that we will probably not be addressing. Most of these things that fall more into the implementation details rather than semantics, and we may decide not to fix (or require a –pedantic flag to get).

Nested tuple argument unpacking

def f((a,b), c):
    pass

This was removed in Python 3.

Inspect support

While it is quite possible to emulate the interface of functions in Cython’s own function type, and recent Cython releases have seen several improvements here, the “inspect” module does not consider a Cython implemented function a “function”, because it tests the object type explicitly instead of comparing an abstract interface or an abstract base class. This has a negative impact on code that uses inspect to inspect function objects, but would require a change to Python itself.

Stack frames

Currently we generate fake tracebacks as part of exception propagation, but don’t fill in locals and can’t fill in co_code. To be fully compatible, we would have to generate these stack frame objects at function call time (with a potential performance penalty). We may have an option to enable this for debugging.

Identity vs. equality for inferred literals

a = 1.0          # a inferred to be C type 'double'
b = c = None     # b and c inferred to be type 'object'
if some_runtime_expression:
    b = a        # creates a new Python float object
    c = a        # creates a new Python float object
print(b is c)     # most likely not the same object

Differences between Cython and Pyrex

Warning

Both Cython and Pyrex are moving targets. It has come to the point that an explicit list of all the differences between the two projects would be laborious to list and track, but hopefully this high-level list gives an idea of the differences that are present. It should be noted that both projects make an effort at mutual compatibility, but Cython’s goal is to be as close to and complete as Python as reasonable.

Python 3 Support

Cython creates .c files that can be built and used with both Python 2.x and Python 3.x. In fact, compiling your module with Cython may very well be an easy way to port code to Python 3.

Cython also supports various syntax additions that came with Python 3.0 and later major Python releases. If they do not conflict with existing Python 2.x syntax or semantics, they are usually just accepted by the compiler. Everything else depends on the compiler directive language_level=3 (see compiler directives).

List/Set/Dict Comprehensions

Cython supports the different comprehensions defined by Python 3 for lists, sets and dicts:

[expr(x) for x in A]             # list
{expr(x) for x in A}             # set
{key(x) : value(x) for x in A}   # dict

Looping is optimized if A is a list, tuple or dict. You can use the forfrom syntax, too, but it is generally preferred to use the usual forin range(...) syntax with a C run variable (e.g. cdef int i).

Note that Cython also supports set literals starting from Python 2.4.

Keyword-only arguments

Python functions can have keyword-only arguments listed after the * parameter and before the ** parameter if any, e.g.:

def f(a, b, *args, c, d = 42, e, **kwds):
    ...

Here c, d and e cannot be passed as position arguments and must be passed as keyword arguments. Furthermore, c and e are required keyword arguments, since they do not have a default value.

If the parameter name after the * is omitted, the function will not accept any extra positional arguments, e.g.:

def g(a, b, *, c, d):
    ...

takes exactly two positional parameters and has two required keyword parameters.

Conditional expressions “x if b else y”

Conditional expressions as described in https://www.python.org/dev/peps/pep-0308/:

X if C else Y

Only one of X and Y is evaluated (depending on the value of C).

cdef inline

Module level functions can now be declared inline, with the inline keyword passed on to the C compiler. These can be as fast as macros.:

cdef inline int something_fast(int a, int b):
    return a*a + b

Note that class-level cdef functions are handled via a virtual function table, so the compiler won’t be able to inline them in almost all cases.

Assignment on declaration (e.g. “cdef int spam = 5”)

In Pyrex, one must write:

cdef int i, j, k
i = 2
j = 5
k = 7

Now, with cython, one can write:

cdef int i = 2, j = 5, k = 7

The expression on the right hand side can be arbitrarily complicated, e.g.:

cdef int n = python_call(foo(x,y), a + b + c) - 32

‘by’ expression in for loop (e.g. “for i from 0 <= i < 10 by 2”)

for i from 0 <= i < 10 by 2:
    print i

yields:

0
2
4
6
8

Note

Usage of this syntax is discouraged as it is redundant with the normal Python for loop. See Automatic range conversion.

Boolean int type (e.g. it acts like a c int, but coerces to/from python as a boolean)

In C, ints are used for truth values. In python, any object can be used as a truth value (using the __nonzero__() method), but the canonical choices are the two boolean objects True and False. The bint (for “boolean int”) type is compiled to a C int, but coerces to and from Python as booleans. The return type of comparisons and several builtins is a bint as well. This reduces the need for wrapping things in bool(). For example, one can write:

def is_equal(x):
    return x == y

which would return 1 or 0 in Pyrex, but returns True or False in Cython. One can declare variables and return values for functions to be of the bint type. For example:

cdef int i = x
cdef bint b = x

The first conversion would happen via x.__int__() whereas the second would happen via x.__bool__() (a.k.a. __nonzero__()), with appropriate optimisations for known builtin types.

Executable class bodies

Including a working classmethod():

cdef class Blah:
    def some_method(self):
        print self
    some_method = classmethod(some_method)
    a = 2*3
    print "hi", a

cpdef functions

Cython adds a third function type on top of the usual def and cdef. If a function is declared cpdef it can be called from and overridden by both extension and normal python subclasses. You can essentially think of a cpdef method as a cdef method + some extras. (That’s how it’s implemented at least.) First, it creates a def method that does nothing but call the underlying cdef method (and does argument unpacking/coercion if needed). At the top of the cdef method a little bit of code is added to see if it’s overridden, similar to the following pseudocode:

if hasattr(type(self), '__dict__'):
    foo = self.foo
    if foo is not wrapper_foo:
        return foo(args)
[cdef method body]

To detect whether or not a type has a dictionary, it just checks the tp_dictoffset slot, which is NULL (by default) for extension types, but non- null for instance classes. If the dictionary exists, it does a single attribute lookup and can tell (by comparing pointers) whether or not the returned result is actually a new function. If, and only if, it is a new function, then the arguments packed into a tuple and the method called. This is all very fast. A flag is set so this lookup does not occur if one calls the method on the class directly, e.g.:

cdef class A:
    cpdef foo(self):
        pass

x = A()
x.foo()  # will check to see if overridden
A.foo(x) # will call A's implementation whether overridden or not

See Early Binding for Speed for explanation and usage tips.

Automatic range conversion

This will convert statements of the form for i in range(...) to for i from ... when i is any cdef’d integer type, and the direction (i.e. sign of step) can be determined.

Warning

This may change the semantics if the range causes assignment to i to overflow. Specifically, if this option is set, an error will be raised before the loop is entered, whereas without this option the loop will execute until a overflowing value is encountered. If this affects you, change Cython/Compiler/Options.py (eventually there will be a better way to set this).

More friendly type casting

In Pyrex, if one types <int>x where x is a Python object, one will get the memory address of x. Likewise, if one types <object>i where i is a C int, one will get an “object” at location i in memory. This leads to confusing results and segfaults.

In Cython <type>x will try and do a coercion (as would happen on assignment of x to a variable of type type) if exactly one of the types is a python object. It does not stop one from casting where there is no conversion (though it will emit a warning). If one really wants the address, cast to a void * first.

As in Pyrex <MyExtensionType>x will cast x to type MyExtensionType without any type checking. Cython supports the syntax <MyExtensionType?> to do the cast with type checking (i.e. it will throw an error if x is not a (subclass of) MyExtensionType.

Optional arguments in cdef/cpdef functions

Cython now supports optional arguments for cdef and cpdef functions.

The syntax in the .pyx file remains as in Python, but one declares such functions in the .pxd file by writing cdef foo(x=*). The number of arguments may increase on subclassing, but the argument types and order must remain the same. There is a slight performance penalty in some cases when a cdef/cpdef function without any optional is overridden with one that does have default argument values.

For example, one can have the .pxd file:

cdef class A:
    cdef foo(self)

cdef class B(A):
    cdef foo(self, x=*)

cdef class C(B):
    cpdef foo(self, x=*, int k=*)

with corresponding .pyx file:

from __future__ import print_function

cdef class A:
    cdef foo(self):
        print("A")

cdef class B(A):
    cdef foo(self, x=None):
        print("B", x)

cdef class C(B):
    cpdef foo(self, x=True, int k=3):
        print("C", x, k)

Note

this also demonstrates how cpdef functions can override cdef functions.

Function pointers in structs

Functions declared in struct are automatically converted to function pointers for convenience.

C++ Exception handling

cdef functions can now be declared as:

cdef int foo(...) except +
cdef int foo(...) except +TypeError
cdef int foo(...) except +python_error_raising_function

in which case a Python exception will be raised when a C++ error is caught. See Using C++ in Cython for more details.

Synonyms

cdef import from means the same thing as cdef extern from

Source code encoding

Cython supports PEP 3120 and PEP 263, i.e. you can start your Cython source file with an encoding comment and generally write your source code in UTF-8. This impacts the encoding of byte strings and the conversion of unicode string literals like u'abcd' to unicode objects.

Automatic typecheck

Rather than introducing a new keyword typecheck as explained in the Pyrex docs, Cython emits a (non-spoofable and faster) typecheck whenever isinstance() is used with an extension type as the second parameter.

From __future__ directives

Cython supports several from __future__ import ... directives, namely absolute_import, unicode_literals, print_function and division.

With statements are always enabled.

Pure Python mode

Cython has support for compiling .py files, and accepting type annotations using decorators and other valid Python syntax. This allows the same source to be interpreted as straight Python, or compiled for optimized results. See Pure Python Mode for more details.

Typed Memoryviews

Typed memoryviews allow efficient access to memory buffers, such as those underlying NumPy arrays, without incurring any Python overhead. Memoryviews are similar to the current NumPy array buffer support (np.ndarray[np.float64_t, ndim=2]), but they have more features and cleaner syntax.

Memoryviews are more general than the old NumPy array buffer support, because they can handle a wider variety of sources of array data. For example, they can handle C arrays and the Cython array type (Cython arrays).

A memoryview can be used in any context (function parameters, module-level, cdef class attribute, etc) and can be obtained from nearly any object that exposes writable buffer through the `PEP 3118`_ buffer interface.

Quickstart

If you are used to working with NumPy, the following examples should get you started with Cython memory views.

from cython.view cimport array as cvarray
import numpy as np

# Memoryview on a NumPy array
narr = np.arange(27, dtype=np.dtype("i")).reshape((3, 3, 3))
cdef int [:, :, :] narr_view = narr

# Memoryview on a C array
cdef int carr[3][3][3]
cdef int [:, :, :] carr_view = carr

# Memoryview on a Cython array
cyarr = cvarray(shape=(3, 3, 3), itemsize=sizeof(int), format="i")
cdef int [:, :, :] cyarr_view = cyarr

# Show the sum of all the arrays before altering it
print("NumPy sum of the NumPy array before assignments: %s" % narr.sum())

# We can copy the values from one memoryview into another using a single
# statement, by either indexing with ... or (NumPy-style) with a colon.
carr_view[...] = narr_view
cyarr_view[:] = narr_view
# NumPy-style syntax for assigning a single value to all elements.
narr_view[:, :, :] = 3

# Just to distinguish the arrays
carr_view[0, 0, 0] = 100
cyarr_view[0, 0, 0] = 1000

# Assigning into the memoryview on the NumPy array alters the latter
print("NumPy sum of NumPy array after assignments: %s" % narr.sum())

# A function using a memoryview does not usually need the GIL
cpdef int sum3d(int[:, :, :] arr) nogil:
    cdef size_t i, j, k
    cdef int total = 0
    I = arr.shape[0]
    J = arr.shape[1]
    K = arr.shape[2]
    for i in range(I):
        for j in range(J):
            for k in range(K):
                total += arr[i, j, k]
    return total

# A function accepting a memoryview knows how to use a NumPy array,
# a C array, a Cython array...
print("Memoryview sum of NumPy array is %s" % sum3d(narr))
print("Memoryview sum of C array is %s" % sum3d(carr))
print("Memoryview sum of Cython array is %s" % sum3d(cyarr))
# ... and of course, a memoryview.
print("Memoryview sum of C memoryview is %s" % sum3d(carr_view))

This code should give the following output:

NumPy sum of the NumPy array before assignments: 351
NumPy sum of NumPy array after assignments: 81
Memoryview sum of NumPy array is 81
Memoryview sum of C array is 451
Memoryview sum of Cython array is 1351
Memoryview sum of C memoryview is 451

Using memoryviews

Syntax

Memory views use Python slicing syntax in a similar way as NumPy.

To create a complete view on a one-dimensional int buffer:

cdef int[:] view1D = exporting_object

A complete 3D view:

cdef int[:,:,:] view3D = exporting_object

A 2D view that restricts the first dimension of a buffer to 100 rows starting at the second (index 1) and then skips every second (odd) row:

cdef int[1:102:2,:] partial_view = exporting_object

This also works conveniently as function arguments:

def process_3d_buffer(int[1:102:2,:] view not None):
    ...

The not None declaration for the argument automatically rejects None values as input, which would otherwise be allowed. The reason why None is allowed by default is that it is conveniently used for return arguments:

import numpy as np

def process_buffer(int[:,:] input_view not None,
                   int[:,:] output_view=None):

   if output_view is None:
       # Creating a default view, e.g.
       output_view = np.empty_like(input_view)

   # process 'input_view' into 'output_view'
   return output_view

Cython will reject incompatible buffers automatically, e.g. passing a three dimensional buffer into a function that requires a two dimensional buffer will raise a ValueError.

Indexing

In Cython, index access on memory views is automatically translated into memory addresses. The following code requests a two-dimensional memory view of C int typed items and indexes into it:

cdef int[:,:] buf = exporting_object

print(buf[1,2])

Negative indices work as well, counting from the end of the respective dimension:

print(buf[-1,-2])

The following function loops over each dimension of a 2D array and adds 1 to each item:

import numpy as np

def add_one(int[:,:] buf):
    for x in range(buf.shape[0]):
        for y in range(buf.shape[1]):
            buf[x, y] += 1

# exporting_object must be a Python object
# implementing the buffer interface, e.g. a numpy array.
exporting_object = np.zeros((10, 20), dtype=np.intc)

add_one(exporting_object)

Indexing and slicing can be done with or without the GIL. It basically works like NumPy. If indices are specified for every dimension you will get an element of the base type (e.g. int). Otherwise, you will get a new view. An Ellipsis means you get consecutive slices for every unspecified dimension:

import numpy as np

exporting_object = np.arange(0, 15 * 10 * 20, dtype=np.intc).reshape((15, 10, 20))

cdef int[:, :, :] my_view = exporting_object

# These are all equivalent
my_view[10]
my_view[10, :, :]
my_view[10, ...]
Copying

Memory views can be copied in place:

import numpy as np

cdef int[:, :, :] to_view, from_view
to_view = np.empty((20, 15, 30), dtype=np.intc)
from_view = np.ones((20, 15, 30), dtype=np.intc)

# copy the elements in from_view to to_view
to_view[...] = from_view
# or
to_view[:] = from_view
# or
to_view[:, :, :] = from_view

They can also be copied with the copy() and copy_fortran() methods; see C and Fortran contiguous copies.

Transposing

In most cases (see below), the memoryview can be transposed in the same way that NumPy slices can be transposed:

import numpy as np

array = np.arange(20, dtype=np.intc).reshape((2, 10))

cdef int[:, ::1] c_contig = array
cdef int[::1, :] f_contig = c_contig.T

This gives a new, transposed, view on the data.

Transposing requires that all dimensions of the memoryview have a direct access memory layout (i.e., there are no indirections through pointers). See Specifying more general memory layouts for details.

Newaxis

As for NumPy, new axes can be introduced by indexing an array with None

cdef double[:] myslice = np.linspace(0, 10, num=50)

# 2D array with shape (1, 50)
myslice[None] # or
myslice[None, :]

# 2D array with shape (50, 1)
myslice[:, None]

# 3D array with shape (1, 10, 1)
myslice[None, 10:-20:2, None]

One may mix new axis indexing with all other forms of indexing and slicing. See also an example.

Read-only views

Since Cython 0.28, the memoryview item type can be declared as const to support read-only buffers as input:

import numpy as np

cdef const double[:] myslice   # const item type => read-only view

a = np.linspace(0, 10, num=50)
a.setflags(write=False)
myslice = a

Using a non-const memoryview with a binary Python string produces a runtime error. You can solve this issue with a const memoryview:

cdef bint is_y_in(const unsigned char[:] string_view):
    cdef int i
    for i in range(string_view.shape[0]):
        if string_view[i] == b'y':
            return True
    return False

print(is_y_in(b'hello world'))   # False
print(is_y_in(b'hello Cython'))  # True

Note that this does not require the input buffer to be read-only:

a = np.linspace(0, 10, num=50)
myslice = a   # read-only view of a writable buffer

Writable buffers are still accepted by const views, but read-only buffers are not accepted for non-const, writable views:

cdef double[:] myslice   # a normal read/write memory view

a = np.linspace(0, 10, num=50)
a.setflags(write=False)
myslice = a   # ERROR: requesting writable memory view from read-only buffer!

Comparison to the old buffer support

You will probably prefer memoryviews to the older syntax because:

  • The syntax is cleaner
  • Memoryviews do not usually need the GIL (see Memoryviews and the GIL)
  • Memoryviews are considerably faster

For example, this is the old syntax equivalent of the sum3d function above:

cpdef int old_sum3d(object[int, ndim=3, mode='strided'] arr):
    cdef int I, J, K, total = 0
    I = arr.shape[0]
    J = arr.shape[1]
    K = arr.shape[2]
    for i in range(I):
        for j in range(J):
            for k in range(K):
                total += arr[i, j, k]
    return total

Note that we can’t use nogil for the buffer version of the function as we could for the memoryview version of sum3d above, because buffer objects are Python objects. However, even if we don’t use nogil with the memoryview, it is significantly faster. This is a output from an IPython session after importing both versions:

In [2]: import numpy as np

In [3]: arr = np.zeros((40, 40, 40), dtype=int)

In [4]: timeit -r15 old_sum3d(arr)
1000 loops, best of 15: 298 us per loop

In [5]: timeit -r15 sum3d(arr)
1000 loops, best of 15: 219 us per loop

Python buffer support

Cython memoryviews support nearly all objects exporting the interface of Python `new style buffers`_. This is the buffer interface described in `PEP 3118`_. NumPy arrays support this interface, as do Cython arrays. The “nearly all” is because the Python buffer interface allows the elements in the data array to themselves be pointers; Cython memoryviews do not yet support this.

Memory layout

The buffer interface allows objects to identify the underlying memory in a variety of ways. With the exception of pointers for data elements, Cython memoryviews support all Python new-type buffer layouts. It can be useful to know or specify memory layout if the memory has to be in a particular format for an external routine, or for code optimization.

Background

The concepts are as follows: there is data access and data packing. Data access means either direct (no pointer) or indirect (pointer). Data packing means your data may be contiguous or not contiguous in memory, and may use strides to identify the jumps in memory consecutive indices need to take for each dimension.

NumPy arrays provide a good model of strided direct data access, so we’ll use them for a refresher on the concepts of C and Fortran contiguous arrays, and data strides.

Brief recap on C, Fortran and strided memory layouts

The simplest data layout might be a C contiguous array. This is the default layout in NumPy and Cython arrays. C contiguous means that the array data is continuous in memory (see below) and that neighboring elements in the first dimension of the array are furthest apart in memory, whereas neighboring elements in the last dimension are closest together. For example, in NumPy:

In [2]: arr = np.array([['0', '1', '2'], ['3', '4', '5']], dtype='S1')

Here, arr[0, 0] and arr[0, 1] are one byte apart in memory, whereas arr[0, 0] and arr[1, 0] are 3 bytes apart. This leads us to the idea of strides. Each axis of the array has a stride length, which is the number of bytes needed to go from one element on this axis to the next element. In the case above, the strides for axes 0 and 1 will obviously be:

In [3]: arr.strides
Out[4]: (3, 1)

For a 3D C contiguous array:

In [5]: c_contig = np.arange(24, dtype=np.int8).reshape((2,3,4))
In [6] c_contig.strides
Out[6]: (12, 4, 1)

A Fortran contiguous array has the opposite memory ordering, with the elements on the first axis closest together in memory:

In [7]: f_contig = np.array(c_contig, order='F')
In [8]: np.all(f_contig == c_contig)
Out[8]: True
In [9]: f_contig.strides
Out[9]: (1, 2, 6)

A contiguous array is one for which a single continuous block of memory contains all the data for the elements of the array, and therefore the memory block length is the product of number of elements in the array and the size of the elements in bytes. In the example above, the memory block is 2 * 3 * 4 * 1 bytes long, where 1 is the length of an int8.

An array can be contiguous without being C or Fortran order:

In [10]: c_contig.transpose((1, 0, 2)).strides
Out[10]: (4, 12, 1)

Slicing an NumPy array can easily make it not contiguous:

In [11]: sliced = c_contig[:,1,:]
In [12]: sliced.strides
Out[12]: (12, 1)
In [13]: sliced.flags
Out[13]:
C_CONTIGUOUS : False
F_CONTIGUOUS : False
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
Default behavior for memoryview layouts

As you’ll see in Specifying more general memory layouts, you can specify memory layout for any dimension of an memoryview. For any dimension for which you don’t specify a layout, then the data access is assumed to be direct, and the data packing assumed to be strided. For example, that will be the assumption for memoryviews like:

int [:, :, :] my_memoryview = obj
C and Fortran contiguous memoryviews

You can specify C and Fortran contiguous layouts for the memoryview by using the ::1 step syntax at definition. For example, if you know for sure your memoryview will be on top of a 3D C contiguous layout, you could write:

cdef int[:, :, ::1] c_contiguous = c_contig

where c_contig could be a C contiguous NumPy array. The ::1 at the 3rd position means that the elements in this 3rd dimension will be one element apart in memory. If you know you will have a 3D Fortran contiguous array:

cdef int[::1, :, :] f_contiguous = f_contig

If you pass a non-contiguous buffer, for example

# This array is C contiguous
c_contig = np.arange(24).reshape((2,3,4))
cdef int[:, :, ::1] c_contiguous = c_contig

# But this isn't
c_contiguous = np.array(c_contig, order='F')

you will get a ValueError at runtime:

/Users/mb312/dev_trees/minimal-cython/mincy.pyx in init mincy (mincy.c:17267)()
    69
    70 # But this isn't
---> 71 c_contiguous = np.array(c_contig, order='F')
    72
    73 # Show the sum of all the arrays before altering it

/Users/mb312/dev_trees/minimal-cython/stringsource in View.MemoryView.memoryview_cwrapper (mincy.c:9995)()

/Users/mb312/dev_trees/minimal-cython/stringsource in View.MemoryView.memoryview.__cinit__ (mincy.c:6799)()

ValueError: ndarray is not C-contiguous

Thus the ::1 in the slice type specification indicates in which dimension the data is contiguous. It can only be used to specify full C or Fortran contiguity.

C and Fortran contiguous copies

Copies can be made C or Fortran contiguous using the .copy() and .copy_fortran() methods:

# This view is C contiguous
cdef int[:, :, ::1] c_contiguous = myview.copy()

# This view is Fortran contiguous
cdef int[::1, :] f_contiguous_slice = myview.copy_fortran()
Specifying more general memory layouts

Data layout can be specified using the previously seen ::1 slice syntax, or by using any of the constants in cython.view. If no specifier is given in any dimension, then the data access is assumed to be direct, and the data packing assumed to be strided. If you don’t know whether a dimension will be direct or indirect (because you’re getting an object with a buffer interface from some library perhaps), then you can specify the generic flag, in which case it will be determined at runtime.

The flags are as follows:

  • generic - strided and direct or indirect
  • strided - strided and direct (this is the default)
  • indirect - strided and indirect
  • contiguous - contiguous and direct
  • indirect_contiguous - the list of pointers is contiguous

and they can be used like this:

from cython cimport view

# direct access in both dimensions, strided in the first dimension, contiguous in the last
cdef int[:, ::view.contiguous] a

# contiguous list of pointers to contiguous lists of ints
cdef int[::view.indirect_contiguous, ::1] b

# direct or indirect in the first dimension, direct in the second dimension
# strided in both dimensions
cdef int[::view.generic, :] c

Only the first, last or the dimension following an indirect dimension may be specified contiguous:

from cython cimport view

# VALID
cdef int[::view.indirect, ::1, :] a
cdef int[::view.indirect, :, ::1] b
cdef int[::view.indirect_contiguous, ::1, :] c
# INVALID
cdef int[::view.contiguous, ::view.indirect, :] d
cdef int[::1, ::view.indirect, :] e

The difference between the contiguous flag and the ::1 specifier is that the former specifies contiguity for only one dimension, whereas the latter specifies contiguity for all following (Fortran) or preceding (C) dimensions:

cdef int[:, ::1] c_contig = ...

# VALID
cdef int[:, ::view.contiguous] myslice = c_contig[::2]

# INVALID
cdef int[:, ::1] myslice = c_contig[::2]

The former case is valid because the last dimension remains contiguous, but the first dimension does not “follow” the last one anymore (meaning, it was strided already, but it is not C or Fortran contiguous any longer), since it was sliced.

Memoryviews and the GIL

As you will see from the Quickstart section, memoryviews often do not need the GIL:

cpdef int sum3d(int[:, :, :] arr) nogil:
    ...

In particular, you do not need the GIL for memoryview indexing, slicing or transposing. Memoryviews require the GIL for the copy methods (C and Fortran contiguous copies), or when the dtype is object and an object element is read or written.

Memoryview Objects and Cython Arrays

These typed memoryviews can be converted to Python memoryview objects (cython.view.memoryview). These Python objects are indexable, slicable and transposable in the same way that the original memoryviews are. They can also be converted back to Cython-space memoryviews at any time.

They have the following attributes:

  • shape: size in each dimension, as a tuple.
  • strides: stride along each dimension, in bytes.
  • suboffsets
  • ndim: number of dimensions.
  • size: total number of items in the view (product of the shape).
  • itemsize: size, in bytes, of the items in the view.
  • nbytes: equal to size times itemsize.
  • base

And of course the aforementioned T attribute (Transposing). These attributes have the same semantics as in NumPy. For instance, to retrieve the original object:

import numpy
cimport numpy as cnp

cdef cnp.int32_t[:] a = numpy.arange(10, dtype=numpy.int32)
a = a[::2]

print(a)
print(numpy.asarray(a))
print(a.base)

# this prints:
#    <MemoryView of 'ndarray' object>
#    [0 2 4 6 8]
#    [0 1 2 3 4 5 6 7 8 9]

Note that this example returns the original object from which the view was obtained, and that the view was resliced in the meantime.

Cython arrays

Whenever a Cython memoryview is copied (using any of the copy or copy_fortran methods), you get a new memoryview slice of a newly created cython.view.array object. This array can also be used manually, and will automatically allocate a block of data. It can later be assigned to a C or Fortran contiguous slice (or a strided slice). It can be used like:

from cython cimport view

my_array = view.array(shape=(10, 2), itemsize=sizeof(int), format="i")
cdef int[:, :] my_slice = my_array

It also takes an optional argument mode (‘c’ or ‘fortran’) and a boolean allocate_buffer, that indicates whether a buffer should be allocated and freed when it goes out of scope:

cdef view.array my_array = view.array(..., mode="fortran", allocate_buffer=False)
my_array.data = <char *> my_data_pointer

# define a function that can deallocate the data (if needed)
my_array.callback_free_data = free

You can also cast pointers to array, or C arrays to arrays:

cdef view.array my_array = <int[:10, :2]> my_data_pointer
cdef view.array my_array = <int[:, :]> my_c_array

Of course, you can also immediately assign a cython.view.array to a typed memoryview slice. A C array may be assigned directly to a memoryview slice:

cdef int[:, ::1] myslice = my_2d_c_array

The arrays are indexable and slicable from Python space just like memoryview objects, and have the same attributes as memoryview objects.

CPython array module

An alternative to cython.view.array is the array module in the Python standard library. In Python 3, the array.array type supports the buffer interface natively, so memoryviews work on top of it without additional setup.

Starting with Cython 0.17, however, it is possible to use these arrays as buffer providers also in Python 2. This is done through explicitly cimporting the cpython.array module as follows:

cimport cpython.array

def sum_array(int[:] view):
    """
    >>> from array import array
    >>> sum_array( array('i', [1,2,3]) )
    6
    """
    cdef int total
    for i in range(view.shape[0]):
        total += view[i]
    return total

Note that the cimport also enables the old buffer syntax for the array type. Therefore, the following also works:

from cpython cimport array

def sum_array(array.array[int] arr):  # using old buffer syntax
    ...

Coercion to NumPy

Memoryview (and array) objects can be coerced to a NumPy ndarray, without having to copy the data. You can e.g. do:

cimport numpy as np
import numpy as np

numpy_array = np.asarray(<np.int32_t[:10, :10]> my_pointer)

Of course, you are not restricted to using NumPy’s type (such as np.int32_t here), you can use any usable type.

None Slices

Although memoryview slices are not objects they can be set to None and they can be checked for being None as well:

def func(double[:] myarray = None):
    print(myarray is None)

If the function requires real memory views as input, it is therefore best to reject None input straight away in the signature, which is supported in Cython 0.17 and later as follows:

def func(double[:] myarray not None):
    ...

Unlike object attributes of extension classes, memoryview slices are not initialized to None.

Pass data from a C function via pointer

Since use of pointers in C is ubiquitous, here we give a quick example of how to call C functions whose arguments contain pointers. Let’s suppose you want to manage an array (allocate and deallocate) with NumPy (it can also be Python arrays, or anything that supports the buffer interface), but you want to perform computation on this array with an external C function implemented in C_func_file.c:

1
2
3
4
5
6
7
8
9
#include "C_func_file.h"

void multiply_by_10_in_C(double arr[], unsigned int n)
{
    unsigned int i;
    for (i = 0; i < n; i++) {
        arr[i] *= 10;
    }
}

This file comes with a header file called C_func_file.h containing:

1
2
3
4
5
6
#ifndef C_FUNC_FILE_H
#define C_FUNC_FILE_H

void multiply_by_10_in_C(double arr[], unsigned int n);

#endif

where arr points to the array and n is its size.

You can call the function in a Cython file in the following way:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
cdef extern from "C_func_file.c":
    # C is include here so that it doesn't need to be compiled externally
    pass

cdef extern from "C_func_file.h":
    void multiply_by_10_in_C(double *, unsigned int)

import numpy as np

def multiply_by_10(arr): # 'arr' is a one-dimensional numpy array

    if not arr.flags['C_CONTIGUOUS']:
        arr = np.ascontiguousarray(arr) # Makes a contiguous copy of the numpy array.

    cdef double[::1] arr_memview = arr

    multiply_by_10_in_C(&arr_memview[0], arr_memview.shape[0])

    return arr


a = np.ones(5, dtype=np.double)
print(multiply_by_10(a))

b = np.ones(10, dtype=np.double)
b = b[::2]  # b is not contiguous.

print(multiply_by_10(b))  # but our function still works as expected.
Several things to note:
  • ::1 requests a C contiguous view, and fails if the buffer is not C contiguous. See C and Fortran contiguous memoryviews.
  • &arr_memview[0] can be understood as ‘the address of the first element of the memoryview’. For contiguous arrays, this is equivalent to the start address of the flat memory buffer.
  • arr_memview.shape[0] could have been replaced by arr_memview.size, arr.shape[0] or arr.size. But arr_memview.shape[0] is more efficient because it doesn’t require any Python interaction.
  • multiply_by_10 will perform computation in-place if the array passed is contiguous, and will return a new numpy array if arr is not contiguous.
  • If you are using Python arrays instead of numpy arrays, you don’t need to check if the data is stored contiguously as this is always the case. See Working with Python arrays.

This way, you can call the C function similar to a normal Python function, and leave all the memory management and cleanup to NumPy arrays and Python’s object handling. For the details of how to compile and call functions in C files, see Using C libraries.

Implementing the buffer protocol

Cython objects can expose memory buffers to Python code by implementing the “buffer protocol”. This chapter shows how to implement the protocol and make use of the memory managed by an extension type from NumPy.

A matrix class

The following Cython/C++ code implements a matrix of floats, where the number of columns is fixed at construction time but rows can be added dynamically.

# distutils: language = c++

# matrix.pyx

from libcpp.vector cimport vector

cdef class Matrix:
    cdef unsigned ncols
    cdef vector[float] v

    def __cinit__(self, unsigned ncols):
        self.ncols = ncols

    def add_row(self):
        """Adds a row, initially zero-filled."""
        self.v.resize(self.v.size() + self.ncols)

There are no methods to do anything productive with the matrices’ contents. We could implement custom __getitem__, __setitem__, etc. for this, but instead we’ll use the buffer protocol to expose the matrix’s data to Python so we can use NumPy to do useful work.

Implementing the buffer protocol requires adding two methods, __getbuffer__ and __releasebuffer__, which Cython handles specially.

# distutils: language = c++

from cpython cimport Py_buffer
from libcpp.vector cimport vector

cdef class Matrix:
    cdef Py_ssize_t ncols
    cdef Py_ssize_t shape[2]
    cdef Py_ssize_t strides[2]
    cdef vector[float] v

    def __cinit__(self, Py_ssize_t ncols):
        self.ncols = ncols

    def add_row(self):
        """Adds a row, initially zero-filled."""
        self.v.resize(self.v.size() + self.ncols)

    def __getbuffer__(self, Py_buffer *buffer, int flags):
        cdef Py_ssize_t itemsize = sizeof(self.v[0])

        self.shape[0] = self.v.size() / self.ncols
        self.shape[1] = self.ncols

        # Stride 1 is the distance, in bytes, between two items in a row;
        # this is the distance between two adjacent items in the vector.
        # Stride 0 is the distance between the first elements of adjacent rows.
        self.strides[1] = <Py_ssize_t>(  <char *>&(self.v[1])
                                       - <char *>&(self.v[0]))
        self.strides[0] = self.ncols * self.strides[1]

        buffer.buf = <char *>&(self.v[0])
        buffer.format = 'f'                     # float
        buffer.internal = NULL                  # see References
        buffer.itemsize = itemsize
        buffer.len = self.v.size() * itemsize   # product(shape) * itemsize
        buffer.ndim = 2
        buffer.obj = self
        buffer.readonly = 0
        buffer.shape = self.shape
        buffer.strides = self.strides
        buffer.suboffsets = NULL                # for pointer arrays only

    def __releasebuffer__(self, Py_buffer *buffer):
        pass

The method Matrix.__getbuffer__ fills a descriptor structure, called a Py_buffer, that is defined by the Python C-API. It contains a pointer to the actual buffer in memory, as well as metadata about the shape of the array and the strides (step sizes to get from one element or row to the next). Its shape and strides members are pointers that must point to arrays of type and size Py_ssize_t[ndim]. These arrays have to stay alive as long as any buffer views the data, so we store them on the Matrix object as members.

The code is not yet complete, but we can already compile it and test the basic functionality.

>>> from matrix import Matrix
>>> import numpy as np
>>> m = Matrix(10)
>>> np.asarray(m)
array([], shape=(0, 10), dtype=float32)
>>> m.add_row()
>>> a = np.asarray(m)
>>> a[:] = 1
>>> m.add_row()
>>> a = np.asarray(m)
>>> a
array([[ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]], dtype=float32)

Now we can view the Matrix as a NumPy ndarray, and modify its contents using standard NumPy operations.

Memory safety and reference counting

The Matrix class as implemented so far is unsafe. The add_row operation can move the underlying buffer, which invalidates any NumPy (or other) view on the data. If you try to access values after an add_row call, you’ll get outdated values or a segfault.

This is where __releasebuffer__ comes in. We can add a reference count to each matrix, and lock it for mutation whenever a view exists.

# distutils: language = c++

from cpython cimport Py_buffer
from libcpp.vector cimport vector

cdef class Matrix:

    cdef int view_count

    cdef Py_ssize_t ncols
    cdef vector[float] v
    # ...

    def __cinit__(self, Py_ssize_t ncols):
        self.ncols = ncols
        self.view_count = 0

    def add_row(self):
        if self.view_count > 0:
            raise ValueError("can't add row while being viewed")
        self.v.resize(self.v.size() + self.ncols)

    def __getbuffer__(self, Py_buffer *buffer, int flags):
        # ... as before

        self.view_count += 1

    def __releasebuffer__(self, Py_buffer *buffer):
        self.view_count -= 1

Flags

We skipped some input validation in the code. The flags argument to __getbuffer__ comes from np.asarray (and other clients) and is an OR of boolean flags that describe the kind of array that is requested. Strictly speaking, if the flags contain PyBUF_ND, PyBUF_SIMPLE, or PyBUF_F_CONTIGUOUS, __getbuffer__ must raise a BufferError. These macros can be cimport’d from cpython.buffer.

(The matrix-in-vector structure actually conforms to PyBUF_ND, but that would prohibit __getbuffer__ from filling in the strides. A single-row matrix is F-contiguous, but a larger matrix is not.)

References

The buffer interface used here is set out in PEP 3118, Revising the buffer protocol.

A tutorial for using this API from C is on Jake Vanderplas’s blog, An Introduction to the Python Buffer Protocol.

Reference documentation is available for Python 3 and Python 2. The Py2 documentation also describes an older buffer protocol that is no longer in use; since Python 2.6, the PEP 3118 protocol has been implemented, and the older protocol is only relevant for legacy code.

Using Parallelism

Cython supports native parallelism through the cython.parallel module. To use this kind of parallelism, the GIL must be released (see Releasing the GIL). It currently supports OpenMP, but later on more backends might be supported.

Note

Functionality in this module may only be used from the main thread or parallel regions due to OpenMP restrictions.

cython.parallel.prange([start,] stop[, step][, nogil=False][, schedule=None[, chunksize=None]][, num_threads=None])

This function can be used for parallel loops. OpenMP automatically starts a thread pool and distributes the work according to the schedule used.

Thread-locality and reductions are automatically inferred for variables.

If you assign to a variable in a prange block, it becomes lastprivate, meaning that the variable will contain the value from the last iteration. If you use an inplace operator on a variable, it becomes a reduction, meaning that the values from the thread-local copies of the variable will be reduced with the operator and assigned to the original variable after the loop. The index variable is always lastprivate. Variables assigned to in a parallel with block will be private and unusable after the block, as there is no concept of a sequentially last value.

Parameters:
  • start – The index indicating the start of the loop (same as the start argument in range).
  • stop – The index indicating when to stop the loop (same as the stop argument in range).
  • step – An integer giving the step of the sequence (same as the step argument in range). It must not be 0.
  • nogil – This function can only be used with the GIL released. If nogil is true, the loop will be wrapped in a nogil section.
  • schedule

    The schedule is passed to OpenMP and can be one of the following:

    static:
    If a chunksize is provided, iterations are distributed to all threads ahead of time in blocks of the given chunksize. If no chunksize is given, the iteration space is divided into chunks that are approximately equal in size, and at most one chunk is assigned to each thread in advance.

    This is most appropriate when the scheduling overhead matters and the problem can be cut down into equally sized chunks that are known to have approximately the same runtime.

    dynamic:
    The iterations are distributed to threads as they request them, with a default chunk size of 1.

    This is suitable when the runtime of each chunk differs and is not known in advance and therefore a larger number of smaller chunks is used in order to keep all threads busy.

    guided:
    As with dynamic scheduling, the iterations are distributed to threads as they request them, but with decreasing chunk size. The size of each chunk is proportional to the number of unassigned iterations divided by the number of participating threads, decreasing to 1 (or the chunksize if provided).

    This has an advantage over pure dynamic scheduling when it turns out that the last chunks take more time than expected or are otherwise being badly scheduled, so that most threads start running idle while the last chunks are being worked on by only a smaller number of threads.

    runtime:
    The schedule and chunk size are taken from the runtime scheduling variable, which can be set through the openmp.omp_set_schedule() function call, or the OMP_SCHEDULE environment variable. Note that this essentially disables any static compile time optimisations of the scheduling code itself and may therefore show a slightly worse performance than when the same scheduling policy is statically configured at compile time. The default schedule is implementation defined. For more information consult the OpenMP specification [1].
  • num_threads – The num_threads argument indicates how many threads the team should consist of. If not given, OpenMP will decide how many threads to use. Typically this is the number of cores available on the machine. However, this may be controlled through the omp_set_num_threads() function, or through the OMP_NUM_THREADS environment variable.
  • chunksize – The chunksize argument indicates the chunksize to be used for dividing the iterations among threads. This is only valid for static, dynamic and guided scheduling, and is optional. Different chunksizes may give substantially different performance results, depending on the schedule, the load balance it provides, the scheduling overhead and the amount of false sharing (if any).

Example with a reduction:

from cython.parallel import prange

cdef int i
cdef int n = 30
cdef int sum = 0

for i in prange(n, nogil=True):
    sum += i

print(sum)

Example with a typed memoryview (e.g. a NumPy array):

from cython.parallel import prange

def func(double[:] x, double alpha):
    cdef Py_ssize_t i

    for i in prange(x.shape[0]):
        x[i] = alpha * x[i]
cython.parallel.parallel(num_threads=None)

This directive can be used as part of a with statement to execute code sequences in parallel. This is currently useful to setup thread-local buffers used by a prange. A contained prange will be a worksharing loop that is not parallel, so any variable assigned to in the parallel section is also private to the prange. Variables that are private in the parallel block are unavailable after the parallel block.

Example with thread-local buffers:

from cython.parallel import parallel, prange
from libc.stdlib cimport abort, malloc, free

cdef Py_ssize_t idx, i, n = 100
cdef int * local_buf
cdef size_t size = 10

with nogil, parallel():
    local_buf = <int *> malloc(sizeof(int) * size)
    if local_buf is NULL:
        abort()

    # populate our local buffer in a sequential loop
    for i in xrange(size):
        local_buf[i] = i * 2

    # share the work using the thread-local buffer(s)
    for i in prange(n, schedule='guided'):
        func(local_buf)

    free(local_buf)

Later on sections might be supported in parallel blocks, to distribute code sections of work among threads.

cython.parallel.threadid()

Returns the id of the thread. For n threads, the ids will range from 0 to n-1.

Compiling

To actually use the OpenMP support, you need to tell the C or C++ compiler to enable OpenMP. For gcc this can be done as follows in a setup.py:

from distutils.core import setup
from distutils.extension import Extension
from Cython.Build import cythonize

ext_modules = [
    Extension(
        "hello",
        ["hello.pyx"],
        extra_compile_args=['-fopenmp'],
        extra_link_args=['-fopenmp'],
    )
]

setup(
    name='hello-parallel-world',
    ext_modules=cythonize(ext_modules),
)

For Microsoft Visual C++ compiler, use '/openmp' instead of '-fopenmp'.

Breaking out of loops

The parallel with and prange blocks support the statements break, continue and return in nogil mode. Additionally, it is valid to use a with gil block inside these blocks, and have exceptions propagate from them. However, because the blocks use OpenMP, they can not just be left, so the exiting procedure is best-effort. For prange() this means that the loop body is skipped after the first break, return or exception for any subsequent iteration in any thread. It is undefined which value shall be returned if multiple different values may be returned, as the iterations are in no particular order:

from cython.parallel import prange

cdef int func(Py_ssize_t n):
    cdef Py_ssize_t i

    for i in prange(n, nogil=True):
        if i == 8:
            with gil:
                raise Exception()
        elif i == 4:
            break
        elif i == 2:
            return i

In the example above it is undefined whether an exception shall be raised, whether it will simply break or whether it will return 2.

Using OpenMP Functions

OpenMP functions can be used by cimporting openmp:

# tag: openmp
# You can ignore the previous line.
# It's for internal testing of the Cython documentation.

from cython.parallel cimport parallel
cimport openmp

cdef int num_threads

openmp.omp_set_dynamic(1)
with nogil, parallel():
    num_threads = openmp.omp_get_num_threads()
    # ...

References

[1]https://www.openmp.org/mp-documents/spec30.pdf

Debugging your Cython program

Cython comes with an extension for the GNU Debugger that helps users debug Cython code. To use this functionality, you will need to install gdb 7.2 or higher, built with Python support (linked to Python 2.6 or higher). The debugger supports debuggees with versions 2.6 and higher. For Python 3, code should be built with Python 3 and the debugger should be run with Python 2 (or at least it should be able to find the Python 2 Cython installation). Note that in recent versions of Ubuntu, for instance, gdb installed with apt-get is configured with Python 3. On such systems, the proper configuration of gdb can be obtained by downloading the gdb source, and then running:

./configure --with-python=python2
make
sudo make install

The debugger will need debug information that the Cython compiler can export. This can be achieved from within the setup script by passing gdb_debug=True to cythonize():

from distutils.core import setup
from distutils.extension import Extension

extensions = [Extension('source', ['source.pyx'])]

setup(..., ext_modules=cythonize(extensions, gdb_debug=True))

For development it’s often helpful to pass the --inplace flag to the setup.py script, which makes distutils build your project “in place”, i.e., not in a separate build directory.

When invoking Cython from the command line directly you can have it write debug information using the --gdb flag:

cython --gdb myfile.pyx

Running the Debugger

To run the Cython debugger and have it import the debug information exported by Cython, run cygdb in the build directory:

$ python setup.py build_ext --inplace
$ cygdb
GNU gdb (GDB) 7.2
...
(gdb)

When using the Cython debugger, it’s preferable that you build and run your code with an interpreter that is compiled with debugging symbols (i.e. configured with --with-pydebug or compiled with the -g CFLAG). If your Python is installed and managed by your package manager you probably need to install debug support separately. If using NumPy then you also need to install numpy debugging, or you’ll see an import error for multiarray. E.G. for ubuntu:

$ sudo apt-get install python-dbg python-numpy-dbg
$ python-dbg setup.py build_ext --inplace

Then you need to run your script with python-dbg also. Ensure that when building your package with debug symbols that cython extensions are re-compiled if they had been previously compiled. If your package is version controlled, you might want to perform git clean -fxd or hg purge --all before building.

You can also pass additional arguments to gdb:

$ cygdb /path/to/build/directory/ GDBARGS

i.e.:

$ cygdb . -- --args python-dbg mainscript.py

To tell cygdb not to import any debug information, supply -- as the first argument:

$ cygdb --

Using the Debugger

The Cython debugger comes with a set of commands that support breakpoints, stack inspection, source code listing, stepping, stepping over, etc. Most of these commands are analogous to their respective gdb command.

cy break breakpoints...

Break in a Python, Cython or C function. First it will look for a Cython function with that name, if cygdb doesn’t know about a function (or method) with that name, it will set a (pending) C breakpoint. The -p option can be used to specify a Python breakpoint.

Breakpoints can be set for either the function or method name, or they can be fully “qualified”, which means that the entire “path” to a function is given:

(gdb) cy break cython_function_or_method
(gdb) cy break packagename.cython_module.cython_function
(gdb) cy break packagename.cython_module.ClassName.cython_method
(gdb) cy break c_function

You can also break on Cython line numbers:

(gdb) cy break :14
(gdb) cy break cython_module:14
(gdb) cy break packagename.cython_module:14

Python breakpoints currently support names of the module (not the entire package path) and the function or method:

(gdb) cy break -p python_module.python_function_or_method
(gdb) cy break -p python_function_or_method

Note

Python breakpoints only work in Python builds where the Python frame information can be read from the debugger. To ensure this, use a Python debug build or a non-stripped build compiled with debug support.

cy step

Step through Python, Cython or C code. Python, Cython and C functions called directly from Cython code are considered relevant and will be stepped into.

cy next

Step over Python, Cython or C code.

cy run

Run the program. The default interpreter is the interpreter that was used to build your extensions with, or the interpreter cygdb is run with in case the “don’t import debug information” option was in effect. The interpreter can be overridden using gdb’s file command.

cy cont

Continue the program.

cy up
cy down

Go up and down the stack to what is considered a relevant frame.

cy finish

Execute until an upward relevant frame is met or something halts execution.

cy bt
cy backtrace

Print a traceback of all frames considered relevant. The -a option makes it print the full traceback (all C frames).

cy select

Select a stack frame by number as listed by cy backtrace. This command is introduced because cy backtrace prints a reversed stack trace, so frame numbers differ from gdb’s bt.

cy print varname

Print a local or global Cython, Python or C variable (depending on the context). Variables may also be dereferenced:

(gdb) cy print x
x = 1
(gdb) cy print *x
*x = (PyObject) {
    _ob_next = 0x93efd8,
    _ob_prev = 0x93ef88,
    ob_refcnt = 65,
    ob_type = 0x83a3e0
}
cy set cython_variable = value

Set a Cython variable on the Cython stack to value.

cy list

List the source code surrounding the current line.

cy locals
cy globals

Print all the local and global variables and their values.

cy import FILE...

Import debug information from files given as arguments. The easiest way to import debug information is to use the cygdb command line tool.

cy exec code

Execute code in the current Python or Cython frame. This works like Python’s interactive interpreter.

For Python frames it uses the globals and locals from the Python frame, for Cython frames it uses the dict of globals used on the Cython module and a new dict filled with the local Cython variables.

Note

cy exec modifies state and executes code in the debuggee and is therefore potentially dangerous.

Example:

(gdb) cy exec x + 1
2
(gdb) cy exec import sys; print sys.version_info
(2, 6, 5, 'final', 0)
(gdb) cy exec
>global foo
>
>foo = 'something'
>end

Convenience functions

The following functions are gdb functions, which means they can be used in a gdb expression.

cy_cname(varname)

Returns the C variable name of a Cython variable. For global variables this may not be actually valid.

cy_cvalue(varname)

Returns the value of a Cython variable.

cy_eval(expression)

Evaluates Python code in the nearest Python or Cython frame and returns the result of the expression as a gdb value. This gives a new reference if successful, NULL on error.

cy_lineno()

Returns the current line number in the selected Cython frame.

Example:

(gdb) print $cy_cname("x")
$1 = "__pyx_v_x"
(gdb) watch $cy_cvalue("x")
Hardware watchpoint 13: $cy_cvalue("x")
(gdb) cy set my_cython_variable = $cy_eval("{'spam': 'ham'}")
(gdb) print $cy_lineno()
$2 = 12

Configuring the Debugger

A few aspects of the debugger are configurable with gdb parameters. For instance, colors can be disabled, the terminal background color and breakpoint autocompletion can be configured.

cy_complete_unqualified

Tells the Cython debugger whether cy break should also complete plain function names, i.e. not prefixed by their module name. E.g. if you have a function named spam, in module M, it tells whether to only complete M.spam or also just spam.

The default is true.

cy_colorize_code

Tells the debugger whether to colorize source code. The default is true.

cy_terminal_background_color

Tells the debugger about the terminal background color, which affects source code coloring. The default is “dark”, another valid option is “light”.

This is how these parameters can be used:

(gdb) set cy_complete_unqualified off
(gdb) set cy_terminal_background_color light
(gdb) show cy_colorize_code

Cython for NumPy users

This tutorial is aimed at NumPy users who have no experience with Cython at all. If you have some knowledge of Cython you may want to skip to the ‘’Efficient indexing’’ section.

The main scenario considered is NumPy end-use rather than NumPy/SciPy development. The reason is that Cython is not (yet) able to support functions that are generic with respect to the number of dimensions in a high-level fashion. This restriction is much more severe for SciPy development than more specific, “end-user” functions. See the last section for more information on this.

The style of this tutorial will not fit everybody, so you can also consider:

Cython at a glance

Cython is a compiler which compiles Python-like code files to C code. Still, ‘’Cython is not a Python to C translator’‘. That is, it doesn’t take your full program and “turns it into C” – rather, the result makes full use of the Python runtime environment. A way of looking at it may be that your code is still Python in that it runs within the Python runtime environment, but rather than compiling to interpreted Python bytecode one compiles to native machine code (but with the addition of extra syntax for easy embedding of faster C-like code).

This has two important consequences:

  • Speed. How much depends very much on the program involved though. Typical Python numerical programs would tend to gain very little as most time is spent in lower-level C that is used in a high-level fashion. However for-loop-style programs can gain many orders of magnitude, when typing information is added (and is so made possible as a realistic alternative).
  • Easy calling into C code. One of Cython’s purposes is to allow easy wrapping of C libraries. When writing code in Cython you can call into C code as easily as into Python code.

Very few Python constructs are not yet supported, though making Cython compile all Python code is a stated goal, you can see the differences with Python in limitations.

Your Cython environment

Using Cython consists of these steps:

  1. Write a .pyx source file
  2. Run the Cython compiler to generate a C file
  3. Run a C compiler to generate a compiled library
  4. Run the Python interpreter and ask it to import the module

However there are several options to automate these steps:

  1. The SAGE mathematics software system provides excellent support for using Cython and NumPy from an interactive command line or through a notebook interface (like Maple/Mathematica). See this documentation.
  2. Cython can be used as an extension within a Jupyter notebook, making it easy to compile and use Cython code with just a %%cython at the top of a cell. For more information see Using the Jupyter Notebook.
  3. A version of pyximport is shipped with Cython, so that you can import pyx-files dynamically into Python and have them compiled automatically (See Compiling with pyximport).
  4. Cython supports distutils so that you can very easily create build scripts which automate the process, this is the preferred method for Cython implemented libraries and packages. See Basic setup.py.
  5. Manual compilation (see below)

Note

If using another interactive command line environment than SAGE, like IPython or Python itself, it is important that you restart the process when you recompile the module. It is not enough to issue an “import” statement again.

Installation

If you already have a C compiler, just do:

pip install Cython

otherwise, see the installation page.

As of this writing SAGE comes with an older release of Cython than required for this tutorial. So if using SAGE you should download the newest Cython and then execute

$ cd path/to/cython-distro
$ path-to-sage/sage -python setup.py install

This will install the newest Cython into SAGE.

Manual compilation

As it is always important to know what is going on, I’ll describe the manual method here. First Cython is run:

$ cython yourmod.pyx

This creates yourmod.c which is the C source for a Python extension module. A useful additional switch is -a which will generate a document yourmod.html) that shows which Cython code translates to which C code line by line.

Then we compile the C file. This may vary according to your system, but the C file should be built like Python was built. Python documentation for writing extensions should have some details. On Linux this often means something like:

$ gcc -shared -pthread -fPIC -fwrapv -O2 -Wall -fno-strict-aliasing -I/usr/include/python2.7 -o yourmod.so yourmod.c

gcc should have access to the NumPy C header files so if they are not installed at /usr/include/numpy or similar you may need to pass another option for those. You only need to provide the NumPy headers if you write:

cimport numpy

in your Cython code.

This creates yourmod.so in the same directory, which is importable by Python by using a normal import yourmod statement.

The first Cython program

You can easily execute the code of this tutorial by downloading the Jupyter notebook.

The code below does the equivalent of this function in numpy:

def compute_np(array_1, array_2, a, b, c):
    return np.clip(array_1, 2, 10) * a + array_2 * b + c

We’ll say that array_1 and array_2 are 2D NumPy arrays of integer type and a, b and c are three Python integers.

This function uses NumPy and is already really fast, so it might be a bit overkill to do it again with Cython. This is for demonstration purposes. Nonetheless, we will show that we achieve a better speed and memory efficiency than NumPy at the cost of more verbosity.

This code computes the function with the loops over the two dimensions being unrolled. It is both valid Python and valid Cython code. I’ll refer to it as both compute_py.py for the Python version and compute_cy.pyx for the Cython version – Cython uses .pyx as its file suffix (but it can also compile .py files).

import numpy as np


def clip(a, min_value, max_value):
    return min(max(a, min_value), max_value)


def compute(array_1, array_2, a, b, c):
    """
    This function must implement the formula
    np.clip(array_1, 2, 10) * a + array_2 * b + c

    array_1 and array_2 are 2D.
    """
    x_max = array_1.shape[0]
    y_max = array_1.shape[1]

    assert array_1.shape == array_2.shape

    result = np.zeros((x_max, y_max), dtype=array_1.dtype)

    for x in range(x_max):
        for y in range(y_max):
            tmp = clip(array_1[x, y], 2, 10)
            tmp = tmp * a + array_2[x, y] * b
            result[x, y] = tmp + c

    return result

This should be compiled to produce convolve_cy.so (for Linux systems, on Windows systems, this will be a .pyd file). We run a Python session to test both the Python version (imported from .py-file) and the compiled Cython module.

In [1]: import numpy as np
In [2]: array_1 = np.random.uniform(0, 1000, size=(3000, 2000)).astype(np.intc)
In [3]: array_2 = np.random.uniform(0, 1000, size=(3000, 2000)).astype(np.intc)
In [4]: a = 4
In [5]: b = 3
In [6]: c = 9
In [7]: def compute_np(array_1, array_2, a, b, c):
   ...:     return np.clip(array_1, 2, 10) * a + array_2 * b + c
In [8]: %timeit compute_np(array_1, array_2, a, b, c)
103 ms ± 4.16 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [9]: import compute_py
In [10]: compute_py.compute(array_1, array_2, a, b, c)
1min 10s ± 844 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [11]: import compute_cy
In [12]: compute_cy.compute(array_1, array_2, a, b, c)
56.5 s ± 587 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

There’s not such a huge difference yet; because the C code still does exactly what the Python interpreter does (meaning, for instance, that a new object is allocated for each number used).

You can look at the Python interaction and the generated C code by using -a when calling Cython from the command line, %%cython -a when using a Jupyter Notebook, or by using cythonize('compute_cy.pyx', annotate=True) when using a setup.py. Look at the generated html file and see what is needed for even the simplest statements. You get the point quickly. We need to give Cython more information; we need to add types.

Adding types

To add types we use custom Cython syntax, so we are now breaking Python source compatibility. Here’s compute_typed.pyx. Read the comments!

import numpy as np

# We now need to fix a datatype for our arrays. I've used the variable
# DTYPE for this, which is assigned to the usual NumPy runtime
# type info object.
DTYPE = np.intc

# cdef means here that this function is a plain C function (so faster).
# To get all the benefits, we type the arguments and the return value.
cdef int clip(int a, int min_value, int max_value):
    return min(max(a, min_value), max_value)


def compute(array_1, array_2, int a, int b, int c):
    
    # The "cdef" keyword is also used within functions to type variables. It
    # can only be used at the top indentation level (there are non-trivial
    # problems with allowing them in other places, though we'd love to see
    # good and thought out proposals for it).
    cdef Py_ssize_t x_max = array_1.shape[0]
    cdef Py_ssize_t y_max = array_1.shape[1]
    
    assert array_1.shape == array_2.shape
    assert array_1.dtype == DTYPE
    assert array_2.dtype == DTYPE

    result = np.zeros((x_max, y_max), dtype=DTYPE)
    
    # It is very important to type ALL your variables. You do not get any
    # warnings if not, only much slower code (they are implicitly typed as
    # Python objects).
    # For the "tmp" variable, we want to use the same data type as is
    # stored in the array, so we use int because it correspond to np.intc.
    # NB! An important side-effect of this is that if "tmp" overflows its
    # datatype size, it will simply wrap around like in C, rather than raise
    # an error like in Python.

    cdef int tmp

    # Py_ssize_t is the proper C type for Python array indices.
    cdef Py_ssize_t x, y

    for x in range(x_max):
        for y in range(y_max):

            tmp = clip(array_1[x, y], 2, 10)
            tmp = tmp * a + array_2[x, y] * b
            result[x, y] = tmp + c

    return result
_images/compute_typed_html.jpg

At this point, have a look at the generated C code for compute_cy.pyx and compute_typed.pyx. Click on the lines to expand them and see corresponding C.

Especially have a look at the for-loops: In compute_cy.c, these are ~20 lines of C code to set up while in compute_typed.c a normal C for loop is used.

After building this and continuing my (very informal) benchmarks, I get:

In [13]: %timeit compute_typed.compute(array_1, array_2, a, b, c)
26.5 s ± 422 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

So adding types does make the code faster, but nowhere near the speed of NumPy?

What happened is that most of the time spend in this code is spent in the following lines, and those lines are slower to execute than in pure Python:

tmp = clip(array_1[x, y], 2, 10)
tmp = tmp * a + array_2[x, y] * b
result[x, y] = tmp + c

So what made those line so much slower than in the pure Python version?

array_1 and array_2 are still NumPy arrays, so Python objects, and expect Python integers as indexes. Here we pass C int values. So every time Cython reaches this line, it has to convert all the C integers to Python int objects. Since this line is called very often, it outweighs the speed benefits of the pure C loops that were created from the range() earlier.

Furthermore, tmp * a + array_2[x, y] * b returns a Python integer and tmp is a C integer, so Cython has to do type conversions again. In the end those types conversions add up. And made our computation really slow. But this problem can be solved easily by using memoryviews.

Efficient indexing with memoryviews

There are still two bottlenecks that degrade the performance, and that is the array lookups and assignments, as well as C/Python types conversion. The []-operator still uses full Python operations – what we would like to do instead is to access the data buffer directly at C speed.

What we need to do then is to type the contents of the ndarray objects. We do this with a memoryview. There is a page in the Cython documentation dedicated to it.

In short, memoryviews are C structures that can hold a pointer to the data of a NumPy array and all the necessary buffer metadata to provide efficient and safe access: dimensions, strides, item size, item type information, etc… They also support slices, so they work even if the NumPy array isn’t contiguous in memory. They can be indexed by C integers, thus allowing fast access to the NumPy array data.

Here is how to declare a memoryview of integers:

cdef int [:] foo         # 1D memoryview
cdef int [:, :] foo      # 2D memoryview
cdef int [:, :, :] foo   # 3D memoryview
...                      # You get the idea.

No data is copied from the NumPy array to the memoryview in our example. As the name implies, it is only a “view” of the memory. So we can use the view result_view for efficient indexing and at the end return the real NumPy array result that holds the data that we operated on.

Here is how to use them in our code:

compute_memview.pyx

import numpy as np

DTYPE = np.intc


cdef int clip(int a, int min_value, int max_value):
    return min(max(a, min_value), max_value)


def compute(int[:, :] array_1, int[:, :] array_2, int a, int b, int c):
     
    cdef Py_ssize_t x_max = array_1.shape[0]
    cdef Py_ssize_t y_max = array_1.shape[1]

    # array_1.shape is now a C array, no it's not possible
    # to compare it simply by using == without a for-loop.
    # To be able to compare it to array_2.shape easily,
    # we convert them both to Python tuples.
    assert tuple(array_1.shape) == tuple(array_2.shape)

    result = np.zeros((x_max, y_max), dtype=DTYPE)
    cdef int[:, :] result_view = result

    cdef int tmp
    cdef Py_ssize_t x, y

    for x in range(x_max):
        for y in range(y_max):

            tmp = clip(array_1[x, y], 2, 10)
            tmp = tmp * a + array_2[x, y] * b
            result_view[x, y] = tmp + c

    return result

Let’s see how much faster accessing is now.

In [22]: %timeit compute_memview.compute(array_1, array_2, a, b, c)
22.9 ms ± 197 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Note the importance of this change. We’re now 3081 times faster than an interpreted version of Python and 4.5 times faster than NumPy.

Memoryviews can be used with slices too, or even with Python arrays. Check out the memoryview page to see what they can do for you.

Tuning indexing further

The array lookups are still slowed down by two factors:

  1. Bounds checking is performed.
  2. Negative indices are checked for and handled correctly. The code above is explicitly coded so that it doesn’t use negative indices, and it (hopefully) always access within bounds.

With decorators, we can deactivate those checks:

...
cimport cython
@cython.boundscheck(False)  # Deactivate bounds checking
@cython.wraparound(False)   # Deactivate negative indexing.
def compute(int[:, :] array_1, int[:, :] array_2, int a, int b, int c):
...

Now bounds checking is not performed (and, as a side-effect, if you ‘’do’’ happen to access out of bounds you will in the best case crash your program and in the worst case corrupt data). It is possible to switch bounds-checking mode in many ways, see Compiler directives for more information.

In [23]: %timeit compute_index.compute(array_1, array_2, a, b, c)
16.8 ms ± 25.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

We’re faster than the NumPy version (6.2x). NumPy is really well written, but does not performs operation lazily, resulting in a lot of intermediate copy operations in memory. Our version is very memory efficient and cache friendly because we can execute the operations in a single run over the data.

Warning

Speed comes with some cost. Especially it can be dangerous to set typed objects (like array_1, array_2 and result_view in our sample code) to None. Setting such objects to None is entirely legal, but all you can do with them is check whether they are None. All other use (attribute lookup or indexing) can potentially segfault or corrupt data (rather than raising exceptions as they would in Python).

The actual rules are a bit more complicated but the main message is clear: Do not use typed objects without knowing that they are not set to None.

Declaring the NumPy arrays as contiguous

For extra speed gains, if you know that the NumPy arrays you are providing are contiguous in memory, you can declare the memoryview as contiguous.

We give an example on an array that has 3 dimensions. If you want to give Cython the information that the data is C-contiguous you have to declare the memoryview like this:

cdef int [:,:,::1] a

If you want to give Cython the information that the data is Fortran-contiguous you have to declare the memoryview like this:

cdef int [::1, :, :] a

If all this makes no sense to you, you can skip this part, declaring arrays as contiguous constrains the usage of your functions as it rejects array slices as input. If you still want to understand what contiguous arrays are all about, you can see this answer on StackOverflow.

For the sake of giving numbers, here are the speed gains that you should get by declaring the memoryviews as contiguous:

In [23]: %timeit compute_contiguous.compute(array_1, array_2, a, b, c)
11.1 ms ± 30.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

We’re now around nine times faster than the NumPy version, and 6300 times faster than the pure Python version!

Making the function cleaner

Declaring types can make your code quite verbose. If you don’t mind Cython inferring the C types of your variables, you can use the infer_types=True compiler directive at the top of the file. It will save you quite a bit of typing.

Note that since type declarations must happen at the top indentation level, Cython won’t infer the type of variables declared for the first time in other indentation levels. It would change too much the meaning of our code. This is why, we must still declare manually the type of the tmp, x and y variable.

And actually, manually giving the type of the tmp variable will be useful when using fused types.

# cython: infer_types=True
import numpy as np
cimport cython

DTYPE = np.intc


cdef int clip(int a, int min_value, int max_value):
    return min(max(a, min_value), max_value)


@cython.boundscheck(False)
@cython.wraparound(False)
def compute(int[:, ::1] array_1, int[:, ::1] array_2, int a, int b, int c):
     
    x_max = array_1.shape[0]
    y_max = array_1.shape[1]
    
    assert tuple(array_1.shape) == tuple(array_2.shape)

    result = np.zeros((x_max, y_max), dtype=DTYPE)
    cdef int[:, ::1] result_view = result

    cdef int tmp
    cdef Py_ssize_t x, y

    for x in range(x_max):
        for y in range(y_max):

            tmp = clip(array_1[x, y], 2, 10)
            tmp = tmp * a + array_2[x, y] * b
            result_view[x, y] = tmp + c

    return result

We now do a speed test:

In [24]: %timeit compute_infer_types.compute(array_1, array_2, a, b, c)
11.5 ms ± 261 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Lo and behold, the speed has not changed.

More generic code

All those speed gains are nice, but adding types constrains our code. At the moment, it would mean that our function can only work with NumPy arrays with the np.intc type. Is it possible to make our code work for multiple NumPy data types?

Yes, with the help of a new feature called fused types. You can learn more about it at this section of the documentation. It is similar to C++ ‘s templates. It generates multiple function declarations at compile time, and then chooses the right one at run-time based on the types of the arguments provided. By comparing types in if-conditions, it is also possible to execute entirely different code paths depending on the specific data type.

In our example, since we don’t have access anymore to the NumPy’s dtype of our input arrays, we use those if-else statements to know what NumPy data type we should use for our output array.

In this case, our function now works for ints, doubles and floats.

# cython: infer_types=True
import numpy as np
cimport cython

ctypedef fused my_type:
    int
    double
    long long


cdef my_type clip(my_type a, my_type min_value, my_type max_value):
    return min(max(a, min_value), max_value)


@cython.boundscheck(False)
@cython.wraparound(False)
def compute(my_type[:, ::1] array_1, my_type[:, ::1] array_2, my_type a, my_type b, my_type c):
     
    x_max = array_1.shape[0]
    y_max = array_1.shape[1]
    
    assert tuple(array_1.shape) == tuple(array_2.shape)

    if my_type is int:
        dtype = np.intc
    elif my_type is double:
        dtype = np.double
    elif my_type is cython.longlong:
        dtype = np.longlong

    result = np.zeros((x_max, y_max), dtype=dtype)
    cdef my_type[:, ::1] result_view = result

    cdef my_type tmp
    cdef Py_ssize_t x, y

    for x in range(x_max):
        for y in range(y_max):

            tmp = clip(array_1[x, y], 2, 10)
            tmp = tmp * a + array_2[x, y] * b
            result_view[x, y] = tmp + c

    return result

We can check that the output type is the right one:

>>>compute(array_1, array_2, a, b, c).dtype
dtype('int32')
>>>compute(array_1.astype(np.double), array_2.astype(np.double), a, b, c).dtype
dtype('float64')

We now do a speed test:

In [25]: %timeit compute_fused_types.compute(array_1, array_2, a, b, c)
11.5 ms ± 258 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

More versions of the function are created at compile time. So it makes sense that the speed doesn’t change for executing this function with integers as before.

Using multiple threads

Cython has support for OpenMP. It also has some nice wrappers around it, like the function prange(). You can see more information about Cython and parallelism in Using Parallelism. Since we do elementwise operations, we can easily distribute the work among multiple threads. It’s important not to forget to pass the correct arguments to the compiler to enable OpenMP. When using the Jupyter notebook, you should use the cell magic like this:

%%cython --force
# distutils: extra_compile_args=-fopenmp
# distutils: extra_link_args=-fopenmp

The GIL must be released (see Releasing the GIL), so this is why we declare our clip() function nogil.

# tag: openmp
# You can ignore the previous line.
# It's for internal testing of the cython documentation.

# distutils: extra_compile_args=-fopenmp
# distutils: extra_link_args=-fopenmp

import numpy as np
cimport cython
from cython.parallel import prange

ctypedef fused my_type:
    int
    double
    long long


# We declare our plain c function nogil
cdef my_type clip(my_type a, my_type min_value, my_type max_value) nogil:
    return min(max(a, min_value), max_value)


@cython.boundscheck(False)
@cython.wraparound(False)
def compute(my_type[:, ::1] array_1, my_type[:, ::1] array_2, my_type a, my_type b, my_type c):

    cdef Py_ssize_t x_max = array_1.shape[0]
    cdef Py_ssize_t y_max = array_1.shape[1]

    assert tuple(array_1.shape) == tuple(array_2.shape)

    if my_type is int:
        dtype = np.intc
    elif my_type is double:
        dtype = np.double
    elif my_type is cython.longlong:
        dtype = np.longlong

    result = np.zeros((x_max, y_max), dtype=dtype)
    cdef my_type[:, ::1] result_view = result

    cdef my_type tmp
    cdef Py_ssize_t x, y

    # We use prange here.
    for x in prange(x_max, nogil=True):
        for y in range(y_max):

            tmp = clip(array_1[x, y], 2, 10)
            tmp = tmp * a + array_2[x, y] * b
            result_view[x, y] = tmp + c

    return result

We can have substantial speed gains for minimal effort:

In [25]: %timeit compute_prange.compute(array_1, array_2, a, b, c)
9.33 ms ± 412 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

We’re now 7558 times faster than the pure Python version and 11.1 times faster than NumPy!

Where to go from here?

Pythran as a Numpy backend

Using the flag --np-pythran, it is possible to use the Pythran numpy implementation for numpy related operations. One advantage to use this backend is that the Pythran implementation uses C++ expression templates to save memory transfers and can benefit from SIMD instructions of modern CPU.

This can lead to really interesting speedup in some cases, going from 2 up to 16, depending on the targeted CPU architecture and the original algorithm.

Please note that this feature is experimental.

Usage example with distutils

You first need to install Pythran. See its documentation for more information.

Then, simply add a cython: np_pythran=True directive at the top of the Python files that needs to be compiled using Pythran numpy support.

Here is an example of a simple setup.py file using distutils:

from distutils.core import setup
from Cython.Build import cythonize

setup(
    name = "My hello app",
    ext_modules = cythonize('hello_pythran.pyx')
)

Then, with the following header in hello_pythran.pyx:

# cython: np_pythran=True

hello_pythran.pyx will be compiled using Pythran numpy support.

Please note that Pythran can further be tweaked by adding settings in the $HOME/.pythranrc file. For instance, this can be used to enable Boost.SIMD support. See the Pythran user manual for more information.

Indices and tables

Cython Changelog

0.29.2 (2018-12-14)

Bugs fixed

  • The code generated for deduplicated constants leaked some references. (Github issue #2750)
  • The declaration of sigismember() in libc.signal was corrected. (Github issue #2756)
  • Crashes in compiler and test runner were fixed. (Github issue #2736, #2755)
  • A C compiler warning about an invalid safety check was resolved. (Github issue #2731)

0.29.1 (2018-11-24)

Bugs fixed

  • Extensions compiled with MinGW-64 under Windows could misinterpret integer objects larger than 15 bit and return incorrect results. (Github issue #2670)
  • Cython no longer requires the source to be writable when copying its data into a memory view slice. Patch by Andrey Paramonov. (Github issue #2644)
  • Line tracing of try-statements generated invalid C code. (Github issue #2274)
  • When using the warn.undeclared directive, Cython’s own code generated warnings that are now fixed. Patch by Nicolas Pauss. (Github issue #2685)
  • Cython’s memoryviews no longer require strides for setting the shape field but only the PyBUF_ND flag to be set. Patch by John Kirkham. (Github issue #2716)
  • Some C compiler warnings about unused memoryview code were fixed. Patch by Ho Cheuk Ting. (Github issue #2588)
  • A C compiler warning about implicit signed/unsigned conversion was fixed. (Github issue #2729)
  • Assignments to C++ references returned by operator[] could fail to compile. (Github issue #2671)
  • The power operator and the support for NumPy math functions were fixed in Pythran expressions. Patch by Serge Guelton. (Github issues #2702, #2709)
  • Signatures with memory view arguments now show the expected type when embedded in docstrings. Patch by Matthew Chan and Benjamin Weigel. (Github issue #2634)
  • Some from ... cimport ... constructs were not correctly considered when searching modified dependencies in cythonize() to decide whether to recompile a module. Patch by Kryštof Pilnáček. (Github issue #2638)
  • A struct field type in the cpython.array declarations was corrected. Patch by John Kirkham. (Github issue #2712)

0.29 (2018-10-14)

Features added

  • PEP-489 multi-phase module initialisation has been enabled again. Module reloads in other subinterpreters raise an exception to prevent corruption of the static module state.
  • A set of mypy compatible PEP-484 declarations were added for Cython’s C data types to integrate with static analysers in typed Python code. They are available in the Cython/Shadow.pyi module and describe the types in the special cython module that can be used for typing in Python code. Original patch by Julian Gethmann. (Github issue #1965)
  • Memoryviews are supported in PEP-484/526 style type declarations. (Github issue #2529)
  • @cython.nogil is supported as a C-function decorator in Python code. (Github issue #2557)
  • Raising exceptions from nogil code will automatically acquire the GIL, instead of requiring an explicit with gil block.
  • C++ functions can now be declared as potentially raising both C++ and Python exceptions, so that Cython can handle both correctly. (Github issue #2615)
  • cython.inline() supports a direct language_level keyword argument that was previously only available via a directive.
  • A new language level name 3str was added that mostly corresponds to language level 3, but keeps unprefixed string literals as type ‘str’ in both Py2 and Py3, and the builtin ‘str’ type unchanged. This will become the default in the next Cython release and is meant to help user code a) transition more easily to this new default and b) migrate to Python 3 source code semantics without making support for Python 2.x difficult.
  • In CPython 3.6 and later, looking up globals in the module dict is almost as fast as looking up C globals. (Github issue #2313)
  • For a Python subclass of an extension type, repeated method calls to non-overridden cpdef methods can avoid the attribute lookup in Py3.6+, which makes them 4x faster. (Github issue #2313)
  • (In-)equality comparisons of objects to integer literals are faster. (Github issue #2188)
  • Some internal and 1-argument method calls are faster.
  • Modules that cimport many external extension types from other Cython modules execute less import requests during module initialisation.
  • Constant tuples and slices are deduplicated and only created once per module. (Github issue #2292)
  • The coverage plugin considers more C file extensions such as .cc and .cxx. (Github issue #2266)
  • The cythonize command accepts compile time variable values (as set by DEF) through the new -E option. Patch by Jerome Kieffer. (Github issue #2315)
  • pyximport can import from namespace packages. Patch by Prakhar Goel. (Github issue #2294)
  • Some missing numpy and CPython C-API declarations were added. Patch by John Kirkham. (Github issues #2523, #2520, #2537)
  • Declarations for the pylifecycle C-API functions were added in a new .pxd file cpython.pylifecycle.
  • The Pythran support was updated to work with the latest Pythran 0.8.7. Original patch by Adrien Guinet. (Github issue #2600)
  • %a is included in the string formatting types that are optimised into f-strings. In this case, it is also automatically mapped to %r in Python 2.x.
  • New C macro CYTHON_HEX_VERSION to access Cython’s version in the same style as PY_HEX_VERSION.
  • Constants in libc.math are now declared as const to simplify their handling.
  • An additional check_size clause was added to the ctypedef class name specification to allow suppressing warnings when importing modules with backwards-compatible PyTypeObject size changes. Patch by Matti Picus. (Github issue #2627)

Bugs fixed

  • The exception handling in generators and coroutines under CPython 3.7 was adapted to the newly introduced exception stack. Users of Cython 0.28 who want to support Python 3.7 are encouraged to upgrade to 0.29 to avoid potentially incorrect error reporting and tracebacks. (Github issue #1958)
  • Crash when importing a module under Stackless Python that was built for CPython. Patch by Anselm Kruis. (Github issue #2534)
  • 2-value slicing of typed sequences failed if the start or stop index was None. Patch by Christian Gibson. (Github issue #2508)
  • Multiplied string literals lost their factor when they are part of another constant expression (e.g. ‘x’ * 10 + ‘y’ => ‘xy’).
  • String formatting with the ‘%’ operator didn’t call the special __rmod__() method if the right side is a string subclass that implements it. (Python issue 28598)
  • The directive language_level=3 did not apply to the first token in the source file. (Github issue #2230)
  • Overriding cpdef methods did not work in Python subclasses with slots. Note that this can have a performance impact on calls from Cython code. (Github issue #1771)
  • Fix declarations of builtin or C types using strings in pure python mode. (Github issue #2046)
  • Generator expressions and lambdas failed to compile in @cfunc functions. (Github issue #459)
  • Global names with const types were not excluded from star-import assignments which could lead to invalid C code. (Github issue #2621)
  • Several internal function signatures were fixed that lead to warnings in gcc-8. (Github issue #2363)
  • The numpy helper functions set_array_base() and get_array_base() were adapted to the current numpy C-API recommendations. Patch by Matti Picus. (Github issue #2528)
  • Some NumPy related code was updated to avoid deprecated API usage. Original patch by jbrockmendel. (Github issue #2559)
  • Several C++ STL declarations were extended and corrected. Patch by Valentin Valls. (Github issue #2207)
  • C lines of the module init function were unconditionally not reported in exception stack traces. Patch by Jeroen Demeyer. (Github issue #2492)
  • When PEP-489 support is enabled, reloading the module overwrote any static module state. It now raises an exception instead, given that reloading is not actually supported.
  • Object-returning, C++ exception throwing functions were not checking that the return value was non-null. Original patch by Matt Wozniski (Github Issue #2603)
  • The source file encoding detection could get confused if the c_string_encoding directive appeared within the first two lines. (Github issue #2632)
  • Cython generated modules no longer emit a warning during import when the size of the NumPy array type is larger than what was found at compile time. Instead, this is assumed to be a backwards compatible change on NumPy side.

Other changes

  • Cython now emits a warning when no language_level (2, 3 or ‘3str’) is set explicitly, neither as a cythonize() option nor as a compiler directive. This is meant to prepare the transition of the default language level from currently Py2 to Py3, since that is what most new users will expect these days. The future default will, however, not enforce unicode literals, because this has proven a major obstacle in the support for both Python 2.x and 3.x. The next major release is intended to make this change, so that it will parse all code that does not request a specific language level as Python 3 code, but with str literals. The language level 2 will continue to be supported for an indefinite time.
  • The documentation was restructured, cleaned up and examples are now tested. The NumPy tutorial was also rewritten to simplify the running example. Contributed by Gabriel de Marmiesse. (Github issue #2245)
  • Cython compiles less of its own modules at build time to reduce the installed package size to about half of its previous size. This makes the compiler slightly slower, by about 5-7%.

0.28.6 (2018-11-01)

Bugs fixed

  • Extensions compiled with MinGW-64 under Windows could misinterpret integer objects larger than 15 bit and return incorrect results. (Github issue #2670)
  • Multiplied string literals lost their factor when they are part of another constant expression (e.g. ‘x’ * 10 + ‘y’ => ‘xy’).

0.28.5 (2018-08-03)

Bugs fixed

  • The discouraged usage of GCC’s attribute optimize("Os") was replaced by the similar attribute cold to reduce the code impact of the module init functions. (Github issue #2494)
  • A reference leak in Py2.x was fixed when comparing str to unicode for equality.

0.28.4 (2018-07-08)

Bugs fixed

  • Reallowing tp_clear() in a subtype of an @no_gc_clear extension type generated an invalid C function call to the (non-existent) base type implementation. (Github issue #2309)
  • Exception catching based on a non-literal (runtime) tuple could fail to match the exception. (Github issue #2425)
  • Compile fix for CPython 3.7.0a2. (Github issue #2477)

0.28.3 (2018-05-27)

Bugs fixed

  • Set iteration was broken in non-CPython since 0.28.
  • UnicodeEncodeError in Py2 when %s formatting is optimised for unicode strings. (Github issue #2276)
  • Work around a crash bug in g++ 4.4.x by disabling the size reduction setting of the module init function in this version. (Github issue #2235)
  • Crash when exceptions occur early during module initialisation. (Github issue #2199)

0.28.2 (2018-04-13)

Features added

  • abs() is faster for Python long objects.
  • The C++11 methods front() and end() were added to the declaration of libcpp.string. Patch by Alex Huszagh. (Github issue #2123)
  • The C++11 methods reserve() and bucket_count() are declared for libcpp.unordered_map. Patch by Valentin Valls. (Github issue #2168)

Bugs fixed

  • The copy of a read-only memoryview was considered read-only as well, whereas a common reason to copy a read-only view is to make it writable. The result of the copying is now a writable buffer by default. (Github issue #2134)
  • The switch statement generation failed to apply recursively to the body of converted if-statements.
  • NULL was sometimes rejected as exception return value when the returned type is a fused pointer type. Patch by Callie LeFave. (Github issue #2177)
  • Fixed compatibility with PyPy 5.11. Patch by Matti Picus. (Github issue #2165)

Other changes

  • The NumPy tutorial was rewritten to use memoryviews instead of the older buffer declaration syntax. Contributed by Gabriel de Marmiesse. (Github issue #2162)

0.28.1 (2018-03-18)

Bugs fixed

  • PyFrozenSet_New() was accidentally used in PyPy where it is missing from the C-API.
  • Assignment between some C++ templated types were incorrectly rejected when the templates mix const with ctypedef. (Github issue #2148)
  • Undeclared C++ no-args constructors in subclasses could make the compilation fail if the base class constructor was declared without nogil. (Github issue #2157)
  • Bytes %-formatting inferred basestring (bytes or unicode) as result type in some cases where bytes would have been safe to infer. (Github issue #2153)
  • None was accidentally disallowed as typed return value of dict.pop(). (Github issue #2152)

0.28 (2018-03-13)

Features added

  • Cdef classes can now multiply inherit from ordinary Python classes. (The primary base must still be a c class, possibly object, and the other bases must not be cdef classes.)
  • Type inference is now supported for Pythran compiled NumPy expressions. Patch by Nils Braun. (Github issue #1954)
  • The const modifier can be applied to memoryview declarations to allow read-only buffers as input. (Github issues #1605, #1869)
  • C code in the docstring of a cdef extern block is copied verbatimly into the generated file. Patch by Jeroen Demeyer. (Github issue #1915)
  • When compiling with gcc, the module init function is now tuned for small code size instead of whatever compile flags were provided externally. Cython now also disables some code intensive optimisations in that function to further reduce the code size. (Github issue #2102)
  • Decorating an async coroutine with @cython.iterable_coroutine changes its type at compile time to make it iterable. While this is not strictly in line with PEP-492, it improves the interoperability with old-style coroutines that use yield from instead of await.
  • The IPython magic has preliminary support for JupyterLab. (Github issue #1775)
  • The new TSS C-API in CPython 3.7 is supported and has been backported. Patch by Naotoshi Seo. (Github issue #1932)
  • Cython knows the new Py_tss_t type defined in PEP-539 and automatically initialises variables declared with that type to Py_tss_NEEDS_INIT, a value which cannot be used outside of static assignments.
  • The set methods .remove() and .discard() are optimised. Patch by Antoine Pitrou. (Github issue #2042)
  • dict.pop() is optimised. Original patch by Antoine Pitrou. (Github issue #2047)
  • Iteration over sets and frozensets is optimised. (Github issue #2048)
  • Safe integer loops (< range(2^30)) are automatically optimised into C loops.
  • alist.extend([a,b,c]) is optimised into sequential list.append() calls for short literal sequences.
  • Calls to builtin methods that are not specifically optimised into C-API calls now use a cache that avoids repeated lookups of the underlying C function. (Github issue #2054)
  • Single argument function calls can avoid the argument tuple creation in some cases.
  • Some redundant extension type checks are avoided.
  • Formatting C enum values in f-strings is faster, as well as some other special cases.
  • String formatting with the ‘%’ operator is optimised into f-strings in simple cases.
  • Subscripting (item access) is faster in some cases.
  • Some bytearray operations have been optimised similar to bytes.
  • Some PEP-484/526 container type declarations are now considered for loop optimisations.
  • Indexing into memoryview slices with view[i][j] is now optimised into view[i, j].
  • Python compatible cython.* types can now be mixed with type declarations in Cython syntax.
  • Name lookups in the module and in classes are faster.
  • Python attribute lookups on extension types without instance dict are faster.
  • Some missing signals were added to libc/signal.pxd. Patch by Jeroen Demeyer. (Github issue #1914)
  • The warning about repeated extern declarations is now visible by default. (Github issue #1874)
  • The exception handling of the function types used by CPython’s type slot functions was corrected to match the de-facto standard behaviour, so that code that uses them directly benefits from automatic and correct exception propagation. Patch by Jeroen Demeyer. (Github issue #1980)
  • Defining the macro CYTHON_NO_PYINIT_EXPORT will prevent the module init function from being exported as symbol, e.g. when linking modules statically in an embedding setup. Patch by AraHaan. (Github issue #1944)

Bugs fixed

  • If a module name is explicitly provided for an Extension() that is compiled via cythonize(), it was previously ignored and replaced by the source file name. It can now be used to override the target module name, e.g. for compiling prefixed accelerator modules from Python files. (Github issue #2038)
  • The arguments of the num_threads parameter of parallel sections were not sufficiently validated and could lead to invalid C code. (Github issue #1957)
  • Catching exceptions with a non-trivial exception pattern could call into CPython with a live exception set. This triggered incorrect behaviour and crashes, especially in CPython 3.7.
  • The signature of the special __richcmp__() method was corrected to recognise the type of the first argument as self. It was previously treated as plain object, but CPython actually guarantees that it always has the correct type. Note: this can change the semantics of user code that previously relied on self being untyped.
  • Some Python 3 exceptions were not recognised as builtins when running Cython under Python 2.
  • Some async helper functions were not defined in the generated C code when compiling simple async code. (Github issue #2075)
  • Line tracing did not include generators and coroutines. (Github issue #1949)
  • C++ declarations for unordered_map were corrected. Patch by Michael Schatzow. (Github issue #1484)
  • Iterator declarations in C++ deque and vector were corrected. Patch by Alex Huszagh. (Github issue #1870)
  • The const modifiers in the C++ string declarations were corrected, together with the coercion behaviour of string literals into C++ strings. (Github issue #2132)
  • Some declaration types in libc.limits were corrected. Patch by Jeroen Demeyer. (Github issue #2016)
  • @cython.final was not accepted on Python classes with an @cython.cclass decorator. (Github issue #2040)
  • Cython no longer creates useless and incorrect PyInstanceMethod wrappers for methods in Python 3. Patch by Jeroen Demeyer. (Github issue #2105)
  • The builtin bytearray type could not be used as base type of cdef classes. (Github issue #2106)

Other changes

0.27.3 (2017-11-03)

Bugs fixed

  • String forward references to extension types like @cython.locals(x="ExtType") failed to find the named type. (Github issue #1962)
  • NumPy slicing generated incorrect results when compiled with Pythran. Original patch by Serge Guelton (Github issue #1946).
  • Fix “undefined reference” linker error for generators on Windows in Py3.3-3.5. (Github issue #1968)
  • Adapt to recent C-API change of PyThreadState in CPython 3.7.
  • Fix signature of PyWeakref_GetObject() API declaration. Patch by Jeroen Demeyer (Github issue #1975).

0.27.2 (2017-10-22)

Bugs fixed

  • Comprehensions could incorrectly be optimised away when they appeared in boolean test contexts. (Github issue #1920)
  • The special methods __eq__, __lt__ etc. in extension types did not type their first argument as the type of the class but object. (Github issue #1935)
  • Crash on first lookup of “cline_in_traceback” option during exception handling. (Github issue #1907)
  • Some nested module level comprehensions failed to compile. (Github issue #1906)
  • Compiler crash on some complex type declarations in pure mode. (Github issue #1908)
  • std::unordered_map.erase() was declared with an incorrect void return type in libcpp.unordered_map. (Github issue #1484)
  • Invalid use of C++ fallthrough attribute before C++11 and similar issue in clang. (Github issue #1930)
  • Compiler crash on misnamed properties. (Github issue #1905)

0.27.1 (2017-10-01)

Features added

  • The Jupyter magic has a new debug option --verbose that shows details about the distutils invocation. Patch by Boris Filippov (Github issue #1881).

Bugs fixed

  • Py3 list comprehensions in class bodies resulted in invalid C code. (Github issue #1889)
  • Modules built for later CPython 3.5.x versions failed to import in 3.5.0/3.5.1. (Github issue #1880)
  • Deallocating fused types functions and methods kept their GC tracking enabled, which could potentially lead to recursive deallocation attempts.
  • Crash when compiling in C++ mode with old setuptools versions. (Github issue #1879)
  • C++ object arguments for the constructor of Cython implemented C++ are now passed by reference and not by value to allow for non-copyable arguments, such as unique_ptr.
  • API-exported C++ classes with Python object members failed to compile. (Github issue #1866)
  • Some issues with the new relaxed exception value handling were resolved.
  • Python classes as annotation types could prevent compilation. (Github issue #1887)
  • Cython annotation types in Python files could lead to import failures with a “cython undefined” error. Recognised types are now turned into strings.
  • Coverage analysis could fail to report on extension modules on some platforms.
  • Annotations could be parsed (and rejected) as types even with annotation_typing=False.

Other changes

  • PEP 489 support has been disabled by default to counter incompatibilities with import setups that try to reload or reinitialise modules.

0.27 (2017-09-23)

Features added

  • Extension module initialisation follows PEP 489 in CPython 3.5+, which resolves several differences with regard to normal Python modules. This makes the global names __file__ and __path__ correctly available to module level code and improves the support for module-level relative imports. (Github issues #1715, #1753, #1035)
  • Asynchronous generators (PEP 525) and asynchronous comprehensions (PEP 530) have been implemented. Note that async generators require finalisation support in order to allow for asynchronous operations during cleanup, which is only available in CPython 3.6+. All other functionality has been backported as usual.
  • Variable annotations are now parsed according to PEP 526. Cython types (e.g. cython.int) are evaluated as C type declarations and everything else as Python types. This can be disabled with the directive annotation_typing=False. Note that most complex PEP-484 style annotations are currently ignored. This will change in future releases. (Github issue #1850)
  • Extension types (also in pure Python mode) can implement the normal special methods __eq__, __lt__ etc. for comparisons instead of the low-level __richcmp__ method. (Github issue #690)
  • New decorator @cython.exceptval(x=None, check=False) that makes the signature declarations except x, except? x and except * available to pure Python code. Original patch by Antonio Cuni. (Github issue #1653)
  • Signature annotations are now included in the signature docstring generated by the embedsignature directive. Patch by Lisandro Dalcin (Github issue #1781).
  • The gdb support for Python code (libpython.py) was updated to the latest version in CPython 3.7 (git rev 5fe59f8).
  • The compiler tries to find a usable exception return value for cdef functions with except * if the returned type allows it. Note that this feature is subject to safety limitations, so it is still better to provide an explicit declaration.
  • C functions can be assigned to function pointers with a compatible exception declaration, not only with exact matches. A side-effect is that certain compatible signature overrides are now allowed and some more mismatches of exception signatures are now detected and rejected as errors that were not detected before.
  • The IPython/Jupyter magic integration has a new option %%cython --pgo for profile guided optimisation. It compiles the cell with PGO settings for the C compiler, executes it to generate a runtime profile, and then compiles it again using that profile for C compiler optimisation. Currently only tested with gcc.
  • len(memoryview) can be used in nogil sections to get the size of the first dimension of a memory view (shape[0]). (Github issue #1733)
  • C++ classes can now contain (properly refcounted) Python objects.
  • NumPy dtype subarrays are now accessible through the C-API. Patch by Gerald Dalley (Github issue #245).
  • Resolves several issues with PyPy and uses faster async slots in PyPy3. Patch by Ronan Lamy (Github issues #1871, #1878).

Bugs fixed

  • Extension types that were cimported from other Cython modules could disagree about the order of fused cdef methods in their call table. This could lead to wrong methods being called and potentially also crashes. The fix required changes to the ordering of fused methods in the call table, which may break existing compiled modules that call fused cdef methods across module boundaries, if these methods were implemented in a different order than they were declared in the corresponding .pxd file. (Github issue #1873)
  • The exception state handling in generators and coroutines could lead to exceptions in the caller being lost if an exception was raised and handled inside of the coroutine when yielding. (Github issue #1731)
  • Loops over range(enum) were not converted into C for-loops. Note that it is still recommended to use an explicit cast to a C integer type in this case.
  • Error positions of names (e.g. variables) were incorrectly reported after the name and not at the beginning of the name.
  • Compile time DEF assignments were evaluated even when they occur inside of falsy IF blocks. (Github issue #1796)
  • Disabling the line tracing from a trace function could fail. Original patch by Dmitry Trofimov. (Github issue #1769)
  • Several issues with the Pythran integration were resolved.
  • abs(signed int) now returns a signed rather than unsigned int. (Github issue #1837)
  • Reading frame.f_locals of a Cython function (e.g. from a debugger or profiler could modify the module globals. (Github issue #1836)
  • Buffer type mismatches in the NumPy buffer support could leak a reference to the buffer owner.
  • Using the “is_f_contig” and “is_c_contig” memoryview methods together could leave one of them undeclared. (Github issue #1872)
  • Compilation failed if the for-in-range loop target was not a variable but a more complex expression, e.g. an item assignment. (Github issue #1831)
  • Compile time evaluations of (partially) constant f-strings could show incorrect results.
  • Escape sequences in raw f-strings (fr'...') were resolved instead of passing them through as expected.
  • Some ref-counting issues in buffer error handling have been resolved.

Other changes

  • Type declarations in signature annotations are now parsed according to PEP 484 typing. Only Cython types (e.g. cython.int) and Python builtin types are currently considered as type declarations. Everything else is ignored, but this will change in a future Cython release. (Github issue #1672)
  • The directive annotation_typing is now True by default, which enables parsing type declarations from annotations.
  • This release no longer supports Python 3.2.

0.26.1 (2017-08-29)

Features added

Bugs fixed

  • cython.view.array was missing .__len__().
  • Extension types with a .pxd override for their __releasebuffer__ slot (e.g. as provided by Cython for the Python array.array type) could leak a reference to the buffer owner on release, thus not freeing the memory. (Github issue #1638)
  • Auto-decoding failed in 0.26 for strings inside of C++ containers. (Github issue #1790)
  • Compile error when inheriting from C++ container types. (Github issue #1788)
  • Invalid C code in generators (declaration after code). (Github issue #1801)
  • Arithmetic operations on const integer variables could generate invalid code. (Github issue #1798)
  • Local variables with names of special Python methods failed to compile inside of closures. (Github issue #1797)
  • Problem with indirect Emacs buffers in cython-mode. Patch by Martin Albrecht (Github issue #1743).
  • Extension types named result or PickleError generated invalid unpickling code. Patch by Jason Madden (Github issue #1786).
  • Bazel integration failed to compile .py files. Patch by Guro Bokum (Github issue #1784).
  • Some include directories and dependencies were referenced with their absolute paths in the generated files despite lying within the project directory.
  • Failure to compile in Py3.7 due to a modified signature of _PyCFunctionFast()

0.26 (2017-07-19)

Features added

  • Pythran can be used as a backend for evaluating NumPy array expressions. Patch by Adrien Guinet (Github issue #1607).
  • cdef classes now support pickling by default when possible. This can be disabled with the auto_pickle directive.
  • Speed up comparisons of strings if their hash value is available. Patch by Claudio Freire (Github issue #1571).
  • Support pyximport from zip files. Patch by Sergei Lebedev (Github issue #1485).
  • IPython magic now respects the __all__ variable and ignores names with leading-underscore (like import * does). Patch by Syrtis Major (Github issue #1625).
  • abs() is optimised for C complex numbers. Patch by da-woods (Github issue #1648).
  • The display of C lines in Cython tracebacks can now be enabled at runtime via import cython_runtime; cython_runtime.cline_in_traceback=True. The default has been changed to False.
  • The overhead of calling fused types generic functions was reduced.
  • “cdef extern” include files are now also searched relative to the current file. Patch by Jeroen Demeyer (Github issue #1654).
  • Optional optimization for re-aquiring the GIL, controlled by the fast_gil directive.

Bugs fixed

  • Item lookup/assignment with a unicode character as index that is typed (explicitly or implicitly) as Py_UCS4 or Py_UNICODE used the integer value instead of the Unicode string value. Code that relied on the previous behaviour now triggers a warning that can be disabled by applying an explicit cast. (Github issue #1602)
  • f-string processing was adapted to changes in PEP 498 and CPython 3.6.
  • Invalid C code when decoding from UTF-16(LE/BE) byte strings. (Github issue #1696)
  • Unicode escapes in ‘ur’ raw-unicode strings were not resolved in Py2 code. Original patch by Aaron Gallagher (Github issue #1594).
  • File paths of code objects are now relative. Original patch by Jelmer Vernooij (Github issue #1565).
  • Decorators of cdef class methods could be executed twice. Patch by Jeroen Demeyer (Github issue #1724).
  • Dict iteration using the Py2 iter* methods failed in PyPy3. Patch by Armin Rigo (Github issue #1631).
  • Several warnings in the generated code are now suppressed.

Other changes

  • The unraisable_tracebacks option now defaults to True.
  • Coercion of C++ containers to Python is no longer automatic on attribute access (Github issue #1521).
  • Access to Python attributes of cimported modules without the corresponding import is now a compile-time (rather than runtime) error.
  • Do not use special dll linkage for “cdef public” functions. Patch by Jeroen Demeyer (Github issue #1687).
  • cdef/cpdef methods must match their declarations. See Github Issue #1732. This is now a warning and will be an error in future releases.

0.25.2 (2016-12-08)

Bugs fixed

  • Fixes several issues with C++ template deduction.
  • Fixes a issue with bound method type inference (Github issue #551).
  • Fixes a bug with cascaded tuple assignment (Github issue #1523).
  • Fixed or silenced many Clang warnings.
  • Fixes bug with powers of pure real complex numbers (Github issue #1538).

0.25.1 (2016-10-26)

Bugs fixed

  • Fixes a bug with isinstance(o, Exception) (Github issue #1496).
  • Fixes bug with cython.view.array missing utility code in some cases (Github issue #1502).

Other changes

  • The distutils extension Cython.Distutils.build_ext has been reverted, temporarily, to be old_build_ext to give projects time to migrate. The new build_ext is available as new_build_ext.

0.25 (2016-10-25)

Features added

  • def/cpdef methods of cdef classes benefit from Cython’s internal function implementation, which enables introspection and line profiling for them. Implementation sponsored by Turbostream (www.turbostream-cfd.com).
  • Calls to Python functions are faster, following the recent “FastCall” optimisations that Victor Stinner implemented for CPython 3.6. See https://bugs.python.org/issue27128 and related issues.
  • The new METH_FASTCALL calling convention for PyCFunctions is supported in CPython 3.6. See https://bugs.python.org/issue27810
  • Initial support for using Cython modules in Pyston. Patch by Boxiang Sun.
  • Dynamic Python attributes are allowed on cdef classes if an attribute cdef dict __dict__ is declared in the class. Patch by empyrical.
  • Cython implemented C++ classes can make direct calls to base class methods. Patch by empyrical.
  • C++ classes can now have typedef members. STL containers updated with value_type.
  • New directive cython.no_gc to fully disable GC for a cdef class. Patch by Claudio Freire.
  • Buffer variables are no longer excluded from locals(). Patch by da-woods.
  • Building f-strings is faster, especially when formatting C integers.
  • for-loop iteration over “std::string”.
  • libc/math.pxd provides e and pi as alias constants to simplify usage as a drop-in replacement for Python’s math module.
  • Speed up cython.inline().
  • Binary lshift operations with small constant Python integers are faster.
  • Some integer operations on Python long objects are faster in Python 2.7.
  • Support for the C++ typeid operator.
  • Support for bazel using a the pyx_library rule in //Tools:rules.bzl.

Significant Bugs fixed

  • Division of complex numbers avoids overflow by using Smith’s method.
  • Some function signatures in libc.math and numpy.pxd were incorrect. Patch by Michael Seifert.

Other changes

  • The “%%cython” IPython/jupyter magic now defaults to the language level of the current jupyter kernel. The language level can be set explicitly with “%%cython -2” or “%%cython -3”.
  • The distutils extension Cython.Distutils.build_ext has now been updated to use cythonize which properly handles dependencies. The old extension can still be found in Cython.Distutils.old_build_ext and is now deprecated.
  • directive_defaults is no longer available in Cython.Compiler.Options, use get_directive_defaults() instead.

0.24.1 (2016-07-15)

Bugs fixed

  • IPython cell magic was lacking a good way to enable Python 3 code semantics. It can now be used as “%%cython -3”.
  • Follow a recent change in PEP 492 and CPython 3.5.2 that now requires the __aiter__() method of asynchronous iterators to be a simple def method instead of an async def method.
  • Coroutines and generators were lacking the __module__ special attribute.
  • C++ std::complex values failed to auto-convert from and to Python complex objects.
  • Namespaced C++ types could not be used as memory view types due to lack of name mangling. Patch by Ivan Smirnov.
  • Assignments between identical C++ types that were declared with differently typedefed template types could fail.
  • Rebuilds could fail to evaluate dependency timestamps in C++ mode. Patch by Ian Henriksen.
  • Macros defined in the distutils compiler option do not require values anymore. Patch by Ian Henriksen.
  • Minor fixes for MSVC, Cygwin and PyPy.

0.24 (2016-04-04)

Features added

  • PEP 498: Literal String Formatting (f-strings). Original patch by Jelle Zijlstra.
  • PEP 515: Underscores as visual separators in number literals.
  • Parser was adapted to some minor syntax changes in Py3.6, e.g. https://bugs.python.org/issue9232
  • The embedded C code comments that show the original source code can be discarded with the new directive emit_code_comments=False.
  • Cpdef enums are now first-class iterable, callable types in Python.
  • Ctuples can now be declared in pure Python code.
  • Posix declarations for DLL loading and stdio extensions were added. Patch by Lars Buitinck.
  • The Py2-only builtins unicode(), xrange(), reduce() and long are now also available in compile time DEF expressions when compiling with Py3.
  • Exception type tests have slightly lower overhead. This fixes ticket 868.
  • @property syntax fully supported in cdef classes, old syntax deprecated.
  • C++ classes can now be declared with default template parameters.

Bugs fixed

  • C++ exceptions raised by overloaded C++ operators were not always handled. Patch by Ian Henriksen.
  • C string literals were previously always stored as non-const global variables in the module. They are now stored as global constants when possible, and otherwise as non-const C string literals in the generated code that uses them. This improves compatibility with strict C compiler options and prevents non-const strings literals with the same content from being incorrectly merged.
  • Compile time evaluated str expressions (DEF) now behave in a more useful way by turning into Unicode strings when compiling under Python 3. This allows using them as intermediate values in expressions. Previously, they always evaluated to bytes objects.
  • isinf() declarations in libc/math.pxd and numpy/math.pxd now reflect the actual tristate int return value instead of using bint.
  • Literal assignments to ctuples avoid Python tuple round-trips in some more corner cases.
  • Iteration over dict(...).items() failed to get optimised when dict arguments included keyword arguments.
  • cProfile now correctly profiles cpdef functions and methods.

0.23.5 (2016-03-26)

  • Compile errors and warnings in integer type conversion code. This fixes ticket 877. Patches by Christian Neukirchen, Nikolaus Rath, Ian Henriksen.
  • Reference leak when “*args” argument was reassigned in closures.
  • Truth-testing Unicode strings could waste time and memory in Py3.3+.
  • Return values of async functions could be ignored and replaced by None.
  • Compiler crash in CPython 3.6.
  • Fix prange() to behave identically to range(). The end condition was miscalculated when the range was not exactly divisible by the step.
  • Optimised all(genexpr)/any(genexpr) calls could warn about unused code. This fixes ticket 876.

0.23.4 (2015-10-10)

Bugs fixed

  • Memory leak when calling Python functions in PyPy.
  • Compilation problem with MSVC in C99-ish mode.
  • Warning about unused values in a helper macro.

0.23.3 (2015-09-29)

Bugs fixed

  • Invalid C code for some builtin methods. This fixes ticket 856 again.
  • Incorrect C code in helper functions for PyLong conversion and string decoding. This fixes ticket 863, ticket 864 and ticket 865. Original patch by Nikolaus Rath.
  • Large folded or inserted integer constants could use too small C integer types and thus trigger a value wrap-around.

Other changes

  • The coroutine and generator types of Cython now also register directly with the Coroutine and Generator ABCs in the backports_abc module if it can be imported. This fixes ticket 870.

0.23.2 (2015-09-11)

Bugs fixed

  • Compiler crash when analysing some optimised expressions.
  • Coverage plugin was adapted to coverage.py 4.0 beta 2.
  • C++ destructor calls could fail when ‘&’ operator is overwritten.
  • Incorrect C literal generation for large integers in compile-time evaluated DEF expressions and constant folded expressions.
  • Byte string constants could end up as Unicode strings when originating from compile-time evaluated DEF expressions.
  • Invalid C code when caching known builtin methods. This fixes ticket 860.
  • ino_t in posix.types was not declared as unsigned.
  • Declarations in libcpp/memory.pxd were missing operator!(). Patch by Leo Razoumov.
  • Static cdef methods can now be declared in .pxd files.

0.23.1 (2015-08-22)

Bugs fixed

  • Invalid C code for generators. This fixes ticket 858.
  • Invalid C code for some builtin methods. This fixes ticket 856.
  • Invalid C code for unused local buffer variables. This fixes ticket 154.
  • Test failures on 32bit systems. This fixes ticket 857.
  • Code that uses from xyz import * and global C struct/union/array variables could fail to compile due to missing helper functions. This fixes ticket 851.
  • Misnamed PEP 492 coroutine property cr_yieldfrom renamed to cr_await to match CPython.
  • Missing deallocation code for C++ object attributes in certain extension class hierarchies.
  • Crash when async coroutine was not awaited.
  • Compiler crash on yield in signature annotations and default argument values. Both are forbidden now.
  • Compiler crash on certain constructs in finally clauses.
  • Cython failed to build when CPython’s pgen is installed.

0.23 (2015-08-08)

Features added

  • PEP 492 (async/await) was implemented.
  • PEP 448 (Additional Unpacking Generalizations) was implemented.
  • Support for coverage.py 4.0+ can be enabled by adding the plugin “Cython.Coverage” to the “.coveragerc” config file.
  • Annotated HTML source pages can integrate (XML) coverage reports.
  • Tracing is supported in nogil functions/sections and module init code.
  • When generators are used in a Cython module and the module imports the modules “inspect” and/or “asyncio”, Cython enables interoperability by patching these modules during the import to recognise Cython’s internal generator and coroutine types. This can be disabled by C compiling the module with “-D CYTHON_PATCH_ASYNCIO=0” or “-D CYTHON_PATCH_INSPECT=0”
  • When generators or coroutines are used in a Cython module, their types are registered with the Generator and Coroutine ABCs in the collections or collections.abc stdlib module at import time to enable interoperability with code that needs to detect and process Python generators/coroutines. These ABCs were added in CPython 3.5 and are available for older Python versions through the backports_abc module on PyPI. See https://bugs.python.org/issue24018
  • Adding/subtracting/dividing/modulus and equality comparisons with constant Python floats and small integers are faster.
  • Binary and/or/xor/rshift operations with small constant Python integers are faster.
  • When called on generator expressions, the builtins all(), any(), dict(), list(), set(), sorted() and unicode.join() avoid the generator iteration overhead by inlining a part of their functionality into the for-loop.
  • Keyword argument dicts are no longer copied on function entry when they are not being used or only passed through to other function calls (e.g. in wrapper functions).
  • The PyTypeObject declaration in cpython.object was extended.
  • The builtin type type is now declared as PyTypeObject in source, allowing for extern functions taking type parameters to have the correct C signatures. Note that this might break code that uses type just for passing around Python types in typed variables. Removing the type declaration provides a backwards compatible fix.
  • wraparound() and boundscheck() are available as no-ops in pure Python mode.
  • Const iterators were added to the provided C++ STL declarations.
  • Smart pointers were added to the provided C++ STL declarations. Patch by Daniel Filonik.
  • NULL is allowed as default argument when embedding signatures. This fixes ticket 843.
  • When compiling with --embed, the internal module name is changed to __main__ to allow arbitrary program names, including those that would be invalid for modules. Note that this prevents reuse of the generated C code as an importable module.
  • External C++ classes that overload the assignment operator can be used. Patch by Ian Henriksen.
  • Support operator bool() for C++ classes so they can be used in if statements.

Bugs fixed

  • Calling “yield from” from Python on a Cython generator that returned a value triggered a crash in CPython. This is now being worked around. See https://bugs.python.org/issue23996
  • Language level 3 did not enable true division (a.k.a. float division) for integer operands.
  • Functions with fused argument types that included a generic ‘object’ fallback could end up using that fallback also for other explicitly listed object types.
  • Relative cimports could accidentally fall back to trying an absolute cimport on failure.
  • The result of calling a C struct constructor no longer requires an intermediate assignment when coercing to a Python dict.
  • C++ exception declarations with mapping functions could fail to compile when pre-declared in .pxd files.
  • cpdef void methods are now permitted.
  • abs(cint) could fail to compile in MSVC and used sub-optimal code in C++. Patch by David Vierra, original patch by Michael Enßlin.
  • Buffer index calculations using index variables with small C integer types could overflow for large buffer sizes. Original patch by David Vierra.
  • C unions use a saner way to coerce from and to Python dicts.
  • When compiling a module foo.pyx, the directories in sys.path are no longer searched when looking for foo.pxd. Patch by Jeroen Demeyer.
  • Memory leaks in the embedding main function were fixed. Original patch by Michael Enßlin.
  • Some complex Python expressions could fail to compile inside of finally clauses.
  • Unprefixed ‘str’ literals were not supported as C varargs arguments.
  • Fixed type errors in conversion enum types to/from Python. Note that this imposes stricter correctness requirements on enum declarations.

Other changes

  • Changed mangling scheme in header files generated by cdef api declarations.
  • Installation under CPython 3.3+ no longer requires a pass of the 2to3 tool. This also makes it possible to run Cython in Python 3.3+ from a source checkout without installing it first. Patch by Petr Viktorin.
  • jedi-typer.py (in Tools/) was extended and renamed to jedityper.py (to make it importable) and now works with and requires Jedi 0.9. Patch by Tzer-jen Wei.

0.22.1 (2015-06-20)

Bugs fixed

  • Crash when returning values on generator termination.
  • In some cases, exceptions raised during internal isinstance() checks were not propagated.
  • Runtime reported file paths of source files (e.g for profiling and tracing) are now relative to the build root directory instead of the main source file.
  • Tracing exception handling code could enter the trace function with an active exception set.
  • The internal generator function type was not shared across modules.
  • Comparisons of (inferred) ctuples failed to compile.
  • Closures inside of cdef functions returning void failed to compile.
  • Using const C++ references in intermediate parts of longer expressions could fail to compile.
  • C++ exception declarations with mapping functions could fail to compile when pre-declared in .pxd files.
  • C++ compilation could fail with an ambiguity error in recent MacOS-X Xcode versions.
  • C compilation could fail in pypy3.
  • Fixed a memory leak in the compiler when compiling multiple modules.
  • When compiling multiple modules, external library dependencies could leak into later compiler runs. Fix by Jeroen Demeyer. This fixes ticket 845.

0.22 (2015-02-11)

Features added

  • C functions can coerce to Python functions, which allows passing them around as callable objects.
  • C arrays can be assigned by value and auto-coerce from Python iterables and to Python lists (and tuples).
  • Extern C functions can now be declared as cpdef to export them to the module’s Python namespace. Extern C functions in pxd files export their values to their own module, iff it exists.
  • Anonymous C tuple types can be declared as (ctype1, ctype2, …).
  • PEP 479: turn accidental StopIteration exceptions that exit generators into a RuntimeError, activated with future import “generator_stop”.
  • Looping over reversed(range()) is optimised in the same way as range(). Patch by Favian Contreras.

Bugs fixed

  • Mismatching ‘except’ declarations on signatures in .pxd and .pyx files failed to produce a compile error.
  • Failure to find any files for the path pattern(s) passed into cythonize() is now an error to more easily detect accidental typos.
  • The logaddexp family of functions in numpy.math now has correct declarations.
  • In Py2.6/7 and Py3.2, simple Cython memory views could accidentally be interpreted as non-contiguous by CPython, which could trigger a CPython bug when copying data from them, thus leading to data corruption. See CPython issues 12834 and 23349.

Other changes

  • Preliminary support for defining the Cython language with a formal grammar. To try parsing your files against this grammar, use the –formal_grammar directive. Experimental.
  • _ is no longer considered a cacheable builtin as it could interfere with gettext.
  • Cythonize-computed metadata now cached in the generated C files.
  • Several corrections and extensions in numpy, cpython, and libcpp pxd files.

0.21.2 (2014-12-27)

Bugs fixed

  • Crash when assigning a C value to both a Python and C target at the same time.
  • Automatic coercion from C++ strings to str generated incomplete code that failed to compile.
  • Declaring a constructor in a C++ child class erroneously required a default constructor declaration in the super class.
  • resize_smart() in cpython.array was broken.
  • Functions in libcpp.cast are now declared as nogil.
  • Some missing C-API declarations were added.
  • Py3 main code in embedding program code was lacking casts.
  • Exception related to distutils “Distribution” class type in pyximport under latest CPython 2.7 and 3.4 releases when setuptools is being imported later.

0.21.1 (2014-10-18)

Features added

  • New cythonize option -a to generate the annotated HTML source view.
  • Missing C-API declarations in cpython.unicode were added.
  • Passing language='c++' into cythonize() globally enables C++ mode for all modules that were not passed as Extension objects (i.e. only source files and file patterns).
  • Py_hash_t is a known type (used in CPython for hash values).
  • PySlice_*() C-API functions are available from the cpython.slice module.
  • Allow arrays of C++ classes.

Bugs fixed

  • Reference leak for non-simple Python expressions in boolean and/or expressions.
  • To fix a name collision and to reflect availability on host platforms, standard C declarations [ clock(), time(), struct tm and tm* functions ] were moved from posix/time.pxd to a new libc/time.pxd. Patch by Charles Blake.
  • Rerunning unmodified modules in IPython’s cython support failed. Patch by Matthias Bussonier.
  • Casting C++ std::string to Python byte strings failed when auto-decoding was enabled.
  • Fatal exceptions in global module init code could lead to crashes if the already created module was used later on (e.g. through a stale reference in sys.modules or elsewhere).
  • cythonize.py script was not installed on MS-Windows.

Other changes

  • Compilation no longer fails hard when unknown compilation options are passed. Instead, it raises a warning and ignores them (as it did silently before 0.21). This will be changed back to an error in a future release.

0.21 (2014-09-10)

Features added

  • C (cdef) functions allow inner Python functions.
  • Enums can now be declared as cpdef to export their values to the module’s Python namespace. Cpdef enums in pxd files export their values to their own module, iff it exists.
  • Allow @staticmethod decorator to declare static cdef methods. This is especially useful for declaring “constructors” for cdef classes that can take non-Python arguments.
  • Taking a char* from a temporary Python string object is safer in more cases and can be done inside of non-trivial expressions, including arguments of a function call. A compile time error is raised only when such a pointer is assigned to a variable and would thus exceed the lifetime of the string itself.
  • Generators have new properties __name__ and __qualname__ that provide the plain/qualified name of the generator function (following CPython 3.5). See http://bugs.python.org/issue21205
  • The inline function modifier is available as a decorator @cython.inline in pure mode.
  • When cygdb is run in a virtualenv, it enables the same virtualenv inside of the debugger. Patch by Marc Abramowitz.
  • PEP 465: dedicated infix operator for matrix multiplication (A @ B).
  • HTML output of annotated code uses Pygments for code highlighting and generally received a major overhaul by Matthias Bussonier.
  • IPython magic support is now available directly from Cython with the command “%load_ext cython”. Cython code can directly be executed in a cell when marked with “%%cython”. Code analysis is available with “%%cython -a”. Patch by Martín Gaitán.
  • Simple support for declaring Python object types in Python signature annotations. Currently requires setting the compiler directive annotation_typing=True.
  • New directive use_switch (defaults to True) to optionally disable the optimization of chained if statement to C switch statements.
  • Defines dynamic_cast et al. in libcpp.cast and C++ heap data structure operations in libcpp.algorithm.
  • Shipped header declarations in posix.* were extended to cover more of the POSIX API. Patches by Lars Buitinck and Mark Peek.

Optimizations

  • Simple calls to C implemented Python functions/methods are faster. This also speeds up many operations on builtins that Cython cannot otherwise optimise.
  • The “and”/”or” operators try to avoid unnecessary coercions of their arguments. They now evaluate the truth value of each argument independently and only coerce the final result of the whole expression to the target type (e.g. the type on the left side of an assignment). This also avoids reference counting overhead for Python values during evaluation and generally improves the code flow in the generated C code.
  • The Python expression “2 ** N” is optimised into bit shifting. See http://bugs.python.org/issue21420
  • Cascaded assignments (a = b = …) try to minimise the number of type coercions.
  • Calls to slice() are translated to a straight C-API call.

Bugs fixed

  • Crash when assigning memory views from ternary conditional expressions.
  • Nested C++ templates could lead to unseparated “>>” characters being generated into the C++ declarations, which older C++ compilers could not parse.
  • Sending SIGINT (Ctrl-C) during parallel cythonize() builds could hang the child processes.
  • No longer ignore local setup.cfg files for distutils in pyximport. Patch by Martin Teichmann.
  • Taking a char* from an indexed Python string generated unsafe reference counting code.
  • Set literals now create all of their items before trying to add them to the set, following the behaviour in CPython. This makes a difference in the rare case that the item creation has side effects and some items are not hashable (or if hashing them has side effects, too).
  • Cython no longer generates the cross product of C functions for code that uses memory views of fused types in function signatures (e.g. cdef func(floating[:] a, floating[:] b)). This is considered the expected behaviour by most users and was previously inconsistent with other structured types like C arrays. Code that really wants all type combinations can create the same fused memoryview type under different names and use those in the signature to make it clear which types are independent.
  • Names that were unknown at compile time were looked up as builtins at runtime but not as global module names. Trying both lookups helps with globals() manipulation.
  • Fixed stl container conversion for typedef element types.
  • obj.pop(x) truncated large C integer values of x to Py_ssize_t.
  • __init__.pyc is recognised as marking a package directory (in addition to .py, .pyx and .pxd).
  • Syntax highlighting in cython-mode.el for Emacs no longer incorrectly highlights keywords found as part of longer names.
  • Correctly handle from cython.submodule cimport name.
  • Fix infinite recursion when using super with cpdef methods.
  • No-args dir() was not guaranteed to return a sorted list.

Other changes

  • The header line in the generated C files no longer contains the timestamp but only the Cython version that wrote it. This was changed to make builds more reproducible.
  • Removed support for CPython 2.4, 2.5 and 3.1.
  • The licensing implications on the generated code were clarified to avoid legal constraints for users.

0.20.2 (2014-06-16)

Features added

  • Some optimisations for set/frozenset instantiation.
  • Support for C++ unordered_set and unordered_map.

Bugs fixed

  • Access to attributes of optimised builtin methods (e.g. [].append.__name__) could fail to compile.
  • Memory leak when extension subtypes add a memory view as attribute to those of the parent type without having Python object attributes or a user provided dealloc method.
  • Compiler crash on readonly properties in “binding” mode.
  • Auto-encoding with c_string_encoding=ascii failed in Py3.3.
  • Crash when subtyping freelist enabled Cython extension types with Python classes that use __slots__.
  • Freelist usage is restricted to CPython to avoid problems with other Python implementations.
  • Memory leak in memory views when copying overlapping, contiguous slices.
  • Format checking when requesting non-contiguous buffers from cython.array objects was accidentally omitted in Py3.
  • C++ destructor calls in extension types could fail to compile in clang.
  • Buffer format validation failed for sequences of strings in structs.
  • Docstrings on extension type attributes in .pxd files were rejected.

0.20.1 (2014-02-11)

Bugs fixed

  • Build error under recent MacOS-X versions where isspace() could not be resolved by clang.
  • List/Tuple literals multiplied by more than one factor were only multiplied by the last factor instead of all.
  • Lookups of special methods (specifically for context managers) could fail in Python <= 2.6/3.1.
  • Local variables were erroneously appended to the signature introspection of Cython implemented functions with keyword-only arguments under Python 3.
  • In-place assignments to variables with inferred Python builtin/extension types could fail with type errors if the result value type was incompatible with the type of the previous value.
  • The C code generation order of cdef classes, closures, helper code, etc. was not deterministic, thus leading to high code churn.
  • Type inference could fail to deduce C enum types.
  • Type inference could deduce unsafe or inefficient types from integer assignments within a mix of inferred Python variables and integer variables.

0.20 (2014-01-18)

Features added

  • Support for CPython 3.4.
  • Support for calling C++ template functions.
  • yield is supported in finally clauses.
  • The C code generated for finally blocks is duplicated for each exit case to allow for better optimisations by the C compiler.
  • Cython tries to undo the Python optimisationism of assigning a bound method to a local variable when it can generate better code for the direct call.
  • Constant Python float values are cached.
  • String equality comparisons can use faster type specific code in more cases than before.
  • String/Unicode formatting using the ‘%’ operator uses a faster C-API call.
  • bytearray has become a known type and supports coercion from and to C strings. Indexing, slicing and decoding is optimised. Note that this may have an impact on existing code due to type inference.
  • Using cdef basestring stringvar and function arguments typed as basestring is now meaningful and allows assigning exactly str and unicode objects, but no subtypes of these types.
  • Support for the __debug__ builtin.
  • Assertions in Cython compiled modules are disabled if the running Python interpreter was started with the “-O” option.
  • Some types that Cython provides internally, such as functions and generators, are now shared across modules if more than one Cython implemented module is imported.
  • The type inference algorithm works more fine granular by taking the results of the control flow analysis into account.
  • A new script in bin/cythonize provides a command line frontend to the cythonize() compilation function (including distutils build).
  • The new extension type decorator @cython.no_gc_clear prevents objects from being cleared during cyclic garbage collection, thus making sure that object attributes are kept alive until deallocation.
  • During cyclic garbage collection, attributes of extension types that cannot create reference cycles due to their type (e.g. strings) are no longer considered for traversal or clearing. This can reduce the processing overhead when searching for or cleaning up reference cycles.
  • Package compilation (i.e. __init__.py files) now works, starting with Python 3.3.
  • The cython-mode.el script for Emacs was updated. Patch by Ivan Andrus.
  • An option common_utility_include_dir was added to cythonize() to save oft-used utility code once in a separate directory rather than as part of each generated file.
  • unraisable_tracebacks directive added to control printing of tracebacks of unraisable exceptions.

Bugs fixed

  • Abstract Python classes that subtyped a Cython extension type failed to raise an exception on instantiation, and thus ended up being instantiated.
  • set.add(a_tuple) and set.discard(a_tuple) failed with a TypeError in Py2.4.
  • The PEP 3155 __qualname__ was incorrect for nested classes and inner classes/functions declared as global.
  • Several corner cases in the try-finally statement were fixed.
  • The metaclass of a Python class was not inherited from its parent class(es). It is now extracted from the list of base classes if not provided explicitly using the Py3 metaclass keyword argument. In Py2 compilation mode, a __metaclass__ entry in the class dict will still take precedence if not using Py3 metaclass syntax, but only after creating the class dict (which may have been done by a metaclass of a base class, see PEP 3115). It is generally recommended to use the explicit Py3 syntax to define metaclasses for Python types at compile time.
  • The automatic C switch statement generation behaves more safely for heterogeneous value types (e.g. mixing enum and char), allowing for a slightly wider application and reducing corner cases. It now always generates a ‘default’ clause to avoid C compiler warnings about unmatched enum values.
  • Fixed a bug where class hierarchies declared out-of-order could result in broken generated code.
  • Fixed a bug which prevented overriding const methods of C++ classes.
  • Fixed a crash when converting Python objects to C++ strings fails.

Other changes

  • In Py3 compilation mode, Python2-style metaclasses declared by a __metaclass__ class dict entry are ignored.
  • In Py3.4+, the Cython generator type uses tp_finalize() for safer cleanup instead of tp_del().

0.19.2 (2013-10-13)

Features added

Bugs fixed

  • Some standard declarations were fixed or updated, including the previously incorrect declaration of PyBuffer_FillInfo() and some missing bits in libc.math.
  • Heap allocated subtypes of type used the wrong base type struct at the C level.
  • Calling the unbound method dict.keys/value/items() in dict subtypes could call the bound object method instead of the unbound supertype method.
  • “yield” wasn’t supported in “return” value expressions.
  • Using the “bint” type in memory views lead to unexpected results. It is now an error.
  • Assignments to global/closure variables could catch them in an illegal state while deallocating the old value.

Other changes

0.19.1 (2013-05-11)

Features added

  • Completely empty C-API structs for extension type slots (protocols like number/mapping/sequence) are no longer generated into the C code.
  • Docstrings that directly follow a public/readonly attribute declaration in a cdef class will be used as docstring of the auto-generated property. This fixes ticket 206.
  • The automatic signature documentation tries to preserve more semantics of default arguments and argument types. Specifically, bint arguments now appear as type bool.
  • A warning is emitted when negative literal indices are found inside of a code section that disables wraparound handling. This helps with fixing invalid code that might fail in the face of future compiler optimisations.
  • Constant folding for boolean expressions (and/or) was improved.
  • Added a build_dir option to cythonize() which allows one to place the generated .c files outside the source tree.

Bugs fixed

  • isinstance(X, type) failed to get optimised into a call to PyType_Check(), as done for other builtin types.
  • A spurious from datetime cimport * was removed from the “cpython” declaration package. This means that the “datetime” declarations (added in 0.19) are no longer available directly from the “cpython” namespace, but only from “cpython.datetime”. This is the correct way of doing it because the declarations refer to a standard library module, not the core CPython C-API itself.
  • The C code for extension types is now generated in topological order instead of source code order to avoid C compiler errors about missing declarations for subtypes that are defined before their parent.
  • The memoryview type name no longer shows up in the module dict of modules that use memory views. This fixes trac ticket 775.
  • Regression in 0.19 that rejected valid C expressions from being used in C array size declarations.
  • In C++ mode, the C99-only keyword restrict could accidentally be seen by the GNU C++ compiler. It is now specially handled for both GCC and MSVC.
  • Testing large (> int) C integer values for their truth value could fail due to integer wrap-around.

Other changes

0.19 (2013-04-19)

Features added

  • New directives c_string_type and c_string_encoding to more easily and automatically convert between C strings and the different Python string types.
  • The extension type flag Py_TPFLAGS_HAVE_VERSION_TAG is enabled by default on extension types and can be disabled using the type_version_tag compiler directive.
  • EXPERIMENTAL support for simple Cython code level line tracing. Enabled by the “linetrace” compiler directive.
  • Cython implemented functions make their argument and return type annotations available through the __annotations__ attribute (PEP 3107).
  • Access to non-cdef module globals and Python object attributes is faster.
  • Py_UNICODE* coerces from and to Python unicode strings. This is helpful when talking to Windows APIs, which use compatible wchar_t arrays for strings. Note that the Py_UNICODE type is otherwise deprecated as of CPython 3.3.
  • isinstance(obj, basestring) is optimised. In Python 3 it only tests for instances of str (i.e. Py2 unicode).
  • The basestring builtin is mapped to str (i.e. Py2 unicode) when compiling the generated C code under Python 3.
  • Closures use freelists, which can speed up their creation quite substantially. This is also visible for short running generator expressions, for example.
  • A new class decorator @cython.freelist(N) creates a static freelist of N instances for an extension type, thus avoiding the costly allocation step if possible. This can speed up object instantiation by 20-30% in suitable scenarios. Note that freelists are currently only supported for base types, not for types that inherit from others.
  • Fast extension type instantiation using the Type.__new__(Type) idiom has gained support for passing arguments. It is also a bit faster for types defined inside of the module.
  • The Python2-only dict methods .iter*() and .view*() (requires Python 2.7) are automatically mapped to the equivalent keys/values/items methods in Python 3 for typed dictionaries.
  • Slicing unicode strings, lists and tuples is faster.
  • list.append() is faster on average.
  • raise Exception() from None suppresses the exception context in Py3.3.
  • Py3 compatible exec(tuple) syntax is supported in Py2 code.
  • Keyword arguments are supported for cdef functions.
  • External C++ classes can be declared nogil. Patch by John Stumpo. This fixes trac ticket 805.

Bugs fixed

  • 2-value slicing of unknown objects passes the correct slice when the getitem protocol is used instead of the getslice protocol (especially in Python 3), i.e. None values for missing bounds instead of [0,maxsize]. It is also a bit faster in some cases, e.g. for constant bounds. This fixes trac ticket 636.
  • Cascaded assignments of None values to extension type variables failed with a TypeError at runtime.
  • The __defaults__ attribute was not writable for Cython implemented functions.
  • Default values of keyword-only arguments showed up in __defaults__ instead of __kwdefaults__ (which was not implemented). Both are available for Cython implemented functions now, as specified in Python 3.x.
  • yield works inside of with gil sections. It previously lead to a crash. This fixes trac ticket 803.
  • Static methods without explicitly named positional arguments (e.g. having only *args) crashed when being called. This fixes trac ticket 804.
  • dir() without arguments previously returned an unsorted list, which now gets sorted as expected.
  • dict.items(), dict.keys() and dict.values() no longer return lists in Python 3.
  • Exiting from an except-as clause now deletes the exception in Python 3 mode.
  • The declarations of frexp() and ldexp() in math.pxd were incorrect.

Other changes

0.18 (2013-01-28)

Features added

  • Named Unicode escapes (“N{…}”) are supported.
  • Python functions/classes provide the special attribute “__qualname__” as defined by PEP 3155.
  • Added a directive overflowcheck which raises an OverflowException when arithmetic with C ints overflow. This has a modest performance penalty, but is much faster than using Python ints.
  • Calls to nested Python functions are resolved at compile time.
  • Type inference works across nested functions.
  • py_bytes_string.decode(...) is optimised.
  • C const declarations are supported in the language.

Bugs fixed

  • Automatic C++ exception mapping didn’t work in nogil functions (only in “with nogil” blocks).

Other changes

0.17.4 (2013-01-03)

Bugs fixed

  • Garbage collection triggered during deallocation of container classes could lead to a double-deallocation.

0.17.3 (2012-12-14)

Features added

Bugs fixed

  • During final interpreter cleanup (with types cleanup enabled at compile time), extension types that inherit from base types over more than one level that were cimported from other modules could lead to a crash.
  • Weak-reference support in extension types (with a cdef __weakref__ attribute) generated incorrect deallocation code.
  • In CPython 3.3, converting a Unicode character to the Py_UNICODE type could fail to raise an overflow for non-BMP characters that do not fit into a wchar_t on the current platform.
  • Negative C integer constants lost their longness suffix in the generated C code.

Other changes

0.17.2 (2012-11-20)

Features added

  • cythonize() gained a best effort compile mode that can be used to simply ignore .py files that fail to compile.

Bugs fixed

  • Replacing an object reference with the value of one of its cdef attributes could generate incorrect C code that accessed the object after deleting its last reference.
  • C-to-Python type coercions during cascaded comparisons could generate invalid C code, specifically when using the ‘in’ operator.
  • “obj[1,]” passed a single integer into the item getter instead of a tuple.
  • Cyclic imports at module init time did not work in Py3.
  • The names of C++ destructors for template classes were built incorrectly.
  • In pure mode, type casts in Cython syntax and the C ampersand operator are now rejected. Use the pure mode replacements instead.
  • In pure mode, C type names and the sizeof() function are no longer recognised as such and can be used as normal Python names.
  • The extended C level support for the CPython array type was declared too late to be used by user defined classes.
  • C++ class nesting was broken.
  • Better checking for required nullary constructors for stack-allocated C++ instances.
  • Remove module docstring in no-docstring mode.
  • Fix specialization for varargs function signatures.
  • Fix several compiler crashes.

Other changes

  • An experimental distutils script for compiling the CPython standard library was added as Tools/cystdlib.py.

0.17.1 (2012-09-26)

Features added

Bugs fixed

  • A reference leak was fixed in the new dict iteration code when the loop target was not a plain variable but an unpacked tuple.
  • Memory views did not handle the special case of a NULL buffer strides value, as allowed by PEP3118.

Other changes

0.17 (2012-09-01)

Features added

  • Alpha quality support for compiling and running Cython generated extension modules in PyPy (through cpyext). Note that this requires at least PyPy 1.9 and in many cases also adaptations in user code, especially to avoid borrowed references when no owned reference is being held directly in C space (a reference in a Python list or dict is not enough, for example). See the documentation on porting Cython code to PyPy.
  • “yield from” is supported (PEP 380) and a couple of minor problems with generators were fixed.
  • C++ STL container classes automatically coerce from and to the equivalent Python container types on typed assignments and casts. Note that the data in the containers is copied during this conversion.
  • C++ iterators can now be iterated over using “for x in cpp_container” whenever cpp_container has begin() and end() methods returning objects satisfying the iterator pattern (that is, it can be incremented, dereferenced, and compared (for non-equality)).
  • cdef classes can now have C++ class members (provided a zero-argument constructor exists)
  • A new cpython.array standard cimport file allows to efficiently talk to the stdlib array.array data type in Python 2. Since CPython does not export an official C-API for this module, it receives special casing by the compiler in order to avoid setup overhead on user side. In Python 3, both buffers and memory views on the array type already worked out of the box with earlier versions of Cython due to the native support for the buffer interface in the Py3 array module.
  • Fast dict iteration is now enabled optimistically also for untyped variables when the common iteration methods are used.
  • The unicode string processing code was adapted for the upcoming CPython 3.3 (PEP 393, new Unicode buffer layout).
  • Buffer arguments and memory view arguments in Python functions can be declared “not None” to raise a TypeError on None input.
  • c(p)def functions in pure mode can specify their return type with “@cython.returns()”.
  • Automatic dispatch for fused functions with memoryview arguments
  • Support newaxis indexing for memoryviews
  • Support decorators for fused functions

Bugs fixed

  • Old-style Py2 imports did not work reliably in Python 3.x and were broken in Python 3.3. Regardless of this fix, it’s generally best to be explicit about relative and global imports in Cython code because old-style imports have a higher overhead. To this end, “from __future__ import absolute_import” is supported in Python/Cython 2.x code now (previous versions of Cython already used it when compiling Python 3 code).
  • Stricter constraints on the “inline” and “final” modifiers. If your code does not compile due to this change, chances are these modifiers were previously being ignored by the compiler and can be removed without any performance regression.
  • Exceptions are always instantiated while raising them (as in Python), instead of risking to instantiate them in potentially unsafe situations when they need to be handled or otherwise processed.
  • locals() properly ignores names that do not have Python compatible types (including automatically inferred types).
  • Some garbage collection issues of memory views were fixed.
  • numpy.pxd compiles in Python 3 mode.
  • Several C compiler warnings were fixed.
  • Several bugs related to memoryviews and fused types were fixed.
  • Several bug-fixes and improvements related to cythonize(), including ccache-style caching.

Other changes

  • libc.string provides a convenience declaration for const uchar in addition to const char.
  • User declared char* types are now recognised as such and auto-coerce to and from Python bytes strings.
  • callable() and next() compile to more efficient C code.
  • list.append() is faster on average.
  • Modules generated by @cython.inline() are written into the directory pointed to by the environment variable CYTHON_CACHE_DIR if set.

0.16 (2012-04-21)

Features added

  • Enhancements to Cython’s function type (support for weak references, default arguments, code objects, dynamic attributes, classmethods, staticmethods, and more)
  • Fused Types - Template-like support for functions and methods CEP 522 (docs)
  • Typed views on memory - Support for efficient direct and indirect buffers (indexing, slicing, transposing, …) CEP 517 (docs)
  • super() without arguments
  • Final cdef methods (which translate into direct calls on known instances)

Bugs fixed

  • fix alignment handling for record types in buffer support

Other changes

  • support default arguments for closures
  • search sys.path for pxd files
  • support C++ template casting
  • faster traceback building and faster generator termination
  • support inplace operators on indexed buffers
  • allow nested prange sections

0.15.1 (2011-09-19)

Features added

Bugs fixed

Other changes

0.15 (2011-08-05)

Features added

  • Generators (yield) - Cython has full support for generators, generator expressions and PEP 342 coroutines.
  • The nonlocal keyword is supported.
  • Re-acquiring the gil: with gil - works as expected within a nogil context.
  • OpenMP support: prange.
  • Control flow analysis prunes dead code and emits warnings and errors about uninitialised variables.
  • Debugger command cy set to assign values of expressions to Cython variables and cy exec counterpart $cy_eval().
  • Exception chaining PEP 3134.
  • Relative imports PEP 328.
  • Improved pure syntax including cython.cclass, cython.cfunc, and cython.ccall.
  • The with statement has its own dedicated and faster C implementation.
  • Support for del.
  • Boundschecking directives implemented for builtin Python sequence types.
  • Several updates and additions to the shipped standard library .pxd files.
  • Forward declaration of types is no longer required for circular references.

Bugs fixed

Other changes

  • Uninitialized variables are no longer initialized to None and accessing them has the same semantics as standard Python.
  • globals() now returns a read-only dict of the Cython module’s globals, rather than the globals of the first non-Cython module in the stack
  • Many C++ exceptions are now special cased to give closer Python counterparts. This means that except+ functions that formerly raised generic RuntimeErrors may raise something else such as ArithmeticError.
  • The inlined generator expressions (introduced in Cython 0.13) were disabled in favour of full generator expression support. This breaks code that previously used them inside of cdef functions (usage in def functions continues to work) and induces a performance regression for cases that continue to work but that were previously inlined. We hope to reinstate this feature in the near future.

0.14.1 (2011-02-04)

Features added

  • The gdb debugging support was extended to include all major Cython features, including closures.
  • raise MemoryError() is now safe to use as Cython replaces it with the correct C-API call.

Bugs fixed

Other changes

  • Decorators on special methods of cdef classes now raise a compile time error rather than being ignored.
  • In Python 3 language level mode (-3 option), the ‘str’ type is now mapped to ‘unicode’, so that cdef str s declares a Unicode string even when running in Python 2.

0.14 (2010-12-14)

Features added

  • Python classes can now be nested and receive a proper closure at definition time.
  • Redefinition is supported for Python functions, even within the same scope.
  • Lambda expressions are supported in class bodies and at the module level.
  • Metaclasses are supported for Python classes, both in Python 2 and Python 3 syntax. The Python 3 syntax (using a keyword argument in the type declaration) is preferred and optimised at compile time.
  • “final” extension classes prevent inheritance in Python space. This feature is available through the new “cython.final” decorator. In the future, these classes may receive further optimisations.
  • “internal” extension classes do not show up in the module dictionary. This feature is available through the new “cython.internal” decorator.
  • Extension type inheritance from builtin types, such as “cdef class MyUnicode(unicode)”, now works without further external type redeclarations (which are also strongly discouraged now and continue to issue a warning).
  • GDB support. http://docs.cython.org/src/userguide/debugging.html
  • A new build system with support for inline distutils directives, correct dependency tracking, and parallel compilation. https://github.com/cython/cython/wiki/enhancements-distutils_preprocessing
  • Support for dynamic compilation at runtime via the new cython.inline function and cython.compile decorator. https://github.com/cython/cython/wiki/enhancements-inline
  • “nogil” blocks are supported when compiling pure Python code by writing “with cython.nogil”.
  • Iterating over arbitrary pointer types is now supported, as is an optimized version of the in operator, e.g. x in ptr[a:b].

Bugs fixed

  • In parallel assignments, the right side was evaluated in reverse order in 0.13. This could result in errors if it had side effects (e.g. function calls).
  • In some cases, methods of builtin types would raise a SystemError instead of an AttributeError when called on None.

Other changes

  • Constant tuples are now cached over the lifetime of an extension module, just like CPython does. Constant argument tuples of Python function calls are also cached.
  • Closures have tightened to include exactly the names used in the inner functions and classes. Previously, they held the complete locals of the defining function.
  • The builtin “next()” function in Python 2.6 and later is now implemented internally and therefore available in all Python versions. This makes it the preferred and portable way of manually advancing an iterator.
  • In addition to the previously supported inlined generator expressions in 0.13, “sorted(genexpr)” can now be used as well. Typing issues were fixed in “sum(genexpr)” that could lead to invalid C code being generated. Other known issues with inlined generator expressions were also fixed that make upgrading to 0.14 a strong recommendation for code that uses them. Note that general generators and generator expressions continue to be not supported.
  • Inplace arithmetic operators now respect the cdivision directive and are supported for complex types.
  • Typing a variable as type “complex” previously gave it the Python object type. It now uses the appropriate C/C++ double complex type. A side-effect is that assignments and typed function parameters now accept anything that Python can coerce to a complex, including integers and floats, and not only complex instances.
  • Large integer literals pass through the compiler in a safer way. To prevent truncation in C code, non 32-bit literals are turned into Python objects if not used in a C context. This context can either be given by a clear C literal suffix such as “UL” or “LL” (or “L” in Python 3 code), or it can be an assignment to a typed variable or a typed function argument, in which case it is up to the user to take care of a sufficiently large value space of the target.
  • Python functions are declared in the order they appear in the file, rather than all being created at module creation time. This is consistent with Python and needed to support, for example, conditional or repeated declarations of functions. In the face of circular imports this may cause code to break, so a new –disable-function-redefinition flag was added to revert to the old behavior. This flag will be removed in a future release, so should only be used as a stopgap until old code can be fixed.

0.13 (2010-08-25)

Features added

  • Closures are fully supported for Python functions. Cython supports inner functions and lambda expressions. Generators and generator expressions are not supported in this release.
  • Proper C++ support. Cython knows about C++ classes, templates and overloaded function signatures, so that Cython code can interact with them in a straight forward way.
  • Type inference is enabled by default for safe C types (e.g. double, bint, C++ classes) and known extension types. This reduces the need for explicit type declarations and can improve the performance of untyped code in some cases. There is also a verbose compile mode for testing the impact on user code.
  • Cython’s for-in-loop can iterate over C arrays and sliced pointers. The type of the loop variable will be inferred automatically in this case.
  • The Py_UNICODE integer type for Unicode code points is fully supported, including for-loops and ‘in’ tests on unicode strings. It coerces from and to single character unicode strings. Note that untyped for-loop variables will automatically be inferred as Py_UNICODE when iterating over a unicode string. In most cases, this will be much more efficient than yielding sliced string objects, but can also have a negative performance impact when the variable is used in a Python context multiple times, so that it needs to coerce to a unicode string object more than once. If this happens, typing the loop variable as unicode or object will help.
  • The built-in functions any(), all(), sum(), list(), set() and dict() are inlined as plain for loops when called on generator expressions. Note that generator expressions are not generally supported apart from this feature. Also, tuple(genexpr) is not currently supported - use tuple([listcomp]) instead.
  • More shipped standard library declarations. The python_* and stdlib/stdio .pxd files have been deprecated in favor of clib.* and cpython[.*] and may get removed in a future release.
  • Pure Python mode no longer disallows non-Python keywords like ‘cdef’, ‘include’ or ‘cimport’. It also no longer recognises syntax extensions like the for-from loop.
  • Parsing has improved for Python 3 syntax in Python code, although not all features are correctly supported. The missing Python 3 features are being worked on for the next release.
  • from __future__ import print_function is supported in Python 2.6 and later. Note that there is currently no emulation for earlier Python versions, so code that uses print() with this future import will require at least Python 2.6.
  • New compiler directive language_level (valid values: 2 or 3) with corresponding command line options -2 and -3 requests source code compatibility with Python 2.x or Python 3.x respectively. Language level 3 currently enforces unicode literals for unprefixed string literals, enables the print function (requires Python 2.6 or later) and keeps loop variables in list comprehensions from leaking.
  • Loop variables in set/dict comprehensions no longer leak into the surrounding scope (following Python 2.7). List comprehensions are unchanged in language level 2.
  • print >> stream

Bugs fixed

Other changes

  • The availability of type inference by default means that Cython will also infer the type of pointers on assignments. Previously, code like this:

    cdef char* s = ...
    untyped_variable = s
    

    would convert the char* to a Python bytes string and assign that. This is no longer the case and no coercion will happen in the example above. The correct way of doing this is through an explicit cast or by typing the target variable, i.e.

    cdef char* s = ...
    untyped_variable1 = <bytes>s
    untyped_variable2 = <object>s
    
    cdef object py_object = s
    cdef bytes  bytes_string = s
    
  • bool is no longer a valid type name by default. The problem is that it’s not clear whether bool should refer to the Python type or the C++ type, and expecting one and finding the other has already led to several hard-to-find bugs. Both types are available for importing: you can use from cpython cimport bool for the Python bool type, and from libcpp cimport bool for the C++ type. bool is still a valid object by default, so one can still write bool(x).

  • __getsegcount__ is now correctly typed to take a Py_size_t* rather than an int*.

0.12.1 (2010-02-02)

Features added

  • Type inference improvements.
    • There have been several bug fixes and improvements to the type inferencer.
    • Notably, there is now a “safe” mode enabled by setting the infer_types directive to None. (The None here refers to the “default” mode, which will be the default in 0.13.) This safe mode limits inference to Python object types and C doubles, which should speed up execution without affecting any semantics such as integer overflow behavior like infer_types=True might. There is also an infer_types.verbose option which allows one to see what types are inferred.
  • The boundscheck directive works for lists and tuples as well as buffers.
  • len(s) and s.decode(“encoding”) are efficiently supported for char* s.
  • Cython’s INLINE macro has been renamed to CYTHON_INLINE to reduce conflict and has better support for the MSVC compiler on Windows. It is no longer clobbered if externally defined.
  • Revision history is now omitted from the source package, resulting in a 85% size reduction. Running make repo will download the history and turn the directory into a complete Mercurial working repository.
  • Cython modules don’t need to be recompiled when the size of an external type grows. (A warning, rather than an error, is produced.) This should be helpful for binary distributions relying on NumPy.

Bugs fixed

  • Several other bugs and minor improvements have been made. This release should be fully backwards compatible with 0.12.

Other changes

0.12 (2009-11-23)

Features added

  • Type inference with the infer_types directive
  • Seamless C++ complex support
  • Fast extension type instantiation using the normal Python meme obj = MyType.__new__(MyType)
  • Improved support for Py3.1
  • Cython now runs under Python 3.x using the 2to3 tool
  • unittest support for doctests in Cython modules
  • Optimised handling of C strings (char*): for c in cstring[2:50] and cstring.decode()
  • Looping over c pointers: for i in intptr[:50].
  • pyximport improvements
  • cython_freeze improvements

Bugs fixed

  • Many bug fixes

Other changes

  • Many other optimisation, e.g. enumerate() loops, parallel swap assignments (a,b = b,a), and unicode.encode()
  • More complete numpy.pxd

0.11.2 (2009-05-20)

Features added

  • There’s now native complex floating point support! C99 complex will be used if complex.h is included, otherwise explicit complex arithmetic working on all C compilers is used. [Robert Bradshaw]

    cdef double complex a = 1 + 0.3j
    cdef np.ndarray[np.complex128_t, ndim=2] arr = \
       np.zeros(10, np.complex128)
    
  • Cython can now generate a main()-method for embedding of the Python interpreter into an executable (see #289) [Robert Bradshaw]

  • @wraparound directive (another way to disable arr[idx] for negative idx) [Dag Sverre Seljebotn]

  • Correct support for NumPy record dtypes with different alignments, and “cdef packed struct” support [Dag Sverre Seljebotn]

  • @callspec directive, allowing custom calling convention macros [Lisandro Dalcin]

Bugs fixed

Other changes

  • Bug fixes and smaller improvements. For the full list, see [1].