Working with Python arrays

Python has a builtin array module supporting dynamic 1-dimensional arrays of primitive types. It is possible to access the underlying C array of a Python array from within Cython. At the same time they are ordinary Python objects which can be stored in lists and serialized between processes when using multiprocessing.

Compared to the manual approach with malloc() and free(), this gives the safe and automatic memory management of Python, and compared to a Numpy array there is no need to install a dependency, as the array module is built into both Python and Cython.

Safe usage with memory views

from cpython cimport array
import array
cdef array.array a = array.array('i', [1, 2, 3])
cdef int[:] ca = a

print(ca[0])

NB: the import brings the regular Python array object into the namespace while the cimport adds functions accessible from Cython.

A Python array is constructed with a type signature and sequence of initial values. For the possible type signatures, refer to the Python documentation for the array module.

Notice that when a Python array is assigned to a variable typed as memory view, there will be a slight overhead to construct the memory view. However, from that point on the variable can be passed to other functions without overhead, so long as it is typed:

from cpython cimport array
import array

cdef array.array a = array.array('i', [1, 2, 3])
cdef int[:] ca = a

cdef int overhead(object a):
    cdef int[:] ca = a
    return ca[0]

cdef int no_overhead(int[:] ca):
    return ca[0]

print(overhead(a))  # new memory view will be constructed, overhead
print(no_overhead(ca))  # ca is already a memory view, so no overhead

Zero-overhead, unsafe access to raw C pointer

To avoid any overhead and to be able to pass a C pointer to other functions, it is possible to access the underlying contiguous array as a pointer. There is no type or bounds checking, so be careful to use the right type and signedness.

from cpython cimport array
import array

cdef array.array a = array.array('i', [1, 2, 3])

# access underlying pointer:
print(a.data.as_ints[0])

from libc.string cimport memset

memset(a.data.as_voidptr, 0, len(a) * sizeof(int))

Note that any length-changing operation on the array object may invalidate the pointer.

Cloning, extending arrays

To avoid having to use the array constructor from the Python module, it is possible to create a new array with the same type as a template, and preallocate a given number of elements. The array is initialized to zero when requested.

from cpython cimport array
import array

cdef array.array int_array_template = array.array('i', [])
cdef array.array newarray

# create an array with 3 elements with same type as template
newarray = array.clone(int_array_template, 3, zero=False)

An array can also be extended and resized; this avoids repeated memory reallocation which would occur if elements would be appended or removed one by one.

from cpython cimport array
import array

cdef array.array a = array.array('i', [1, 2, 3])
cdef array.array b = array.array('i', [4, 5, 6])

# extend a with b, resize as needed
array.extend(a, b)
# resize a, leaving just original three elements
array.resize(a, len(a) - len(b))

API reference

Data fields

data.as_voidptr
data.as_chars
data.as_schars
data.as_uchars
data.as_shorts
data.as_ushorts
data.as_ints
data.as_uints
data.as_longs
data.as_ulongs
data.as_longlongs  # requires Python >=3
data.as_ulonglongs  # requires Python >=3
data.as_floats
data.as_doubles
data.as_pyunicodes

Direct access to the underlying contiguous C array, with given type; e.g., myarray.data.as_ints.

Functions

The following functions are available to Cython from the array module:

int resize(array self, Py_ssize_t n) except -1

Fast resize / realloc. Not suitable for repeated, small increments; resizes underlying array to exactly the requested amount.

int resize_smart(array self, Py_ssize_t n) except -1

Efficient for small increments; uses growth pattern that delivers amortized linear-time appends.

cdef inline array clone(array template, Py_ssize_t length, bint zero)

Fast creation of a new array, given a template array. Type will be same as template. If zero is True, new array will be initialized with zeroes.

cdef inline array copy(array self)

Make a copy of an array.

cdef inline int extend_buffer(array self, char* stuff, Py_ssize_t n) except -1

Efficient appending of new data of same type (e.g. of same array type) n: number of elements (not number of bytes!)

cdef inline int extend(array self, array other) except -1

Extend array with data from another array; types must match.

cdef inline void zero(array self)

Set all elements of array to zero.