3
Cython and GDAL
brendan-ward edited this page 2014-12-03 21:55:42 -08:00

Cython and GDAL

Cython is used to build C extensions that provide bindings against GDAL and additional optimizations for performant functions accessible from Python.

GDAL functions

GDAL functions are typically made available in rasterio using the following components:

  • Declaration of external function available in one of GDAL's header files (for example: _gdal.pxd).
  • Cython bridge function that wraps the GDAL function and associated data structures, and may include additional operations to be compiled to C code using Cython (for example: _features.pyx).
  • Python function that calls above bridge function, providing additional utility operations and data validation as necessary (for example: features.py).

GDAL data types

enums need to be wrapped for use in cython.

In _gdal.pxd:

ctypedef enum GDALDataType:
    GDT_Unknown
    GDT_Byte
    GDT_UInt16

GDAL data structures

TODO

Memory management

TODO

Wrapping a GDAL function

Here is an example workflow that covers the highlights:

  • Find the declaration of function you want in the appropriate GDAL header file (for example: gdal_alg.h). Add this declaration to the appropriate declaration file in rasterio (for example: _gdal.pxd) to the existing section that includes other declarations from that header file, or preceded by cdef extern from "<name of header file>": otherwise.

In gdal_alg.h:

CPLErr 	GDALPolygonize(
    GDALRasterBandH hSrcBand, 
    GDALRasterBandH hMaskBand, 
    OGRLayerH hOutLayer, 
    int iPixValField, 
    char **papszOptions, 
    GDALProgressFunc pfnProgress, 
    void *pProgressArg
)

In _gdal.pxd:

cdef extern from "gdal_alg.h":
    int GDALPolygonize(
        void *src_band, 
        void *mask_band, 
        void *layer, 
        int fidx, 
        char **options, 
        void *progress_func, 
        void *progress_data
)
  • Add declarations and wrappings of any required data structures, or simply use their native data types. For example, in the case above GDAL types are represented as their native data types (e.g., int).

  • Create a Cython bridge function in the appropriate rasterio Cython file (for example: _features.pyx). This function will typically take Python or NumPy objects as input, and transform them into the appropriate data structures to pass into the GDAL function. This function will also need to handle transformation of the outputs into the proper representation to return to the calling Python code. This module will need to cimport the module above with the declarations.

  • Create wrapping python function, if necessary. (TODO: better describe recommended division of labor between python and cython functions).

Naming conventions:

TODO: file names (*.pxd, *.pyx, *.py) and function names (_foo() in *.pyx, foo() in *.py)