Cython and GDAL
Cython is used to build C extensions that provide bindings against GDAL and additional optimizations for performant functions accessible from Python.
GDAL functions
GDAL functions are typically made available in rasterio using the following components:
- Declaration of external function available in one of GDAL's header files (for example:
_gdal.pxd). - Cython bridge function that wraps the GDAL function and associated data structures, and may include additional operations to be compiled to C code using Cython (for example:
_features.pyx). - Python function that calls above bridge function, providing additional utility operations and data validation as necessary (for example:
features.py).
GDAL data types
enums need to be wrapped for use in cython.
In _gdal.pxd:
ctypedef enum GDALDataType:
GDT_Unknown
GDT_Byte
GDT_UInt16
GDAL data structures
TODO
Memory management
TODO
Wrapping a GDAL function
Here is an example workflow that covers the highlights:
- Find the declaration of function you want in the appropriate GDAL header file (for example:
gdal_alg.h). Add this declaration to the appropriate declaration file in rasterio (for example:_gdal.pxd) to the existing section that includes other declarations from that header file, or preceded bycdef extern from "<name of header file>":otherwise.
In
gdal_alg.h:
CPLErr GDALPolygonize(
GDALRasterBandH hSrcBand,
GDALRasterBandH hMaskBand,
OGRLayerH hOutLayer,
int iPixValField,
char **papszOptions,
GDALProgressFunc pfnProgress,
void *pProgressArg
)
In
_gdal.pxd:
cdef extern from "gdal_alg.h":
int GDALPolygonize(
void *src_band,
void *mask_band,
void *layer,
int fidx,
char **options,
void *progress_func,
void *progress_data
)
-
Add declarations and wrappings of any required data structures, or simply use their native data types. For example, in the case above GDAL types are represented as their native data types (e.g.,
int). -
Create a Cython bridge function in the appropriate rasterio Cython file (for example:
_features.pyx). This function will typically take Python or NumPy objects as input, and transform them into the appropriate data structures to pass into the GDAL function. This function will also need to handle transformation of the outputs into the proper representation to return to the calling Python code. This module will need tocimportthe module above with the declarations. -
Create wrapping python function, if necessary. (TODO: better describe recommended division of labor between python and cython functions).
Naming conventions:
TODO: file names (*.pxd, *.pyx, *.py) and function names (_foo() in *.pyx, foo() in *.py)