2

I have the following minimal example of summing an array (taken from here):

#lib.cpp
template<typename T>
T arr_sum(T *arr, int size)
{
    T temp=0;
    for (int i=0; i != size; ++i){
        temp += arr[i];
    }
    return temp;
}

#lib_wrapper.pyx
cimport cython

ctypedef fused  float_t:
    cython.float
    cython.double

cdef extern from "lib.cpp" nogil:
    T arr_sum[T](T *arr, size_t size)

def py_arr_sum(float_t[:] arr not None):
    return arr_sum(&arr[0], arr.shape[0])

#setup.py
from setuptools import setup
from setuptools.extension import Extension
from Cython.Distutils import build_ext
import numpy as np

ext_modules = [Extension("lib_wrapper", ["lib_wrapper.pyx"],
                         include_dirs=[np.get_include()],
                         extra_compile_args=["-std=c++11", "-O1"],
                         language="c++")]

setup(
    name='Rank Filter 1D Cython',
    cmdclass={'build_ext': build_ext},
    ext_modules=ext_modules
)

applying python setup.py build_ext --inplace produces a 202K size shared object lib_wrapper.cpython-39-darwin.so. gcc -shared -fPIC -O1 -o lib.so lib.cpp would produce a smaller object of ~4K size. I assume that the redundancy in the file size comes from the C++- Python bridge, created by Cython.

Considering the numerous methods such as Numpy-C API, pybind11, etc, which one would allow the creation of this bridge without such a large file size overhead? Please exclude ctypes from the suggestions - it seems to bring a large addition to access time.

2
  • @AhmedAEK, "-Os" cut of file size from 197 to 179K. I did not try C API. Is this seems to me the best way? I would like to have multiple types and prefer to have a clean code. In case this is the best way, I will go to this direction. Probably, not the most maintainable code though... Commented Apr 27 at 18:51
  • 2
    from experience pybind11 is slightly smaller in size than cython, and slightly more C++ friendly than pure Python C api, so it is the best of both worlds, (ie: not the best or worst in anything) and it is slower than cython and python C API in terms of performance, but you will have to try real code instead of this example to see the actual impact on size and performance.
    – Ahmed AEK
    Commented Apr 27 at 18:54

1 Answer 1

1

Python C API with Numpy C API will have the least size of all, as everything else is just a wrapper around them.

to reduce size of C++ binaries the best tricks are

  1. compiling with size optimization -Os instead of any other -Ox flag.
  2. disabling RTTI. -fno-rtti and exceptions -fno-exceptions (note that c++ exceptions will terminate the application, so it won't work well with pybind11 exceptions)
  3. compiling without debug info (don't use -g flag)
  4. compiling with -DNDEBUG to remove "debugging code" from some libraries.

Not the answer you're looking for? Browse other questions tagged or ask your own question.