sparrow C++20 idiomatic APIs for the Apache Arrow Columnar Format Introduction sparrow is an implementation of the Apache Arrow Columnar format in C++. It provides array structures with idiomatic APIs and convenient conversions from and to the C interface. sparrow requires a modern C++ compiler supporting C++20. Installation Package managers We provide a package for the mamba (or conda) package manager: mamba install -c conda-forge sparrow Install from sources sparrow has a few dependencies that you can install in a mamba environment: mamba env create -f environment-dev.yml mamba activate sparrow You can then create a build directory, and build the project and install it with cmake: mkdir build cd build cmake .. \ -DCMAKE_BUILD_TYPE=Debug \ -DCMAKE_INSTALL_PREFIX= $CONDA_PREFIX \ -DBUILD_EXAMPLES=ON \ -DBUILD_TESTS=ON \ -BUILD_DOCS=ON \ .. make install Usage Requirements Compilers: Clang 18 or higher GCC 11.2 or higher Apple Clang 16 or higher MSVC 19.41 or higher Initialize data with sparrow and extract C data structures # include " sparrow/sparrow.hpp " namespace sp = sparrow; sp::primitive_array< int > ar = { 1 , 3 , 5 , 7 , 9 }; auto [arrow_array, arrow_schema] = sp::extract_arrow_structures(std::move(ar)); // Use arrow_array and arrow_schema as you need (serialization, passing it to // a third party library) // ... // You are responsible for releasing the structure in the end arrow_array.release(&arrow_array); arrow_schema.release(&arrow_schema); Initialize data with sparrow and use C data structures # include " sparrow/sparrow.hpp " namespace sp = sparrow; sp::primitive_array< int > ar = { 1 , 3 , 5 , 7 , 9 }; // Caution: get_arrow_structures returns pointers, not values auto [arrow_array, arrow_schema] = sp::get_arrow_structures(ar); // Use arrow_array and arrow_schema as you need (serialization, passing it to // a third party library) // ... // do NOT release the C structures in the end, the "ar" variable will do it for you Read data from somewhere and pass it to sparrow # include " sparrow/sparrow.hpp " # include " thrid-party-lib.hpp " namespace sp = sparrow; namespace tpl = third_party_library; ArrowArray array; ArrowSchema schema; tpl::read_arrow_structures (&array, &schema); sp::array ar (&array, &schema); // Use ar as you need // ... // You are responsible for releasing the structure in the end array.release(&array); schema.release(&schema); Read data from somewhere and move it into sparrow # include " sparrow/sparrow.hpp " # include " thrid-party-lib.hpp " namespace sp = sparrow; namespace tpl = third_party_library; ArrowArray array; ArrowSchema schema; tpl::read_arrow_structures (&array, &schema); sp::array ar (std::move(array), std::move(schema)); // Use ar as you need // ... // do NOT release the C structures in the end, the "ar" variable will do it for you Documentation The documentation (currently being written) can be found at https://man-group.github.io/sparrow/index.html Acknowledgements This development has been funded as part of a collaboration between ArcticDB, Bloomberg, and QuantStack. License This software is licensed under the Apache License 2.0. See the LICENSE file for details.