F3: The Open-Source Data File Format for the Future
F3 is a data file format that is designed with efficiency, interoperability, and extensibility in mind. It provides a data organization that rectifies the layout shortcomings of the last-generation formats like Parquet, while at the same time maintaining good interoperability and extensibility (a.k.a future-proof) via embedded Wasm decoders.
⚠️ This project is a research prototype verifying the ideas in the paper. You should not use it in production.
Build instructions
We only tested on an Intel machine with Debian 12.
git submodule update --init --recursive ./scripts/setup_debian.sh # build the PoC package of F3 cargo build -p fff-poc # run unit test for F3 cargo test -p fff-poc
Important directories
format: FlatBuffer definition of the file format.
fff-poc: The main code of the F3 format. It references other subdirs like fff-core, fff-encoding, fff-format, and fff-ude-wasm.
fff-bench: Benchmarks and experiments appeared in the paper. Specifically, fff-bench/examples should contain most experiments, both micro and e2e.
... continue reading