The Koala Benchmark Suite
Benchmarks | Quick Setup | More Info | License
For issues and ideas, open a GitHub issue.
Koala is a benchmark suite aimed at the characterization of performance-oriented research targeting the POSIX shell. It consists of 14 sets of real-world shell programs from diverse domains ranging from CI/CD and AI/ML to biology and the humanities. They are accompanied by real inputs that facilitate small- and large-scale performance characterization and varying opportunities for optimization.
If any aspect of Koala is useful, please cite the ATC'25 Koala paper:
@inproceedings { koala2025atc , title = { The Koala Benchmarks for the Shell: Characterization and Implications } , author = { Evangelos Lamprou and Ethan Williams and Georgios Kaoukis and Zhuoxuan Zhang and Michael Greenberg and Konstantinos Kallas and Lukas Lazarek and Nikos Vasilakis } , booktitle = { Proceedings of the 2025 USENIX Annual Technical Conference (USENIX ATC '25) } , year = { 2025 } , address = { Santa Clara, CA } , publisher = { USENIX Association } , }
As part of the ATC'25 Artifact Evaluation process, the Koala frozen atc25-ae branch received all three badges—artifact Available, Functional, and Reproduced.
Benchmarks
Each of the top-level folders (except infrastructure) contains a benchmark set. Please explore the individual benchmark directories for more details on their specific inputs, dependencies, and usage.
Benchmark Description analytics processes real-world network logs to extract and summarize key events. bio performs genomic and transcriptomic analysis using population and RNA-seq data. ci-cd builds and tests open-source software projects. covid analyzes public transit activity during the covid-19 pandemic. file-mod compresses, encrypts, and converts various file formats. inference runs media-related inference tasks using large foundation models. ml implements a full machine learning pipeline using scikit-learn. nlp processes books using shell-based nlp pipelines from unix for poets. oneliners executes classic and modern one-liner shell pipelines. pkg builds aur packages and analyzes npm packages for permissions. repl performs security auditing and git-based development workflow replay. unixfun solves unix text-processing problems from the 50-year anniversary challenge. weather computes and visualizes historical weather statistics. web-search implements crawling, indexing, and querying of wikipedia data.
... continue reading