Apple has released Pico-Banana-400K, a highly curated 400,000-image research dataset which, interestingly, was built using Google’s Gemini-2.5 models. Here are the details.
Apple’s research team has published an interesting study called “Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing”.
In addition to the study, they also released the full 400,000-image dataset it produced, which has a non-commercial research license. This means that anyone can use it and explore it, provided it is for academic work or AI research purposes. In other words, it can’t be used commercially.
Right, but what is it?
A few months ago, Google released the Gemini-2.5-Flash-Image model, also known as Nanon-Banana, which is arguably the state-of-the-art when it comes to image editing models.
Other models have also shown significant improvements, but, as Apple’s researchers put it:
“Despite these advances, open research remains limited by the lack of large-scale, high-quality, and fully shareable editing datasets. Existing datasets often rely on synthetic generations from proprietary models or limited human-curated subsets. Furthermore, these datasets frequently exhibit domain shifts, unbalanced edit type distributions, and inconsistent quality control, hindering the development of robust editing models.”
So, Apple set out to do something about it.
Building Pico-Banana-400K
The first thing Apple did was pull an unspecified number of real photographs from the OpenImages dataset, “selected to ensure coverage of humans, objects, and textual scenes.”
... continue reading