Is Gemini 2.5 good at bounding boxes?
Is Gemini 2.5 good at bounding boxes? Sort of... July 10, 2025 TL;DR Gemini 2.5 Pro is a decent object detector, matching Yolo V3 from 2018 on MS-COCO val. Multimodal Large Language Models keep getting better, but are they ready to dethrone CNNs in computer vision tasks like object detection? The allure of skipping dataset collection, annotation, and training is too enticing not to waste a few evenings testing. I decided to write a small benchmark and check Gemini 2.5 on MS-COCO, focusing on