The 2025 Low-Power Computer Vision Challenge: Spurring Innovation for Edge and Mobile Devices

Imagine a downtown urban street on a traffic-heavy Saturday night. Now, add a severe snowstorm and an autonomous vehicle that safely navigates it all, despite fogged-up cameras, unpredictable pedestrians in oversized outerwear, streetlights throwing glare off a dozen wet surfaces, and the snow itself, which all but obliterates the visibility of traffic signs and lane markers. A computer vision (CV) system that could make this safe navigation possible would revolutionize not only autonomous vehicles but also robotics, security, and other domains. Moonshot goals like this eventually become reality through events like the Low-Power-Computer Vision Challenge (LPCVC), which just wrapped up in Nashville at the IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR). LPCVC founder and organizer Yung-Hsiang Lu, a professor of electrical and computer engineering at Purdue University, said the month-long challenge–which ran from 1–31 March–offers many benefits, both in itself and for the future. “Participating in LPCVC gives [people] an opportunity to compete with the world’s top researchers,” said Lu. “For the field, LPCVC gives researchers from different organizations and countries a chance to share their best solutions. Together, they propel the technology forward.” LPCVC: An Evolving Challenge LPCVC was born out of the buzz about CV and deep learning at the 2013 IEEE Rebooting Computing Summit. Conversations among Lu and other attendees there surfaced a common concern: that deep learning’s steep resource requirements would put it beyond the reach of mobile CV applications. The CV Challenge Takes Shape From these discussions, the challenge—originally known as the Low-Power Image Recognition Challenge (LPIRC)—was formally established to encourage the integration of low-power technologies and CV. The first event was the onsite LPIRC challenge held on 7 June 2015 in Austin, Texas, in conjunction with the Design Automation Conference. Two more onsite challenges followed, and the event evolved from there: In 2018, LPIRC went online to offer greater access to developers across the globe. For 2020, the challenge’s name and scope were broadened to the Low-Power Computer Vision Challenge. The LPCVC continued annually until 2024, when it took a year-long break. This Year’s Challenge LPCVC 2025 relaunched with a new platform and a new sponsor: Qualcomm. Shuai Zhang is a senior staff machine learning researcher and tech lead in the Qualcomm Computer Vision Group. Zhang said that Qualcomm and LPCVC share a common goal: to advance innovative edge AI solutions in both academia and industry. Qualcomm and Lu collaborated on designing and implementing the challenge’s new online system—including a compiling, submission, and evaluation pipeline, along with online webinars and baseline codes. The aim was to make LPCVC even more accessible to everyone, from experts to newer developers and students who might lack background knowledge or the required hardware resources. Zhang said the challenge’s online system also facilitates innovation by providing rapid feedback. “The new pipeline provides immediate compiling feedback via Qualcomm AI Hub, allowing quick debugging and modifications,” said Zhang. “This enables participants to focus more on optimization and innovative solutions, rather than waiting for feedback as in previous years.” LPCVC 2025 included 77 teams from 25 different countries and regions, with the highest number of teams based in India, China, South Korea, and the United States. For Lu, this growth is a testament to the increasing global interest in low-power CV. “When we started in 2015, only dozens of people participated. Now hundreds of people from many countries are interested in these topics,” said Lu, calling that growth his biggest satisfaction with the challenge this year. His biggest surprise? “That some student teams outperformed industrial teams made of multiple researchers with many years of experience.” LPCVC 2025: Challenge Tracks and Winners The 2025 challenge had three tracks, each of which was focused on optimizing and deploying models on edge devices, including mobile phones and AI PCs. The winning teams from each of the following tracks were invited to CVPR’s 8th Workshop on Efficient Deep Learning for Computer Vision on 11 June to present their solutions and receive awards. Track 1: Image Classification In this track, teams were challenged to create models that could quickly and effectively label images with different lighting and styles. Why It Matters Accurately classifying widely varied images helps applications perform more reliably under messy real-world conditions. This includes tackling issues such as environmental noise in self-driving cars and other safety critical applications, where misclassification can have serious consequences. Winning Teams Track 1’s winning team was LabLVM from the Republic of Korea’s Ajou University. Although the winning solution’s completion time of 1,961 microseconds was slower than the baseline of 419 microseconds, it achieved a remarkable 0.97 accuracy compared to a baseline accuracy of 0.69. “I prioritized enhancing the model’s generalization ability,” said Seungmin Oh, leader of the LabLVM team. “The most difficult part was maintaining this robustness while ensuring the model met the strict inference time constraints.” Oh was surprised by how effectively vision-language models performed in this context. “On the other hand,” said Oh, “I found that synthetically generated data did not significantly improve the model’s performance, contrary to my initial expectations.” Track 1’s runners-up were The SEU AIC LAB from Southeast University’s School of Integrated Circuits in the People’s Republic of China (second place) ETF Amigo from University of Belgrade’s School of Electrical Engineering, Republic of Serbia (third place) Track 2: Open-Vocabulary Segmentation with Text Prompt In this track, teams were challenged to create models for identifying and segmenting image objects based on text descriptions, even if the objects weren’t part of the training set. Why It Matters Models that can use text descriptions to identify images are more flexible; they also better mirror human interactions, letting users simply describe what they want the model to find. Winning Teams Track 2’s winning solution was created by the SICer team from Southeast University (which placed second in Track 1). In this track, the team’s model improved on accuracy and speed: it had a mean intersection over union of 0.61 compared to the baseline’s 0.46, and did so in 515.18 milliseconds, compared to the baseline’s 863.42 milliseconds. SICer team leader Yuning Ji said preparing the model was the biggest challenge: “Model training demands substantial computational resources and is time-consuming,” Ji said. “Hidden categories necessitate training datasets to be both comprehensive and meticulously annotated.” Track 2’s runners-up were Sailor Moon from the University of Minnesota in the United States (second place). MaXinLab from University of California, Irvine, in the United States (third place) Track 3: Monocular Depth Estimation In this track, teams were challenged to create models that could infer the relative depth or distance between objects in a scene presented in only a single image. Why It Matters While humans routinely grasp 3D relationships based on 2D inputs, training models to do this is nontrivial; it’s also essential given that most real-world images come from a single camera, whether it’s embedded in a drone or a phone or a surveillance system. Achieving accurate monocular depth estimation can dramatically lower the costs for creating 3D-aware systems. It can also help applications such as robotics and autonomous driving better understand and navigate their environments. The Winning Teams In track 3, the winning solution’s completion time of 29.79 milliseconds was marginally slower than the baseline’s 24.09 milliseconds, but it achieved a remarkable improvement in depth estimation accuracy, with an F-Score of 83.80 compared to the baseline’s 62.37. The winning team was Sailor Moon from the University of Minnesota (which placed second in Track 2). The team’s leader, Kexin Chen, said the competition’s set up—including the limited timeline, lack of training data, and the fact that metric evaluations were available only for the previous day’s last submission—meant that they had to be thoughtful and highly selective about which models they submitted for evaluation. Chen said this limited the optimization techniques that the team could explore. Nonetheless, in addition to the wins, team members had a powerful takeaway: “We learned that reducing image resolution significantly improves running time while having only a minor impact on accuracy, especially for ViT-based model architectures,” said Chen. Track 3’s runners-up were The team from Shenyang Institute of Automation at the Chinese Academy of Sciences, People’s Republic of China (second place) Circle, an independent team from the United States (third place). Plans for LPCVC 2026 and Advice for Future Participants Zhang said that Qualcomm plans to continue collaborating with Professor Lu on LPCVC 2026. Among their possible plans are to expand the scope to encompass more AI areas, including natural language processing and large language models. Reflecting on their experiences in LPCVC 2025, the leaders of all three winning teams offered the following advice for competitors aiming to take on LPCVC 2026. “I encourage future participants to approach the task with creativity. Although Task 1 focused on different lighting conditions and styles, improving performance under these variations doesn’t necessarily require generating new images with diverse lighting. There may be alternative, more innovative ways to tackle the problem effectively.” —Seungmin Oh, Track 1’s winning team leader “For tracks related to large-scale models, it is critical to ensure ample GPU computational resources and conduct pre-training in advance. This allows sufficient time to explore optimized model architectures and perform fine-tuning on custom-built downstream datasets.” —Yuning Ji, Track 2’s winning team leader “Our advice would be to start early and take time to study the sample solutions carefully. The LPCVC 2025 organizing team is incredibly supportive, so don’t hesitate ask questions whenever needed.” —Kexin Chen, Track 3’s winning team leader. For Further Information For further information about this year’s event and LPCVC, check out: All nine of LPCVC 2025’s winning solutions, which are available to the public on GitHub, along with the test datasets. The LPCVC website, which includes full details about this and past year’s challenges, as well as a newsletter sign-up to stay informed about future events. To dig deeper, visit IEEE Computer Society’s extensive Computer Vision Resources page, which includes everything from career opportunities to discussions about CV’s future and issues related to ethics, standards, diversity, and inclusion in the CV field.

The 2025 Low-Power Computer Vision Challenge: Spurring Innovation for Edge and Mobile Devices

Share this article

Related Articles