1. Towards Verifiable Multimodal Deep Research: A Multi-Agent Harness for Interleaved Report Generation
Chenghao Zhang, Guanting Dong, Yufan Liu, Tong Zhao, Zhicheng Dou
Lead affiliation - Gaoling School of Artificial Intelligence, Renmin University of China
3. VitalAgent: A Tool-Augmented Agent for Reactive and Proactive Physiological Monitoring over Wearable Health Data
Di Zhu, Yu Yvonne Wu, Hong Jia, Aaqib Saeed, Vassilis Kostakos, Ting Dang
Lead affiliation - The University of Melbourne, Australia
4. Training Deliberative Monitors for Black-Box Scheming Detection
Aditya Sinha, Akshat Naik, Victor Gillioz, Simon Storf, Kilian Merkelbach, Rich Barton-Cooper, Axel Højmark, Marius Hobbhahn
Lead affiliation - Independent
5. How Consistent Are LLM Agents? Measuring Behavioral Reproducibility in Multi-Step Tool-Calling Pipelines
Lead affiliation - Independent Researcher
7. RoboWits: Unexpected Challenges for Robotic Creative Problem Solving
Chunru Lin, Hongxin Zhang, Fenghao Yu, Zhehuan Chen, Thomas L. Griffiths, Yejin Choi, David Held, Chuang Gan
Lead affiliation - University of Massachusetts Amherst
8. Hallucination Mitigation with Agentic AI, Nested Learning, and AI Sustainability via Semantic Caching
Lead affiliation - Tesisquare
9. GTA: Generating Long-Horizon Tasks for Web Agents at Scale
Tenghao Huang, Kung-Hsiang Huang, Prafulla Kumar Choubey, Yilun Zhou, Muhao Chen, Jonathan May, Chien-Sheng Wu
Lead affiliation - University of Southern California
10. PTCG-Bench: Can LLM Agents Master Pokémon Trading Card Game?
Dongdong Hua, Yifei Sun, Renhong Huang, Feng Gao, Chunping Wang, Yang Yang
Lead affiliation - Zhejiang University