1. Adaptive Latent Agentic Reasoning
Dongwon Jung, Peng Shi, Yi Zhang, Junshan Zhang, Muhao Chen
2. Enhancing Operational Safety via Agentic Dialogue Hazard Identification Analysis
Sanjay Das, Ran Elgedawy, Ethan Seefried, Ryan Burchfield, Tirthankar Ghosal
3. Toward a Modular Architecture for Embedded AI Agent Systems at the Edge
4. Inducing Reasoning Primitives from Agent Traces
Zhihan Lei, Jiarui Yan, Joshua Momo, William W. Cohen
5. What Makes Interaction Trajectories Effective for Training Terminal Agents?
Sidi Yang, Chaofan Tao, Jierun Chen, Tiezheng Yu, Ruoyu Wang, ..., Taiqiang Wu, Lifeng Shang, Xiaohui Li, Ngai Wong, Haoli Bai
6. Economy of Minds: Emerging Multi-Agent Intelligence with Economic Interactions
Zhenting Qi, Huangyuan Su, Ao Qu, Chenyu Wang, Yu Yao, ..., Ju Li, Paul Pu Liang, Himabindu Lakkaraju, Sham Kakade, Yilun Du
7. KForge: LLM-Driven Cross-Platform Kernel Generation for AI Accelerators
Taras Sereda, Burak Bartan, Ankita Nayak, Tom St.John, Natalie Serrino, Zain Asgar
8. Multi$^2$: Hierarchical Multi-Agent Decision-Making with LLM-Based Agents in Interactive Environments
9. MUSE: A Unified Agentic Harness for MLLMs
Jianglin Lu, Hailing Wang, Xu Ma, Qihua Dong, Mingyuan Zhang, Yizhou Wang, Yun Fu
10. EvoDrive: Pareto Evolution for Safety-Critical Autonomous Driving via Self-Improving LLM Agents
Tong Nie, Yuewen Mei, Yihong Tang, Junlin He, Jie Deng, Jian Sun, Wei Ma
Show 77 more
11. EvoDS: Self-Evolving Autonomous Data Science Agent with Skill Learning and Context Management
Zherui Yang, Fan Liu, Yansong Ning, Hao Liu
12. FederatedSkill: Federated Learning for Agentic Skill Evolution
Jingbo Yang, Guanyu Yao, Yang Zhang, Ramana Rao Kompella, Gaowen Liu, Shiyu Chang
13. Tool-Aware Optimization with Entropy Guidance for Efficient Agentic Reinforcement Learning
Hongye Cao, Nuo Yan, Haoyuan Deng, Ziwei Wang, Tianpei Yang, Jing Huo, Yuyao Zhang, Yang Gao
14. Diagnosing Knowledge Gaps in LLM Tool Use: An Agentic Benchmark for Novel API Acquisition
Jinnuo Liu, Yue Peng, Jinhan Niu, Hongyi Wen
15. Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions
Xuan Yang, Hao Xu, Tingfeng Hui, Hongsheng Xin, Kaike Zhang, Chunxiao Liu, Ning Miao
17. Agentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoning
Yu Xia, Zhouhang Xie, Xin Xu, Byungkyu Kang, Prarit Lamba, Xiang Gao, Julian McAuley
18. Traj-Evolve: A Self-Evolving Multi-Agent System for Patient Trajectory Modeling in Lung Cancer Early Detection
Sihang Zeng, Matthew Thompson, Ruth Etzioni, Meliha Yetisgen
19. Think-Before-Speak: From Internal Evaluation to Public Expression in Multi-Agent Social Simulation
Kaiqi Yang, Tai-Quan Peng, Sanguk Lee, Hui Liu
20. CP-Agent: Context-Aware Multimodal Reasoning for Cellular Morphological Profiling under Chemical Perturbations
Yuxin Zhang, Yiyao Li, Ping Shu Ho, Simon See, Zhenqin Wu, Kevin Tsia
21. TSQAgent: Rating Time Series Data Quality via Dedicated Agentic Reasoning
Shunyu Wu, Dan Li, Haozheng Ye, Weibin Feng, Jian Lou, Bo Zhang, Wenjie Feng, Chenjuan Guo, See-Kiong Ng
22. LAP: An Agent-to-Instrument Protocol for Autonomous Science
Linwu Zhu, Liqiang Gao, Yan Chen, Dan Zhu, Jian Huang
23. PhotoCraft: Agentic Reasoning with Hierarchical Self-Evolving Memory for Deep Image Search
Kailin Lyu, Zhiqiang Yuan, Jianwei He, Qiwei Yan, Xuanbo Su, ..., Ce Hao, Shengqian Qin, Lianyu Hu, Jinchao Zhang, Jie Zhou
24. MemTrain: Self-Supervised Context Memory Training
Ziheng Li, Xingrun Xing, Haoqing Wang, Zhi-Hong Deng, Yehui Tang
25. SagaQA: A Multi-hop Reasoning Benchmark for Long-form Narrative Understanding in TV Series
Galann Pennec, Zhengyuan Liu, Nicholas Asher, Philippe Muller, Nancy F. Chen
26. Inference Cost Attacks for Retrieval-Augmented Large Language Models
Chengliang Liu, Liangbo Ning, Yujuan Ding, Wenqi Fan
27. Cross-Vendor Sola ISPM Benchmark: Evaluating Agentic AI for Federated Identity Security Reasoning
Eden Yavin, Gal Engelberg, Konstantin Koutsyi, Leon Goldberg, Gal Baron
28. MemoGen: Can Past Experience Improve Future Text-to-Image Generation?
Wenshuo Chen, Kuimou Yu, Bowen Tian, Jianfei Song, Shaofeng Liang, ..., Kaishen Yuan, Lei Wang, Jiemin Wu, Songning Lai, Yutao Yue
29. VirtualMLE: A Virtual ML Engineer that Optimizes Sequential Recommenders
Shiteng Cao, Jingwen Liu, Junda She, Zhiheng Li
30. eMEM: A Hybrid Spatio-Temporal Memory System For Embodied Agents
A. Haroon Rasheed, Maria Kabtoul
31. When Helping Hurts and How to Fix It: Multi-Agent Debate for Data Cleaning
Chirag Parmar, Akshat Mehta, Henglin Wu, Jagadish Ramamurthy, Shweta Medhekar
32. What Benchmarks Don't Measure: The Case for Evaluating Abstention Competence in Autonomous Agents
Victor Ojewale, Suresh Venkatasubramanian
33. AUDITFLOW: Executable Symbolic Environments for Structured Financial Reporting Verification
Yan Wang, Xuguang Ai, Jaisal Patel, Xueqing Peng, Fengran Mo, Yupeng Cao, Haohang Li, Mingyu Cao, Lingfei Qian, Víctor Gutiérrez-Basulto
34. SkillDAG: Self-Evolving Typed Skill Graphs for LLM Skill Selection at Scale
Tong Bai, Zhenglin Wan, Pengfei Zhou, Xingrui Yu, Wangbo Zhao, Yang You, Ivor W. Tsang
35. DELTAMEM: Incremental Experience Memory for LLM Agents via Residual Trees
Haoran Tan, Zeyu Zhang, Zhicheng Cao, Rui Li, Xu Chen
36. EvoTrainer: Co-Evolving LLM Policies and Training Harnesses for Autonomous Agentic Reinforcement Learning
Guhong Chen, Yingcheng Shi, Yongbin Li, Binhua Li, Xander Xu, Hu Wei, Shiwen Ni, Min Yang, Jieping Ye
37. Uncertainty-Aware Clarification in LLM Agents with Information Gain
Mengyi Deng, Zhiwei Li, Xin Li, Tingyu Zhu, Ying Zhao, Zhijiang Guo, Wei Wang
38. ClinicalMC: A Benchmark for Multi-Course Clinical Decision-Making with Large Language Models
Ruihui Hou, Siyi Zhu, Ziyue Huai, Guangya Yu, Yongqi Fan, Chunming Wang, Tong Ruan
39. LEAP: Supercharging LLMs for Formal Mathematics with Agentic Frameworks
Po-Nien Kung, Linfeng Song, Dawsen Hwang, Jinsung Yoon, Chun-Liang Li, ..., Quoc V Le, Burak Gokturk, Thang Luong, Tomas Pfister, Nanyun Peng
40. StepFinder: A Temporal Semantic Framework for Failure Attribution in Multi-Agent Systems
Taiyu Zhu, Yifan Wu, Weilin Jin, Ying Li, Gang Huang
41. Overlaying Governance: A Compositional Authorization Framework for Delegation and Scope in Agentic AI
42. Cross-Lingual Token Arbitrage: Optimizing Code Agent Context Windows via Local LLM Preprocessing
43. The DeepSpeak-Agentic Dataset
Sarah Barrington, Maty Bohacek, Hany Farid
45. $Ψ$-Bench: Evaluating Persona-Sensitive Influencing in Persuasive Dialogues
Peixuan Han, Hongyi Du, Jiayu Liu, Yihang Sun, Yutong Liu, Jiaxuan You
46. Libra: Efficient Resource Management for Agentic RL Post-Training
Kaiwen Chen, Xin Tan, Jingzong Li, Hong Xu
47. Validation-Gated Multi-Agent Governance for Online Adaptation of Thermal-Hydraulic Surrogate Models under Operating-Regime Shift
Doyeong Lim, Seungyoon Lee, In Cheol Bang
48. Trading Human Curation for Synthetic Augmentation in RLVR
Akshansh <last>, Leonardo Rosa Rodrigues, Michael Korostelev, Youssef Hassan, Mark E. Whiting
49. Skill-RM: Unifying Heterogeneous Evaluation Criteria via Agent Skill
Tao Chen, Gangwei Jiang, Pengyu Cheng, Siyuan Huang, Yihao Liu, ..., Kai Tang, Junling Liu, Qinliang Su, Xiaoxi Jiang, Guanjun Jiang
50. The Deliberative Illusion: Diagnosing Factual Attrition and Stance Homogenization in Multi-Agent LLM Deliberation
Herun Wan, Jiaying Wu, Minnan Luo, Fanxiao Li, Ningnan Wang, Nancy F. Chen, Min-Yen Kan
51. Framing Migration News with LLMs: Structured CoT as a Support for Human Interpretation
David Alonso del Barrio, Jing Wen, Daniel Gatica-Perez
52. HybridThinker: Efficient Chain-of-Thought Reasoning via Compressed Memory and Transient Thought Steps
Xin Liu, Runsong Zhao, Xinyu Liu, Junhao Ruan, Pengcheng Huang, ..., Chunyang Xiao, Chenglong Wang, Changliang Li, Jingbo Zhu, Tong Xiao
53. A New Framework for Cybersecurity Refusals in AI Agents
Eliot Krzysztof Jones, Mateusz Dziemian, Matt Fredrikson, J Zico Kolter
54. What You Approve Is What Executes: Consent Integrity for Black-Box LLM Agents
55. SkillGuard: A Permission Framework for Agent Skills
Shidong Pan, Xiaoyu Sun, Tianyi Zhang, Dianshu Liao, Meixue Si, Zhenchang Xing
56. FORGE: Multi-Agent Graduated Exploitation and Detection Engineering
57. MetaWorld: Scaling Multi-Agent Video World Model from Single-view Video Data
Teng Hu, Mingchun Lu, Yating Wang, Jiangning Zhang, Jinkun Hao, Ye Pan, Ran Yi, Lizhuang Ma, Dacheng Tao
58. JAVEDIT: Joint Audio-Visual Instruction-Guided Video Editing with Agentic Data Curation
Yinan Chen, Chuming Lin, Zhennan Chen, Yuxiang Zeng, Junwei Zhu, ..., Xiaobin Hu, Chengjie Wang, Yong Liu, Jiangning Zhang, Shuicheng Yan
59. \textsc{CR-Seg}: Attention-Guided and CoT-Enhanced Coarse-to-Refined Reasoning Segmentation
Yifan Cao, Xiaocui Yang, Faxian Wan, Shi Feng, Daling Wang, Yifei Zhang
60. Benchmarking Visual State Tracking in Multimodal Video Understanding
Sihyun Yu, Nanye Ma, Pinzhi Huang, Hyunseok Lee, Shusheng Yang, ..., Ellis Brown, Oscar Michel, Boyang Zheng, Jinwoo Shin, Saining Xie
61. Do Matching Mechanisms Work with LLM Agents?
Yukihiro Hoshino, Ayato Kitadai, Nariaki Nishino
62. Causal Mirage Equilibrium in Agentic Machine Intelligence
63. Skill Is Not Document: A Query-Conditional Benchmark and Two-Stage Retriever for LLM Agent Skill Routing
Zifei Wang, Wei Wen, Qiang Ji, Ruizhi Qiao, Xing Sun
64. Taiji: Pareto Optimal Policy Optimization with Semantics-IDs Trade-off for Industrial LLM-Enhanced Recommendation
Yuecheng Li, Zeyu Song, Jing Yao, Chi Lu, Peng Jiang, Kun Gai
65. CARVE: Certified Affordable Repair of Vetoed Maneuvers via Envelopes for Interactive Driving
66. BotDirector: Robot Storytelling Across the Symmetrical Reality with Multi-modal Interactions
Zhe Sun, Meng Wang, Lei Wang, Yuxi Wang, Wanxin Li, Yujia Peng, Zhenliang Zhang
67. Revisiting Embodied Chain-of-Thought for Generalizable Robot Manipulation
Nan Sun, Yuan Zhang, Yongkun Yang, Wentao Zhao, Peiyan Li, ..., Runze Suo, Yifei Su, Xin Xiao, Xinghang Li, Huaping Liu
68. Self-Refining Agentic Reinforcement Learning for Vision-Conditioned UAV Navigation
Roohan Ahmed Khan, Yasheerah Yaqoot, Muhammad Ahsan Mustafa, Dzmitry Tsetserukou
69. An Asymptotic Theory of Chain-of-Thought in In-Context Learning
Kaito Takanami, Cengiz Pehlevan
70. Proof-Refactor: Refactoring Generated Formal Proofs into Modular Artifacts
Yiming Fu, Peixuan Liu, Zichen Wang, Kun yuan
71. Entropy Gate: Entropy Quenching for Near-Lossless Token Compression in LLM Pipelines
Justice Owusu Agyemang, Jerry John Kponyo, Kwame Opuni-Boachie Obour Agyekum, Francisca Adoma Acheampong, Kwame Agyeman-Prempeh Agyekum, James Dzisi Gadze
72. Learn from Your Mistakes: Tree-like Self-Play for Secure Code LLMs
Wenqi Chen, Ziyan Zhang, Bing Wang, Lin Liu, Hengheng Zhang, Zhengsu Chen
73. OVO-S-Bench: A Hierarchical Benchmark for Streaming Spatial Intelligence in Multimodal LLMs
Yifei Li, Pengyiang Liu, Yuhang Zang, Zhongyue Shi, Qi Fu, Hongye Hao, Jiwen Lu
74. ToolGate: Token-Efficient Pre-Call Control for Tool-Augmented Vision-Language Agents
Anjie Liu, Yan Song, Zhixun Chen, Ziqin Gong, Zhongwei Yu, Jun Wang
75. GTBench: A Curriculum-Grounded Benchmark for Evaluating LLMs as Mathematical Research Assistants in Graph Theory
Noujoud Nader, Ibrahem Aljabea, Patrick Diehl, Deepti Gupta
76. ThoughtFold: Folding Reasoning Chains via Introspective Preference Learning
Ziyan Liu, Xueda Shen, Yuzhe Gu, Songyang Gao, Kuikun Liu, Guangran Cheng, Chengqi Lyu, Dahua Lin, Wenwei Zhang, Kai Chen
77. Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models
Mahtab Bigverdi, Lindsey Li, Weikai Huang, Yiming Liu, Jaemin Cho, ..., Tuhin Kundu, Chris Dangjoo Kim, Zelun Luo, Linda Shapiro, Ranjay Krishna
78. FGRPO: Federated GRPO with Adaptive Aggregation on Non-IID Data
Pengyu Chen, Shaowei Li, Kai Wang, Yunsheng Yuan, Kai Han, Jun Luo, Feng Li
79. Post-Hoc Robustness for Model-Based Reinforcement Learning
Siemen Herremans, Ali Anwar, Siegfried Mercelis
80. A Close Look At World Model Recovery In Supervised Fine-Tuned LLM Planners
Patrick Emami, Nan Qiang, Peter Graf
81. SEA-NLI: Natural Language Inference as a Lens into Southeast Asian Cultural Understanding
Peerawat Chomphooyod, Jian Gang Ngui, Yosephine Susanto, Attapol T. Rutherford, Alham Fikri Aji, Sarana Nutanong, Can Udomcharoenchaikit, Peerat Limkonchotiwat
82. From Script to Semantics: Prompting Strategies for African NLI
Anuj Tiwari, Terry Oko-odion, Hannah Nwokocha
83. Reasoning over Grammar: Can Synthetic Linguistic Reasoning Traces Enhance Low-Resource Machine Translation?
Renhao Pei, Yihong Liu, Sampo Pyysalo, Hinrich Schütze, Shaoxiong Ji
84. Quantifying Faithful Confidence Expression in Large Reasoning Models
Areeb Gani, Asal Meskin, Gabrielle Kaili-May Liu, Arman Cohan
86. Eliciting Complex Spatial Reasoning in MLLMs through Wide-Baseline Matching
Hao Zhong, Muzhi Zhu, Shenyan Zeng, Anzhou Li, Cong Chen, ..., Duochao Shi, Wentao Ye, Tao Lin, Hao Chen, Chunhua Shen
87. Fast Unlearning at Scale via Margin Self-Correction
Federico Di Gennaro, Alexander Shevchenko, Fanny Yang