1. RUBAS: Rubric-Based Reinforcement Learning for Agent Safety
Xian Qi Loye, Qinglin Su, Zhexin Zhang, Shiyao Cui, Qi Zhu, Fei Mi, Hongning Wang, Minlie Huang
2. Plan First, Judge Later, Run Better: A DMAIC-Inspired Agentic System for Industrial Anomaly Detection
Yongzi Yu, Ao Li, Le Wang, Ziyue Li, Fugee Tsung, Yuxuan Liang, Man Li
3. MapAgent: An Industrial-Grade Agentic Framework for City-scale Lane-level Map Generation
Deguo Xia, Zihan Li, Haochen Zhao, Dong Xie, Yuyao Kong, Xiyan Liu, Jizhou Huang, Mengmeng Yang, Diange Yang
4. Scaling Self-Evolving Agents via Parametric Memory
Tao Ren, Weiyao Luo, Hui Yang, Rongzhi Zhu, Xiang Huang, ..., Bingxue Chou, Jieping Ye, Jiafeng Liang, Yongbin Li, Yijie Peng
5. Strabo: Declarative Specification and Implementation of Agentic Interaction Protocols
Samuel H. Christie V, Amit K. Chopra, Munindar P. Singh
6. SMAC-Talk: A Natural Language Extension of the StarCraft Multi-Agent Challenge for Large Language Models
7. Cascading Hallucination in Agentic RAG: The CHARM Framework for Detection and Mitigation
8. Exploring Cross-Scenario Generality of Agentic Memory Systems: Diagnostics and a Strong Baseline
Zhikai Chen, Jialiang Gu, Junyu Yin, Xianxuan Long, Shenglai Zeng, Xiaoze Liu, Kai Guo, Keren Zhou, Jiliang Tang
9. The Meta-Agent Challenge: Are Current Agents Capable of Autonomous Agent Development?
Xinyu Lu, Tianshu Wang, Pengbo Wang, zujie wen, Zhiqiang Zhang, ..., Boxi Cao, Yaojie Lu, Hongyu Lin, Xianpei Han, Le Sun
10. AgentJet: A Flexible Swarm Training Framework for Agentic Reinforcement Learning
Qingxu Fu, Boyin Liu, Shuchang Tao, Zhaoyang Liu, Bolin Ding
나머지 62편 보기
11. Rethinking Continual Experience Internalization for Self-Evolving LLM Agents
Jingwen Chen, Wenkai Yang, Shengda Fan, Wenbo Nie, Chenxing Sun, Shaodong Zheng, Yangen Hu, Lu Pan, Ke Zeng, Yankai Lin
12. TIBlender: Early-Warning Threat Intelligence from Cross-Platform Social Media Evidence
Hiroki Nakano, Takashi Koide, Daiki Chiba
13. From Agent Traces to Trust: Evidence Tracing and Execution Provenance in LLM Agents
Yiqi Wang, Jiaqi Zhang, Taotao Cai, Zirui Liu, Qingqiang Sun, Zequn Sun, Zhangkai Wu, Mingkai Zhang, Yanming Zhu
15. Agent Planning Benchmark: A Diagnostic Framework for Planning Capabilities in LLM Agents
Haoyu Sun, Wenxuan Wang, Mingyang Song, Jujie He, Weinan Zhang, Yang Liu, Yang Yang, Yu Cheng
16. Towards Efficient and Evidence-grounded Mobility Prediction with LLM-Driven Agent
Linyao Chen, Qinlao Zhao, Zechen Li, Mingming Li, Likun Ni, Jinyu Chen, Yuhao Yao, Xuan Song, Noboru Koshizuka, Hiroki Kobayashi
17. Evaluating Large Language Models in Dynamic Clinical Decision-Making with Standardized Patient Cases
Cheng Liang, Pengcheng Qiu, Ya Zhang, Yanfeng Wang, Chaoyi Wu, Weidi Xie
18. Streaming Communication in Multi-Agent Reasoning
Zhen Yang, Xiaogang Xu, Wen Wang, Cong Chen, Xander Xu, Ying-Cong Chen
19. Trivium: Temporal Regret as a First-Class Objective for Causal-Memory Controllers
21. Deliberate Evolution: Agentic Reasoning for Sample-Efficient Symbolic Regression with LLMs
Xinyu Pang, Zhanke Zhou, Xuan Li, Fangrui Lv, Shanshan Wei, Sen Cui, Bo Han, Changshui Zhang
23. GARL: Game-Theoretic Reinforcement Learning for Multi-Agent Strategic Prioritisation
Yuxiao Ye, Yiwen Zhang, Huiyuan Xie, Yuqin Huang, Zhiyuan Liu
24. Imbuing Large Language Models with Bidirectional Logic for Robust Chain Repair
Zehua Cheng, Wei Dai, Jiahao Sun, Thomas Lukasiewicz
25. Consensus is Strategically Insufficient: Reasoning-Trace Disagreement as a Knowledge-Representation Signal
Michał Wawer, Jarosław A. Chudziak
27. Simulate, Reason, Decide: Scientific Reasoning with LLMs for Simulation-Driven Decision Making
Yuhan Yang, Ruipu Li, Alexander Rodríguez
28. FALSIFYBENCH: Evaluating Inductive Reasoning in LLMs with Rule Discovery Games
Leonardo Bertolazzi, Katya Tentori, Raffaella Bernardi
29. SMADE-IE: Sparse Multi-Agent Framework with Evidence-Driven Debate for Zero-Shot Information Extraction
Kenfeng Huang, Yi Cai, Xin Wu, Zikun Deng, Li Yuan
30. NoRA: Evaluating Grounded Reasonableness in Visual First-person Normative Action Reasoning
Sichao Li, Sai Ma, Daniel Kilov, Secil Yanik Guyot, Zhuang Li, Seth Lazar
31. AgenticDiffusion: Agentic Diffusion-based Path Planning for Vision-Based UAV Navigation
Faryal Batool, Muhammad Ahsan Mustafa, Fawad Mehboob, Valerii Serpiva, Dzmitry Tsetserukou
34. AutoLab: Can Frontier Models Solve Long-Horizon Auto Research and Engineering Tasks?
Zhangchen Xu, Junda Chen, Yue Huang, Dongfu Jiang, Jiefeng Chen, ..., Mengdi Wang, Radha Poovendran, Misha Sra, Alex Pentland, Zichen Chen
35. Entity Binding Failures in Speech LLM Reasoning: Diagnosis and Chain-of-Thought Intervention
Ming-Hao Hsu, Xiaohai Tian, Jun Zhang, Zhizheng Wu
36. Temporal Order Matters for Agentic Memory: Segment Trees for Long-Horizon Agents
Yifan Simon Liu, Liam Gallagher, Faeze Moradi Kalarde, Jiazhou Liang, Armin Toroghi, Scott Sanner
37. LifeSide: Benchmarking Agents as Lifelong Digital Companions
Yuqian Wu, Zhijie Deng, Wei Chen, Junwei Li, Yutian Jiang, ..., Zhengjun Huang, Qingxiang Liu, Jing Tang, Jiaheng Wei, Yuxuan Liang
39. Stateful Visual Encoders for Vision-Language Models
Zirui Wang, Junwei Yu, Adam Yala, David M. Chan, Joseph E. Gonzalez, Trevor Darrell
41. Online Skill Learning for Web Agents via State-Grounded Dynamic Retrieval
Jiaxi Li, Ke Deng, Yun Wang, Jingyuan Huang, Yucheng Shi, Qiaoyu Tan, Jin Lu, Ninghao Liu
43. Fog of Love: Engineering Virtuous Agent Behavior with Affinity-based Reinforcement Learning in a Game Environment
Ajay Vishwanath, Christian Omlin
44. Tree-Based Formalization of Multi-Agent Complementarity in Human-AI Interactions
45. Episodic Memory Temporal Consistency for Cooperative Multi-Agent Reinforcement Learning
Zicheng Zhao, Yu Lan, Chengzhengxu Li, Zhaohan Zhang, Xiaoming Liu
46. Enhancing the MADDPG Algorithm for Multi-Agent Learning via Action Inference and Importance Sampling
Marc Walden, Jason Liu, Shaashwath Sivakumar, Ryan Liu, Hamza Khan
47. RAMPART: Registry-based Agentic Memory with Priority-Aware Runtime Transformation
48. PersonaTree: Structured Lifecycle Memory for Person Understanding in LLM Agents
Yubo Hou, Jingwei Song, Hongbo Zhang, Zhisheng Chen, Bang Xiao, Tao Wan, Zengchang Qin
49. DAR: Deontic Reasoning with Agentic Harnesses
Guangyao Dou, William Jurayj, Nils Holzenberger, Benjamin Van Durme
50. Arithmetic Pedagogy for Language Models
Andhika Bernard Lumbantobing, Hokky Situngkir
51. Caught in the Act(ivation): Toward Pre-Output and Multi-Turn Detection of Credential Exfiltration by LLM Agents
Kargi Chauhan, Pratibha Revankar
52. From Untrusted Input to Trusted Memory: A Systematic Study of Memory Poisoning Attacks in LLM Agents
Pritam Dash, Tongyu Ge, Aditi Jain, Tanmay Shah, Zhiwei Shang
53. What If Prompt Injection Never Left? Exploring Cross-Session Stored Prompt Injection in Agentic Systems
Yuanbo Xie, Tianyun Liu, Yingjie Zhang, Suchen Liu, Yulin Li, Liya Su, Tingwen Liu
54. Selection-Aware Diagnostics for Chain-of-Thought Answer Hijacking
55. A-Live: Passive Liveness Detection via Neuromuscular Micro-Motion Signatures on Commodity Sensors
Mohammed Gharib, Sam Burns, Martin Zizi
56. Food-R1: A Unified Multi-Task Food Vision-Language Model with Reinforcement Learning
Yu Zhu, Yongkang Li, Wenjie Zhu, Haoyi Jiang, Wenyu Liu, Wei Yang, Bin Li, Xinggang Wang
57. MetaPoint: Unlocking Precise Spatial Control in Agentic Visual Generation
Dewei Zhou, Xinyu Huang, Xun Wang, Ji Xie, Yabo Zhang, Liang Li, Kunchang Li, Zongxin Yang, Yi Yang
58. Learning to cooperate with emergent reputation via multi-agent reinforcement learning
Xinwei Song, Yizhe Huang, Dengji Zhao, Xue Feng
59. CoPark: Learning Reactive Parking via Self-Play
Jiarong Wei, Yanxing Chen, Sinuo Song, Yin Wu, Anna Rehr, Abhinav Valada
60. CADENCE: Predicting Realized MAPF Execution Time Beyond Sum of Costs
Abhishek S, Badrikanath Praharaj, Sreeram MV
61. Knowledge Index of Noah's Ark
Sheng Jin, Minghao Liu, Yunze Xiao, Zeqi Zhou, Heli Qi, ..., Wenhao Huang, Jiaheng Liu, Zihan Wang, Weihao Xuan, Ge Zhang
62. SePO: Self-Evolving Prompt Agent for System Prompt Optimization
Wangcheng Tao, Han Wu, Weng-Fai Wong
63. Description-Code Inconsistency in Real-world MCP Servers: Measurement, Detection, and Security Implications
Yutao Shi, Xiaohan Zhang, Xiangjing Zhang, Xihua Shen, Hui Ouyang, Huming Qiu, Mi Zhang, Min Yang
64. MIRAGE: Mobile Agents with Implicit Reasoning and Generative World Models
Zhichao Yang, Yuanze Hu, Haojie Hao, Longkun Hao, Dongshuo Huang, Hongyu Lin, Gen Li, Lanqing Hong, Yihang Lou, Yan Bai
65. FindIt: A Format-Informed Visual Detection Benchmark for Generalist Multimodal LLMs
Eshika Khandelwal, Jingjing Pan, Mingfang Zhang, Quan Kong, Lorenzo Garattoni, Hilde Kuehne
66. Invariant Gradient Alignment for Robust Reasoning Distillation
Zehua Cheng, Wei Dai, Jiahao Sun
67. QO-Bench: Diagnosing Query-Operator-Preserving Retrieval over Typed Event Tuples
Mengao Zhang, Xiang Yang, Chang Liu, Tianhui Tan, Ke-wei Huang
68. 3DThinkVLA: Endowing Vision-Language-Action Models with Latent 3D Priors via 3D-Thinking-Guided Co-training
Jiaxin Shi, Xidong Zhang, Fucai Zhu, Zhe Li, Siyu Zhu, Weihao Yuan
69. Dive into the Scene: Breaking the Perceptual Bottleneck in Vision-Language Decision Making via Focus Plan Generation
Boyuan Xiao, Bohong Chen, Yumeng Li, Ji Feng, Yao-Xiang Ding, Kun Zhou
71. Bayesian learning for the stochastic shortest path problem
Chon Wai Ho, Sumeetpal S. Singh, Jiaqi Guo
72. Smart Transportation Without Neurons -- Fair Metro Network Expansion with Tabular Reinforcement Learning
Dimitris Michailidis, Sennay Ghebreab, Fernando P. Santos