Cortiq

· 138편

Agentic AI

해당 날짜의 arXiv 발표에서 선별한 랭킹 브리프입니다. Cortiq은 주제 적합도, 주저자 맥락, 공개 연구 신호를 함께 봅니다.

1. Benchmarking Open-Ended Multi-Agent Coordination in Language Agents

Kale-ab Abebe Tessera, Andras Szecsenyi, Cameron Barker, Alexander Rutherford, Davide Paglieri, Aidan Scannell, Henry Gouk, Elliot J. Crowley, Tim Rocktäschel, Amos Storkey

주저자 소속 - University of Edinburgh

2. Artificial Intelligence for Mathematical Reasoning: An Integrated Survey of Language Models, Neuro-symbolic Systems, and Verified Discovery

Syed Rifat Raiyan, Mohsinul Kabir, Hasan Mahmud, Md Kamrul Hasan

주저자 소속 - Stanford University

3. VATS: Exploiting Implicit Authority in Error-Path Injection via Systematic Mutation

Harshil Patel, Kunal Pai

주저자 소속 - Department of Computer Science, University of California, Davis, CA, USA

4. The Cold-Start Safety Gap in LLM Agents

Chung-En Sun, Linbo Liu, Tsui-Wei Weng

주저자 소속 - University of California, San Diego

5. Overcoming the Regulatory Bottleneck via Agent-to-Agent Protocols: A Nuclear Case Study

Akshay J. Dave, David Grabaskas, Joseph A. Renevitz, Richard B. Vilim

주저자 소속 - Stanford University

6. A Multi-modal Agentic Co-pilot for Evidence Grounded Computational Pathology

Zhe Xu, Zhengyu Zhang, Zhiyuan Cai, Jiahao Xu, Yijie Lin, ..., Yihui Wang, Yingxue Xu, Ronald Cheong Kin Chan, Li Liang, Hao Chen

주저자 소속 - Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, China

7. REFLECT: Intervention-Supported Error Attribution for Silent Failures in LLM Agent Traces

Xiaofeng Lin, Yingxu Wang, Tung Sum Thomas Kwok, Daniel Guo, Sahil Arun Nale, Charles Fleming, Guang Cheng

주저자 소속 - University of California, Los Angeles, USA

8. ViMax: Agentic Video Generation

Lingxuan Huang, Sizhe He, Hengji Zhou, Liqiang Nie, Lianghao Xia, Chao Huang

주저자 소속 - The University of Hong Kong

9. SAGE: An LLM-driven Self Reflective Agentic Framework for Fraud Detection

Yichen Chen, Siying Li, Yuhang Liang, Lijun Wang, Renyang Liu

주저자 소속 - National University of Singapore

10. PathoSage: Towards Multi-Source Evidence Adjudication in Pathology via Experience-Aware Agentic Workflow

Chengyang Zhang, Wenchuan Zhang, Bo Li, Mengran Li, Bob Zhang, Yuhao Yi, Hong Bu, Jiancheng Lv

주저자 소속 - Chengyang Zhang²±± ◆ Wenchuan Zhang² ◆ Bo Li³ ◆ Mengran Li´ ◆ Bob Zhang³
나머지 128편 보기

15. MAVIS: Multi-Agent Video Retrieval via Structured Video Understanding

Jie Zhang, Qilang Ye, Hao Zhou, Haochen Liang, Fei Luo

주저자 소속 - School of Computing and Information Technology, Great Bay University

20. Brain-Prompt Injection: A Route-Safety Audit for BCI-LLM Agents

Jianwei Tai

주저자 소속 - School of Internet, Anhui University

22. A Multi-Agent System for IPMSM Design Optimization via an FEA-AI Hybrid Approach

Jinseong Han, Sunwoong Yang, Namwoo Kang

주저자 소속 - Cho Chun Shik Graduate School of Mobility, KAIST

24. Scaffold Effects on GAIA: A Controlled Comparison

Jason Starace

주저자 소속 - Independent Researcher

26. MAGIS: Evidence-Based Multi-Agent Reasoning for Interpretable Strabismus Clinical Decision-Making

Xikai Tang, Yifan Wang, Jiafan Zhuang, Li Luo, Jinming Guo, ..., Jie Cen, Guangqiang Yin, Kunliang Qiu, Ce Zheng, Zhun Fan

주저자 소속 - School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China

27. Bidirectional Semantic Complementary Tool Retrieval for Remote Sensing Agents

Zeyuan Wang, Dongyang Hou, Cheng Yang, Xuezhi Cui, Linrui Xu, ..., Liangtian Liu, Kai Ouyang, Wang Guo, Lili Zhu, Chao Tao

주저자 소속 - School of Geosciences and Info-Physics, Central South University, Changsha 410083, China.

29. SSR: Can Simulated Patients Learn to Stigmatize Themselves? Modeling Self-Stigma through Internal Monologue

Kunyao Lan, Bingrui Jin, Zichen Zhu, Mengyue Wu

주저자 소속 - X-LANCE Lab, Dept. of Computer Science and Engineering, MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, China

31. LUNA-AD: Lightweight Uncertainty-Aware Language Model with Lifelong Learning for Autonomous Driving

Ruoyu Yao, Pei Liu, Ruiguo Zhong, Mingxing Peng, Rui Yang, Jun Ma

주저자 소속 - The Hong Kong University of Science and Technology (Guangzhou), China

44. Rosetta Memory: Adaptive Memory for Cross-LLM Agents

Hao Yang, Shiqi Shen, Haoxuan Li, Zhipeng Wang, Zhi Gong, Xu Chen

주저자 소속 - Gaoling School of Artificial Intelligence, Renmin University of China

45. Byzantine Cheap Talk: Adversarial Resilience and Topology Effects in LLM Coordination Games

Aya El Mir, Martin Takáč, Salem Lahlou

주저자 소속 - Mohamed bin Zayed University of Artificial Intelligence (MBZUAI)

46. QueryWeaver: Reliable Multi-Tool Query Execution Planning via LLM-Based Graph Generation

Aishwarya Chakravarthy, Vidhi Kulkarni, Duen Horng Chau

주저자 소속 - School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, USA

47. SLMJury: Can Small Language Models Judge as Well as Large Ones?

Anish Laddha, Nitesh Pradhan, Gaurav Srivastava

주저자 소속 - Department of Computer Science and Engineering, LNMIIT, Jaipur, India

52. Hallucination Cascade: Analyzing Error Propagation in Multi-Agent LLM Systems

Saeid Jamshidi, Arghavan Moradi Dakhel, Kawser Wazed Nafi, Foutse Khomh

주저자 소속 - SWAT Laboratory, Polytechnique Montréal, Montréal, QC, Canada

60. From Statute to Control Flow: Span-Grounded Deontic Trees for Defeasible Scope Parsing

Jian Chen, Siyuan Li, Chucheng Wan, Zixuan Yuan

주저자 소속 - The Hong Kong University of Science and Technology (Guangzhou)

64. TVI-CoT: Text-Visual Interleaved Chain-of-Thought Reasoning for Multimodal Understanding

Lianyu Hu, Xiaoyu Ma, Zeqin Liao, Yang Liu

주저자 소속 - College of Computing and Data Science of Nanyang Technological University, Singapore

86. Claw-R1: A Step-Level Data Middleware System for Agentic Reinforcement Learning

Daoyu Wang, Mingyue Cheng, Qingchuan Li, Shuo Yu, Jie Ouyang, Qi Liu

주저자 소속 - State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China

89. From Holistic Evaluation to Structured Criteria: Rubrics Across the Evolving LLM Landscape

Hao Chen, Ziyu Han, Yukun Yan, Qingfu Zhu, Maosong Sun, Wanxiang Che

주저자 소속 - Research Center for Social Computing and Interactive Robotics, Harbin Institute of Technology

93. Civil Court Simulation with Large Language Models

Yifan Chen, Haitao Li, Kaiyuan Zhang, Yueyue Wu, Qingyao Ai, Yiqun Liu

주저자 소속 - Beijing University of Posts and Telecommunications, Beijing, China

101. CRANE: Knowledge Editing for Reasoning MLLMs

Han Huang, Hao Wang, Mengqi Zhang, Shu Wu, Qiang Liu, Liang Wang

주저자 소속 - University of Chinese Academy of Sciences

103. Claude Code-Driving Scenario Mining for the Argoverse 2 Challenge

Wei Deng, Caoshengzhe Xue, Shuaikun Liu, Zhaohong Liu, Mengshi Qi, Huadong Ma

주저자 소속 - State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, China

107. Cooperative Long Rope Skipping via Multi-Agent Reinforcement Learning

Zihao Wang, Shijie Peng, Kerui Wu, Yu Huang, Ruiqi Xue, Dong Liu, Tian Xu, Lei Yuan, Yang Yu

주저자 소속 - National Key Laboratory of Novel Software Technology, Nanjing University, Nanjing, China

115. Agentic Search for Counterfactual Recourse under Fixed LLM Budgets

Yasuo Tabei

주저자 소속 - RIKEN Center for Advanced Intelligence Project, Tokyo, Japan

116. Data Agents Under Attack: Vulnerabilities in LLM-Driven Analytical Systems

Kuncan Wang, Ziting Wang, Peizhuo Lv, Haoyang Li, Guoliang Li, Gao Cong, Wei Dong

주저자 소속 - 1Nanyang Technological University, Singapore; 2The Hong Kong Polytechnic University; 3Tsinghua University

118. Multi-Turn Evaluation of Deep Research Agents Under Process-Level Feedback

Rishabh Sabharwal, Hongru Wang, Amos Storkey, Jeff Z. Pan

주저자 소속 - School of Informatics, University of Edinburgh, United Kingdom

123. Capacity, Not Format: Rethinking Structured Reasoning Failures

Hengxin Fan

주저자 소속 - Tianjin Normal University

124. ZAS-SQL: Distilling Rules from Failures for Zero-Shot Text-to-SQL

Hongzhou Zheng, Yixin Gou, Wenjia Zhang

주저자 소속 - Shanghai Research Institute for Intelligent Autonomous Systems, Tongji University

132. When Languages Disagree: Self-Evolving Multilingual LLM Judges

Xiyan Fu, Wei Lu

주저자 소속 - Nanyang Technological University

134. Proxy Reward Internalization and Mechanistic Exploitation: A Learned Precursor to Reward Hacking and Its Generalization

Mohammad Beigi, Ming Jin, Lifu Huang

주저자 소속 - Affiliation / Address line 1 \ Affiliation / Address line 2 \ Affiliation / Address line 3

135. Tight Sample Complexity of Transformers

Chenxiao Yang, Nathan Srebro, Zhiyuan Li

주저자 소속 - Toyota Technological Institute at Chicago (TTIC)