Cortiq

· 138 papers

Agentic AI

A ranked brief from the day's arXiv listing. Cortiq weighs topical fit, lead-author context, and public research signals before the issue is published.

1. Benchmarking Open-Ended Multi-Agent Coordination in Language Agents

Kale-ab Abebe Tessera, Andras Szecsenyi, Cameron Barker, Alexander Rutherford, Davide Paglieri, Aidan Scannell, Henry Gouk, Elliot J. Crowley, Tim Rocktäschel, Amos Storkey

Lead affiliation - University of Edinburgh

2. Artificial Intelligence for Mathematical Reasoning: An Integrated Survey of Language Models, Neuro-symbolic Systems, and Verified Discovery

Syed Rifat Raiyan, Mohsinul Kabir, Hasan Mahmud, Md Kamrul Hasan

Lead affiliation - Stanford University

3. VATS: Exploiting Implicit Authority in Error-Path Injection via Systematic Mutation

Harshil Patel, Kunal Pai

Lead affiliation - Department of Computer Science, University of California, Davis, CA, USA

4. The Cold-Start Safety Gap in LLM Agents

Chung-En Sun, Linbo Liu, Tsui-Wei Weng

Lead affiliation - University of California, San Diego

5. Overcoming the Regulatory Bottleneck via Agent-to-Agent Protocols: A Nuclear Case Study

Akshay J. Dave, David Grabaskas, Joseph A. Renevitz, Richard B. Vilim

Lead affiliation - Stanford University

6. A Multi-modal Agentic Co-pilot for Evidence Grounded Computational Pathology

Zhe Xu, Zhengyu Zhang, Zhiyuan Cai, Jiahao Xu, Yijie Lin, ..., Yihui Wang, Yingxue Xu, Ronald Cheong Kin Chan, Li Liang, Hao Chen

Lead affiliation - Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, China

7. REFLECT: Intervention-Supported Error Attribution for Silent Failures in LLM Agent Traces

Xiaofeng Lin, Yingxu Wang, Tung Sum Thomas Kwok, Daniel Guo, Sahil Arun Nale, Charles Fleming, Guang Cheng

Lead affiliation - University of California, Los Angeles, USA

8. ViMax: Agentic Video Generation

Lingxuan Huang, Sizhe He, Hengji Zhou, Liqiang Nie, Lianghao Xia, Chao Huang

Lead affiliation - The University of Hong Kong

9. SAGE: An LLM-driven Self Reflective Agentic Framework for Fraud Detection

Yichen Chen, Siying Li, Yuhang Liang, Lijun Wang, Renyang Liu

Lead affiliation - National University of Singapore

10. PathoSage: Towards Multi-Source Evidence Adjudication in Pathology via Experience-Aware Agentic Workflow

Chengyang Zhang, Wenchuan Zhang, Bo Li, Mengran Li, Bob Zhang, Yuhao Yi, Hong Bu, Jiancheng Lv

Lead affiliation - Chengyang Zhang²±± ◆ Wenchuan Zhang² ◆ Bo Li³ ◆ Mengran Li´ ◆ Bob Zhang³
Show 128 more

15. MAVIS: Multi-Agent Video Retrieval via Structured Video Understanding

Jie Zhang, Qilang Ye, Hao Zhou, Haochen Liang, Fei Luo

Lead affiliation - School of Computing and Information Technology, Great Bay University

20. Brain-Prompt Injection: A Route-Safety Audit for BCI-LLM Agents

Jianwei Tai

Lead affiliation - School of Internet, Anhui University

22. A Multi-Agent System for IPMSM Design Optimization via an FEA-AI Hybrid Approach

Jinseong Han, Sunwoong Yang, Namwoo Kang

Lead affiliation - Cho Chun Shik Graduate School of Mobility, KAIST

24. Scaffold Effects on GAIA: A Controlled Comparison

Jason Starace

Lead affiliation - Independent Researcher

26. MAGIS: Evidence-Based Multi-Agent Reasoning for Interpretable Strabismus Clinical Decision-Making

Xikai Tang, Yifan Wang, Jiafan Zhuang, Li Luo, Jinming Guo, ..., Jie Cen, Guangqiang Yin, Kunliang Qiu, Ce Zheng, Zhun Fan

Lead affiliation - School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China

27. Bidirectional Semantic Complementary Tool Retrieval for Remote Sensing Agents

Zeyuan Wang, Dongyang Hou, Cheng Yang, Xuezhi Cui, Linrui Xu, ..., Liangtian Liu, Kai Ouyang, Wang Guo, Lili Zhu, Chao Tao

Lead affiliation - School of Geosciences and Info-Physics, Central South University, Changsha 410083, China.

29. SSR: Can Simulated Patients Learn to Stigmatize Themselves? Modeling Self-Stigma through Internal Monologue

Kunyao Lan, Bingrui Jin, Zichen Zhu, Mengyue Wu

Lead affiliation - X-LANCE Lab, Dept. of Computer Science and Engineering, MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, China

31. LUNA-AD: Lightweight Uncertainty-Aware Language Model with Lifelong Learning for Autonomous Driving

Ruoyu Yao, Pei Liu, Ruiguo Zhong, Mingxing Peng, Rui Yang, Jun Ma

Lead affiliation - The Hong Kong University of Science and Technology (Guangzhou), China

44. Rosetta Memory: Adaptive Memory for Cross-LLM Agents

Hao Yang, Shiqi Shen, Haoxuan Li, Zhipeng Wang, Zhi Gong, Xu Chen

Lead affiliation - Gaoling School of Artificial Intelligence, Renmin University of China

45. Byzantine Cheap Talk: Adversarial Resilience and Topology Effects in LLM Coordination Games

Aya El Mir, Martin Takáč, Salem Lahlou

Lead affiliation - Mohamed bin Zayed University of Artificial Intelligence (MBZUAI)

46. QueryWeaver: Reliable Multi-Tool Query Execution Planning via LLM-Based Graph Generation

Aishwarya Chakravarthy, Vidhi Kulkarni, Duen Horng Chau

Lead affiliation - School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, USA

47. SLMJury: Can Small Language Models Judge as Well as Large Ones?

Anish Laddha, Nitesh Pradhan, Gaurav Srivastava

Lead affiliation - Department of Computer Science and Engineering, LNMIIT, Jaipur, India

52. Hallucination Cascade: Analyzing Error Propagation in Multi-Agent LLM Systems

Saeid Jamshidi, Arghavan Moradi Dakhel, Kawser Wazed Nafi, Foutse Khomh

Lead affiliation - SWAT Laboratory, Polytechnique Montréal, Montréal, QC, Canada

60. From Statute to Control Flow: Span-Grounded Deontic Trees for Defeasible Scope Parsing

Jian Chen, Siyuan Li, Chucheng Wan, Zixuan Yuan

Lead affiliation - The Hong Kong University of Science and Technology (Guangzhou)

64. TVI-CoT: Text-Visual Interleaved Chain-of-Thought Reasoning for Multimodal Understanding

Lianyu Hu, Xiaoyu Ma, Zeqin Liao, Yang Liu

Lead affiliation - College of Computing and Data Science of Nanyang Technological University, Singapore

86. Claw-R1: A Step-Level Data Middleware System for Agentic Reinforcement Learning

Daoyu Wang, Mingyue Cheng, Qingchuan Li, Shuo Yu, Jie Ouyang, Qi Liu

Lead affiliation - State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China

89. From Holistic Evaluation to Structured Criteria: Rubrics Across the Evolving LLM Landscape

Hao Chen, Ziyu Han, Yukun Yan, Qingfu Zhu, Maosong Sun, Wanxiang Che

Lead affiliation - Research Center for Social Computing and Interactive Robotics, Harbin Institute of Technology

93. Civil Court Simulation with Large Language Models

Yifan Chen, Haitao Li, Kaiyuan Zhang, Yueyue Wu, Qingyao Ai, Yiqun Liu

Lead affiliation - Beijing University of Posts and Telecommunications, Beijing, China

101. CRANE: Knowledge Editing for Reasoning MLLMs

Han Huang, Hao Wang, Mengqi Zhang, Shu Wu, Qiang Liu, Liang Wang

Lead affiliation - University of Chinese Academy of Sciences

103. Claude Code-Driving Scenario Mining for the Argoverse 2 Challenge

Wei Deng, Caoshengzhe Xue, Shuaikun Liu, Zhaohong Liu, Mengshi Qi, Huadong Ma

Lead affiliation - State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, China

107. Cooperative Long Rope Skipping via Multi-Agent Reinforcement Learning

Zihao Wang, Shijie Peng, Kerui Wu, Yu Huang, Ruiqi Xue, Dong Liu, Tian Xu, Lei Yuan, Yang Yu

Lead affiliation - National Key Laboratory of Novel Software Technology, Nanjing University, Nanjing, China

115. Agentic Search for Counterfactual Recourse under Fixed LLM Budgets

Yasuo Tabei

Lead affiliation - RIKEN Center for Advanced Intelligence Project, Tokyo, Japan

116. Data Agents Under Attack: Vulnerabilities in LLM-Driven Analytical Systems

Kuncan Wang, Ziting Wang, Peizhuo Lv, Haoyang Li, Guoliang Li, Gao Cong, Wei Dong

Lead affiliation - 1Nanyang Technological University, Singapore; 2The Hong Kong Polytechnic University; 3Tsinghua University

118. Multi-Turn Evaluation of Deep Research Agents Under Process-Level Feedback

Rishabh Sabharwal, Hongru Wang, Amos Storkey, Jeff Z. Pan

Lead affiliation - School of Informatics, University of Edinburgh, United Kingdom

123. Capacity, Not Format: Rethinking Structured Reasoning Failures

Hengxin Fan

Lead affiliation - Tianjin Normal University

124. ZAS-SQL: Distilling Rules from Failures for Zero-Shot Text-to-SQL

Hongzhou Zheng, Yixin Gou, Wenjia Zhang

Lead affiliation - Shanghai Research Institute for Intelligent Autonomous Systems, Tongji University

132. When Languages Disagree: Self-Evolving Multilingual LLM Judges

Xiyan Fu, Wei Lu

Lead affiliation - Nanyang Technological University

134. Proxy Reward Internalization and Mechanistic Exploitation: A Learned Precursor to Reward Hacking and Its Generalization

Mohammad Beigi, Ming Jin, Lifu Huang

Lead affiliation - Affiliation / Address line 1 \ Affiliation / Address line 2 \ Affiliation / Address line 3

135. Tight Sample Complexity of Transformers

Chenxiao Yang, Nathan Srebro, Zhiyuan Li

Lead affiliation - Toyota Technological Institute at Chicago (TTIC)