Motivation
The AI community has been placing significant emphasis on mathematical reasoning as a means to explore the ability of intelligence in large language models (LLMs) and multi-modal large language models (MLLMs), such as OpenAI O1 and DeepSeek R1. As a common information medium, documents consist of text, images, tables, diagrams, charts, mathematical notations, etc. By leveraging multiple elements in documents, multi-modal mathematical reasoning focuses on enabling machines to solve, interpret, and reason about mathematical problems. It combines image and text analysis, symbolic manipulation, numerical computation, and logical inference to address challenges ranging from basic arithmetic to advanced problem-solving in algebra, calculus, and beyond, thus forming an area of growing importance in document intelligence.
Recent advancements in (multi-modal) large language models have driven progress in multi-modal mathematical reasoning, such as chart reasoning [1], table reasoning [2,3], and geometry problem solving [4,5]. Many real-world documents, such as academic papers, technical manuals, and financial reports, involve symbolic mathematics and logical reasoning. Addressing these requires algorithms that combine the precision of symbolic methods with the flexibility of modern AI. This workshop aims to bring together the researchers from industry, science, and academia to exchange ideas and discuss ongoing research in multi-modal mathematical reasoning in documents.
Schedule
Time | Events |
13:50-14:00 | Opening Remarks |
14:00-14:40 | Invited Talk1, Dr. Pan Lu, Advancing Mathematical Reasoning with Language Models and Agentic Systems. |
14:40-15:20 | Invited Talk2, Dr. Wenda Li |
15:20-16:00 | Contributed Talks (Best/Runner-up Paper Talks) |
16:00-16:20 | Coffee Break |
16:20-17:40 | Poster or Oral Session |
References
- [1] Zirui Wang, et al, CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs, NeurlPS 2024.
- [2] Pan Lu, et al, Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning, ICLR 2023.
- [3] Mingyu Zheng, Multimodal Table Understanding, ACL 2024.
- [4] Pan Lu, et al, MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts, ICLR 2024.
- [5] Trieu H. Trink, et al, Solving olympiad geometry without human demonstrations, Nature 2024.
Call for Papers
Acceptable submission topics may include but are not limited to:
- Multi-modal Mathematical Reasoning
- Symbolic and Neural Approaches for Mathematical Reasoning
- Distillation of general LLM/MLLM Mathematical Reasoning into document intelligence
- Mathematical QA in Documents
- Chart Understanding and Reasoning
- Table Understanding and Reasoning
- Flow-chart/Diagram Understanding and Reasoning
- Geometry Problem Solving
- Datasets and Benchmarks for Multi-Modal Mathematical Reasoning tasks.
- Applications: Use cases in scientific document understanding, financial document analysis, math education with AI, and so on.
Submission
The workshop is open to original papers of theoretical or practical nature. Papers should be formatted to follow the instructions in ICDAR webpage https://www.icdar2025.com/home. Papers are limited to 15 pages (not including references). This workshop will follow a double-blind review process. Authors should not include their names and affiliations anywhere in the manuscript. Authors should also ensure that their identity is not revealed in directly by citing their previous work in the third person and omitting acknowledgments until the camera-ready version. Papers have to be submitted via the workshop's CMT submission page. The submission link will be announced soon.
At least one author of each accepted paper must register for the workshop, in order to present the paper. For further instructions, please refer to the ICDAR 2025 webpage.
Important Dates
- Submission Deadline: April 15, 2025
- Decisions Announced: May 15, 2025
- Camera Ready Deadline: June 1, 2025
- Workshop: September 20, 2025
Publication
Accepted papers will be published in the ICDAR 2025 proceedings (workshop).
Contact
Workshop Chairs
- Qiufeng Wang, Xi'an Jiaotong-Liverpool University, China
- Yang Liu, Sun Yat-sen University, China
- Kun Zhang, Carnegie Mellon University, US
- Cheng-Lin Liu, Institute of Automation, Chinese Academy of Sciences, China
Program Committee Members
- Kaizhu Huang, Duke Kunshan University, China
- Yu Zhou, Nankai University, China
- Fei Yin, Institute of Automation, Chinese Academy of Sciences, China
- Jian Xu, Institute of Automation, Chinese Academy of Sciences, China
- Xiaobo Jin, Xi'an Jiaotong-Liverpool University, China
- Tonghua Su, Harbin Institute of Technology, China
- Meng Fang, University of Liverpool, UK
- Qingxing Cao, Sun Yat-sen University, China
- Zhicheng Yang, Hong Kong University of Science and Technology (Guangzhou), China
- Wenda Li, University of Edinburgh, UK
- Jindong Wang, William & Mary, US
Short CV of the Workshop Chairs
Prof. Qiufeng Wang, Professor, the head of Department of Intelligent Science at School of Advanced Technology in Xi’an Jiaotong-Liverpool University (XJTLU), and the Director of Suzhou Municipal Key Lab of Cognitive Computing and Applied Technology. He received the Ph.D degree in Pattern Recognition and Intelligence Systems from Institute of Automation, Chinese Academy of Sciences (CASIA), and won Presidential Scholarship of Chinese Academy of Sciences. After that, he worked at the National Laboratory of Pattern Recognition (NLPR) in CASIA, and then Microsoft. Dr. Wang joined XJTLU in Feb. 2017. His research interests include pattern recognition and machine learning, specially document analysis and recognition. Dr. Wang has published 80+ papers, including IEEE T-PAMI, Patten Recognition, ICCV, ICML, and published one book about deep learning in Springer.
Prof. Yang Liu, the associate professor at the School of Computer Science, Sun Yat-sen University, and a key member of the Human-Cyber-Physical Intelligence Integration Laboratory (HCP-Lab) at Sun Yat-sen University. He received the Ph.D degree from Xidian University. His primary research areas include multimodal understanding and causal reasoning. He has published over 30 papers in journals and conferences such as IEEE T-PAMI, T-IP, CVPR, ICCV, ACM MM, and IJCAI. He published the book "Multimodal Large Models: A New Generation of Artificial Intelligence Technology Paradigms". He was awarded the Outstanding Author of the Year 2024 by the Publishing House of Electronics Industry, the Excellence Award at the 2023 China Software Conference for the Robotic Large Model and Embodied Intelligence Challenge, and the First Prize at the 2023 Guangdong Province Third Youth Academic Showcase in Computer Science.
Prof. Kun Zhang, the associate professor of philosophy and an affiliate faculty in the machine learning department at Carnegie Mellon University; he is also a visiting professor and the acting chair of the machine learning department and the director of the Center for Integrative AI at Mohamed bin Zayed University of Artificial Intelligence (MBZUAI). He develops methods for making causality transparent by torturing various kinds of data and investigates machine learning problems including transfer learning, representation learning, and reinforcement learning from a causal perspective. He has been frequently serving as a senior area chair, area chair, or senior program committee member for major conferences in machine learning or artificial intelligence, including UAI, NeurIPS, ICML, IJCAI, AISTATS, and ICLR. He was a co-founder and general & program co-chair of the first Conference on Causal Learning and Reasoning (CLeaR 2022), a program co-chair of the 38th Conference on Uncertainty in Artificial Intelligence (UAI 2022), a general co-chair of UAI 2023, and is a program co-chair of ICDM 2024.
Prof. Cheng-Lin Liu, received the B.S. degree in electronic engineering from Wuhan University, Wuhan, China, the M.E. degree in electronic engineering from Beijing Polytechnic University (currently Beijing University of Technology), Beijing, China, the Ph.D. degree in pattern recognition and intelligent control from the Institute of Automation of Chinese Academy of Sciences, Beijing, China, in 1989, 1992 and 1995, respectively. He was a postdoctoral fellow at Korea Advanced Institute of Science and Technology (KAIST) and later at Tokyo University of Agriculture and Technology from March 1996 to March 1999. From 1999 to 2004, he was a research staff member and later a senior researcher at the Central Research Laboratory, Hitachi, Ltd., Tokyo, Japan. From 2005, he has been a Professor at the National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences, Beijing, China, and is now the director of the laboratory. His research interests include pattern recognition, image processing, machine learning, document analysis and understanding. He has published over 400 technical papers in journals and conferences. He is an associate Editor-in-Chief of Pattern Recognition Journal and Acta Automatica Sinica, an associate editor of several international and domestic journals. He is a Fellow of the CAA, CAAI, IAPR and the IEEE.
Talk 1:
Speaker: Dr. Pan Lu, Stanford University, US.
Title: Advancing Mathematical Reasoning with Language Models and Agentic Systems.
Abstract: Mathematical reasoning is fundamental to human intelligence and plays a crucial role in advancing education, science, and technology. This talk explores the development of language model systems that exhibit robust mathematical reasoning and facilitate scientific discovery, marking a significant step toward general artificial intelligence.
We introduce novel multi-modal and knowledge-intensive benchmarks designed to assess the reasoning capabilities of large language models (LLMs) and vision-language models (VLMs) in real-world scenarios, including those involving visual data, tabular information, and scientific applications. The talk highlights recent advancements in mathematical reasoning within visual contexts and addresses key unresolved challenges.
Additionally, we present cutting-edge retrieval and tool-augmented algorithms that significantly enhance LLM performance in mathematical reasoning tasks. Finally, we explore how agentic systems, leveraging test-time optimization and external tools, can further advance mathematical reasoning and scientific discovery.
Bio: Pan Lu is a postdoctoral researcher at Stanford University. He received his Ph.D. in Computer Science from UCLA in 2024. His long-term goal is to develop systems that can reason and collaborate with humans for the common good. His primary research focuses on machine reasoning, particularly in the areas of mathematical reasoning and scientific discovery. His work has led to top-tier venues such as Nature, NeurIPS, ICLR, ICML, and ACL. He has served as the program chair for SoCal NLP 2023, the area chair for ICLR 2025 and ACL 2025, and the co-chair for the MATHAI workshops at NeurIPS from 2021 to 2024. He has received various awards, including the Most Influential NIPS Papers award, the Amazon PhD Fellowship, Bloomberg PhD Fellowship, Qualcomm Innovation Fellowship, and UCLA Dissertation Year Fellowship.
Talk 2:
Speaker: Dr. Wenda Li, University of Edinburgh, UK.
Title: To be added.
Abstract: To be added.
Bio: Wenda Li is a lecturer in hybrid AI at the University of Edinburgh and a visiting research fellow at the University of Cambridge. His research interests include formalising mathematics and applying machine learning techniques to assist users of interactive theorem provers. He has published in several prominent venues in the fields of theorem proving and machine learning including JAR, ITP, CPP, ICML, ICLR, and NeurIPS.