We introduces HiAR-ICL, a novel paradigm to enhance the complex reasoning capabilities of large language models. Unlike traditional in-context learning, HiAR-ICL shifts the focus from example-based analogical learning to abstract thought patterns. The primary contributions of HiAR-ICL are as follows:
In-context learning (ICL) enables large language models (LLMs) to perform downstream tasks through advanced prompting and high-quality demonstrations. However, traditional ICL paradigms encounter significant limitations in complex reasoning tasks, stemming primarily from their dependence on example quality and absence of explicit reasoning guidance. To address these challenges, we introduce HiAR-ICL, a High-level Automated Reasoning paradigm in ICL that shifts focus from specific examples to abstract reasoning patterns, thereby extending the conventional concept of “context” in ICL. Our approach begins by defining five atomic reasoning actions, upon which we employ Monte Carlo Tree Search to systematically construct high-level reasoning patterns. During inference, HiAR-ICL dynamically selects appropriate reasoning patterns based on problem attributes, providing explicit guidance for the model’s reasoning process. Experiments demonstrate HiAR-ICL's effectiveness and efficiency: utilizing only 200 prior samples with Qwen2.5-7B-Instruct, our method achieves 80.6% accuracy on MATH and 62.5% on AMC, exceeding GPT-4o's 77.2% and 57.5%. Our approach enhances performance across models of varying sizes while generalizing effectively across domains. Further analysis reveals that HiAR-ICL can also serve as a plug-and-play inference method compatible with post-training techniques like GRPO. Code and data are available at https://github.com/jinyangwu/HiARICL.
Our approach begins by defining five atomic reasoning actions, upon which we employ Monte Carlo Tree Search to systematically construct high-level reasoning patterns. During inference, HiAR-ICL dynamically selects appropriate reasoning patterns based on problem attributes, providing explicit guidance for the model's reasoning process. Specifically, HiAR-ICL consists of two main components:
Main Results
As shown in Table 1, we evaluate the effectiveness of HiARICL across six mainstream reasoning benchmarks. We provide comprehensive comparisons between HiAR-ICL and ICL methods. We have three key findings:
Out-of-Distribution Generalization
We also evaluate HiAR-ICL's performance against in-context learning (ICL) and supervised fine-tuning (SFT) under OOD scenarios. To ensure a fair comparison, we use the same 200 seed samples for both thought card construction and SFT. As illustrated in Figure 5, while ICL and SFT experience significant performance degradation, HiAR-ICL demonstrates remarkable resilience, preserving robust performance across multiple models and datasets. These results underscore HiARICL's superior robustness and generalization capabilities, positioning it as a more reliable and adaptable solution for handling diverse reasoning tasks, including both ID and OOD data.
Plug-and-Play Capability
Similar to ICL methods like CoT, HiAR-ICL operates as a training-free test-time inference framework compatible with post-training techniques. We demonstrated this by applying HiAR-ICL to models that have undergone GRPO training on the MATH training set. Table 5 shows that our framework consistently enhances performance when integrated with these approaches. This synergy suggests HiAR-ICL captures reasoning patterns complementary to those acquired during post-training, affirming its plug-and-play versatility. Our comprehensive results in Tables 1, 5, and 14 further demonstrate HiAR-ICL’s broad applicability across various model architectures and training paradigms, including base, instruction-tuned, and reinforcement learning-optimized models. This also demonstrates the generalizability and versatility of high-level thought patterns in complex reasoning tasks. Future work could explore deeper integration with post-training methods to maximize complementary benefits
More results and analysis are provided in our paper.
@article{wu2024beyond,
title={Beyond examples: High-level automated reasoning paradigm in in-context learning via mcts},
author={Wu, Jinyang and Feng, Mingkuan and Zhang, Shuai and Che, Feihu and Wen, Zengqi and Tao, Jianhua},
journal={arXiv preprint arXiv:2411.18478},
year={2024}
}