Distilling System 2 Thinking into LLMs for Enhanced Complex Reasoning Performance

July 13, 2024

Large language models (LLMs) are incredibly skilled at answering simple questions, but when it comes to handling complex tasks that require reasoning and planning, they need special prompting techniques. These prompting techniques, known as “System 2” techniques, push LLMs to generate intermediate steps towards solving a problem, enhancing their reasoning capabilities.

However, implementing System 2 techniques can slow down LLM applications and make them computationally expensive. In a recent study, researchers at Meta FAIR introduced a new technique called “System 2 distillation” that teaches LLMs complex tasks without the need for intermediate steps.

In cognitive science, System 1 and System 2 represent two different modes of thinking. System 1 is fast, intuitive, and automatic, while System 2 is slow, deliberate, and analytical. LLMs are often likened to System 1 thinking, as they can quickly generate text but struggle with tasks that require conscious reasoning and planning.

To address this limitation, researchers have explored prompting LLMs to mimic System 2 thinking by generating intermediate reasoning steps before providing a final answer. Techniques like “Chain of Thought” have shown to improve accuracy for logical reasoning tasks, but they come at a higher cost in terms of inference time and latency.

The concept of System 2 distillation draws inspiration from how humans internalize complex tasks over time, transitioning them from System 2 to System 1 thinking. In this process, LLMs are prompted to solve problems using System 2 techniques, and the correct responses are identified through unsupervised mechanisms like self-consistency. The intermediate reasoning steps are then discarded, and the model is fine-tuned based on the initial question and final answer, allowing it to skip the reasoning process and directly provide solutions.

The researchers tested the System 2 distillation method on various reasoning tasks using different prompting techniques. The results demonstrated a significant improvement in LLMs’ performance on complex tasks, often surpassing the accuracy of original System 2 methods. Notably, the distilled models could generate responses faster and with less computational resources due to the elimination of intermediate steps.

While System 2 distillation shows promise in enhancing LLMs’ reasoning capabilities, there are still areas to explore, such as its applicability to smaller models and its impact on broader task performance. Understanding the limitations of distillation, particularly in tasks that require explicit reasoning like complex math problems, will be critical for further advancements in AI systems.

In conclusion, System 2 distillation offers a valuable optimization strategy for LLM pipelines, enabling them to efficiently handle specific tasks while leaving room for deliberate reasoning on more challenging tasks. By distilling complex reasoning skills into LLMs’ output without intermediate steps, this technique paves the way for more efficient and effective AI systems in the future.