Reasoning Through Chess: How Reasoning Evolves from Data Through Fine-Tuning and Reinforcement Learning
NeurIPS 2025 FoRLM Workshop
Outperformed state-of-the-art open-source reasoning models in chess through SFT and RL on a 7B-parameter language model. The key focus of this work was to study how fine-tuning influenced post-RL reasoning (both quantitative and qualitative performance) using custom theoretically-inspired datasets.