I Trained a Markdown File to Boost GPT-5.5 by 23 Points – This Shouldn’t Work
Author(s): Chew Loong Nian – AI ENGINEER
Originally published on Towards AI.
In an intriguing exploration of AI capabilities, Chew Loong Nian, an AI Engineer, shares an unconventional method that significantly enhanced the performance of GPT-5.5 using a Markdown file. This approach, outlined in a detailed article on Towards AI, demonstrates a remarkable improvement in the model’s benchmark score from 58.8 to 82.3, a leap of +23.5 points, without altering a single weight or refining any parameters.
The Concept Behind SkillOpt
The core principle of this approach lies in a system named SkillOpt. It involves treating the Markdown skills document, or “skill file,” as an adjustable state while maintaining the target model unchanged. By employing a robust optimization model during training, SkillOpt suggests limited modifications—additions, deletions, or replacements—that are only accepted if they demonstrably enhance a validation score. This mirrors the stability of gradient descent in the text space.
Performance Results and Insights
Chew Loong Nian highlights the results across 52 model combinations, noting that SkillOpt consistently performs best or ties for the best performance. Notably, GPT-5.5’s live chat capability surged from 58.8 to 82.3, with pronounced improvements in format-verified procedural tasks like SpreadsheetBench. The trained skills introduce rules for structure verification, explicit value evaluation, state tracking in embedded navigation, and accurate answer anchoring in tables. This advancement is achieved with minimal changes and limited artifact size.
Reproducing the Workflow
The article outlines a straightforward setup for replicating this workflow: install SkillOpt, configure the backends, execute the training loop, and integrate the learned Markdown into the model’s context. This method provides an efficient way to enhance model performance without extensive resource investment.
SkillOpt-Sleep: An Innovative Extension
Additionally, the article introduces SkillOpt-Sleep, a plugin-like extension designed to learn from a user’s historical transcriptions. It features an offline consolidation loop for review, adoption, and validation, further enhancing the training document’s utility.
Addressing Limitations
Despite its promising results, SkillOpt faces two primary limitations: its dependence on automated scoring judges and its focus on optimizing one document at a time. However, for tasks that require procedural accuracy and verification, training the document rather than the model offers a more dependable and cost-effective optimization strategy compared to traditional fine-tuning methods.
For a deeper dive into this innovative approach and its implications, read the full blog on Medium Here.
Published via Toward AI
“`

