Quantum Verifiable Rewards for Post-Training Qiskit Code Assistant

Training AI models to write better quantum code with quantum hardware verification

Quantum Verifiable Rewards for Post-Training Qiskit Code Assistant

📄 We just released a new paper “Quantum Verifiable Rewards for Post-Training Qiskit Code Assistant” - and it’s been getting great attention from folks in the quantum field since day one. 🎉

🔍 What we built: A novel approach to train AI models that can write better quantum code using Qiskit. What makes this interesting:

✅ Quantum verification at the core - Instead of just hoping the AI-generated code works, we actually verify it runs correctly on real quantum systems ✅ Smart training pipeline - We created synthetic quantum problem-test pairs and used both Direct Preference Optimization (DPO) and Group Relative Policy Optimization (GRPO) to align our models ✅ Real quantum feedback - Our models learn directly from quantum hardware and systems rewards, ensuring the code they generate actually works in practice

The results: Our best model significantly outperforms existing leading open-source baselines on the challenging Qiskit-HumanEval-hard benchmark.

This work presents a solid contribution to making quantum programming more accessible through AI assistance. The approach of using quantum hardware verification to train coding assistants opens up promising directions for future research.

Proud of our team’s thoughtful work on this project: Nicolas Dupuis, Adarsh Tiwari, Youssef MROUEH, David Kremer, Ismael Faro. The intersection of AI and quantum computing continues to offer compelling research opportunities.

Paper: https://arxiv.org/abs/2508.20907

#QuantumComputing #AI #MachineLearning #Qiskit #Research #AIforQuantum


Originally shared on LinkedIn on August 30, 2025 - 76 reactions, 0 comments as of 11/12/2025

Avatar
Juan Cruz-Benito
AI for Quantum Product Owner & Engineering Manager

Building the convergence of AI and Quantum Computing. Product Owner & Engineering Manager @ IBM Quantum

Related