Quantum LDPC code discovery requires searching large algebraic design spaces while reliably certifying the parameters and equivalence classes of any candidates found. We present a workflow that uses large language models to mutate Python programs generating quantum LDPC code designs—specifically bivariate-bicycle and perturbed bivariate-bicycle code ansätze—coupled with a rigorous independent validation pipeline (GF(2) rank computation, distance estimation, MILP, BLISS Tanner-graph deduplication, decomposability analysis, and local-Clifford equivalence checks). Across five campaigns running roughly 1,650 evolutionary iterations and screening approximately 2 × 10^5 candidate codes—requiring around 140 hours of computation and ~US$400 in LLM inference costs—the system identified 465 distinct candidate codes at block length n ≤ 360 (97 CSS bivariate-bicycle codes and 368 non-CSS perturbed variants). Notable finds include an indecomposable [[288,16,12]] code and higher-weight codes reaching k = 50 at distance d = 8, while non-CSS results included perturbed codes matching the gross-code benchmark at [[144,12,12]]. These results suggest that LLM-guided program evolution, combined with rigorous independent evaluation, can be a practical approach for structured quantum-code discovery.