Research Use¶

This repository is most useful as a reproducible evidence generator for small-system quantum simulation studies. It is not a claim that any one method is universally best, and it is not production chemistry software.

For the problem scope and user-facing workflows, read PROBLEM.md. For the current benchmark inventory, read notebooks/benchmarks/SUMMARY.md. For published tables and figures, read notebooks/benchmarks/RESULTS.md.

Research Claims This Repo Can Support¶

Use this repo to support claims of these forms:

for a named small molecule or low-qubit Hamiltonian, method A was more accurate, faster, or more stable than method B under a stated configuration
a solver default is reasonable for a documented calibration panel
a configuration is sensitive to seed, shots, noise channel, mapping, ansatz, optimizer, or active-space choice
a non-chemistry Hamiltonian can be run through the same expert-mode API as the chemistry benchmarks

Do not use this repo, by itself, to claim:

chemical accuracy for large molecules
hardware performance or device-readiness
universal algorithm rankings across problem classes
production-quality quantum chemistry results

Evidence Standard¶

A result should be treated as research evidence only when it records:

the resolved problem: molecule or model, geometry, charge, basis, mapping, active space, qubit count, and Hamiltonian-term count where available
the reference: exact diagonalization, Hartree-Fock, analytical model result, or an explicit statement that no reference is used
the solver configuration: method, ansatz, optimizer, step counts, stepsizes, QPE evolution settings, shots, seeds, and noise model
the metrics: energy, absolute error against the reference where meaningful, runtime, cache-hit state, and method-specific diagnostics
the statistical design: seed list, shot list, repetitions, failure criteria, and aggregation method for stochastic or optimizer-sensitive studies
the environment: package version and relevant dependency versions for release-grade benchmark artifacts

The benchmark row contract is documented in notebooks/benchmarks/SCHEMA.md.

Claim Levels¶

Level	Meaning	Minimum evidence
Smoke	The API path runs.	One tiny deterministic case.
Case study	A method behaves as reported on one problem.	One problem, fixed config, reference where available.
Benchmark	A comparison is decision-useful.	Multiple methods or settings, common reference, runtime/cache metadata, documented metrics.
Reproducibility study	Stability is measured.	Multiple seeds or shots, aggregate statistics, failure-rate notes.
Release-grade evidence	A result can be cited.	Curated artifact export, versioned code, clean validation checks, and documented limitations.

Benchmark Acceptance Checklist¶

Before treating a new notebook as a benchmark, verify that it:

asks one explicit research or method-selection question
avoids duplicating an existing notebook unless it replaces or generalizes it
follows notebooks/benchmarks/TEMPLATE.md for question, scope, reference, metrics, aggregation, limitations, and artifact-export notes
uses the shared problem-resolution and Hamiltonian pipeline
compares against an exact or clearly documented reference when feasible
reports cache hits separately from compute runtime
exports any published table or figure through scripts/export_benchmark_artifacts.py
states limitations in the notebook or nearby docs when a method is known to be noiseless-only, calibration-specific, or small-system-only

Release Protocol¶

For a release intended to be useful in research:

Run the default and full test suites.
Run registered benchmark suites that should become release evidence. Use python -m common.benchmarks list to inspect available suites, then run a selected suite, for example python -m common.benchmarks run --suite h2-cross-method.
Compare refreshed suite outputs against the previous evidence set with python -m common.benchmarks compare --base old_run --head new_run.
Rerun benchmark notebooks whose results are being refreshed.
Export curated artifacts with python scripts/export_benchmark_artifacts.py.
Confirm notebooks/benchmarks/_artifacts/benchmark_manifest.json describes the published artifact set.
Build docs with python -m sphinx -W -b html docs docs/_build/html.
Tag the code and attach or archive the curated artifact set if the release is meant to be cited directly.

Document Boundaries¶

The markdown files intentionally have separate jobs:

README.md: installation, orientation, and quickstart
PROBLEM.md: what practical problems the repo is for
THEORY.md: algorithm background
USAGE.md: API and CLI usage
RESEARCH.md: evidence standards and benchmark acceptance rules
notebooks/benchmarks/SUMMARY.md: benchmark inventory
notebooks/benchmarks/RESULTS.md: curated result surfaces
notebooks/benchmarks/SCHEMA.md: benchmark row and artifact metadata fields