Variational Quantum Classifier¶
This note describes the variational quantum classifier (VQC) implemented in qml.classifiers.
The model is a hybrid quantum–classical binary classifier:
- a classical feature vector is encoded into a quantum circuit
- a parameterised ansatz is applied
- an observable is measured
- the resulting scalar is converted into a class probability
- parameters are trained by minimising a classical loss
Data¶
We consider a binary classification dataset:
where:
- \(N\) is the number of samples
- \(x_i \in \mathbb{R}^d\) is the feature vector for sample \(i\)
- \(y_i \in \{0,1\}\) is the binary label for sample \(i\)
- \(d\) is the feature dimension
In the current implementation:
- \(d = 2\)
- the dataset is the two-moons dataset
- features are standardised before entering the circuit
Let:
denote one standardised input sample.
Quantum state preparation¶
The input is encoded into a quantum state using an angle embedding.
Let:
- \(n\) be the number of qubits
- \(n = d\) in the current implementation
- \(|0\rangle^{\otimes n}\) be the initial computational basis state
The feature map is:
where \(U_{\text{enc}}(x)\) is the encoding unitary.
Angle embedding¶
For an input vector \(x \in \mathbb{R}^n\), the encoding applies one \(R_Y\) rotation per qubit:
where:
- \(x_j\) is feature \(j\)
- \(R_Y(\alpha)\) is a single-qubit rotation by angle \(\alpha\) about the \(Y\) axis
The matrix form is:
So the encoded state depends directly on the input features.
Variational ansatz¶
After encoding, a trainable circuit is applied.
Let:
denote the full set of trainable parameters.
The ansatz unitary is:
and the full circuit state is:
Layered hardware-efficient ansatz¶
The implemented ansatz uses:
- one layer index \(\ell = 1,\dots,L\)
- one qubit index \(j = 1,\dots,n\)
where:
- \(L\) is the number of variational layers
- \(n\) is the number of qubits
Each layer applies:
- \(R_Y\) on each qubit
- \(R_Z\) on each qubit
- a chain of CNOT gates for entanglement
The parameter tensor is:
where:
- \(\theta_{\ell,j,1}\) is the \(R_Y\) angle for layer \(\ell\), qubit \(j\)
- \(\theta_{\ell,j,2}\) is the \(R_Z\) angle for layer \(\ell\), qubit \(j\)
One layer has the form:
where \(U_{\text{ent}}\) is the entangling unitary.
For the chain entangler:
Thus the full ansatz is:
Measurement and model output¶
The circuit measures the expectation value of the Pauli-\(Z\) observable on the first qubit.
Let:
where \(Z_1\) is Pauli \(Z\) acting on qubit 1.
The raw model output is:
with:
This expectation value is then mapped to a probability:
and therefore:
where:
- \(p_\theta(y=1 \mid x)\) is the predicted probability of class 1
- \(p_\theta(y=0 \mid x)\) is the predicted probability of class 0
Decision rule¶
The predicted class is:
where \(\hat{y}(x)\) is the predicted label for input \(x\).
Loss function¶
Training uses binary cross-entropy.
For a batch of \(N\) training samples, let:
- \(y_i \in \{0,1\}\) be the true label of sample \(i\)
- \(p_i = p_\theta(y=1 \mid x_i)\) be the predicted probability of class 1 for sample \(i\)
The loss is:
where:
- \(\mathcal{L}(\theta)\) is the training objective
- \(N\) is the number of training samples
In implementation, probabilities are clipped slightly away from 0 and 1 for numerical stability.
Optimisation¶
The parameters \(\theta\) are trained using a classical optimiser.
The current implementation uses Adam with step size:
where \(\eta\) is the learning rate.
Training proceeds for a fixed number of steps:
where \(T\) is the total number of optimisation iterations.
At each step:
- evaluate the circuit on the training set
- compute probabilities \(p_i\)
- compute loss \(\mathcal{L}(\theta)\)
- compute gradients with respect to \(\theta\)
- update \(\theta\) using Adam
The loss history is recorded as:
where \(\mathcal{L}^{(t)}\) is the loss after optimisation step \(t\).
Accuracy¶
After training, predictions are formed on both train and test sets.
For any evaluation set of size \(M\), let:
- \(y_i\) be the true label of sample \(i\)
- \(\hat{y}_i\) be the predicted label of sample \(i\)
Accuracy is:
where:
- \(M\) is the number of evaluated samples
- \(\mathbf{1}\{\cdot\}\) is the indicator function
The implementation reports:
- training accuracy
- test accuracy
Parameter count¶
The ansatz parameter tensor has shape:
so the total number of trainable parameters is:
where:
- \(P\) is the total number of trainable parameters
- \(L\) is the number of layers
- \(n\) is the number of qubits
For the current minimal model:
- \(n = 2\)
- typical choice: \(L = 2\)
so:
Current implementation choices¶
The current VQC is intentionally minimal.
Included¶
- binary classification
- two-dimensional input
- angle embedding
- hardware-efficient ansatz
- Pauli-\(Z\) measurement on the first qubit
- binary cross-entropy
- Adam optimisation
- train/test accuracy
- loss curve and decision-boundary visualisation
Interpretation¶
The model learns a map:
where the nonlinearity comes from:
- quantum state preparation
- entangling operations
- nonlinear dependence of expectation values on the parameters and inputs
In practice, the VQC behaves like a compact hybrid classifier whose expressive power depends on:
- number of qubits \(n\)
- number of layers \(L\)
- embedding choice
- entanglement structure
- optimiser settings
Relation to the code¶
The implemented workflow is organised as follows:
qml.dataprepares the datasetqml.embeddingsapplies the feature mapqml.ansatzapplies the trainable circuitqml.classifiers.run_vqcperforms training and evaluationqml.visualizecreates plots
So the notebook remains a package client, while the full VQC logic lives in the package.
Summary¶
The implemented VQC is a binary classifier defined by:
- a feature map \(U_{\text{enc}}(x)\)
- a trainable ansatz \(U_{\text{ans}}(\theta)\)
- an observable \(M = Z_1\)
- a probability map from expectation values to class probabilities
- a binary cross-entropy training objective
Formally:
This is the core VQC workflow used in the repository.