Theory¶
This repository implements core methods in quantum machine learning (QML) using parameterised quantum circuits and quantum feature maps.
The focus is on supervised learning using hybrid quantum–classical models.
Workflows include:
- variational quantum classifiers
- variational quantum regressors
- quantum kernel methods
- trainable quantum kernels
- quantum metric learning
All models rely on parameterised quantum circuits evaluated within classical optimisation loops.
Table of Contents¶
- Contrastive loss
- Data re-uploading embeddings
- Classification in embedding space
- Relationship to kernel methods
- Author
- License
Hybrid quantum–classical learning¶
Most QML models take the form:
where:
- \(x \in \mathbb{R}^d\) is a classical input vector
- \(\theta\) are trainable circuit parameters
- \(f_\theta\) is computed using a quantum circuit
Typical workflow:
- encode classical data into a quantum state
- apply a parameterised circuit
- measure an observable
- compute a classical loss
- update parameters using classical optimisation
Optimisation is performed using gradient-based methods such as Adam.
Gradients are computed using automatic differentiation and the parameter-shift rule.
Data encoding (feature maps)¶
Classical data must be embedded into quantum states.
Given a feature vector:
we prepare a quantum state:
where:
is a parameterised unitary.
Angle embedding¶
A simple encoding uses single-qubit rotations:
where:
Angle embedding maps classical features directly to rotation angles.
Variational quantum circuits¶
Variational models use parameterised circuits:
composed of single-qubit rotations and entangling gates.
The model output is an expectation value:
where:
- \(M\) is an observable
- \(U(x)\) encodes data
- \(U(\theta)\) contains trainable parameters
Hardware-efficient ansatz¶
A common ansatz uses repeated layers:
with entanglement:
Properties:
- shallow circuit depth
- hardware compatible
- expressive but trainable
- widely used baseline
Expectation values¶
Variational models produce scalar outputs via expectation values:
with:
Typical observable:
giving outputs in:
Finite-shot estimation (noise-aware execution)¶
Expectation values may be computed either analytically or via sampling.
Given \(S\) measurement shots:
where:
Finite-shot evaluation introduces sampling variance:
As:
the estimate converges to the analytic expectation value.
Finite-shot sampling simulates noise effects present on real hardware.
Variational quantum classifier (VQC)¶
Binary classification uses expectation values mapped to probabilities.
Define:
Probability of class 1:
Prediction rule:
Classification loss¶
Binary cross-entropy:
Optimisation adjusts parameters \(\theta\) to minimise classification error.
Variational quantum regression (VQR)¶
Regression uses expectation values as continuous predictions.
Prediction:
Targets are typically standardised:
to match the observable output range.
Regression loss¶
Mean squared error:
Evaluation metrics include:
Mean absolute error:
Regression uses the same quantum architecture as classification but a different loss.
Quantum kernel methods¶
Kernel methods avoid explicit parameter optimisation.
Instead, similarity between inputs is computed:
where:
Kernel evaluation using quantum circuits¶
Kernel values are computed using:
Procedure:
- apply feature map \(U(x_i)\)
- apply inverse feature map \(U^\dagger(x_j)\)
- measure probability of the zero state
Construct kernel matrix:
Support vector machines with quantum kernels¶
Given kernel matrix:
the classifier takes the form:
where:
- \(\alpha_i\) are learned coefficients
- \(b\) is a bias term
Training solves a convex optimisation problem.
The quantum computer supplies kernel values.
Trainable quantum kernels¶
Instead of fixing the feature map, parameters \(\theta\) can be introduced:
giving kernel:
Kernel-target alignment¶
Trainable kernels optimise similarity between:
- kernel matrix \(K_\theta\)
- label similarity matrix \(Y\)
Label similarity:
Alignment objective:
where Frobenius inner product:
Optimisation objective:
Alignment encourages kernel similarity to reflect class structure.
Quantum convolutional neural networks¶
Quantum convolutional neural networks use a hierarchical circuit structure to combine local feature extraction with progressive reduction of the active degrees of freedom.
A QCNN alternates:
- convolution-style local two-qubit blocks
- pooling-style entangling reductions
- a final readout on a reduced register
In the implementation used here, the classifier applies a trainable embedding on four qubits, followed by shared convolution blocks on neighbouring pairs, then a second-stage convolution on the pooled representation.
Hierarchical structure¶
Let the initial embedded state be
The QCNN transforms it as
This differs from a flat variational classifier because information is processed through successive local blocks rather than a repeated global ansatz layer.
Binary classification readout¶
The final prediction is obtained from a Pauli-\(Z\) expectation on the readout qubit:
This is mapped to a class probability by
Training then minimizes binary cross-entropy over the dataset.
Quantum metric learning¶
Quantum metric learning aims to learn an embedding geometry in which distances between samples reflect label similarity.
Instead of directly predicting labels, the model learns a parameterised quantum embedding:
The quantum circuit defines a feature map:
where expectation values of Pauli observables form an embedding vector:
for a \(k\)-qubit circuit.
Distance-based supervision¶
Given two samples:
define embedding distance:
Training encourages:
- small distances for same-class pairs
- large distances for different-class pairs
Contrastive loss¶
Define label similarity indicator:
Contrastive objective:
where:
- \(m\) is a margin hyperparameter
- \(d_{ij}\) is Euclidean distance in embedding space
The margin encourages separation between classes.
Data re-uploading embeddings¶
Expressive embeddings may be constructed using repeated feature encoding layers:
where:
Repeated encoding increases expressivity without increasing qubit count.
Classification in embedding space¶
After optimisation, predictions may be performed using classical methods.
One simple approach uses nearest centroid classification.
Compute class centroids:
Prediction:
Metric learning therefore separates:
- representation learning (quantum)
- classification rule (classical)
Relationship to kernel methods¶
Metric learning and kernel methods both rely on quantum feature maps.
Kernel methods compute similarity:
Metric learning instead optimises parameters such that Euclidean distances in embedding space reflect label similarity.
Both approaches use quantum circuits to construct feature representations.
Relationship to variational models¶
Variational classifiers directly optimise prediction error.
Metric learning optimises geometry of the feature space.
Advantages:
- decouples representation learning from classifier choice
- allows classical classifiers to operate on quantum features
- supports few-shot learning scenarios
- provides interpretable embedding structure
Model capacity considerations¶
Embedding expressivity depends on:
- number of qubits
- circuit depth
- entanglement structure
- number of re-uploading layers
As circuit depth increases, the embedding may represent more complex similarity structure.
Relationship between models¶
Variational models:
learn parameters inside quantum circuits.
Kernel models:
use quantum circuits to compute similarity measures.
Trainable kernels:
learn parameters inside the feature map rather than classifier weights.
Model capacity¶
Expressivity depends on:
- embedding structure
- circuit depth
- entanglement pattern
- number of qubits
Tradeoffs:
- deeper circuits increase expressivity
- deeper circuits increase noise sensitivity
- more qubits increase dimensional capacity
General workflow¶
Common structure across models:
- prepare dataset
- encode data into quantum state
- evaluate circuit
- compute classical objective
- update parameters or classifier
Noise considerations¶
Finite-shot sampling introduces:
- variance in expectation values
- stochastic gradients
- sensitivity to circuit depth
Noise-aware evaluation allows study of:
- robustness of variational models
- stability of kernel matrices
- sensitivity of optimisation
Finite-shot execution approximates behaviour of real quantum hardware.
Quantum autoencoders¶
Quantum autoencoders learn a unitary compression map that moves irrelevant information into a designated trash subsystem while preserving the informative degrees of freedom in a smaller latent subsystem.
Let the input state be
where:
- \(\mathcal{H}_A\) is the retained latent subsystem
- \(\mathcal{H}_B\) is the trash subsystem
The encoder aims to transform the state so that the trash subsystem is close to a fixed reference state, typically \(|0\rangle^{\otimes k}\).
Compression objective¶
Given encoder unitary \(U(\theta)\), the compressed state is
Compression succeeds when the trash subsystem factors as
This repository optimizes the probability of measuring the trash subsystem in the all-zero state.
Reconstruction¶
After compression, a decoder can be defined by the adjoint unitary
Applying the decoder gives a reconstructed state
The implementation reports both:
- compression fidelity on the trash subsystem
- reconstruction fidelity on the full state
References¶
Schuld, M., Sinayskiy, I., & Petruccione, F. (2015)
An introduction to quantum machine learning.
Havlíček et al. (2019)
Supervised learning with quantum-enhanced feature spaces.
Farhi & Neven (2018)
Classification with quantum neural networks.
Mitarai et al. (2018)
Quantum circuit learning.
Cristianini et al. (2002)
On kernel-target alignment.
Author¶
Sid Richards
LinkedIn: https://www.linkedin.com/in/sid-richards-21374b30b/
GitHub: https://github.com/SidRichardsQuantum
License¶
MIT License — see LICENSE