Geometric Integration & Hamiltonian Monte Carlo
Standard numerical integrators (Euler, Runge-Kutta) are designed to minimize local truncation error in the Euclidean sense. They treat the state space as a flat vector space and ignore the global geometric invariants of the system. For Hamiltonian systems (energy conserving), standard integrators exhibit “energy drift”: a monotonic increase or decrease in the system’s total energy, eventually leading to catastrophic divergence. In Statistical Mechanics and Bayesian Inference (HMC), this drift is unacceptable as it violates the Detailed Balance condition required for valid sampling. We require Geometric Numerical Integration: methods that preserve the underlying structure (holonomy, symplecticity, momentum) by design. This note explores the mathematical engine of these methods, starting from the geometry of the cotangent bundle to the stability of shadow Hamiltonians.
1. Hamiltonian Foundations: The Symplectic Manifold
A Hamiltonian system is defined on a Symplectic Manifold . is typically the cotangent bundle with coordinates . is a closed, non-degenerate 2-form .
The Vector Field: Given a Hamiltonian , the associated vector field is defined by:
In local coordinates , this yields the Hamilton equations:
Or in matrix form where is the standard symplectic matrix .
Poisson Brackets: The symplectic structure induces a Lie algebra on the space of smooth functions via the Poisson Bracket:
A flow is Symplectic if it preserves the 2-form: . This is equivalent to the Jacobian condition . Every Hamiltonian flow is symplectic. By Liouville’s theorem, it also preserves the volume form .
2. Symplectic Integrators: Splitting and Composition
We cannot solve the flow exactly for general . Instead, we decompose the Hamiltonian into integrable parts . If and can be solved exactly, their flows and are symplectic. Composition of symplectic maps remains symplectic.
The Leapfrog (Verlet) Scheme: Defined by the symmetric splitting:
Typically, is the potential () update and is the kinetic () update. Because each sub-map is a shear transform (a triangular matrix with unit diagonal), the Jacobian determinant is 1. Symplecticity is guaranteed by construction, regardless of the step size , provided the system remains within the stability region.
3. Lie Group Integrators: Munthe-Kaas Methods
Standard integrators perform the update . This works in Euclidean space because is closed under addition. However, if the state space is a Lie Group (like the rotation group ), the updated point will generally not lie in . We need integrators that respect the group structure.
The Exponential Map: Let be the Lie algebra of . The exponential map maps elements of the tangent space at the identity to the group. The update rule becomes:
where is the “lifted” direction. Munthe-Kaas (RKMK) methods generalize Runge-Kutta to this setting by solving the differential equation in the Lie algebra and then mapping back. This ensures that a robot arm’s joints stay at unit length and rigid bodies do not “stretch” during simulation.
4. Variational Integrators: The Lagrangian Path
While Hamiltonian mechanics focuses on phase space, Lagrangian mechanics focuses on the path that minimizes the Action Integral:
Discrete Variational Principle: A Variational Integrator is derived by discretizing the action first:
where is the discrete Lagrangian. Hamilton’s Principle then leads to the Discrete Euler-Lagrange (DEL) equations:
The Major Result: Every variational integrator is Symplectic. By discretizing the variational principle rather than the differential equation, we inherit the conservation of the discrete symplectic form automatically. This provides an alternative route to constructing high-order integrators for orbital mechanics and celestial simulation.
5. Backward Error Analysis & Shadow Hamiltonians
What equation does Leapfrog actually solve? We know it solves approximately, but there exists a modified Hamiltonian such that the numerical points lie exactly on the flow lines of . Let be the Lie operator for the flow of . The Leapfrog operator is:
Baker-Campbell-Hausdorff (BCH) Formula: The composition can be written as a single exponential . For symmetric splitting, the error terms are always even powers of :
Assuming a separable , this simplifies to:
Linear Stability Limits: is an infinite series that generally diverges. However, for small , it can be truncated. Stability is lost when the eigenvalues of the Jacobian move away from the unit circle. For a harmonic oscillator with frequency , the stability condition for Leapfrog is:
If , the shadow Hamiltonian ceases to exist, and the numerical system enters a chaotic, explosive regime.
6. No-Go Theorems (Ge-Marsden)
Can we construct a symplectic integrator that conserves energy exactly? Theorem (Ge-Marsden 1988): If a fixed time-step integrator is:
- Symplectic
- Energy preserving ()
- Non-integrable system Then the integrator must be the identity map (time step 0). Meaning: You cannot have both exact geometry and exact energy for general systems. Integrated Hamiltonian dynamics stays “near” energy contours precisely because of symplecticity, making the preservation of volume more statistically robust than pinning energy itself.
7. HMC Implementation
We use Leapfrog in HMC. Steps:
- Sample momentum .
- Simulate Hamiltonian dynamics for steps using Leapfrog.
- Metropolis correction: . If Leapfrog conserved perfectly, . Since it conserves , error depends on .
We use Leapfrog and RK4 to simulate the Kepler Problem (an inverse-square central force). The Hamiltonian is . Symplectic methods maintain a stable (though precessing) elliptical orbit. Non-symplectic methods (like RK4) slowly drain or add energy, causing the planet to spiral into the sun or escape the system.
import numpy as np
import matplotlib.pyplot as plt
def kepler_potential(q):
return -1.0 / np.linalg.norm(q)
def kepler_grad(q):
return q / np.linalg.norm(q)**3
def leapfrog_kepler(steps=2000, h=0.01):
q = np.array([1.0, 0.0])
p = np.array([0.0, 0.5]) # Low velocity for elliptical orbit
path = []
energy = []
for _ in range(steps):
p -= (h/2) * kepler_grad(q)
q += h * p
p -= (h/2) * kepler_grad(q)
path.append(q.copy())
energy.append(0.5*np.dot(p, p) + kepler_potential(q))
return np.array(path), np.array(energy)
def rk4_kepler(steps=2000, h=0.01):
z = np.array([1.0, 0.0, 0.0, 0.5]) # q1, q2, p1, p2
path = []
energy = []
def f(state):
q = state[:2]
p = state[2:]
return np.concatenate([p, -kepler_grad(q)])
for _ in range(steps):
k1 = f(z)
k2 = f(z + h/2 * k1)
k3 = f(z + h/2 * k2)
k4 = f(z + h * k3)
z += (h/6) * (k1 + 2*k2 + 2*k3 + k4)
path.append(z[:2].copy())
p = z[2:]
energy.append(0.5*np.dot(p, p) + kepler_potential(z[:2]))
return np.array(path), np.array(energy)
def plot_kepler_results():
path_lf, e_lf = leapfrog_kepler()
path_rk, e_rk = rk4_kepler()
plt.figure(figsize=(12, 5))
# Orbits
plt.subplot(1, 2, 1)
plt.plot(path_lf[:,0], path_lf[:,1], label='Leapfrog (Symplectic)')
plt.plot(path_rk[:,0], path_rk[:,1], 'r--', label='RK4 (Energy Drift)')
plt.scatter([0], [0], color='orange', s=100, label='Sun')
plt.title('Kepler Orbits (2000 steps)')
plt.legend()
# Energy error
plt.subplot(1, 2, 2)
plt.plot(e_lf - e_lf[0], label='Leapfrog Error')
plt.plot(e_rk - e_rk[0], label='RK4 Error')
plt.title('Energy Fluctuation vs Drift')
plt.ylabel('H - H0')
plt.legend()
# plt.show()8. Constrained Dynamics: The RATTLE Algorithm
Often, we simulate systems with holonomic constraints (e.g., a molecule with fixed bond lengths). A standard symplectic update will violate these constraints within a few steps. We must modify the Hamiltonian to include Lagrange multipliers :
RATTLE is the symplectic discretization of constrained Hamiltonian systems. It involves solving for such that both and . The preservation of the Symplectic Form on the Constrained Manifold is non-trivial but guaranteed by RATTLE’s specific staggering of the multiplier updates.
9. The Adaptive Time-Step Paradox
In general numerics, we use adaptive to speed up simulation in smooth regions. However, Adaptive Symplectic Integration is impossible with naive step size changes. If depends on state , the resulting map is generally NOT symplectic. Energy drift reappears as soon as begins to vary. Shadow Energy creates a non-Hamiltonian feedback loop. To solve this, we use the Poincaré Transformation: Instead of changing , we transform the time variable and solve a modified Hamiltonian . This allows the use of a fixed step size in while achieving variable resolution in , preserving the geometric invariants of the system.
10. Neural Hamiltonian Flows
In generative modeling, we want to map a simple distribution (Gaussian) to a complex one (images) using a reversible, volume-preserving map. Hamiltonian Neural Networks (HNN) and Symplectic Adjoint Networks parametrize the Hamiltonian as a neural network. The flow is then implemented using a Leapfrog integrator layer. Because Leapfrog is exactly symplectic, the network is:
- Guaranteed Reversible: We can reconstruct the latent noise from the image perfectly.
- Volume Preserving: The Jacobian determinant is 1, simplifying the Change of Variables formula in Normalizing Flows.
- Physics-Informed: The network learns the underlying conserved quantities (energy) of the dynamical system it is modeling.
11. Conclusion: The Edge of Stability
The shadow Hamiltonian is an asymptotic series. For infinitely small , it perfectly describes the numerical flow. However, in practical simulation, we use the largest possible. As increases, the error terms grow. At a critical threshold (the resonance limit), the series for diverges. The discrete map undergoes a Stochastic Transition: the orbits leave the invariant KAM tori and begin to fill the phase space volume. This is the “chaos” regime where HMC becomes inefficient, as the proposal distribution no longer respects the geometry of the target density. The art of geometric integration is thus a balancing act: operating at the very edge of the shadow Hamiltonian’s convergence to maximize sampling speed without falling into the abyss of non-conservative drift.
Historical Timeline
| Year | Event | Significance |
|---|---|---|
| 1833 | William Rowan Hamilton | Formulates Hamiltonian mechanics. |
| 1956 | De Vogelaere | First explicit symplectic integrator (Leapfrog). |
| 1967 | Loup Verlet | Rediscoveres Leapfrog for Molecular Dynamics. |
| 1987 | Duane et al. | Hybrid Monte Carlo (HMC). |
| 1988 | Ge & Marsden | No-Go Theorem for exact energy conservation. |
| 2006 | Hairer, Lubich, Wanner | ”Geometric Numerical Integration” textbook. |
| 2014 | Hoffman & Gelman | No-U-Turn Sampler (NUTS). |
| 2019 | Greydanus et al. | Hamiltonian Neural Networks. |
Appendix A: Proof of Liouville’s Theorem (Trace-Determinant)
Theorem: If and , then volume is conserved. Proof: Let be the volume of a set transported by flow.
Let . Jacobi’s Formula: . Here . . (Chain rule). So . . . So . For Hamiltonian systems, . . Thus . Since , forever.
Appendix B: Symplectic Euler
Leapfrog is Order 2. There is a simpler Order 1 symplectic method. Symplectic Euler (Semi-Implicit):
- (Use NEW momentum). Jacobian:
Determinant: . It is symplectic! But it is not symmetric (Time Reversible). Symmetrization of Symplectic Euler gives Leapfrog. .
Appendix C: KAM Theory and Long-Term Stability
One of the deepest questions in numerical analysis is: How long does the shadow Hamiltonian stability last? The Kolmogorov-Arnold-Moser (KAM) theorem states that for small perturbations of integrable systems, most invariant tori (periodic orbits) survive. Symplectic integrators act as small perturbations to the Hamiltonian . Backward error analysis shows that the discrete trajectory is close to an integrable system for times . This means that while the integrator is not solving the exact system, it is solving a nearby stable reality for astronomical timescales. This is why symplectic integrators are used to calculate the stability of the Solar System over billions of years; a non-symplectic method would have crumbled in a few millenia.
Appendix D: Exterior Calculus and Symplectic Forms
Symplecticity is fundamentally an area-preserving property in the language of Exterior Calculus. Let be the symplectic 2-form. The integral of over a surface in phase space gives the total oriented area of . A map is symplectic if the pulled-back form equals . Applying Stokes’ Theorem:
This geometric interpretation reveals that HMC isn’t just “preserving energy”; it is preserving the very “fabric” of the state space. This lack of compression/expansion in phase space is what prevents the Markov Chain from becoming biased toward high-density regions without proper accounting.
Appendix E: Generating Functions and Hamilton-Jacobi Theory
Every symplectic map can be locally represented by a Generating Function . The relations and implicitly define the map. This is the bridge between the Hamiltonian and Lagrangian perspectives. If satisfies the Hamilton-Jacobi Equation:
then defines the exact flow of the system. Symplectic integrators can be viewed as maps derived from an approximate generating function . This ensures that even though is not exact, the resulting discretization satisfies the consistency conditions required by the symplectic group.
Appendix F: Stochastic Hamiltonian Dynamics
In Bayesian inference, we often consider the Stochastic Differential Equation (SDE):
This is Hamiltonian Langevin Dynamics. The geometric properties change slightly: we no longer preserve volume exactly (due to the friction term), but we preserve the Gibbs Measure . Correctly discretizing this system requires a Stochastic Symplectic Integrator. Splitting methods remain the gold standard:
This ensures that the “Hamiltonian part” is perfectly symplectic, while the “Langevin part” (OU process) is solved exactly. This hybrid approach is the backbone of modern large-scale posterior sampling in deep learning.
Appendix G: Geodesic Integration on Riemannian Manifolds
In General Relativity and Robotics, we often integrate on manifolds where the metric is state-dependent. The kinetic energy becomes . The Hamiltonian is no longer simple/separable. Standard Leapfrog fails. Geodesic Leapfrog: We must use Implicit methods or Geodesic Splitting. Specifically, we can split the metric itself (if it has symmetry) or solve the fixed-point equation for iteratively. The result is a trajectory that “rolls” along the curvature of the space without flying off the tangent plane. This is essential for simulating the precession of planets around black holes where the “straight lines” are themselves curved.
Appendix H: Symplectic Symmetry and Reversible Debugging
Symplectic integrators are Time Reversible. If you negate the momentum () and run the integrator for steps, you should return exactly to the initial (modulo floating point error). Bit-Reversible Computing: In large-scale physics simulations, we use Integer Arithmetic or fixed-point representations to make the map exactly reversible even at the bit level. This allows for Reversible Debugging: you can “rewind” a simulation from a crash state to find the exact moment a singularity occurred without storing the entire checkpoint history. This bit-level symmetry is a direct digital encoding of the symplectic property .
Appendix I: The Symplectic Group
The set of all symplectic matrices forms a Lie group . A matrix is in the group if . Properties:
- Determinant: . (Orientation and volume preservation).
- Spectrum: If is an eigenvalue, then , , and are also eigenvalues. This “symplectic symmetry” in the spectrum is what prevents complex eigenvalues from drifting into the unstable region without a pair-wise collision at the unit circle.
- Dimension: The dimension of the group is . Geometric integration can be viewed as the task of ensuring that our numerical update restricted to the manifold despite the errors of discretization.
Appendix J: Multi-symplectic Integrators for PDEs
Can we extend these ideas to infinite-dimensional systems like the Wave Equation or Schrödinger Equation? Hamiltonian PDEs often have a “Multi-symplectic” structure, where there is a symplectic form for both time and space.
Standard symplectic integrators only preserve the temporal structure. Multi-symplectic integrators (like the Preissman box scheme) preserve a local energy-momentum conservation law for every cell in the spacetime grid. This prevents the accumulation of unphysical spatial oscillations (aliasing) and ensures that wave packets preserve their shape over long distance travel.
Appendix K: Connection to Optimal Transport
There is a profound link between Hamiltonian flows and Optimal Transport (OT). Brenier’s Theorem states that the optimal map pushing one distribution to another is the gradient of a convex function . In the dynamic formulation of OT (Benamou-Brenier), the optimal path is a geodesic in the space of probability measures endowed with the Wasserstein metric. Geometric integration of these “Wasserstein Flows” requires preserving the displacement convexity of the energy. Symplectic integrators applied to the discretized OT problem ensure that the resulting transport maps are physically consistent and do not “leak” probability mass into forbidden regions of the state space.
Appendix L: Numerical Jacobians and Automatic Differentiation
Implementing symplectic integrators like Leapfrog or RATTLE requires the gradient . In modern machine learning frameworks (JAX, PyTorch), we use Automatic Differentiation (AD) to compute these gradients exactly to machine precision. Using Finite Differences (e.g., ) introduces errors that break the symplectic symmetry. Even a tiny gradient error acts as a non-conservative force, destroying the long-term stability proofs of the shadow Hamiltonian. AD ensuring that our numerical vector field is grad-like by construction is the single most important development in making Geometric Integration practical for complex, high-dimensional models.
Appendix M: Glossary of Geometric Terms
- Cotangent Bundle (): The space of all possible positions and momenta.
- Holonomic Constraint: A coordinate-based restriction (e.g., ).
- Poisson Bracket (): The Lie algebra operation on state-space functions.
- Shadow Hamiltonian (): The exact energy of the discrete system.
- Symplectic 2-form (): The geometric object that measures phase space area.
- Symplectic Manifold: A smooth manifold equipped with a closed non-degenerate 2-form.
References
1. Hairer, E., Lubich, C., & Wanner, G. (2006). “Geometric Numerical Integration”. The definitive reference for structure-preserving algorithms.
2. Leimkuhler, B., & Reich, S. (2004). “Simulating Hamiltonian Dynamics”. Bridge between molecular dynamics and numerical analysis.
3. Neal, R. M. (2011). “MCMC using Hamiltonian dynamics”. The foundational HMC paper for the machine learning community.
4. Marsden, J. E., & West, M. (2001). “Discrete mechanics and variational integrators”. The seminal paper on deriving integrators from the discrete action principle.
5. Munthe-Kaas, H. (1998). “High order Runge-Kutta methods on manifolds”. Introduced the RKMK methods for Lie group integration.
6. Bridges, T. J., & Reich, S. (2001). “Multi-symplectic integrators: numerical methods for Hamiltonian PDEs”. The primary source for multi-symplectic geometry in wave systems.
7. Ambrosio, L., Gigli, N., & Savaré, G. (2008). “Gradient Flows in Metric Spaces”. Deep connection between Wasserstein geometry and optimal transport.