# The uncertainty principle and virtual particles

October 23, 2020. According to physics folklore, virtual particles are objects which can flit in and out of existence, governed by Heisenberg’s uncertainty principle. But what is uncertainty in time? And what are virtual particles? I offer a semi-rigorous connection using the Mandelstam-Tamm and Robertson-Schrödinger uncertainty relations.

#### 1. Introduction to uncertainty

Prerequisites: quantum mechanics.

When Heisenberg layed down his uncertainty principle in 1927, it came in two flavours: position-momentum and energy-time. Schematically, these are usually written

$\Delta x \cdot \Delta p \gtrsim \hbar, \quad \Delta E \cdot \Delta t \gtrsim \hbar.$

What we mean by uncertainty is a bit uncertain. Often, the first semi-rigorous context we encounter these notions is the wavepacket, i.e. a well-localized wavefunction in space $|\psi(x)\rangle$. This has some spread of positions $\Delta x$, but we can Fourier transform to momentum space, where it will have some spread $\Delta p$. The uncertainty principle is then a mathematical result about the Fourier transform, and in fact, it is true for any wavefunction, not merely the well-localised ones.

We can derive the energy-time form for a wavepacket using the following hack. Let $\psi(x, t)$ denote the profile of the wavepacket. We assume it’s well-localised around some wavenumber $k_0$, so it has group velocity,

$v = \frac{\partial \omega}{\partial k}\bigg|_{k_0} = \frac{\hbar \partial \omega}{\hbar\partial k}\bigg|_{k_0} = \frac{\partial E}{\partial p}\bigg|_{k_0},$

using $E = \hbar \omega$ and $p = \hbar \omega$. If we linearize around the central wavenumber, we have

$\omega(k) \approx \omega(k_0) + v(k - k_0) \quad \Longrightarrow \quad v = \frac{\Delta \omega}{\Delta k} = \frac{\Delta E}{\Delta p},$

where now $\Delta p$ is interpreted as the momentum spread of the wavepacket and $\Delta E$ is the corresponding energy spread. Let $\Delta t$ be the time taken for the wavepacket to travel the distance of its own spread, $\Delta x$, then $v\Delta t = \Delta x$. The position-momentum uncertainty principle then gives

$\Delta x \cdot \Delta p = v\Delta t \cdot \Delta p \approx \frac{\Delta E}{\Delta p} \cdot \Delta t \cdot \Delta p = \Delta E \cdot \Delta t \gtrsim \hbar.$

So we get the energy-time form of the uncertainty principle!

But it’s very different from the position-momentum form. Position and momentum are bases to measure the wavepacket in, i.e. they correspond to legitimate quantum-mechanical operators. Energy is also a legitimate operator; we call it the Hamiltonian. But time is not an operator! Here, it appears very specifically as the time the wavepacket takes to move a whole $\Delta x$ away. Physically, this is the point at which the position changes distinguishably. Prior to this time, there is large overlap between the wavepackets at the two different times. This is the seed of a rigorous version of the energy-time uncertainty principle which doesn’t depend on having a localized wavepacket.

#### 2. MTML bound

In 1945, Mandelstam and Tamm gave the first rigorous version of the energy-time uncertainty relation. It’s important, because it tells us what it really means! I’ll present the nice proof from Levitin and Toffoli (2009). Consider some initial state $|\psi_0\rangle$ in a Hilbert space, which evolves according to a Hamiltonian $H$ with eigenstates $|E_n\rangle$. Averages with respect to this state are denoted $\langle A\rangle_0 = \langle\psi_0|A|\psi_0\rangle$. We can expand our state $|\psi_0\rangle$ and its time-evolved counterpart in energy eigenstates as

$|\psi_0\rangle = \sum_n C_n |E_n\rangle, \quad |\psi_t\rangle = \sum_n C_n e^{-i E_n t/\hbar} |E_n\rangle,$

with normalization giving $\sum_n |C_n|^2 = 1$. Define the overlap

$O(t) = \langle \psi_0 | \psi(t)\rangle = \sum_n |C_n|^2 e^{-i E_n t/\hbar}.$

Motivated by the wavepacket example, we expect an energy-time relation between the energy uncertainty $\Delta E = \sqrt{\langle H^2\rangle_0 - \langle H\rangle_0^2}$, and the time $\tau$ taken to evolve to a state with no overlap, $O(\tau) = 0$, i.e. a state which is perfectly distinguishable. First, we write out the overlap squared:

\begin{align*} |O(t)|^2 & = \sum_{mn}|C_nC_m|^2 e^{i(E_m - E_n)t/\hbar} =\sum_{mn}|C_nC_m|^2 \cos\left[\frac{(E_m - E_n)t}{\hbar}\right]. \end{align*}

The last equality follows because the overlap squared is real, so we can take the real part of each term without prejudice. To proceed, we need to invoke a slightly magical trigonometric inequality,

$\cos x \geq 1 - \frac{4}{\pi^2}x \sin x - \frac{2}{\pi^2}x^2.$

Applied to the overlap squared, this yields

\begin{align*} |O(t)|^2 \geq 1 - &\frac{4}{\pi^2}\sum_{mn}|C_nC_m|^2 \sin\left[\frac{(E_m - E_n)t}{\hbar}\right]\frac{(E_m - E_n)t}{\hbar} \\ - & \frac{2}{\pi^2}\sum_{mn} |C_nC_m|^2\left[\frac{(E_m - E_n)t}{\hbar}\right]^2, \end{align*}

where the first term uses $\sum_{mn}|C_mC_n|^2 = 1$. We can simplify this by observing that the second term is

$\sum_{mn}|C_nC_m|^2 \sin\left[\frac{(E_m - E_n)t}{\hbar}\right]\frac{(E_m - E_n)t}{\hbar} = -t\frac{d}{dt}|O(t)|^2,$

which will vanish when $O(\tau) = 0$. The last term is related to the energy spread, since

\begin{align*} \sum_{mn} |C_nC_m|^2(E_m - E_n)^2 & =\sum_{mn} |C_nC_m|^2(E_m^2 + E_n^2 - 2E_m E_n) \\ & = 2\langle H^2\rangle - 2\langle H\rangle^2 = 2\Delta E. \end{align*}

Thus, we find that when $O(\tau) = 0$,

$0 \geq 1 - \frac{1}{\pi^2}\left(\frac{2t \Delta E}{\hbar}\right)^2.$

Rearranging gives the Mandelstam-Tamm form of the energy-time uncertainty principle:

$\Delta E \cdot \tau \geq \frac{\pi \hbar}{2}.$

You might wonder if we could have manipulated things a different way and obtained a different result. In fact, we can! We will need another magical inequality,

$\cos x \geq 1 - \frac{2}{\pi}(x + \sin x).$

Now just take the real part of $O(t)$:

\begin{align*} \Re[O(t)] & = \sum_n |C_n|^2 \cos(E_nt/\hbar) \\ & \geq \sum_n |C_n|^2 \left[1 - \frac{2}{\pi}\left(\frac{E_nt}{\hbar}+\sin\left(\frac{E_nt}{\hbar}\right)\right)\right]\\ & = 1 - \frac{2 Et}{\pi \hbar} + \Im[O(t)], \end{align*}

where $E = \langle H\rangle$ is the average energy. As before, when the state evolves to an orthogonal state with $O(\tau) = 0$, the real and imaginary part vanish, and we are left with the uncertainty relation of Margolus and Levitin (1997):

$E \cdot \tau \geq \frac{\pi \hbar}{2}.$

Both bounds are correct, so we use whichever is most constraining. We’ll call this combined bound the Margolus-Levitin-Mandelstam-Tamm (MLMT) uncertainty relation:

$\tau \geq \text{max}\left\{\frac{\pi\hbar}{2\Delta E}, \frac{\pi\hbar}{2E}\right\}.$

We will mainly be interested in the Mandelstam-Tamm bound, but Margolus-Levitin was so close I couldn’t resist the temptation to derive it as well.

#### 3. Robertson-Schrödinger uncertainty

Let’s return to the position-momentum version for a moment. As before, for a state $|\psi\rangle$, we define the spread as the standard deviation of expectations measured in the state:

$\Delta x = \sqrt{\langle x^2\rangle-\langle x\rangle^2}, \quad \Delta p = \sqrt{\langle p^2\rangle -\langle p\rangle^2}.$

For this definition, the precise form of the uncertainty relation is $\Delta x \cdot \Delta p \geq \hbar/2$. The canonical commutation relations $[x, p] = i\hbar$ tell us that position and momentum do not commute, and cannot be measured simultaneously. Since this is true for any state, it is true that $\langle[x, p]\rangle = i\hbar$. So, we can rewrite position-momentum uncertainty as

$\Delta x \cdot \Delta p \geq \frac{1}{2} |\langle [x, p]\rangle|.$

In fact, it’s a relatively straightforward exercise to generalise this to arbitrary operators in a pure state:

$\Delta A \cdot \Delta B \geq \frac{1}{2}|\langle [A, B]\rangle|.$

This is called the Robertson-Schrödinger uncertainty principle (continuing our pattern of double-barrel uncertainty relations) after Howard Percy Robertson and Erwin Schrödinger. The expectation of the commutator lower bounds the product of uncertainties. We can connect this to energy-time uncertainty by plugging in the Hamiltonian for one of the operators:

$\Delta E \geq \frac{1}{2 \Delta A}|\langle [H, A]\rangle|.$

Levitin and Toffoli (2009) showed that Mandelstam-Tamm is sometimes tight, so we assume it is the higher bound. In this case,

$\tau \leq \tau_A = \frac{\pi\hbar\Delta A}{|\langle [H, A]\rangle|},$

where $\tau_A$ is a “Mandelstam-Tamm-Robertson-Schrödinger” (MTRS) timescale associated with $A$. The time needed to evolve to an orthogonal state has a bound made smaller by the amount of non-conservation of $A$, as measured by $[H, A]$, and larger by the variance $\Delta A$. As a quick reminder, in the Heisenberg picture $[H, A] = i\hbar\dot{A}$. Thus, if they commute, the expectation of $A$ is conserved.

This connection seems physically reasonable. If the expectation of $A$ is changing, it witnesses that $|\psi(t)\rangle$ is becoming more distinguishable from $|\psi(0)\rangle$. On the other hand, a large uncertainty $\Delta A$ washes out this effect, since $A$ is too blurry to usefully tell us about changes in the state. In fact, since $[H, A] = i\hbar\dot{A}$, we should have

$|\langle [H, A] \rangle| = \hbar |\langle \dot{A}\rangle| \sim \hbar \frac{|\Delta A|}{|\Delta \tau|}.$

If we interpret $|\Delta A| = \Delta A$ as the spread, and plug into MTRS, we get

$\tau_A = \pi \Delta \tau,$

where $\Delta \tau$ is the time needed to change the expectation of $A$ by $\Delta A$:

$\langle A\rangle_{\tau_A/\pi} - \langle A\rangle_{0} \sim \Delta A.$

This is very similar to the wavepacket case, and uses the same sort of implicit localisation, but in the $A$-basis.

#### 4. Virtual particles

We’ll now turn to a specific example of $A$: particle number. The basic setup requires a Fock space, which we take to be bosonic for simplicity. We have a set of creation and annihilation operators $a_i^\dagger, a_i$, satisfying commutation relations

$[a_i, a^\dagger_j] = \delta_{ij}.$

The states take the form of products of single particle states built by acting on the vacuum with creation operators:

$|n_1, n_2, \ldots, n_k\rangle = C (a_1^\dagger)^{n_1}\cdots (a_k^\dagger)^{n_k}|0\rangle,$

for a normalisation constant $C$. The number operator $N_i = a_i^\dagger a_i$ counts the number of $i$ quanta, since the commutation relations imply

$N_i |n_1, n_2, \ldots, n_k\rangle = n_i |n_1, n_2, \ldots, n_k\rangle.$

Now, this is all static; it has nothing to do with the Hamiltonian. In particular, particle number need not be conserved. A state may have a well-defined particle number, with $\Delta N_i$ small, but the expected number of particles can change quickly depending on the commutator $[H, N_i]$. The particles which appear out of nowhere, or rather, by virtue of time evolution, are precisely virtual particles! When Mandelstam-Tamm is tight, we know the state will become distinguishable after the MTR timescale,

$\tau_i = \frac{\pi\hbar\Delta N_i}{|\langle [H, N_i]\rangle|}.$

If my interpretation of the MTRS timescale is correct, they will become distinguishable by measurements of the number operator $N_i$ itself.

#### 5. A coherent cubic example

To get a feeling for what’s going on, consider the simplest nontrivial example, a perturbed harmonic oscillator with Hamiltonian

$H = \hbar \omega \left(N + \frac{1}{2}\right) + \lambda \left[(a^\dagger)^2 a + a^\dagger a^2\right] = H_0 + \lambda I,$

where $N = a^\dagger a$ and $H_0$ is the unperturbed harmonic oscillator, with $[N, H_0] = 0$, and $\lambda I$ is a cubic interaction which can fuse two particles into one, or split a particle into two. Both processes are required for unitarity, i.e. a Hermitian $H$. Using the commutator identity $[A, BC] = [A, B]C + B[A, C]$, it’s easy to show that

$[N, I] = [N, a^\dagger N + Na] = [N, a^\dagger]N + N[N, a] = a^\dagger N - Na.$

Eigenstates of $N$ have zero spread in particle number, but also zero commutator expectation, as a quick calculation shows. A more reasonable example is a coherent state, an eigenvalue of $a$ rather than $N$:

$|\alpha\rangle = e^{-|\alpha|^2/2}\sum_{n\geq 0}\frac{\alpha^n}{\sqrt{n!}}|n\rangle, \quad a |\alpha\rangle = \alpha |\alpha\rangle,$

where $\alpha$ is an arbitrary complex number. This has the following properties:

$\langle N\rangle_\alpha = |\alpha|^2, \quad \Delta N =|\alpha|,$

In the absence of interactions, this is a Gaussian wavepacket which wobbles back and forth in position space, without changing shape, at the frequency of the oscillator. There will be position-momentum uncertainty associated with this motion, and indeed, it saturates the wavepacket uncertainty relation discussed above.

But when we introduce the cubic interaction, there is a new source of uncertainty: changes to the measured particle number, which still appear even if we cling to the wobbling wavepacket and measure its properties. Rather than solve for the wavefunction over time (an analytically impossible task), we can use the MTRS timescale to estimate when the interaction appreciably changes particle number! For a coherent state, the commutator is

$\langle [N, I]\rangle_\alpha = \langle a^\dagger N\rangle_\alpha - \langle N a\rangle_\alpha = -2\Im[\alpha] |\alpha|^2 = -2|\alpha|^3\sin\theta$

for $\theta = \text{arg}(\alpha)$. The MTRS timescale can then be written

$\tau_N = \frac{\pi \hbar}{2 |\lambda\alpha^2 \sin\theta|}.$

This has some sensible features. If $\lambda = 0$, and there is no interaction, then we will never see any change in expected particle number. If $\theta = 0$, then $\alpha$ has no imaginary part, and there is no transition; roughly, $\theta$ tells us the asymmetry between the forward and the backward process, required for there to be an overall change in expected particle number. Finally, as $\alpha \to 0$, we simply recover the ground state of the unperturbed Hamiltonian, which cannot change expected particle number since we need particles to make or destroy particles.

#### 6. Particles and antiparticles

We can do a similar calculation with a basic version of particles and antiparticles. Consider three species of particles with annihilation operators $a, b, c$, and Hamiltonian,

$H = \sum_{x=abc}\hbar \omega_x \left(N_x + \frac{1}{2}\right) + \lambda (N_{\text{pair}}c + c^\dagger N_{\text{pair}}),$

where $a$ and $b$ are each other’s antiparticle, pair-produced by $N_{\text{pair}}$:

$N_{\text{pair}} = (a + b^\dagger) (a^\dagger + b) = N_a + N_b + 1.$

The interaction term means we can destroy a $c$ particle and pair-produce $a$ and $b$, as well as the converse process. We will repeat the same calculation as the previous section, focusing on transitions in $c$ particle number. First, note that since $a, b, c$ and daggers all commute, the canonical commutation relations for $c$ imply

$[N_c, I] = N_{\text{pair}}[N_c, c] + [N_c, c^\dagger]N_{\text{pair}} = c^\dagger N_{\text{pair}} - N_{\text{pair}} c.$

Once again, we consider a coherent $c$-state, $|\gamma\rangle$, with no $a$ or $b$ particles. Then

\begin{align*} \langle [N_c, I]\rangle_\gamma & = -2\Im \langle N_{\text{pair}} c\rangle_\gamma \\ & = -2\Im [\gamma \langle N_a + N_b + 1\rangle_\gamma] \\ & = - 2|\gamma| \sin\theta, \end{align*}

where $\theta = \text{arg}(\gamma)$, using $\langle N_a\rangle_\gamma = \langle N_b\rangle_\gamma = 0$. Thus, we obtain an MTRS timescale

$\tau_c = \frac{\pi \hbar}{2 |\lambda \sin\theta|}.$

Even though there are no $a$ or $b$ particles in the initial state, and we do not observe them, “virtual” $a$ and $b$ particles induce a change in the expected number of $c$ quanta. Interestingly, only the angle $\theta$ and not the size $|\gamma|$ are relevant, presumably because $N_\text{pair}$ does not change the coherent state on $c$, unlike $(a^\dagger)^2$ in the last section. We can generalise this easily to a coherent state of $a, b, c$ particles (since they commute),

$|\psi\rangle = |\alpha, \beta, \gamma\rangle.$

This yields a MTRS timescale

$\tau_c = \frac{\pi\hbar}{2|\lambda \sin\theta| (1+|\alpha|^2 + |\beta|^2)}.$

If there are $a$ and $b$ particles in the initial state, the transition in expected $c$ quanta is achieved more quickly, since we can run the interactions in both directions.

#### 7. Conclusion

It would be nice to obtain a rigorous understanding of the MTMR timescale, and to further explore the relation to the MTML bound. But the most intriguing problem is to connect the inequality and toy models, which all occur in the setting of non-relativistic quantum mechanics, to virtual particles in quantum field theory. One option is to upgrade the cubic particle-antiparticle example to, say, a charged scalar field. A simple and formally satisfying option might be to explore how Bogoliubov transformations appear in the this context, since these appear as the rigorous explanation of virtual vacuum fluctuations attributed to the uncertainty principle, e.g. in black hole physics. I leave all of these to future work, or the interested reader!