---------------------------------------------------
Postulates for the formal core of quantum mechanics
---------------------------------------------------
Quantum mechanics consists of a formal core that is universally agreed
upon (basically being a piece of mathematics with a few meager pointers
on how to match it with experimental reality) and an interpretational
halo that remains highly disputed even after 90 years of modern quantum
mechanics. The latter is the subject of the foundations of quantum
mechanics; it is addressed elsewhere in this FAQ.
Here I focus on the formal side. The relativistic case is outside the
scope of the present axioms, though presumably very little needs to be
changed.
As in any axiomatic setting (necessary for a formal discipline),
there are a number of different but equivalent sets of axioms
or postulates that can be used to define formal quantum mechanics.
Since they are equivalent, their choice is a matter of convenience.
My choice presented here is the formulation which gives most direct
access to statistical mechanics but is free from allusions to
measurement. The reason for the first is that statistical mechanics is
the main tool for applications of quantum mechanics to the macroscopic
systems we are familiar with. The reason for the second is that real
measurements consitute a complex process involving macroscopic
detectors, hence should be explained by quantum statistical mechanics
rather than be part of the axiomatic foundations themselves. (This is
in marked contrast to other foundations, and distinguishes the present
axiom system.)
Thus the following describes nonrelativistic quantum statistical
mechanics in the Schroedinger picture. (As explained later, the
traditional starting point is instead the special case of this setting
where all states are assumed to be pure.)
A second reason for my choice is to emphasise the similarity of
quantum mechanics and classical mechanics. Indeed, the only difference
between classical and quantum mechanics in this axiomatic setting is
that:
- The classical case only works with diagonal operators, where all
operations happen pointwise on the diagonal elements. Thus
multiplication is commutative, and one can identify operators and
functions. In particular, the density mattrix degenerates into a
probability density.
- The quantum case allows for noncommutative operators, hence both
observable quantities and the density are (usually
infinite-dimensional) matrices.
For brevity, I assume the knowledge of some basic terms from functional
analysis, which are precisely defined in many mathematics books.
[For a discussion of the difference between a Hermitian and a
self-adjoint operator, see e.g., Definition 3 in
http://arxiv.org/pdf/quant-ph/9907069 . The importance of this
difference is that Hermitian operators have a real spectrum if and
only if they are self-adjoint. Moreover, the Hille-Yosida theorem says
that e^{iX) exists (and is unitary) for a Hermitian operator X if and
only iff X is self-adjoint. A detailed discussion and the HY theorem
itself can be found in Vol.3 of the math physics treatise by Thirring.
A Hermitian trace class operator is always self-adjoint.]
The statements of my axioms contain in parentheses some additional
explanations that, strictly speaking, are not part of the axioms but
make them more easily intelligible; the list of examples given only
has illustrative character and is far from being exhaustive.
Quantum mechanics is governed by the following six axioms:
A1. A generic system (e.g., a 'hydrogen molecule') is defined by
specifying a Hilbert space K and a densely defined, self-adjoint
Hermitian linear operator H called the _Hamiltonian_ or the _energy_.
A2. A particular system (e.g., 'the ion in the ion trap on this
particular desk') is characterized by its _state_ rho(t)
at every time t in R (the set of real numbers).
Here rho(t) is a Hermitian, positive semidefinite, linear trace class
operator on K satisfying at all times the condition
trace rho(t) = 1. (normalization)
A3. A system is called _closed_ in a time interval [t1,t2]
if it satisfies the evolution equation
d/dt rho(t) = i/hbar [rho(t),H] for t in [t1,t2],
and _open_ otherwise. (hbar is Planck's constant, and is often set
to 1.)
If nothing else is apparent from the context, a system is assumed to
be closed.
A4. Besides the energy H, certain other densely defined, self-adjoint
Hermitian operators or vectors of such operators are distinguished
as _observables_.
(E.g., the observables for a system of N distinguishable particles
conventionally include for each particle several 3-dimensional vectors:
the _position_ x^a, _momentum_ p^a, _orbital_angular_momentum_ L^a
and the _spin_vector_ (or Bloch vector) sigma^a of the particle with
label a. If u is a 3-vector of unit length then u dot p^a, u dot L^a
and u dot sigma^a define the momentum, orbital angular momentum,
and spin of particle a in direction u.)
A5. For any particular system, and for every vector X of observables
with commuting components, one associates a time-dependent monotone
linear functional <.>_t defining the _expectation_
_t:=trace rho(t) f(X)
of bounded continuous functions f(X) at time t.
(This is equivalent to a multivariate probability measure dmu_t(X)
on a suitable sigma algebra over the spectrum spec(X) of X) defined by
integral dmu_t(X) f(X) := trace rho(t) f(X) =_t.
This sigma algebra is uniquely determined.)
A6. Quantum mechanical predictions consist of predicting properties
(typically expectations or conditional probabilities) of the measures
defined in Axiom A5, given reasonable assumptions about the states
(e.g., ground state, equilibrium state, etc.)
Axiom A6 specifies that the formal content of quantum mechanics is
covered exactly by what can be deduced from Axioms A1-A5 without
anything else added (except for restrictions defining the specific
nature of the states and observables), and hence says that
Axioms A1-A5 are complete.
The description of a particular closed system is therefore given by
the specification of a particular Hilbert space in A1, the
specification of the observable quantities in A4, and the
specification of conditions singling out a particular class of
states (in A6). Given this, everything else is determined by the theory,
and hence is (in principle) predicted by the theory.
The description of an open system involves, in addition, the
specification of the details of the dynamical law. (For the basics,
see the entry 'Open quantum systems' in this FAQ.)
In addition to these formal axioms one needs a rudimentary
interpretation relating the formal part to experiments.
The following _minimal_interpretation_ seems to be universally
accepted.
MI. Upon measuring at times t_l (l=1,...,n) a vector X of observables
with commuting components, for a large collection of independent
identical (particular) systems closed for times t 0, all terms exp(-E_k/T)/Z(T) become 0 or 1,
with 1 only for the k corresponding to the states with least energy
Thus, if the ground state psi_1 is unique,
lim_{T->0} rho(T) = psi_1 psi_1^*.
This implies that for low enough temperatures, the equilibrium state
is approximately pure. The larger the gap to the second smallest
energy level, the better is the approximation at a given nonzero
temperature. In particular (reinstalling the Boltzmann constant kbar),
the approximation is good if the energy gap exceeds a small multiple
of E^* := kbar T.
States of simple enough systems with a few levels only
can often be prepared in nearly pure states, by realizing a source
governed by a Hamiltonian in which the first excited state has a much
larger energy than the ground state. Dissipation then brings the
system into equilibrium, and as seen above, the resulting equilibrium
state is nearly pure.
To see how the more traditional setting in terms of the
Schroedinger equation arises, we consider the case of a closed
system in a pure state rho(t) at some time t.
If psi(t) is a unit vector in the range of the pure state rho(t)
then psi(t), called the _state_vector_ of the system at time t,
is determined up to a phase, and one easily verifies that
rho(t) = psi(t)psi(t)^*.
Remarkably, under the dynamics for a closed system specified in the
above axioms, this property persists with time (only) if the system
is closed, and the state vector satisfies the Schroedinger equation
i hbar psi(t) = H psi(t)
if its phase at every time is appropriately chosen. (The density matrix
is independent of this phase.)
Thus the state remains pure at all times. Conversely, for every pure
state, the phases of psi(t) at all times t can be chosen such that the
Schroedinger equation holds.
Moreover, if X is a vector of observables with commuting components
and the spectrum of X is discrete, then the measure from Axiom A5
is discrete,
integral dmu(X) f(X) = sum_k p_k f(X_k)
with nonnegative numbers p_k summing to 1, commonly called
_probabilities_. Associated with the p_k are eigenspaces K_k such that
X psi = X_k psi for psi in K_k,
and K is the direct sum of the K_k. Therefore, every state vector psi
can be uniquely decomposed into a sum
psi = sum_k psi_k with psi_k in K_k.
psi_k is called the _projection_ of psi to the eigenspace K_k.
A short calculation using Axiom A5 now reveals that for a pure state
rho(t)=psi(t)psi(t)^*, the probabilities p_k are given by the
so-called _Born_rule_
p_k = |psi_k(t)|^2, (B)
where psi_k(t) is the projection of psi(t) to the eigenspace K_k.
Deriving the Born rule (B) from Axioms A1-A5 makes it completely
natural, while the traditional approach starting with (B)
makes it an irreducible rule full of mystery and only justifiable
by its miraculous agreement with experiment.
Note that Born's 1926 paper (reprinted in English translation in
pp.52-55 of the reprint volume ''Quantum Theory and Measurement'' by
Wheeler and Zurek) - which introduced the probabilistic interpretation
that earned him a Nobel prize - didn't relate his interpretation to
measurement. Born's formulation doesn't depend on anything being
measured (let alone to be assigned a precise numerical measurement
value): ''gives the probability for the electron, arriving from the
z-direction, to be thrown out into the direction designated by the
angles alpha, beta, gamma, with the phase change delta''.
Nevertheless, it is often (see, e.g.,
http://en.wikipedia.org/wiki/Born_rule )
claimed as part of Born's rule that the results of the measurement
should equal exactly the eigenvalues. But unless the lambda_i are
(as for polarization, spin or angular momentum in a particular
direction - the common subjects of experiments involving Alice and Bob)
system-independent, discrete, and known a priori - in which case one
can label each measurement record with these numbers -, this form of the
rule is highly unrealistic.
I didn't mention indistinguishable particles in my examples
illustrating the axioms, for two reasons:
1. One cannot easily specify the set of relevant observables without
introducing lots of additional notation or terminology - whereas the
explanations of the axioms should be very short.
2. I think that the concept of indistinguishable particles is
completely superseded by the concept of a quantum field.
The latter gives much better intuition about the meaning of the
formalism, and the former (which is difficult to justify and even more
difficult to interpret intuitively) is then completely dispensable.