What is the relation between POVMs and observables (as Hermitian operators)?

Question

Let $\renewcommand{\calH}{{\mathcal{H}}}\calH$ be a finite-dimensional Hilbert space. An observable $A$ is here a Hermitian operator, $A\in\mathrm{Herm}(\calH)$. A POVM is here a collection of positive operators summing to the identity: $\{\mu(a): a\in\Sigma\}\subset\mathrm{Pos}(\calH)$ such that $\sum_{a\in\Sigma} \mu(a)=I$, for some register $\Sigma$.

An observable $A$ is given physical meaning via the mapping $\rho\mapsto \operatorname{Tr}(A\rho)$ for any state $\rho$, which gives us the expectation value of $A$ on $\rho$. On the other hand, a POVM $\mu$ is given physical meaning interpreting $\Sigma$ as the set of possible outcomes, with the outcome $a$ occurring with probability $\operatorname{Tr}(\mu(a)\rho)$.

A POVM is always a collection of observables, but not all collections of observables are POVMs.

Intuitively, I understand the process of "measuring an observable $A$" as tantamount to performing the (projective) POVM corresponding to the eigenvectors of $A$, then attaching numbers (the eigenvalues of $A$) to the corresponding outcomes, and obtaining the corresponding expectation value in the limit of many measurements. In this sense, I would be led to say that POVMs are more "fundamental" than observables, in the sense that observables amount to measuring a POVM and then doing post-processing on the corresponding measurement results.

On the other other hand, in several contexts observables, rather than POVMs, are given the spotlight. To name one example, we discuss the uncertainty principle in terms of observables, not in terms of POVMs. In this sense, it seems like observables are generally regarded as "more fundamental".

In light of this, is there any general statement that can be made about the relations between observables and POVMs? Should they be regarded as simply incomparable ideas, or is there merit in thinking of observables as equivalent to post-processing of measurement (POVM) results? Or are there reasons to instead use observables as the primitive idea, and think of POVMs as simply special sets of observables?

^{While the title might one lead to believe this question is similar to this other one I asked previously, the questions are really completely different; the apparent similarity is due to the different possible meanings of the term "observable" in different contexts.}

also related on physics.SE: physics.stackexchange.com/q/482768/58382 — glS, Commented May 4, 2022 at 8:21

Adam Zalcman · Accepted Answer · 2021-03-18 02:12:04Z

One way of looking at the relationship between POVMs and observables arises from identifying their counterparts in the theory of probability of which quantum mechanics can be thought of as an extension. It is easier to identify the counterparts if we temporarily restrict our attention to a special type of POVMs known as projection-valued measures or PVMs.

Below, we summarize four concepts: probability measure, random variable, PVM and observable in a way that highlights the high degree of similarity between probability measures and PVMs on one hand and random variables and observables on the other.

Probability measures and random variables

In elementary probability we construct random variables as tables that assign probabilities to possible outcomes, e.g. a random variable taking values $\lambda_k$ for $k=1,\dots,n$ with respective probabilities $p_k$ is described by

$$ \begin{array}{c|cccccc} X & \lambda_1 & \lambda_2 & \dots & \lambda_n \\ \hline p(X) & p_1 & p_2 & \dots & p_n \\ \end{array} $$

In the more abstract approach based on measure theory the probabilistic structure is separated out by introducing the probability space $(\Omega, \mathcal{F}, P)$ where the sample space $\Omega$ is the set of all possible outcomes of a random experiment, $\mathcal{F}$ is a $\sigma$-algebra of events on $\Omega$ and $P: \mathcal{F} \to \mathbb{R}_{\ge0}$ is a probability measure.

A random variable $X$ is defined as a measurable function from $\Omega$ to a measurable space, e.g. $\mathbb{R}$. This splits the above table by inserting the sample space

$$ \begin{array}{c|cccccc} X & \lambda_1 & \lambda_2 & \dots & \lambda_n \\ \hline \Omega & \omega_1 & \omega_2 & \dots & \omega_n \\ \hline P & p_1 & p_2 & \dots & p_n \\ \end{array} $$

(Technically, $P$ is defined on $\mathcal{F}$ but when $\Omega$ is finite and $\mathcal{F}=\mathcal{P}(\Omega)$ additivity implies that $P$ is uniquely defined by its values on the singleton subsets of $\Omega$.)

In this view a random variable can be thought of as a random experiment which is represented by the probability measure $P$ and which yields an abstract outcome $\omega_k$ followed by post-processing which is described by the measurable function $X$ and which finds the experiment result $\lambda_k := X(\omega_k)$. Moreover, if the range of $X$ is finite, the random variable can be written as

$$ X = \sum_k\lambda_k\mathbb{1}_{A_k}\tag1 $$

where $\mathbb{1}_E$ is the indicator function of a set $E\subset \Omega$ and the sets $A_k$ form a partitioning of $\Omega$, i.e. they are disjoint subsets of $\Omega$ whose union is $\Omega$. Note that in terms of the indicator functions, the last condition can be stated as $\mathbb{1}_{A_i}\mathbb{1}_{A_j} = \delta_{ij}\mathbb{1}_{A_i}$ and $\sum_k\mathbb{1}_{A_k} = 1$.

Projection-valued measures and observables

In elementary quantum mechanics we construct observables by assigning projectors to possible outcomes and combining them to form a Hermitian operator, e.g. an observable $X$ taking values $\lambda_k$ for $k=1,\dots,n$ with respective projectors $|k\rangle\langle k|$ is $X=\sum_k\lambda_k|k\rangle\langle k|$ or in table format

$$ \begin{array}{c|cccccc} X & \lambda_1 & \lambda_2 & \dots & \lambda_n \\ \hline \pi & |1\rangle\langle 1| & |2\rangle\langle 2| & \dots & |n\rangle\langle n|\end{array} $$

As before, in the more abstract approach using measure theory we introduce a sample space $\Omega$ and a $\sigma$-algebra $\mathcal{F}$. In place of a probability measure $P: \mathcal{F} \to \mathbb{R}_{\ge0}$, we define a projection-valued measure $\pi: \mathcal{F} \to L_H(\mathcal{H})$ where $L_H(\mathcal{H})$ is the space of Hermitian operators on a Hilbert space $\mathcal{H}$ and $\pi(E)$ is a projector for every event $E\in\mathcal{F}$. Moreover, for any partitioning $A_k$ of $\Omega$ we require that $\pi(A_i)\pi(A_j) = \delta_{ij}\pi(A_i)$ and $\sum_k\pi(A_k)=I$. For each state $|\psi\rangle$, this gives us a probability measure $P_\psi: \mathcal{F} \to \mathbb{R}_{\ge0}$ defined as $P_\psi(E) = \langle\psi|\pi(E)|\psi\rangle$.

Defining $\Omega = \{\omega_1, \dots, \omega_n\}$, $\mathcal{F}=\mathcal{P}(\Omega)$ and $\pi(A) = \sum_{\omega_k\in A}|k\rangle\langle k|$ we can then write $X=\sum_k\lambda_k\pi(\{\omega_k\})$ or in table format

$$ \begin{array}{c|cccccc} X & \lambda_1 & \lambda_2 & \dots & \lambda_n \\ \hline \Omega & \omega_1 & \omega_2 & \dots & \omega_n \\ \hline \pi & |1\rangle\langle 1| & |2\rangle\langle 2| & \dots & |n\rangle\langle n| \\ \end{array} $$

(As before, technically, $\pi$ is defined on $\mathcal{F}$ but when $\Omega$ is finite and $\mathcal{F}=\mathcal{P}(\Omega)$ additivity implies that $\pi$ is uniquely defined by its values on the singleton subsets of $\Omega$.)

In this view an observable can be thought of as a projective measurement which is represented by the PVM $\pi$ and which given a state $|\psi\rangle$ yields an abstract outcome $\omega_k$ followed by post-processing which is described by the eigendecomposition of $X$ and which finds the measurement result $\lambda_k$. Moreover, if the spectrum of $X$ is finite, the observable can be written as

$$ X = \sum_k\lambda_k\Pi_k\tag2 $$

where $\Pi_k=\pi(\{\omega_k\})$ is the projector on the eigenspace of $X$ associated with eigenvalue $\lambda_k$. Note that the projectors are orthogonal, i.e. $\Pi_i\Pi_j = \delta_{ij}\Pi_i$ and $\sum_k\Pi_k = I$.

Observables vs POVMs

The correspondence above shows that there is indeed merit in thinking of an observable as equivalent to a special kind of POVM - namely a PVM - together with post-processing of measurement results. Moreover, the correspondence explains why POVMs are of independent interest outside of their role in specifying observables. Specifically, POVMs are more general than PVMs since they are not limited to describing projective measurements. This situation also finds its mirror image in measure theory: just as probability measures are not the only interesting type of scalar measures so the PVMs are not the only interesting type of POVMs. This situation may be represented graphically as

$$ \begin{array}{|c|c|} \hline \text{random variables} & \\ \hline \text{probability measures} & \text{other scalar measures} \\ \hline \end{array} \begin{array}{|c|c|} \hline \text{observables} & \\ \hline \text{PVMs} & \text{other POVMs} \\ \hline \end{array} $$

where positioning of one cell above another is to be interpreted as "builds upon" and where empty cells highlight the fact that the type of construction used to form random variables on top of probability measures and observables on top of PVMs does not easily generalize to other scalar measures and POVMs.

This relationship explains why both observables and POVMs are useful and encountered regularly. On one hand, observables like random variables provide a higher level language than measures which includes convenient shortcuts for computing quantities such as mean and standard deviation (useful to express results such as the Heisenberg uncertainty principle). On the other, they are less general since they only model projective measurements.

Note that general measurement may be emulated by projective measurement together with auxiliary subsystem and unitary evolution. Therefore, in a physical sense observables and POVMs describe the same fundamental physical reality. The utility of POVMs lies in convenient description of information theoretic aspects of processes more complex than projective measurement.

Remark on shared limitations

Finally, note that general quantum measurements (as described e.g. in section 2.2.3 on page 84 in Nielsen & Chuang) capture a wider class of processes than both observables and POVMs. Specifically, the letter two model measurement statistics, but fail to provide the most general way to compute post-measurement state.

interesting take, thanks a lot. I guess the gist of it is that observables are sort of the quantum analogue to random variables when the underlying probability distribution is given by a PVM. This makes me wonder: more general POVMs also define an underlying probability distribution. What would an "observable", i.e. a random variable, corresponding to a generic POVM look like? The answer is probably that you still get a "standard" observable via Naimark's theorem or something like that, though I can't quite put my finger on it right now — glS, Commented Mar 17, 2021 at 20:02
Thank you for the deep and interesting question. Re gist: Yes, that's a good summary. Re probability distributions defined by POVMs: Yes, any POVM (projective or otherwise) can be thought of as a probability distribution parametrized with $|\psi\rangle$. The issue is that the observable construction which takes outcomes $\lambda_k$ assigned to each operator $E_k$ in the POVM and forms the linear combination $X=\sum_k\lambda_k E_k$ is "lossy" in the sense that there are many other POVMs that yield the same $X$. The spectral theorem means that restricting attention to PVMs removes the lossiness. — Adam Zalcman, Commented Mar 18, 2021 at 1:23
In other words, an observable built on top of a PVM remembers the PVM, so you can recover the projectors and the outcomes. By contrast, an observable built on top of a generic POVM forgets the POVM elements and we can't recover the elements or the outcomes (though we can still compute the average as $\langle\psi|X|\psi\rangle$). — Adam Zalcman, Commented Mar 18, 2021 at 1:24

Sanchayan Dutta · Accepted Answer · 2022-05-04 02:19:21Z

In various respects, the POVM formalism is much more fundamental than Hermitian observables.

The notion of generalized observables was developed precisely to weaken and generalize the notion of observables as Hermitian operators. For instance, the restrictions imposed by Pauli's theorem on a suitable time observable was a long-debated issue in quantum mechanics which now has reasonable candidates in terms of POVMs. Other advantages include a simpler version of Gleason's theorem for POVMs. There's a very nice discussion about this in Valter Moretti's Spectral Theory and Quantum Mechanics (chapter 13).

You might also be interested in the theory of noncommutative probability. You'll need the analogue of Gleason's theorem (as in Busch's paper) to see that to every generalized probability measure you can associate a unique density operator $\rho$ that gives the same measurement statistics.

Stack Exchange Network

What is the relation between POVMs and observables (as Hermitian operators)?

2 Answers 2

Probability measures and random variables

Projection-valued measures and observables

Observables vs POVMs

Remark on shared limitations

Not the answer you're looking for? Browse other questions tagged
measurement
terminology-and-notation
povm
or ask your own question.

Linked

Hot Network Questions

What is the relation between POVMs and observables (as Hermitian operators)?

2 Answers 2

Probability measures and random variables

Projection-valued measures and observables

Observables vs POVMs

Remark on shared limitations

Not the answer you're looking for? Browse other questions tagged measurementterminology-and-notationpovm or ask your own question.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
measurement
terminology-and-notation
povm
or ask your own question.