Let \(S\) be the event that the violin is a Strad, and \(L\) be the event that the violin has the label Emilio found. The probability we are interested in is \(P(S\given L)\). The Bayes’ theorem () says,

\[P(S\given L) = \frac{P(L\given S)P(S)}{P(L\given S)P(S)+P(L\given \neg S)P(\neg S)}\]

The question provides all the necessary information to compute the probability we are interested in. Namely,

\[\begin{gather*} P(S) &=& 0.03 \\ P(\neg S) &=& 0.97 \\ P(L\given S) &=& 0.663\\ P(L\given \neg S) &=& 0.39 \end{gather*}\]

Therefore,1

\[P(S\given L) = \frac{0.663\cdot 0.03}{0.663\cdot 0.03 + 0.39\cdot 0.97} \approx 0.04995\]

The second part. Let’s symbolize the event “Emilio sounds bad” as \(B\). We are now interested in the probability \(P(S\given L \cap B)\). Again, we can use the Bayes’ theorem to compute this probability,

\[P(S\given L\cap B) = \frac{P(L\cap B\given S)P(S)}{P(L\cap B\given S)P(S) + P(L\cap B\given \neg S)P(\neg S)} \label{b1}\]

What looks mysterious is the term \(P(L\cap B\given S)\) (and its \(\neg S\) variant in the denominator). We can use the definition of conditional probability () to expand it as,

\[\begin{align} \begin{aligned} P(L\cap B\given S) & = \frac{P(L\cap B\cap S)}{P(S)}\\ & = \frac{P(L\cap S)P(B\given L\cap S)}{P(S)}\\ & = \frac{P(S)P(L\given S)P(B\given L\cap S)}{P(S)}\\ & = P(L\given S)P(B\given L\cap S) \end{aligned}\label{b2} \end{align}\]

Inserting equality \(\eqref{b2}\) into equality \(\eqref{b1}\), we get,

\[\label{b3} P(S\given L\cap B) = \frac{P(S)P(L\given S)P(B\given L\cap S)}{P(S)P(L\given S)P(B\given L\cap S) + P(\neg S)P(L\given \neg S)P(B\given L\cap \neg S)}\]

Note that equation \(\eqref{b3}\) is actually nothing but the Bayes’ theorem fortified with the Chain rule.

We know the values of most of the terms in equation \(\eqref{b3}\), except for the probabilities \(P(B\given L\cap S)\) and \(P(B\given L\cap \neg S)\), namely the probability that Emilio sounds bad given that the violin is a Strad and has the label, and the probability that Emilio sounds bad given that the violin is not a Strad and has the label. But, there is something fishy here. If we know that the violin is not a Strad, then does the fact that it has the label have any effect on the probability we assign to how well Emilio will sound? Or, take the other way around, in which case we know that the violin is actually a Strad. In mathematical language, we say that the events \(B\) and \(L\) are conditionally independent given \(S\). Adapting to our case, we have,

\[\begin{align} \begin{aligned} P(B\given L\cap S) & = P(B\given S) = 0.95\\ P(B\given L\cap \neg S) & = P(B\given \neg S) = 0.98 \end{aligned} \end{align}\]

Inserting all the values we have into equation \(\eqref{b3}\), we get,2

\[P(S\given L\cap B) = \frac{0.03\cdot 0.663\cdot 0.95}{0.03\cdot 0.663\cdot 0.95 + 0.97\cdot 0.39\cdot 0.98} \approx 0.048496\]

Therefore, learning that Emilio sounds bad decreases the probability that the violin is a Strad, albeit only very slightly, and down from an already very low probability.


Appendix

Given a sample space \(\Omega\), with events \(A_1,...,A_n \subseteq \Omega\) forming a partition of \(\Omega\), with \(P(A_i) > 0\) for all \(i\leq n\), and an event \(B \subseteq \Omega\), such that \(P(B) > 0\):

\[\begin{align*} P(A_i\given B) &= \frac{P(A_i)P(B\given A_i)}{P(B)} \\ & = \frac{P(A_i)P(B\given A_i)}{P(A_1)P(B\given A_1) + \cdots + P(A_n)P(B\given A_n)} \end{align*}\]

The conditional probability of an event \(A\) given an event \(B\) is the probability of \(A\) occurring given that \(B\) has occurred. It is denoted as \(P(A\given B)\) and can be calculated using the formula:

\[P(A\given B) = \frac{P(A \cap B)}{P(B)}\]

Given three events \(A\), \(B\), and \(C\), we say that \(A\) and \(B\) are conditionally independent given \(C\) if the occurrence of one event does not affect the probability of the other event occurring, given that \(C\) has occurred. This can be expressed mathematically as:

\[P(A \cap B \given C) = P(A \given C) P(B \given C)\]

The formula can be rearranged to a more intuitive form. First, observe that the LHS can be rewritten as,

\[\begin{align} \begin{aligned} P(A \cap B \given C) & = \frac{P(A \cap B\cap C)}{P(C)}\\ & = \frac{P(A\cap C)P(B\given A\cap C)}{P(C)} \end{aligned} \end{align}\]

Now transform the RHS as,

\[\begin{align} \begin{aligned} P(A \given C) P(B \given C) & = \frac{P(A\cap C)}{P(C)} P(B\given C) \end{aligned} \end{align}\]

Re-drawing the equality with the transformed LHS and RHS, we get,

\[\begin{align} \begin{aligned} \frac{P(A\cap C)P(B\given A\cap C)}{P(C)} = \frac{P(A\cap C)}{P(C)} P(B\given C) \end{aligned} \end{align}\]

which, after cancellations, gives,

\(\begin{align} \begin{aligned} P(B\given A\cap C)= P(B\given C) \end{aligned} \end{align}\) which more explicitly states that “when you know that \(C\) has occurred, learning that \(A\) has occurred does not provide any information on the probability of \(B\) occurring”.


  1. A more precise value is 0.04995102840352595 

  2. A more precise value is 0.048496071267704312.