Bayes (WIP 2)
Instructions: Click and drag black labels along edges; click gray labels to switch to controlling those.
From a friend in business school: “1. What's the probability of a HS athlete going pro? 2. Suppose we know a pro athlete. What's the probability she was a college athlete?”
So I was thinking about my favorite intuitive illustrations and explanations of conditional probability and Bayes' theorem, e.g.
- http://setosa.io/conditional/
- http://yudkowsky.net/rational/bayes
- That brilliant.org one I think
The Victor Powell / Setosa one in particular is unbeatably excellent, I think. But I wanted to try this. It feels more intutive to me that P(A) and P(B) should be perpendicular. In particular, it makes independence of A and B easy to see: the black and gray lines exactly overlap; P(A) = P(A|B).
When there's some dependence, I'm interested by the diagonal line you get between the midpoints of the P(B|A) line and the P(B|¬A) line. It's not shown by default cuz it got noisy, but it's the diagonal line you see in-between switching between black and gray parameter sets. I can imagine (and want to try) an alternate control scheme where you just have an intersection you can drag around inside the box and an angle for the diagonal line. That's equally expressive as this scheme, and sorta interesting, because the angle of the diagonal line is so related (somehow?) to the, uh, covariance or w/e.
One cool thing about this is that you can feel out which things are linear and which are not. The slope of the diagonal line is independent of P(A), which I would not have intuited. And P(A|B) and P(A|¬B) are nonlinear functions of P(A), P(B|A), and P(B|¬A), which I don't think I had any intuition about, but feels central to a lot of counterintuitive results of conditional probability questions.
As nice as it is to “feel out”, I want to be able to SEE any of those things I feel — spatialize the state space. Plot everything as a function of everything else, see the steepest slopes, marginal sensitivities, etc. A kind of phase space, idk. Ideally with the same visualization. Explode it along a third axis of all possible values of the current parameter... yessssssss that'd be so good, so doable. Whichever parameter you're currently holding, explode out all possible values along the z-axis, so you can see the nonlinear effects of dragging by dx.
I still want to make something that captures some feeling I have of weighing prior and posterior confidences, and the updating flowing one way or the other accordingly, almost hydraulically.
Discussion question ideas, like if I were trying to teach this
- How to interpret positions of lines
- P(¬A) and P(¬B) are not explicitly shown. What are they?
- Why don't we need separate controls for "not A" and "not B"?
- Why is it a square? Does it matter? What might other rectangles mean?
- What does and doesn't change when you click a grayed-out probability?
- Calculate the area of a segment and compare to the symbolic rules for probabilities. How do you interpret the areas?
- Calculate an area as a fraction of another larger area. How would you interpret that fraction?
- Something about dependence and independence of various variables (probabilities) wrt others
- Something about linear vs nonlinear responses to other variables (probabilities)
- Drag P(B|A) and watch P(B). Is there a maximum or minimum value of P(B) you can achieve? What is it? What about if you drag P(B|A) and watch P(A|B)?
- Try to get P(A|B) to equal P(B|A). What has to be true? Can you get them to equal any other way, or is your solution unique?
- Under some circumstances, these relationships simplify to something more intuitive or linear. Can you find examples? Why might those situations be confusing?
- Consider P(B) is very small and P(A|¬B) is very big. Drag P(A|B) between 0 and 1 and in-between. As you move it, how does P(B|A) move? What does it move between, roughly? At the same speed? I.e., if you change P(A|B) by some amount — let's say 0.1 — does P(B|A) always change by roughly the same amount, no matter what P(A|B) is? What is that amount?
To-do
- Interactions should maybe be more obvious... ghost cursor?
- ✅ Label areas: P(A ⋀ B), P(A ⋀ ¬B), P(¬A ⋀ B), P(¬A ⋀ ¬B)
- ✅ Smoothly transition from A|B lines to B|A lines (ghosting the past lines and transitioning solid ones to new spots through the diagonal)
- Show derivations and definitions on hover
- Add a hover-sensitive plain text summary sentence caption ("If B is true, then the probability of A is the probability of A and B divided by the probability of A...", highlighting the relevant quantities and areas, maybe color-coding)
- Add covariance as a lever to fiddle with (closely related but not identical to the diagonal??)
- Toggle independence of A and B (or total dependence?)
- One day a more constructive writeup would be good...
- Could you represent dependence by skewing the square into a rhombus? Like the angle between the P(A) and P(B) edges would vary between 90º (independent) and 0º (totally dependent).
See also
- https://en.wikipedia.org/wiki/An_Essay_towards_solving_a_Problem_in_the_Doctrine_of_Chances
- https://en.wikipedia.org/wiki/Beta_distribution#Bayes.27_prior_probability_.28Beta.281.2C1.29.29
- I asked for help here, there's some good discussion, what a ridiculously helpful community: https://math.stackexchange.com/questions/2407913/is-it-possible-to-divide-a-square-into-four-parts-of-arbitrary-size-with-two-lin