← all articles

Bayes' Theorem and Stochastic Processes

(Short paper I wrote for a CS23: Discrete Maths course at Cabrillo College. — March 2014)

How does the concept of a stochastic process relate to, say, conditional probabilities and Bayes' Theorem (which is often referred to as inverted conditional probabilities)? Are they the same? Is one a subset of the other? What is a stochastic process? What does it mean for a stochastic process to be discrete or continuous? What are some current and traditional areas of research where stochastic processes are used?


As humans, we find it important to analyse, classify, and understand the world around us. Some systems, such as spring contraptions (or other physics-related systems), are considered deterministic systems, because they can be explained with a strict set of rules (usually expressed with differential equations). However there are many other systems that are intrinsically random (or we simply don't understand them to the full extent and therefore they "seem" random) and cannot be explained with a discrete set of equations; instead we need to rely on probability. We call these "random" systems stochastic systems.

We refer to a stochastic process as the collection of random data/variables which represent change in a stochastic system. Think of this as the set of variables obtained when you research the population of majestic lions in the African Savannah. These processes can be considered Discrete or Continuous:

Discrete vs. continuous, by hand

Flip a coin to build a discrete tally, or drag the weight slider through a continuous range. Same system, two kinds of variable.

Since these systems are random, the only way we can understand and model them is through probabilities. Now let's say we are trying to model a typical Blackjack (a.k.a. 21) game and we want to know our probability of winning with a perfect 21 points. Obviously, there are multiple ways of winning (e.g. 9+9+3; King+8; Ace[1]+Ace[11]+9). This is a discrete random system that we will only be able to model probabilistically. Furthermore, our models will have to adapt and change depending on our initial conditions, since our chances of winning are not the same if our first card is a 3 rather than a 10. This is where Probability Theory comes into play — and more specifically, conditional probability. Conditional probability measures the probability of something happening if some other correlated event happens first. We denote this as P(A|B) — "the probability of A given B." A simple example: in a class where 20 assignments make up 100% of the grade, what is the probability that a student will pass the class if they got an A on the first assignment?

But now, what would happen if I inverted the previous question — "in a class where 20 assignments make up 100% of the grade, what is the probability that if a student passes, they got an A on the first assignment?" This is where Bayes' Theorem comes in handy. It is a common mistake to think that P(A|B) has to equal P(B|A). The only way P(A|B) = P(B|A) is if P(A) = P(B); otherwise you have to compute:

P(A|B) = P(B|A) · P(A)P(B)

I wouldn't say Bayes' Theorem is a subset of Conditional Probability; they share more of a converse relationship (if we were to treat them as statements, CP could be something like p → q while BT would be q → p).

Bayes' Theorem, made visible

A classic trap: a test that's "99% accurate" can still be mostly wrong when the thing it tests for is rare. Move the sliders and watch the posterior P(A|B) update against the area diagram.

Area diagram of a population split by condition and by test result.
has it & tests positive (true +) doesn't have it & tests positive (false +) tests negative
= P(A|B), the chance it's real given a positive result

As you can imagine, these ideas of stochastic processes and conditional probability are used for almost anything you can think of: programming computer AI, understanding electron behaviour around the nucleus, predicting population growth, trying to beat the odds at a casino, weather prediction, videogames, and disease/pandemic spread predictions, among others. Pretty much anything that involves a random set of variables is usually modelled with these ideas and theorems.

To give an interesting and useful example, there is a model known as Stochastic Oscillators that is used to predict the overall momentum of a stock — in other words, whether the stock's trend is going up or down over a set period. It's an interesting model, since you can have a stock falling in the short run that actually has strong rising momentum. Stock fluctuations are typical in the market: random events such as congressional decisions, or global or regional events, might impact a stock's selling price, but the overall momentum will tell you whether these bumps will have an overtly negative effect or are just transitional for your chosen time range. (E.g. if you pick a yearly time frame with a strong, rising-momentum stock, a congressional decision might make the stock fall for a few months, but by year's end the stock will have risen further than its original price.)

One particularly flabbergasting example of stochastic models in physics is a theory for Quantum Mechanics generally referred to as the Stochastic interpretation of Quantum Mechanics. This theory models physics with the assumption that the space-time structure is actually undergoing metric and topological fluctuations (in layman's terms: it changes and distorts, so maybe one second is actually a tad longer than another, or a meter is actually smaller in some places than others). When analysing space-time from a large-scale view, these tiny fluctuations average out and we get a convenient system describable with discrete models like those in Classical Physics. However, when we try to analyse our universe at a small scale, we can't even out these fluctuations and eventually our classical views break down. At this point we start talking about Quantum Mechanics and we need to rely on stochastic models and Probability Theory to describe its behaviour (e.g. Schrödinger's Equation, which gives us a probability wave of where we might find small particles). What this basically hypothesises is that the universe's fluctuations create Quantum Mechanics, and not the other way around (which is how it is usually considered).

References