I asked for examples of things that are Poisson distributed in class. One student said the number of chocolate chips in a cookie are Poisson distributed. He’s right.
Here is the intuition of when you have a Poisson distribution. First, you should have a counting process where you are interested in the total number of events that occur by time t or in space s. If each of these events is independent of the others, then the result is a Poisson distribution.
Let’s consider the Poisson process properties of a chocolate chip cookie. Let N(t) denote the number of chocolate chips in a cookie of size t. N(t) is a Poisson process with rate y if all four of the following events are true:
1) The cookie has stationary increments, where the number of chocolate chips in a cookie is proportional to the size of the cookie. In other words, a cookie with twice as much dough should have twice as many chocolate chips (N(t) ~ Poisson (y*t)). That is a reasonable assumption.
2) The cookie has independent increments. The number of chocolate chips in a cookie does not affect the number of chocolate chip cookies in the next cookie.
3) A cookie without any dough cannot have any chocolate chips (N(0)=0)).
4) The probability of finding two or more chocolate chips in a cookie of size h is o(h). In other words, you will find at most one chocolate chip in a tiny amount of dough.
All of these assumptions appear to be true, at least in a probabilistic sense. Technically there may be some dependence between chips if we note that bags of chocolate chips have a finite population (whatever is in the bag). There is some dependence between the number of chocolate chips in one cookie to the next if we note that how many chips we have used thus far gives us additional knowledge about how many chips are left. This would violate the independent increments assumption. However, the independence assumption is approximately true since the frequency of chocolate chips in the cookie you are eating is roughly independent of the frequency of chocolate chips in the cookies you have already eaten. As a result, I expect the Poisson is be an excellent approximation.
The fourth school bus accident in the Richmond, Virginia area occurred this morning. Everyone wants to know, what does this mean?!?
Here’s what I think it means: bus accidents can be modeled as a Poisson process. Equivalently, the time between bus accidents can be modeled using the exponential distribution. This modeling paradigm is appropriate if bus accidents “randomly” occur independently of one another, which is a reasonable assumption.
If the time between bus accidents is exponentially distributed, then we expect that sometimes bus accidents occur in groups of three or four. Example exponential probability distributions are below. The exponential distribution has parameter lambda, where the average time between arrivals (bus accidents in this case). Most of the “meat” of the distribution is close to zero, even if the average time between arrivals is very large. This means that we would expect to sometimes observe small interarrival times and then go a long time between the next arrival.
Let’s put this in terms of bus accidents. If bus accidents occur as a result of chance or coincidence, then we would sometimes expect to observe four bus accidents in a week and then go months before the next bus accident. Four bus accidents in a week does not necessarily imply that something nefarious is going on.
This reasoning can also be used to explain why completely unrelated celebrity deaths sometimes occur in threes.
Example exponential distributions (probability density functions). The average time between arrivals is lambda^-1.
How rare are four bus accidents in a week? Let’s assume that bus accidents occur once every four weeks on average (lambda=1/4). The probability of observing 4+ accidents in a week is 0.01%. Pretty rare. But that’s any one week. The school year is 36 weeks long, which means that we would have 36 chances to have 4+ accidents in a week. Using the Binomial distribution, we find that the the odds of having at least one week with 4+ accidents is 0.5% (once every 200 years).
What about a slightly less extreme week? The probability of observing 3+ accidents in a week is 0.2%. Over the course of a year, the odds of having at least one week with 3+ accidents is 7.5% (once every 13 years).