Understanding the Poisson Distribution

I find probability distributions would often be useful tools to know and understand, but the explanations are not always very intuitive. The Poisson distribution is one of the probability distributions that I have run into quite often. Most recently I ran into it when preparing for some AWS Machine Learning certification questions. Since this is not the first time I run into it, I figured it would be nice to understand it better. In this article, I explore questions on when, where, and how to apply it. And I try to keep it more intuitive by using some concrete example cases.

Example Uses of the Poisson Distribution

Before looking into details of something, it is nice to have an idea of what that something is useful for. The Poisson Distribution is typically described as a discrete probability distribution. It tells you the probability of discrete number of events in a timeframe, such as 1 event, or 2 events, or 3 events, … Not 1.1 events, or 2.7 events, or any other fractional events. Whole events only. This is why it is called discrete.

Some example uses for the Poisson distribution include estimating the number of:

  • calls to a call center
  • cars passing a on a highway
  • fatal airplane crashes in a year
  • particles emitted in radioactive decay
  • faults in physical hardware
  • people visiting a restaurant during lunch time

One typical use of these estimates would be as input to capacity planning (e.g., call center staffing or hardware provisioning). Besides events over time, the Poisson distribution can also be used to estimate instances in an area, volume, or over a distance. I will present some examples of these different application types in the following sections. For further ideas and examples of its application, there is a nice Quora question/answers.

Let’s start by looking at what the Poisson distribution is, and how to calculate it.

Properties of the Poisson Distribution

A process that produces values conforming to a Poisson Distribution is called a Poisson Process. Such a process, and thus the resulting Poisson Distribution, is expected to have the following properties:

  • Discrete values. An event either occurs or not, there is no partial occurrence here.
  • Independent events. Occurrence of one event does not affect the occurrence of other events.
  • Multiple events do not occur at the same time. However, the sub-intervals in which they occur may be very small.
  • Average (a.k.a. the mean) number of events is constant through time.

The following figure/line illustrates a Poisson Process:

Poisson process example, with average of 5 units between events.

The dots in the above figure illustrate events occurring. The x-axis illustrates time passing. The units of time (on the x-axis here) could be any units of time, such as seconds, minutes, hours, or days. They could even be other types of units, such as distance. The Poisson distribution / process does not really care about the unit of measurement, only that the properties of the distribution are met. I generated the above figure using a random process that would generate values between 1-10, with an average value of 5. In practice, I used the Python random module to generate the numbers.

As I said, the time unit could be anything, but for clarity of example I will use seconds. To calculate the Poisson distribution, we really only need to know the average number of events in the timeframe of interest. If we calculate the number of dots in the above figure, we get 18 events (dots). Now assume we want to know the probability of getting exactly 18 events in a timeframe of 100 seconds (as in the figure).

In this process, the average number of events would be 100/5 = 20. We have a timeframe of 100 seconds, and a process that produces events on average every 5 seconds. Thus on average we have 100/5 = 20 events in 100 seconds.

To answer the question on the probability of having exactly 18 events in a timeframe of 100 seconds, we can plug these numbers into the Poisson distribution formula. This gives a chance of about 8.4% for having 18 events occur in this timeframe (100s). In the Poisson Distribution formula this would translate to the parameters k = 18, λ (lambda) = 20.

Let’s look at what these means, or what is the formula:

The Poisson Distribution Formula

To calculate the Poisson distribution, we can use the formula defined by the French mathematician Siméon Denis Poisson, about 200 years ago:

Poisson distribution formula.

It looks a bit scary with those fancy symbols. To use this, you really just need to know the values to plug in:

  • e: Euler’s number. A constant value of about 2.71828. Most programming languages provide this as a constant, such as math.e in Python.
  • λ (lambda): The average number of events in an interval. Sometimes μ (Mu) is used, but that is just syntax, the formula stays the same.
  • k: The number of events to calculate the probability for.

For the more programming oriented, here is the same, in Python:

The Poisson distribution formula in Python.

This code, and the code to generate all the images in this article is easiest to access on my Kaggle notebook. Its a bit of a mess, but if you want to look at the code, it’s there.

If we plug the values for the imaginary Poisson Process from the previous section (18 dots on a line, with an average of 20) into this formula, we get the following:

(e^-20)*(20^18)/18! = 0.08439

Which translates to about 8.4% chance of 18 events (k=18) in the timeframe, when the average is 20 events (λ=20). That’s how I got the number in the previous section.

Example: Poisson Distribution for Reliability Analysis

I find concrete examples make concepts much easier to understand. One area that I find is a fitting example here, and where I have experience in seeing the Poisson distribution applied, is system reliability analysis. So I will illustrate the concept here with an example of how to use it to provide assurance of “five nines”, or 99.999%, reliability in relation to potential component failures.

There are many possible components in different systems that would fit this example quite nicely. I use storage nodes (disks in good old times) as an example. Say we have a system that needs to have 100 storage nodes running at all times to provide the required level of capacity. We need to provide this capacity with a reliability of 99.999%. We have some metric (maybe from past observations, from manufacturer testing, or something else) to say the storage nodes we use have an average of 1 failure per 100 units in a year.

Knowing that 1 in 100 storage nodes fails on average during a year does not really give much information on how to achieve 99.999% assurance of having a minimum of 100 nodes up. However, with this information, we can use the Poisson distribution to help find an answer.

To build this distribution, we just plug in these values into the Poisson formula:

  • λ = 1
  • k = 0-7

This results in the following distribution:

Poisson distribution for λ (avg)=1, k (events) = 0-7.

Here λ (avg in the table) is 1, since we have the average number of events at 1. For k (events in the table above), I have simply started at 0, and increased by 1 until I reached the target probability of 99.999%. The Poisson formula with these values is also in the table for each row. Adding up all the probabilities gives us the value of 7 as the number of potential failures until we reach the probability of 99.999% (cumulative in the table).

So how did we reach 7? We have a probability of 36.7879% for 0 failures, and the same 36.7879% for 1 failure. The cumulative probability for 0 or 1 failures is 36.7879 + 36.7879 = 73.5759%. The probability of exactly 2 failures is 18.394%. The cumulative probability of 0, 1, or 2 failures is thus 73.5759 + 18.394 = 0.919699%. And so on, as visible in the table (cumulative column). Since these are exclusive numbers (we cannot have both 0 and 1 failures at the same time), we need to sum them up to get the cumulative chance of avoiding all scenarios of 0-7 failures. This is the cumulative 99.999% value in the table row 7.

So the answer to the original question in this example is to start the year with 107 nodes running to have 99.999% assurance of keeping a minimum of 100 nodes running at all times through the year. Or you could also do 108 depending on how you interpret the numbers (does the last failure count as reaching 99.999%, or do you need one more).

Of course, this does not consider external issues such as meteor hits, datacenter fires, tripping janitors, etc. but rather is to address the “natural” failure rate of the nodes themselves. It also assumes that you wouldn’t be exchanging broken nodes as they fail. But this is intended as an analysis example, and one can always finetune the process based on the results and exact requirements.

For those interested, some further examples of such applications in reliability analysis are available on the internet. Just search for something like poisson distribution reliability engineering.

Poisson Distributions for average of 0-5 events

Now that we have the basic definitions, and a concrete example, let’s look at what the distribution generally looks like. With the same parameters, the Poisson distribution is always the same, regardless of the unit of time or space used. Using an average number of events 1/minute or 1/hour, the distribution itself is the same. Just the time-frame changes. Similarly, looking at an average of instances per area of 10/m2 or 10/cm2 will have the same distribution, just over a different area.

Here I will plot how the distribution evolves as the λ (average number of events in timeframe or area) is changed. I use a varying number of events k in each case up until the probability given by the Poisson distribution is very small. I use discrete (whole) number for the λ, such as 0, 1, 2, 3, 4, and 5, although the average can have continuous values (0.1 would be fine for λ). Let’s start with λ of 0:

Poisson distribution for 0-5 events (k) with an average of 0 events (λ) in time interval.

The above figure shows the Poisson distribution for an average of 0 events (λ). If you think about it for a moment, the average number of events can only be 0 if the the number of events is always 0. This is what the above figure and table show. With an average of 0 events, the number of events is always 0.

Poisson distribution for 0-5 events (k) with an average of 1 events (λ) in time interval.

A bit more interesting, the above figure with an average of 1 events (λ) in the time interval shows a bigger spread of probable number of events. This distribution is already familiar from the reliability estimation example I showed, as it had λ of 1. Here the number of events (k) can be both 0 or 1 equally often (about 36.8%), after which the probability quickly goes down for larger number of events (k). The average of 1 is balanced by the large probability of 0’s and smaller probability of the larger numbers for k.

Poisson distribution for 0-7 events (k) with an average of 2 events (λ) in time interval.

Here, with figure with an average of 2 events (λ) in the interval, we see some trends with the increase of the average number of events (λ) as compared to the distributions with λ of 0 and 1. As λ increases, the center of the distribution shifts right, the distribution spreads wider apart (x-axis becomes broader), and the percentage of single values becomes smaller (y-axis is shorter). So there are more values for k, each one individually having smaller probabilities but summing to 1 in the end.

Poisson distribution for 0-8 events (k) with an average of 3 events (λ) in time interval.

Compared to the smaller averages (0 ,1, 2), the above figure with average of 3 events (λ) shows further the trend where the average value itself (here k=3) has the highest probability, but with the value right below it (here k=2) closely (or equally) matching in probability. And the center shifting right, with the spread getting broader.

Poisson distribution for 0-10 events (k) with an average of 4 events (λ) in time interval.

Both the number of 4 (figure above) and 5 (figure below) for the average number of events (λ) show the same trends as with 0, 1, 2, and 3 for λ continue further.

Poisson distribution for 0-11 events (k) with an average of 5 events (λ) in time interval.

To save some space, and to illustrate the Poisson distribution evolution as the average number of events (λ) rises, I animated it over different λ values from 0 to 50:

Poisson distribution from 0 to 50 (discrete) averages (λ) and scaled number of events (k).

As I noted before, the average (λ) does not need to be discrete, even if these examples I use here are. You cannot have 0.1 failures, but you can have on average 0.1 failures.

With the above animated Poisson distribution going from 0 to 50 values for λ, the same trends as before are again more pronounced – As the average number of events (λ) rises,

  • The overall number of events predicted goes up (x-axis center moves right).
  • The distribution becomes wider (x-axis becomes broader) with more values for k (number of events).
  • The probability of any single number of events (k) get smaller (y-axis is shorter).
  • The probability always peaks at the average (highest probability where λ=k).
  • The summer probability always stays at 1 (=100%) over the x- and y-axis.

That sums up my basic exploration of the Poisson distribution.

Example: Snowflakes, Call Centers, and a Poisson Distribution

As I noted, I find concrete examples tend to make things easier to understand. I already presented the example from reliability engineering, which I think makes a great example of the Poisson distribution as it seems like such an obvious fit. And useful too.

However, sometimes the real world is not so clear on all the assumptions on applying the Poisson distribution. With this in mind, I will try something a bit more ambiguous. Because it is winter now (where I live it is), I use falling snowflakes as an example to illustrate this problem. When I say snowflakes falling, I mean when the flakes hit the ground, on a specific area.

Assume we know the average rate of snowflakes falling in an hour (on the selected area) is 100, and we want to know the probability of 90 snowflakes falling. Using the Poisson distribution formula defined earlier in this article, we get the following:

  • e = 2.71828: as defined above (or math.e)
  • λ = 100: The average number of snowflakes falling in a given timeframe (1 hour).
  • k = 90: The number of snowflakes we want to find the probability for.

Putting these into the Poisson formula, we get:

(e^-100)*(100^90)/90! = 0.025.

Which translates to about 2.5% probability of 90 snowflakes falling in an hour (the given time-frame), on the given area.

To calculate the broader Poisson probability distribution for different number of snowflakes, we can, again, simply run the Poisson formula using different values for k. Remember, k here stands for the number of events to find the probability for. In this case the events are the snowflakes falling. Looping k with values from 75 to 125 gives the following distribution:

Poisson distribution for k=75-125, λ=100.

Sorry for the small text in the image. The bottom shows the ticks for x-axis as the numbers from 75 to 125. These are the values k that I looped for this example. The bars in the figure denote the probability of each number of snowflakes (running the Poisson formula with the value k). The numbers on top of the bars are the probabilities for each number of snowflakes falling, given the average of 100 snowflakes in the interval (λ). As usual, the Poisson distribution always peaks at the average point (λ, here 100). The probability at the highest point (k=100) here is only about 4% due to wide spread of the probabilities in this distribution.

So as long as we meet the earlier defined requirements for applying the Poisson distribution:

  • Discrete values
  • Independent events
  • Multiple events not occurring simultaneously
  • The average number of events is constant over time

As long as these hold, we can say that there is a 4% chance of 100 snowflakes, when the average number of observed snowflakes (λ) in the timeframe is 100. And about 2.5% chance of observing 90 snowflakes (k=90) in a similar case. And so on, according to the distribution visualized above.

Let’s look at the Poisson distribution requirements here:

  1. Snowflakes are discrete events. There cannot be half a snowflake in this story. Either the snowflake falls or it doesn’t.
  2. The snowflakes can be considered independent, as one would not expect one to affect another.
  3. Multiple snowflakes are often falling at the same time, but do they fall to the ground at the exact same moment?
  4. The average number of snowflakes over the hours is very unlikely to be constant for a very long time.

I considered a few times whether to use snowflakes as an example, since this list actually goes from the first item obviously being true, to the next one slightly debatable, the third more so, and the final one quite obviously not going to hold always. But everyone likes to talk about the weather, so why not.

So if we take the first argument above as true, and the following ones at different degrees of debatable, lets look at the points 2-4 from the above list.

2: Are falling snowflakes independent? I am not an in-depth expert on the physics of snowflakes or their falling patterns. However, 100 snowflakes on an area in an hour is not much, and I believe the independence of the snowflakes is a safe assumption here. But in different cases, it is always good to consider this more deeply as well (e.g., people tending to cluster in groups).

3: How likely is it that two snowflakes hit the ground at the exact same moment? Depends. If we reduce the time frame of what you consider “the same time” enough, there are very few real-world events that would occur at the exact same moment. And if this was a problem, there would be practically very few, if any, cases where the Poisson distribution would apply. There is almost always the theoretical possibility of simultaneous events, even if incredibly small. But it is very unlikely here. So I would classify this as not a problem to make this assumption. I believe this kind of relaxed assumption is quite common to make (e.g., we could say in the reliability example, there is an incredibly small chance of simultaneous failure).

4: The average number of falling snowflakes being consistent is of course not true over a longer period of time. It is not always snowing, and sometimes there is a bigger storm, other times just a minor dribble. To address this, we can consider the problem differently, as the number of snowflakes falling on a specific area in a smaller interval, such as 1 minute. The number of snowflakes should be more constant at least for the duration of several minutes (or as long as the snowstorm maintains a reasonably steady state). Maybe it would be more meaningful to discuss the distributions of different levels of storms in different stages?

The fourth point, the change over time, highlights an issue that is relevant to many example applications of the Poisson distribution. The rate of something is not always constant, and thus the time factor may need special consideration. I even found a term for it when I was looking for information on this on the internet: time-varying poisson distribution. And as highlighted by the third point above, some other considerations may also need to be slightly relaxed, while the result can still be useful, even if not perfect.

Snowflakes vs Support Calls

The above snow example was a largely made up problem, but weather is always a nice topic to discuss. Perhaps a more realistic, but very similar, case would be to predict the number of calls you could get in a call center. Just replace snowflakes with calls into the call center.

As with the snow example, with calls to a call center, over different time periods (depending also on time-zones), the average might vary, producing different distributions. Your distributions over days might also vary. For example, weekends vs holidays vs business days. But if you had, for example, an average of 100 calls during the business hours on business days, your Poisson distribution would look the exact same as the average 100 snowflakes in the snowflake example.

You could then use this distribution for things like choosing how many people you should hire into the call center to maintain a minimum service level to match your service level agreement. For example, by finding a spot where the cumulative probability of receiving that many calls in an hour is less than 90%, and using it as evidence of being able to provide minimum of 90% service level at all times.

In such a call center example, you would have other considerations compared to the weather scenario. Including the timezones you serve, business hours, business days, special events, the probability of multiple simultaneous calls, and so on. However, if you think about it for a moment, at the conceptual level these are exactly the same considerations as for the weather example.

So far, my examples have only described events over timeframes. As I noted in the beginning, sometimes the Poisson distribution can also be used to model number of instances in volumes, or areas, of space.

Example: Poisson Distribution Over an Area – Trees in a Forest, WW2 Bombs in London

Besides events over time, the Poisson distribution can also be used to estimate other types of data. One other example is typically distributions of instances (events) over an area. Imagine the following box to describe a forest, where every dot is a tree:

Imaginary forest, dots representing trees.

A Poisson distribution can similarly represent occurrences of such instances in space as it can represent occurrence of events in time. Imagine now splitting this forest area into a grid of smaller boxes (see my Kaggle kernel for the image generation. This image is from run 31):

The forest, divided into smaller squares.

In the above figure, each tree in this is now inside a single smaller box, although the rendering makes it look like the dots on the border of two boxes might belong to two boxes. Now, we can calculate how many trees are in each smaller square:

Calculating the number of trees inside the smaller boxes.

This calculation for the image above, gives the following distribution:

Number of trees in smaller box in the imaginary forest.

From the calculation, the number of smaller boxes with 0 trees (or dots) is 316, or 79%, of the total 400. Further 17% of the smaller boxes in the grid contain 1 tree (dot), and 4% contain 2 trees (dots).

Now, we can calculate the Poisson probability distribution for the same data. We have 400 smaller boxes, and 100 trees. This makes the average number of trees in a box 100/400=0.25. This is the average number of instances in the area (similar to timeframe in earlier examples), or λ, for the Poisson formula. The number of events k in this case is the number of instances (of trees). Putting these values into the Poisson formula (k=tree_count, λ=0.25, we get:

Number of trees estimated by the Poisson distribution.

Comparing this Poisson distribution to the actual distribution calculated from the tree grid image above, it is nearly identical. Due to random noise, there is always some deviation. The more we would increase the number of “trees” in the actual example (randomly generate more), and the number of mini-boxes (smaller area), the closer the actual observations should come to the theoretical numbers calculated by the Poisson formula. This is how randomness generally seems to work, the bigger sets tend to converge to their theoretical target better. But even with the values here it is quite close.

Poisson Distribution and the London Bombings in WW2

Besides the forest and the trees, one could use this type of analysis on whatever data is available, when it makes sense. As I read about Poisson distribution, its application to the London bombings in the second world-war (WW2) often comes up. There is a good summary about this on Stack Exchange. And if you read it, you will note that this is exactly the type of analysis that my forest example above showed.

In this story, the Germans were bombing the British in London, and the British suspected they were targeting some specific locations (bases, ports, etc). To figure if the Germans were really targeting specific targets with insider knowledge, the British divided London into squares and calculated the number of bomb hits in each square. They then calculated the probability of each number of hits in each square, and compared the results to the Poisson distribution. As the result closely matched the Poisson distribution, the British then concluded that the bombings were, in fact, random, and not based on inside-information. Again, this is what would happen if we just replace “trees” with “bombs” and “forest” with “London” in my forest example above.

Of course, a cunning adversary could account for this by targeting a few high-interest targets, and hiding them in a large number of randomly dropped bombs. But the idea is there, on how the Poisson distribution could be used for this type of analysis as well. Something to note, of course, is that the Stack Exchange post also discusses this as actually being a more suitable problem for a Binomial distribution. Which is closely related to the Poisson distribution (often quite indistinguishable). Let’s see about this in more detail.

Poisson vs Binomial Distribution

When I read about what is the Poisson distribution and where does it derive from, I often run into the Binomial distribution. The point being, that the binomial distribution comes from the more intuitive notion of single events, and their probabilities. Then some smart people like mr. Siméon Denis Poisson used this to derive the generalized formula of the Poisson distribution. And now, 200 years later, I can just plug in the numbers. Nice. But I digress. The Binomial distribution formula:

The Binomial distribution formula.

The above figure shows the general Binomial distribution formula. It has only a few variables:

  • n: number of trials
  • k: number of positive outcomes (events we calculating probability for)
  • p: probability of the positive outcome

We can further split the Binomial formula into different parts:

Different parts of the Binomial formula.

I found a good description of the binomial formula, and these parts, on the Math is Fun website. To briefly summarize, the three parts in the above figure correspond to:

  • part1: number of possible trial output combinations that can produce the wanted result
  • part2: probability for positive outcomes with expected k positive outcomes
  • part3: probability for negative outcomes with expected k positive outcomes

To summarize, it uses the number of possible trial combinations, the probability of a positive outcome, and the probability of a negative outcome to calculate the probability of k positive outcomes. As before, calculating this formula for the different numbers of k, we can get the Binomial distribution.

What does any of this have to do with the Poisson distribution? As an example, let’s look at the reliability example from before as a Binomial Distribution instead. We have the following numbers for the reliability example:

  • average number of failures (λ, in one year): 1 in 100
  • number of events investigated (k): 0-7

We can convert these into the values needed for the Binomial Distribution:

  • n: number of trials = 100. Because we have an average over 100 trials, and want to know the result for 100 storage nodes.
  • p: probability of failure = 1/100 = 0.01. Because we have a 1 in 100 chance of failure.
  • k: 0-7 as in the Poisson example.

Comparing the results, we get the following table:

Poisson vs Binomial

In this table, poisson refers to the values calculated with the Poisson formula. binomial100 refers to the values calculated with Binomial formula using parameters n=100 and p=0.01. Similarly, binomial1000 uses n=1000 and p=0.001. As you can see, these are all very close.

It is generally said, that when p nears 0 and n nears infinity for the Binomial Distribution, you will approximate the Poisson Distribution. Or the other way around. In any case, the idea being that as your interval of sub-trials gets smaller in Binomial, you get closer to the theoretical value calculated by the Poisson Distribution. This is visible in the table as binomial100 is already close to poisson, but binomial1000 is still noticeably closer with a larger n and smaller p.

You can further experiment with this yourself, by increasing n, and decreasing p in this example, and using a Binomial Distribution calculator. For example, where p = 0.01, 0.001, 0.0001, while n = 100, 1000, 10000, and so on.

The difference between the Binomial and Poisson distribution is a bit elusive. As usual, there is a very nice Stack Exchange question on the topic. I would summarize it as using Poisson when we know the average rate over a timeframe, and the timeframe (or area/volume) can be split into smaller and smaller pieces. Thus the domain of the events is continuous and bordering on infinite trials (as you make the interval smaller and smaller). The Binomial being a good fit if you have a specific, discrete, number of known events and a probability. Like a thousand coin tosses. I recommend the Stack Exchange post for more insights.

Other Examples of Poisson Distribution

As I was looking into the Poisson distribution I found many examples trying to illustrate its use. Let’s look at a few more briefly.

Number of chocolate chips in a cookie dough. I though this was an interesting one. Instead of events over time, or instances in an area, we are talking about instances in a volume of mass. Kind of like 3-dimensional area. I guess if we looked at time (and distance) as 1-dimensional, and area as 2-dimensional, this would be the continuation to 3-dimensional. The applicability I think would mainly depend on how well the chips are distributed in the dough, and whether they could be considered independent (not clustering) and on average constant. But I find the application of the Poisson distribution to volumes of mass is an interesting perspective.

Estimating traffic counts. How many cars pass a spot on a highway could be suitable for a Poisson distribution. I think there can be some dependencies between cars, since people tend to drive in clusters for various reasons. And this behaviour likely changes over specific time periods (of days, weeks, etc similar to the call center example).

Estimating bus arrivals. This one does not seem very meaningful, since the buses are on a schedule. Maybe the average rate would be quite constant over some time periods, but not very independent. For estimating how much they will deviate from their schedule might work a bit better, although weather, traffic, and other aspects might have a big impact.

People arriving in a location. This is a bit tricky. I think people often tend to travel in groups, and if they arrive to some events, such as school classes, workdays, meetings, football games, concerts, or anything else like that, they tend to cluster a lot. So I do not think this would work very well for a Poisson distribution. But maybe in some cases.

Number of fatal airline crashes in a year quite a classic example. If we consider back to the past few years (today) and, for example, the issues Boeing had with their MCAS system, this would certainly not fit. But then this (MCAS issue) would show up as an anomaly, similar to what the British were looking for in the London Bombing example. And this would be correct, and a useful find it in itself, if otherwise not observed. Other than these types of clustered events due to a specific cause, I believe airplane crashes makes a reasonable example of applying the Poisson distribution.

A bit like people, animal behaviour could also be of interest. One example I saw online (sorry, could not find the link anymore) was about cows being distributed across a field. And how they tend to group (or cluster), meaning the distribution would not be truly independent, and likely not very good for a Poisson distribution. This made me think about the minimum distance between events / instances as a parameter. For example, repeating my forest example, but setting a minimum distance from one tree to the next. The longer this minimum distance would be, the further the actual distribution should differ from the predicted Poisson distribution. This type of “personal space” seems likely in many real-world scenarios.

Most of these extra examples I listed here actually seem to have some properties similar to my snowflakes / call center example. You might not have the perfect case for the Poisson distribution, but if you figure out your rules and limitations, and find a suitable angle, you might still benefit from it.


I presented a bit deeper look at three cases of applying the Poisson distribution in this article. Hopefully they are helpful in understanding about its behaviour and potential use. I find the reliability example quite concise for the most “basic” (or standard-like) application of the Poisson distribution. The call center / snowflake example gives it a bit more complex real-wold context, and finally the trees in a forest / London bombing example illustrates the expansion of the application into the two-dimensional space from the time-domain. The additional cases I briefly discussed further highlighted a few interesting points, such as the expansion into 3-dimensional volumes, and finding different angles where the Poisson might be useful, even if not directly matching all its criteria.

Mostly my day to day job, or generally daily tasks don’t really involve many uses for the Poisson (or Binomial) distribution. However, I believe the distribution is very useful to understand and keep in mind when the situation arises. And when it does, I find some good pointers are useful to remember. Some points I find useful:

  • The Poisson distribution is always the same if we have the same parameters λ and k, regardless of the timeframe or area size. Just have to be able to scale the idea.
  • The probability and rate of events generally seems to be scalable across events, so if I have a rate of 1 in 100, I can also use a rate of 0.1 in 10, or 10 in 1000.
  • The Binomial distribution is a good alternative to keep in mind. I find it a good mental note to consider the Binomial if I have a given probability, and a specific (limited) number of events.
  • The Poisson on the other hand is a clearer fit if I have the average rate, and potentially continuous number of events of a timeframe

In the end, I think the most important thing is to remember what these distributions are good for. For Poisson, this would be estimating the probability of a discrete number of events, given their average rate in a timeframe/area/volume, and when the timeframe is continuous (e.g., you can divide an hour into smaller and smaller time units for more trials) but the events are discrete (no partial events). Binomial on the other hand if you have the probability for an event, and the number of trials is discrete (specific number of trials). Keeping the basic applications in mind, you can then look up the details as needed, and figure out the best way to apply them.

That’s all for today. If you think I missed something, or anything to improve, let me know 🙂


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s