Continuous Probability Distributions

It is the third article in the 3-part article series on the probability distributions.

The first part talked about Statistics, Probability, and distribution curves. The second part talked about discrete probability distributions with highlights on Poisson, Bernoulli, Binomial, and a particular case of Uniform distribution. In this part, I will talk about commonly used continuous probability distributions, including Normal, Exponential, and a specific instance of Uniform distribution.

Continuous Probability Distributions

In the first part, we saw what a probability distribution is and how we can represent it using a density curve for all the possible outcomes.

An experiment with numerical outcomes on a continuous scale, such as measuring the length of ropes, tallness of trees, etc. is represented with continuous probability distributions. These numbers can be anything between say, 1 meter to 1.1 meters, therefore, data with these kinds of numbers are treated differently than the discrete case.

PDF: The probability distributions are represented by using a mathematical function. Probability Density Function (PDF) defines the probability of values on a continuous scale. It looks like a smooth sketch made without lifting your pencil. The above picture depicts the PDF for the height of adults.

Please note that the sum of probabilities in PDF for all outcomes will always equal one. It is represented as the Area Under the Curve of this graph.

CDF: In specific statistical scenarios, we are interested in knowing the probability that the outcome will have a value less than or equal to a specific value. For example, while guessing the expected time to reach the office from home, we might be interested in knowing the probability that this time is less than 30 minutes. This probability is represented by using the Cumulative Density Function or CDF. It is a function that provides a probability that an outcome will have a value less than or equal to a specific value. The below figure represents the CDF for the height of adults. Note how it reaches to 1 for the largest observed height.

In short,

  • Probability Density Function (PDF): Probability for all possible outcomes
  • Cumulative Density Function (CDF): Probability less than or equal to a value

Whenever we talk about probability distributions, we implicitly refer to a PDF curve. Continuous distributions play a vital role in machine learning applications. We will discuss some popular continuous probability distributions in this article:

  • Normal distribution
  • Exponential distribution
  • A particular case of uniform distribution

Normal Distribution:

Take some sand in your hand. Now drop it slowly at the ground. What do you see? It is a small hill-like structure. Most of the sand tends to be in the middle, and there are extremities too where lesser sand particles are present. This tendency to be in the middle is the central tendency, and the resultant structure of sand particles resembles a normal distribution.

Normal Distribution is a symmetrical arrangement of a data set in which most values cluster in the middle, and the rest taper off symmetrically towards either extreme. It is commonly referred to as the Bell Curve and is used to model continuous data. Example: Weights/Heights of different employees in an organization, the age of students in a class, etc.

The mean or the average value and the standard deviation are the two parameters of this distribution. Each normal distribution follows a rule of thumb that defines the percentage of data lying in a given range of standard deviations from the mean. It is called the 68-95-99.7 rule, which is the approximate percentage of data lying 1, 2, and 3 standard deviations away from either side of the mean. It is represented in the figure below.

For example, imagine a normal distribution with a mean of 100 and a standard deviation of 10. Then, we can expect 95% of the data to be covered by values that are only two standard deviations away from the mean. That is, between 100 – (2 * 10) and 100 + (2 * 10) or between 80 and 120.

During quality testing in industries, practitioners often follow a six-sigma approach. It precisely means keeping the quality under three deviations away from the average (mean) quality on either side (hence, six sigma). For example, if the length of a candle is on an average 15 cm with a standard deviation of 0.2 cm. Then, the six-sigma quality check ensures that every candle that goes out for sale has a length between (15 – 3*0.2 = 14.4 cm) and (15 – 3*0.2 = 15.6 cm).

Normal distribution with a zero mean and a standard deviation of 1 is called the Standard Normal Distribution. Often the numerical data following normal distribution is scaled/standardized to ease out the mathematical calculations. We will skip those details in this article.

Exponential Distribution:

In the second part, we discussed Poisson distribution, which we can use to model the number of events in a fixed time—for example, the number of customers arriving every hour in a bank. Now, imagine we have to find out the time spent between the arrival of two consecutive customers. This time can be as short as no time lag to as large as the whole duration (one hour). The exponential distribution is used to predict the waiting time between two consecutive events (i.e., arrival, failure of a machine, etc.). For example, we are usually interested in:

  • Time until next online order arrives at a restaurant
  • Time until the next loan application is filed
  • The time you need to wait for the next bus

This distribution is characterized by a rapid decrease in the probability for larger numbers. For example, the time until the next customer arrives will likely be very low, but the chances that this delay will be as large as an hour will be very less. This distribution is represented in the below figure.

The drop in the probability is represented by the decay parameter (λ) of this distribution.

Continuous Uniform Distribution:

Consider a spinner, as shown below. You give it a spin and see where it comes to rest. The resulting angle can be any number (say, 172.564, 124, 46.234, etc.) between 0 and 360 with equal probability. It is a case of uniform distribution for the continuous outcome.

Graphically, continuous uniform distribution curves look like a rectangle.

Uniform Distribution has only a starting point and an ending point as its parameters. Here, the parameters are (0 and 360). Please refer to the second part of this article series to understand Uniform distribution for a discrete set of outcomes.

Conclusion:

In this article, we have discussed how we can use probability functions to represent continuous numbers. We also discussed normal distribution, the most popular and most used distribution with a special focus on the six-sigma approach used in manufacturing industries. Further, we saw what we mean by exponential and uniform distribution and how we can use one to model our continuous data.