Subject 3. Cumulative Distribution Function

Analysts are often interested in finding the probability of a range of outcomes rather than a specific outcome. A cumulative distribution function (cdf) gives the probability that a random variable X is less than or equal to a particular value x, P(X≤x). In contrast, a probability function is used to find the probability of a specific outcome. To derive a cumulative distribution function F(x), simply sum the values of the probability function for all outcomes less than or equal to x.

The two characteristics are:

  • The cumulative distribution function lies between 0 and 1 for any x: 0 ≤ F(x) ≤ 1.
  • As we increase x, the cdf either increases or remains constant.

Given the cumulative distribution function, the probabilities for the random variable can also be calculated. In general:

P(X = xn) = F(Xn) - F(Xn - 1)

A cumulative frequency distribution is a plot of the number of observations falling in or below an interval. It can show either the actual frequencies at or below each interval (as shown here) or the percentage of the scores at or below each interval. The plot can be a histogram as or a polygon.

Example

Consider a probability function: p(X) = X/6 for X = 1, 2, 3 and p(X) = 0 otherwise. In a previous example it was shown that p(1) = 1/6, p(2) = 2/6, and p(3) = 3/6.

  • F(1) indicates the probability that has been accumulated up to and including the point X = 1. Clearly, 1/6 of probability has been accumulated up to this point, so F(1) = 1/6.
  • F(2) indicates the probability that has been accumulated up to and including the point X = 2. When X = 2 is reached, the accumulation of 1/6 is taken from X = 1 and 2/6 from X = 2; in total accumulation is 1/6 + 2/6 = 3/6 or, of the probability, so F(2) = 3/6.
  • F(3) indicates the probability that has been accumulated up to and including the point X = 3. By the time X = 3 is reached, all the probability has been accumulated: 1/6 from X = 1, 2/6 from X = 2 and 3/6 from X = 3. Thus, 1/6 + 2/6 + 3/6 = 1. Therefore, F(3) = 1.

It is also possible to calculate F(X) for intermediate values. F(0) = 0, as no probability has been accumulated up to the point X = 0; F(1.5) = 1/6, as by the time X = 1.5 is reached, 1/6 of probability has been accumulated from X = 1; F(7) = 1, as by the time 7 is reached, all possible probability from X = 1, 2 and 3 has been collected.

User Contributed Comments 1

You need to log in first to add your comment.
sahilb7: F(X) is the cumulative sum of probabilities p(X) for all values less than X.