Self Online Study - Mathematics - Probability - Binomial Distribution
In statistics the so-called binomial distribution describes the possible number of times that a particular event will occur in a sequence of observations. The event is coded binary, it may or may not occur. The binomial distribution is used when a researcher is interested in the occurrence of an event, not in its magnitude. For instance, in a clinical trial, a patient may survive or die. The researcher studies the number of survivors, and not how long the patient survives after treatment. Another example is whether a person is ambitious or not. Here, the binomial distribution describes the number of ambitious persons, and not how ambitious they are.
The binomial distribution is specified by the number of observations, n, and the probability of occurence, which is denoted by p.
A classic example that is used often to illustrate concepts of probability theory, is the tossing of a coin. If a coin is tossed 4 times, then we may obtain 0, 1, 2, 3, or 4 heads. We may also obtain 4, 3, 2, 1, or 0 tails, but these outcomes are equivalent to 0, 1, 2, 3, or 4 heads. The likelihood of obtaining 0, 1, 2, 3, or 4 heads is, respectively, 1/16, 4/16, 6/16, 4/16, and 1/16. In the figure on this page the distribution is shown with p = 1/2 Thus, in the example discussed here, one is likely to obtain 2 heads in 4 tosses, since this outcome has the highest probability.
Other situations in which binomial distributions arise are quality control, public opinion surveys, medical research, and insurance problems.
In many cases, it is appropriate to summarize a group of independent observations by the number of observations in the group that represent one of two outcomes. For example, the proportion of individuals in a random sample who support one of two
political candidates fits this description. In this case, the statistic p is the count X of voters who support the candidate divided by the total number of individuals in the group n. This provides an estimate of the parameter p, the proportion of individuals who support the candidate in the entire population.
The binomial distribution describes the behavior of a count variable X if the following conditions apply:
1: The number of observations n is fixed.
2: Each observation is independent.
3: Each observation represents one of two outcomes ("success" or "failure").
4: The probability of "success" p is the same for each outcome.
Bernoulli Theorem :
Let there be n independent trials in an experiment and let the random variable X denote the number of successes in these trials. Let the probability of getting a success in a single trial be p and that of getting a failure be q so that p+q=1 . The
PX=r)= nC r.pr.q(n-r)
Mean and Variance of the Binomial Distribution
The binomial distribution for a random variable X with parameters n and p represents the sum of n independent variables Z which may assume the values 0 or 1. If the probability that each Z variable assumes the value 1 is equal to p, then the mean of each variable is equal to 1*p + 0*(1-p) = p, and the variance is equal to p(1-p). By the addition properties for independent random variables, the mean and variance of the binomial distribution are equal to the sum of the means and variances of the n independent Z variables, so
These definitions are intuitively logical. Imagine, for example, 8 flips of a coin. If the coin is fair, then p = 0.5. One would expect the mean number of heads to be half the flips, or np = 8*0.5 = 4. The variance is equal to np(1-p) = 8*0.5*0.5 = 2.
If we know that the count X of "successes" in a group of n observations with sucess probability p has a binomial distribution with mean np and variance np(1-p), then we are able to derive information about the distribution of the sample proportion p, the count of successes X divided by the number of observations n. By the multiplicative properties of the mean, the mean of the distribution of X/n is equal to the mean of X divided by n, or
np/n = p. This proves that the sample proportion p is an unbiased estimator of the population proportion p. The variance of X/n is equal to the variance of X divided by n², or (np(1-p))/n² = (p(1-p))/n . This formula indicates that as the size of the sample increases, the variance decreases.
In the example of rolling a six-sided die 20 times, the probability p of rolling a six on any roll is 1/6, and the count X of sixes has a B(20, 1/6) distribution. The mean of this distribution is 20/6 = 3.33, and the variance is 20*1/6*5/6 = 100/36 = 2.78. The mean of the proportion of sixes in the 20 rolls, X/20, is equal to p = 1/6 = 0.167, and the variance of the proportion is equal to (1/6*5/6)/20 = 0.007.