# IAS: Proability Distribution

Glide to success with Doorsteptutor material for GRE : Get detailed illustrated notes covering entire syllabus: point-by-point for high retention.

Download PDF of This Page (Size: 107K) ↧

Enumerate probability distribution; explain the histogram and probability distribution curve.

## Probability Distribution Curve

Probability distributions are a fundamental concept in statistics. They are used both on a theoretical level and a practical level.

Some practical uses of probability distributions are:

To calculate confidence intervals for parameters and to calculate critical regions for hypothesis tests.

For uni variate data, it is often useful to determine a reasonable distributional model for the data.

Statistical intervals and hypothesis tests are often based on specific distributional assumptions. Before computing an interval or test based on a distributional assumption, we need to verify that the assumption is justified for the given data set. In this case, the distribution does not need to be the best-fitting distribution for the data, but an adequate enough model so that the statistical technique yields valid conclusions.

Simulation studies with random numbers generated from using a specific probability distribution are often needed.

### Explanation

The probability distribution of the variable X can be uniquely described by its cumulative distribution function F (x), which is defined by for any x in R.

A distribution is called discrete if its cumulative distribution function consists of a sequence of finite jumps, which means that it belongs to a discrete random variable X: a variable which can only attain values from a certain finite or countable set. A distribution is called continuous if its cumulative distribution function is continuous, which means that it belongs to a random variable X for which Pr[ X = x ] = 0 for all x in R.

Several probability distributions are so important in theory or applications that they have been given specific names:

The Bernoulli distribution, which takes value 1 with probability p and value 0 with probability q = 1 − p.

## The Poisson Distribution

In probability theory and statistics, the Poisson distribution is a discrete probability distribution discovered by Simeon-Denis Poisson (1781 − 1840) and published, together with his probability theory, in 1838. N that count, among other things, a number of discrete occurrences (sometimes called “arrivals” ) that take place during a time-interval of given length. The probability that there are exactly k occurrences (k being a non-negative integer, k = 0, 1, 2, …) is where e is the base of the natural logarithm (e = 2.71828…), k! is the factorial of k, is a positive real number, equal to the expected number of occurrences that occur during the given interval. For instance, if the events occur on average every 2 minutes, and you are interested in the number of events occurring in a 10 minute interval, you would use as model a Poisson distribution with? = 5.

## The Normal Distribution

The normal or Gaussian distribution is one of the most important probability density functions, not the least because many measurement variables have distributions that at least approximate to a normal distribution. It is usually described as bell shaped, although its exact characteristics are determined by the mean and standard deviation. It arises when the value of a variable is determined by a large number of independent processes. For example, weight is a function of many processes both genetic and environmental. Many statistical tests make the assumption that the data come from a normal distribution.

## The Probability Distribution Function is Given by the Following Formula

Where x = value of the continuous random variable = mean of normal random variable (greek letter ‘mu’ )

e = exponential constant = 2.7183 = standard deviation of the distribution = mathematical constant = 3.1416

## Histogram and Probability Distribution Curve

Histograms or bar charts in which the area of the bar is proportional to the number of observations having values in the range defining the bar. As an example construct histograms of populations. The population histogram describes the proportion of the population that lies between various limits. It also describes the behavior of individual observations drawn at random from the population, that is, it gives the probability that an individual selected at random from the population will have a value between specified limits. When we're talking about populations and probability, we don't use the words “population histogram” Instead, we refer to probability densities and distribution functions (However, it will sometimes suit my purposes to refer to “population histograms” to remind you what a density is.).

When the area of a histogram is standardized to 1, the histogram becomes a probability density function. The area of any portion of the histogram (the area under any part of the curve) is the proportion of the population in the designated region. It is also the probability that an individual selected at random will have a value in the designated region. For example, if 40% of a population has cholesterol values between 200 and 230 mg/dl, 40% of the area of the histogram will be between 200 and 230 mg/dl. The probability that a randomly selected individual will have a cholesterol level in the range 200 to 230 mg/dl is 0.40 or 40%. Strictly speaking, the histogram is properly a density, which tells you the proportion that lies between specified values.

Cumulative distribution function is something else. It is a curve whose value is the proportion with values less than or equal to the value on the horizontal axis, as the example to the left illustrates. Densities have the same name as their distribution functions. For example, a bell-shaped curve is a normal density. Observations that can be described by a normal density are said to follow a normal distribution.