# NTE Praxis: Probability

Doorsteptutor material for UGC is prepared by world's top subject experts: Get detailed illustrated notes covering entire syllabus: point-by-point for high retention.

what to you understand by concept of probability. Explain various theories of probability

probability is a branch of mathematics that measures the likelihood that an event will occur. Probabilities are expressed as numbers between 0 and 1. The probability of an impossible event is 0, while an event that is certain to occur has a probability of 1. Probability provides a quantitative description of the likely occurrence of a particular event. Probability is conventionally expressed on a scale of zero to one. A rare event has a probability close to zero. A very common event has a probability close to one.

## Four Theories of Probability

- Classical or a priori probability: This is the oldest concept evolved in 17
^{th}century and based on the assumption that outcomes of random experiments (like tossing of coin, drawing cards from a pack or throwing a die) are equally likely. For this reason this is not valid in the following cases- Where outcomes of experiments are not equally likely, for example lives of different makes of bulbs.
- Quality of products from a mechanical plant operated under different condition. However, it is possible to mathematically work out the probability of complex events, despite of these demerits. A priori probabilities are of considerable importance in applied statistics.

- Empirical concept: This was developed in 19
^{th}centaury for insurance business data and is based on the concept of relative frequency. It is based on historical data being used for future prediction. When we toss a coin, the probability of a head coming up is because there are two equally likely events, namely appearance of a head or that of a tail. This is an approach of determining a probability from deductive logic. - Subjective or personal approach. This approach was adopted by frank Ramsey in 1926 and developed by others. It is based on personal beliefs of the person making the probability statement based on past information, noticeable trends and appreciation of futuristic situation. Experienced people use this approach for decision making in their own field.
- Axiomatic approach: This approach was introduced by Russian mathematician A N Kolmogorov in 1933. His concept of probability is considered as a set of function, no precise definition is given but following axioms or postulates are adopted.
- The probability of an event ranges from 0 to 1. That is, an event surely not be happen has probability 0 and another event sure to happen is associated with probability 1.
- The probability of an entire sample space (that is any, some or all the possible outcomes of an experiment) is 1. Mathematically, P (S) = 1

## ChiSquare Test

what is chi-square test, narrate the steps for determining value of x2 with suitable examples. Explain the condition for applying x2 and uses of chi-square test

this test was developed by Karl Pearson (1857 − 1936) , analytical situation and professor of applied mathematics, London, Whose concept of coefficient of correlation is most widely used. This r = test consider the magnitude of dependency between theory and observation and is defined as

Where Oi is the observed frequency

E = expected frequencies

Steps for determining value of x2

- When data is given in a tabulated form calculated form expected frequencies for each cell using the following formula E = (row total) ⚹ (column total) /total number of observation.
- Take difference between O and E for each cell and calculate their square (O-E) 2
- Divide (O-E) 2 by respective expected frequencies and total up to get x2.
- Compare calculated value with table value at given degree of freedom and specified level of significance. If at a stated level, the calculated value is more than table values, the difference between theoretical and observed frequencies are considered to be significant. It could not have arisen due to fluctuation of simple sampling. However if the values is less than table value it is not considered as significant, regarded as due to fluctuation of simple sampling and therefore ignored.

### Condition for Applying x2

- N must be large, say more than 50, to ensure the similarity between theoretically correct distribution and our sampling distribution.
- no theoretical cell frequency cell frequency should be too small, say less than 5, because that may be over estimation of the value of x2 and may result into rejection of hypotheses. In case we get such frequencies, we should pool them up with the previous or succeeding frequencies. This action is called Yates correction for continuity.

### Uses of Chi Square Test

- As a test of independence The Chi Square Test of Independence tests the association between 2 categorical variables. Weather two or more attribute are associated or not can be tested by framing a hypothesis and testing it against table value. For example, use of quinine is effective in control of fever or complexions of husband and wives. Consider two variables at the nominal or ordinal levels of measurement. A question of interest is: Are the two variables of interest independent (not related) or are they related (dependent) ? When the variables are independent, we are saying that knowledge of one gives us no information about the other variable. When they are dependent, we are saying that knowledge of one variable is helpful in predicting the value of the other variable. One popular method used to check for independence is the chi-squared test of independence. This version of the chi-squared distribution is a nonparametric procedure whereas in the test of significance about a single population variance it was a parametric procedure.

### Assumptions

- We take a random sample of size n.
- The variables of interest are nominal or ordinal in nature.
- Observations are cross classified according to two criteria such that each observation belongs to one and only one level of each criterion.

## Chi-Test Charateristics

### As a Test of Goodness of Fit

The Test for independence (one of the most frequent uses of Chi Square) is for testing the null hypothesis that two criteria of classification, when applied to a population of subjects are independent. If they are not independent then there is an association between them. A statistical test in which the validity of one hypothesis is tested without specification of an alternative hypothesis is called a goodness-of-fit test. The general procedure consists in defining a test statistic, which is some function of the data measuring the distance between the hypothesis and the data (in fact, the badness-of-fit) , and then calculating the probability of obtaining data which have a still larger value of this test statistic than the value observed, assuming the hypothesis is true. This probability is called the size of the test or confidence level. Small probabilities (say, less than one percent) indicate a poor fit. Especially high probabilities (close to one) correspond to a fit which is too good to happen very often, and may indicate a mistake in the way the test was applied, such as treating data as independent when they are correlated. An attractive feature of the chi-square goodness-of-fit test is that it can be applied to any university distribution for which you can calculate the cumulative distribution function. The chi-square goodness-of-fit test is applied to binned data (i.e.. . data put into classes) . This is actually not a restriction since for non-binned data you can simply calculate a histogram or frequency table before generating the chi-square test. However, the values of the chi-square test statistic are dependent on how the data is binned. Another disadvantage of the chi-square test is that it requires a sufficient sample size in order for the chi-square approximation to be valid.

### As Test of Homogeneity

It is an extension of test for independence weather two more independent random samples are drawn from the same population or different population. The Test for Homogeneity answers the proposition that several populations are homogeneous with respect to some characteristic.