Why should I choose AnalystNotes?

AnalystNotes specializes in helping candidates pass. Period.

Subject 5. Hypothesis Testing of Linear Regression Coefficients PDF Download

Hypothesis Tests of the Slope Coefficient

We frequently are interested in testing whether knowledge o an independent variable X is useful in explaining the values of Y. For example, we may want to test whether a linear relationship exists between X and Y. If X and Y are not linearly related, then in the population regression line E(Yi) = b0 + b1Xi, we should have b1 = 0. If b1 = 0, the values of X are of no use in predicting Y, and the population regression line will be a horizontal line. If we reject the hypothesis that b1 = 0, then we are saying that the values of X are helpful in predicting Y.

The null hypothesis does not always have the form H0: b1 = 0, although this is by far the most frequent case.

Suppose an economist claims that, in the U.S., annual income is related to years of education and the slope of the population regression line is approximately b1 = $2000. That is, the economist claims that an increase of one year in education tends to be associated with an increase of approximately $2000 in annual income. Suppose we want to test the null hypothesis that the slope of the population regression line is b1 = $2000. Then we would test the null hypothesis H0: b1 = $2000 against the two sided alternative hypothesis H1: b1 ≠ $2000.

Test of hypothesis concerning the value of b1 are based on the fact that if the basic assumptions of the simple linear regression model hold, then the random variable t = (b-hat1 - B1)/Sb-hat1 follows the student t distribution with n - 2 degrees of freedom.

The two key measures are the standard error of the parameter and the critical value for the t-distribution associated with the t-test of statistical significance.

  • Sb1 = sqrt [ ∑(yi - y-hati)2 / (n - 2) ] / sqrt [ ∑(xi - x-bar)2 ]
  • The t-distribution requires that the number of degrees of freedom be known. For a linear regression with two parameters estimated (the two parameters are the slope and intercept), the number of degrees of freedom is (n - 2), where n is the number of observations. (Generally, the number of degrees of freedom equals the number of observations less the number of parameters estimated.)

The decision rule: reject H0 in favor of H1 if t < -tα/2, n-2 or if t > tα/2, n-2.

Regarding the calculated t-statistic used to test whether the slope coefficient is equal to zero:

  • It is equal to the t-statistic to test whether the pairwise correlation is zero: t = r sqrt (n - 2) / sqrt (1 - r2)
  • It is related to the F-distributed test statistic: t2 = F

Hypothesis Tests of the Intercept

Conducting hypothesis tests and calculating confidence intervals for the intercept parameter b0 is not done as often as it is for the slope parameter b1. The reason for this becomes clear upon reviewing the meaning of b0. The intercept parameter b0 is the mean of the responses at x = 0.

We can perform the hypothesis testing in a similar manner. One thing to note is the equation for the standard error of the intercept is different. However, there's no need to memorize it.

Hypothesis Tests of Slope When Independent Variable Is an Indicator Variable

A dummy (indicator) variable takes on 1 and 0 only. The number 1 and 0 have no numerical (quantitative) meaning. The two numbers are used to represent groups. In short dummy variable is categorical (qualitative).

For instance, we may have a sample (or population) that includes both female and male. Then a dummy variable can be defined as D = 1 for female and D = 0 for male. Such a dummy variable divides the sample into two subsamples (or two sub-populations): one for female and one for male.

Hypothesis testing can be performed in a similar manner when independent variable is a dummy variable.

User Contributed Comments 5

User Comment
vi2009 why does this statistics have n-2 degree of freedom instead of n-1?
Vonoko Usually it is n - # of coefficients estimated ( so y-intercept coefficient + one explanatory variable coefficient)
reganbaha because linear regression is based on 2 different variables. i.e. 1 dof from each population
DariSH Another good explanation: n-2, is where you substract 1 for the independent variable, and another 1 for intercept.
Actually the formula is: n-k-1, where n-number of observations, k-number of independent variables, and 1 - for intercept.
Oksanata generally, the number of degrees of freedom equals the number of observations less the number of parameters estimated. For a linear regression with 2 parameters estimated (the slope and intercept), the number of degrees of freedom is (n-2)
You need to log in first to add your comment.
I am happy to say that I passed! Your study notes certainly helped prepare me for what was the most difficult exam I had ever taken.
Andrea Schildbach

Andrea Schildbach

My Own Flashcard

No flashcard found. Add a private flashcard for the subject.

Add

Actions