Regression Practice STA 3024 Spring, 1999 D. Meeter

In any particular regression, the total SS is a constant no matter what model is fit (unless you transform y.) As you add terms to the model, SSreg increases, and SSE decreases. For any model fit to this data, the total SS is the same; what is lost from SSE is added to SSreg. The d.f. also remain the same; if regression gains a d.f., the SSE loses a d..f.

In a set of 40 observations, the total SS was 5,000.

1. y was regressed on x1; the SSreg was 400. How many d.f. does SSreg have? What is SSE? How many d.f. does SSE have? Is b 1 significantly different from zero? What is your estimate of s 2 and how many d.f. does it have?

2. Then y was regressed on x1 and x2, with x2 listed first. The SSE was 4,500; how many d.f. does it have? What is your estimate of s 2 and how many d.f. does it have?

What is SSreg? How many d.f.? The program prints x2: 300

extra due to x1: 200

How many d.f. for the 300? For the 200? How can it be that the SS "due to" x1 is different?

Test the hypothesis that b 1 = 0 by

a) a partial F test;

b) a test which assumes that x1 was added last;

c) a test in which the complete model is x1 and x2 and the reduced model is just x2.

(These are three ways of saying the same thing. The t ratios in Minitab, if squared, give the partial F ratios, so they accomplish the same test. What is the t ratio corresponding to the above test?)

3. Then y was regressed on x1, x2, and x3, in that order. The result was

x1: 400

extra due to x2: 400

extra due to x3: 100

What is SSreg? How many d.f.? What is SSE? how many d.f.? What is your estimate of s 2? How many d.f.? Note that the SS for x1 is the same as in 1, because it was entered first in this three-predictor regression. What do you notice about the SS for x2 in this regression compared to the previous one? This phenomenon is unusual but not impossible. Usually the SSreg for a variable gets smaller as other variables are entered into the model.

_________________________

Suppose you had data on your company for salaries (y) for men and women, along with their years of experience (x). You fit a regression to the data with men as the reference category, and get a prediction equation:

.

Assuming that a) the model is adequate, and b) all coefficients are statistically significant, what is the prediction equation for men? For women? Interpret these equations in words. What salary would you predict for a woman with 0 year's experience? For a man? The same, with 10 year's experience?