Name ______________________ STA 4222/5225 Midterm Spring '01 D. Meeter
DUE: March 8
Work all problems on a separate paper. Show your work and label the answer.
This is a TEST. Do your own work. QUESTIONS? Ask me or Mr. Bain.
Questions labeled STA 5225 can be attempted by students in STA 4222 for extra credit.
The Hospital Cost Containment Board is doing a study of medical back problems which were treated in hospitals in 2000. The study is funded by a private hospital insurers association that wants to compare claims made on private insurers with the rest of the cases.
For planning purposes, 1999 data is available, which shows that there were 36,000 back claims, of which 40% involved private insurers.
An important response variable is Length of Stay (LOS) in the hospital, which in l999 averaged about 6 days, with a variance of 4. (The private cases had a variance of 3.24 and the public a variance of 3.6 .)
Plan A is an srs of n claims.
1. Choose a sample size so that the bound on the estimate of mean LOS is about 0.2 days.
2. Do you need to be concerned about the fpc? Why or why not?
3. In this particular study, identify or describe: response, element, population, sampling unit.
5. STA 5225: The above 1999 estimates were compiled from reports from hundreds of hospitals that summarize the data for their hospitals. Can you use similar year 2000 reports as a frame? Why/why not?
Plan B is a Neyman allocation of 372 claims to two strata: private, non-private.
6. How many observations would you need in each stratum?
7. Estimate the bound B on the error of estimation. Ignore fpc.
8. Estimate the number of observations needed with srs to achieve the bound B which you got in Question 5. Comment.
9. There are three reasons for stratifying a sample. Which of them (if any) apply in this problem? Give your reasoning.
10. STA 5225: State a variable which might be on the claim file that would be a candidate for an auxiliary variable for ratio or regression estimation. State another variable that might be related to LOS that could not be used for ratio or regression estimation.
----------------------------------------------------------------------------------------------------------------------
The Florida Department of Agriculture wants to know what quantity of different types of pesticides are being used each year. They propose to take a sample of Florida farms.
10. Which of the following variables might be useful for stratification? State your reasoning.
a) age of the farm owner
b) size of farm in acres
c) county in which farm is located
d) type of crop(s) grown.
11. Several of the responses are of the form "pounds of pesticide type Z used." Explain briefly how ratio or regression estimation could be used to estimate these responses more precisely.
12. Some of the forms are returned "no longer farming". How should this be handled?
13. STA 5225: Some farms refuse to participate. What about selecting a neighboring farm to replace them? (Explain why or why not.) Name three things that you could do to keep the participation rate high.
14. In these True-False questions, BRIEFLY state why you chose True or False.
a) T F The fpc is important when the sampling fraction is large.
b) T F Sample survey methods aren't needed if you census the entire population.
c) T F If we take a larger sample from the same population (using the same method) we should expect, but we cannot guarantee, that we will get a smaller bound B.