Estimation after Model Selection in a Gaussian Model* EDSEL A PENA
University of South CarolinaThe problem of estimating the variance and the distribution function of a Gaussian model which is intermediate between a model where the mean parameter is fully known and a model where the mean parameter is completely unknown is considered. This problem is motivated by the desire to understand the theoretical implications of the process of selecting a model among several competing sub-models, and then estimating a parameter of interest after the model selection, but with these sequential steps using the same sample data. This practice is common in many areas, such as in regression analysis where a subset of possible predictors is chosen through stepwise regression; in reliability and survival analysis settings where it may just be known that the failure time distribution belongs to either of two parametric classes of distributions functions; or in goodness-of-fit testing where an embedding approach is utilized to develop the test procedures, such in Neyman's smooth goodness-of-fit approach. Of particular issue addressed in this talk is to compare the following three approaches in proceeding with the inference: Approach I: Utilize estimators developed under a more general model, and in the extreme case, under a fully nonparametric model; Approach II: Perform a two-step process, with the first step being to select the sub-model, and the second step being to use an estimator developed under the chosen sub-model, with both steps using the same sample data; and Approach III (Bayesian): Form a weighted combination of sub-model estimators, with the weights, which are the sub-model posterior probabilities, being data-dependent.
*Joint work with Prof. Vanja Dukic of the University of Chicago.