![]() |
|
|
#1 | ||
|
Grizzled Veteran
Join Date: Dec 2003
|
Regression Question
Question; Say I'm running your standard regression, fears of multi-collinearity be damned, and I have 20 companies, and 5 variables (Call price the dependent variable, and size, growth, LN growth and % American the explanatory variables).
When I'm running a full regression (ie, using all 5 variables) and only 11 of the companies have data for all 5 variables. Am I correct in running the regression only on the reduced dataset of these 11 companies, or is it okay to include other companies that may have one or other variables missing ? It should be noted that missing here doesn't mean the variable doesn't exist (Ie - it may well affect the price) - rather, it simply can't be isolated for that specific company. My gut tells me that I ought to be running it on the reduced dataset only because I'm biasing the other variables and adding noise if I use a larger set, but I'm not certain. Thanks in advance. |
||
|
|
|
|
|
#2 |
|
College Starter
Join Date: Dec 2003
Location: The DMV
|
Off the top of my head, without knowing what you are trying to analyze or what statistical techniques you are using, running your regression with the reduced data set is probably what you should ultimately report.
Also, you can run a few diagnostic regressions--for example, instead of maximizing on variables, you can try to maximize on companies. The extreme case would be to keep all 20 companies, and then just keep the variables they all have data for. You can also run the regressions on various data sets--keeping differing numbers of companies based on data availability on certain variables. Compare the r2 of the overall regressions to see if those incomplete variables, or missing companies really add anything. Look at the standardized coefficients and t-stats (if available) of the individual variables to see if they change from regression to regression, and also to determine the strenghth of the variable/company combos you are excluding. Other than that, could you come up with defensible surrogates for your missing data? |
|
|
|
![]() |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
|
|