Crapshoot
08-15-2006, 10:18 PM
Question; Say I'm running your standard regression, fears of multi-collinearity be damned, and I have 20 companies, and 5 variables (Call price the dependent variable, and size, growth, LN growth and % American the explanatory variables).
When I'm running a full regression (ie, using all 5 variables) and only 11 of the companies have data for all 5 variables. Am I correct in running the regression only on the reduced dataset of these 11 companies, or is it okay to include other companies that may have one or other variables missing ? It should be noted that missing here doesn't mean the variable doesn't exist (Ie - it may well affect the price) - rather, it simply can't be isolated for that specific company. My gut tells me that I ought to be running it on the reduced dataset only because I'm biasing the other variables and adding noise if I use a larger set, but I'm not certain.
Thanks in advance.
When I'm running a full regression (ie, using all 5 variables) and only 11 of the companies have data for all 5 variables. Am I correct in running the regression only on the reduced dataset of these 11 companies, or is it okay to include other companies that may have one or other variables missing ? It should be noted that missing here doesn't mean the variable doesn't exist (Ie - it may well affect the price) - rather, it simply can't be isolated for that specific company. My gut tells me that I ought to be running it on the reduced dataset only because I'm biasing the other variables and adding noise if I use a larger set, but I'm not certain.
Thanks in advance.