View Full Version : Statistics Question
CraigSca
12-30-2005, 07:09 PM
All about correlations and regressions to the mean.
Suppose I have two variables...and their correlation is .71. According to statistics, 71% of the other variable can be explained by the first variable. If I KNOW what the first variable is - can I make an educated guess/plausible version of the second variable by doing the following:
Using the mean and the S.D, of the first variable, I determine how many S.D.s from the mean the original variable is. I then take this z-score and multiply the second variable's S.D. by this value and add it to it's mean. I then multiply this number by the original correlation value (in this case .71). I then take the original mean of the second number and add it to a random (evenly distributed) z-score multiplied by the second variable's S.D. I multiply this number by (1 - correlation value - in this case .29).
Does this therefore create a plausible value for the second number that IS 71% described by the original variable and 29% described by the second number's mean and S.D.
P.S. Why the hell did I even ask this?
AlexB
12-30-2005, 07:32 PM
...can I make an educated guess/plausible version of the second variable by doing the following:
Using the mean and the S.D, of the first variable, I determine how many S.D.s from the mean the original variable is. I then take this z-score and multiply the second variable's S.D. by this value and add it to it's mean. I then multiply this number by the original correlation value (in this case .71). I then take the original mean of the second number and add it to a random (evenly distributed) z-score multiplied by the second variable's S.D. I multiply this number by (1 - correlation value - in this case .29).
Does this therefore create a plausible value for the second number that IS 71% described by the original variable and 29% described by the second number's mean and S.D.
This is a statisitics question, and the way it is worded allows me to be 50% confident in my answer, which is....
No.
MIJB#19
12-30-2005, 08:05 PM
All about correlations and regressions to the mean.
Suppose I have two variables...and their correlation is .71. According to statistics, 71% of the other variable can be explained by the first variable. If I KNOW what the first variable is - can I make an educated guess/plausible version of the second variable by doing the following:
Using the mean and the S.D, of the first variable, I determine how many S.D.s from the mean the original variable is. I then take this z-score and multiply the second variable's S.D. by this value and add it to it's mean. I then multiply this number by the original correlation value (in this case .71). I then take the original mean of the second number and add it to a RANDOM (evenly distributed) z-score multiplied by the second variable's S.D. I multiply this number by (1 - correlation value - in this case .29).
Does this therefore create a PLAUSIBLE value for the second number that IS 71% described by the original variable and 29% described by the second number's mean and S.D.
P.S. Why the hell did I even ask this?
The bolded capitalized words are hard to match...
I think it's just fair to asume that if you make a formula that translates the first given variable into the second variable in about 71 percent of the cases, it would give you reasonable formula to predict what the new number would be. The randomness, even evenly distributed, will butcher the realism of your new number.
CraigSca
12-30-2005, 09:53 PM
But I don't want the first variable to TRANSLATE 71% of the time, I want it to DESCRIBE 71% of the second variable.
Basically, I want to take the correlation of two variables and, given the real value of the FIRST variable, make an accurate value for the second. Am I doing it the right way?
JonInMiddleGA
12-30-2005, 11:47 PM
But I don't want the first variable to TRANSLATE 71% of the time, I want it to DESCRIBE 71% of the second variable.
Basically, I want to take the correlation of two variables and, given the real value of the FIRST variable, make an accurate value for the second. Am I doing it the right way?
2/3rds.
Fonzie
12-31-2005, 12:09 AM
All about correlations and regressions to the mean.
Suppose I have two variables...and their correlation is .71. According to statistics, 71% of the other variable can be explained by the first variable. If I KNOW what the first variable is - can I make an educated guess/plausible version of the second variable by doing the following:
Using the mean and the S.D, of the first variable, I determine how many S.D.s from the mean the original variable is. I then take this z-score and multiply the second variable's S.D. by this value and add it to it's mean. I then multiply this number by the original correlation value (in this case .71). I then take the original mean of the second number and add it to a random (evenly distributed) z-score multiplied by the second variable's S.D. I multiply this number by (1 - correlation value - in this case .29).
Does this therefore create a plausible value for the second number that IS 71% described by the original variable and 29% described by the second number's mean and S.D.
P.S. Why the hell did I even ask this?A couple of things:
1) a correlation coefficient does not provide the percentage of variance accounted for in one variable by another - it simply describes the strength of the relationship between those variables. Squaring the correlation coefficient provides the percentage of variance explained (in this case, .71 x .71 = .50, or 50%).
2) I'm not sure I understand the concept of your equation - you appear to be constructing a composite score that is: (the percentage of one variable's explanation of the variance in another variable X the mean of the first variable) + (the proportion of unexplained variance in that relationship X the mean of the second variable), correct? Two concerns about this: 1) the variables would need to be measured on the same scale for such an additive approach to provide a meaningful sum; 2) when you talk about one variable explaining the variance in the other, the value you get (50%, in this case) is specific to the explaining variable - the source of remain variability (the other 50%) is not known, and thus it might not be conceptually correct to use the second variable's properties in such a way.
3) Knowing the purpose behind this might allow me to offer more helpful advice. In fact, I'm not even sure what I've said so far is either helpful or coherent, as it is quite late and I've spent the entire day playing with my 2 year-old (which has a magical way of eroding my verbal skills). ;) So, my apologies if any of this is nonsensical.
CraigSca
12-31-2005, 09:09 AM
PM incoming....
vBulletin v3.6.0, Copyright ©2000-2026, Jelsoft Enterprises Ltd.