PDA

View Full Version : Statistics and probability question


kcchief19
11-04-2003, 09:55 PM
As has been noted before, I am a bit of a novice when it comes to statistics and probability, so this is probably an easy question.

At work today, we were putting together a mailing and had 547 envelopes with mailing labels on them. However, after we were done we noted that there was a database error and we had to relabel all 547 envelopes.

The names on the labels were the same, although the order and distribution in which the labels were applied were at random. Inevitably, someone finally affixed the new label for the same person over their old label.

What are the odds of this happening?

yabanci
11-04-2003, 09:57 PM
I think under Murphy's law, it's about 100%

wbonnell
11-04-2003, 10:07 PM
isn't it 1 / (547!)

1 / (547 * 546 * 545 * 544 * 543 * ... * 1)

If there were 5 labels in the first set and 547 in the second set it would be:

1 / (547 * 546 * 545 * 544 * 543)

sabotai
11-04-2003, 10:12 PM
Well, first of all, the odds of it happen at, say, position 212 out of 547 is 1 in 547. Because for all 547 positions, there's 1 chance of the name occuring at each position.

For it to happen again. Well, it's simple. If you had a 547 sided die, what are the odds of it coming up with the same number two times in a row?

This is where my stats knowledge ends? :)

Celeval
11-04-2003, 10:20 PM
2/3


No, seriously. Pretty close.

The chance of the first label being different is 546/547. The chance of the first two labels being different is (546/547)*(546/547). The chance of all labels being different is (546/547)^547 = .367. So the chance of at least one label being the same is 1-.367 = .632 = 63.2 %

I don't recall if this is exactly right, but it's pretty close, IIRC.

wbonnell
11-04-2003, 10:22 PM
doh! I guess I gave the odds that all labels would match

kcchief19
11-04-2003, 10:30 PM
At first I was thinking it was wbonnell's choice, but then I thought that made the odds too long. Then I thought that it might be simply 1 in 547.

Then I was thinking along the same lines as sabotai, that the odds of matching labels would be the same as rolling two 547-sided dice and coming up the same.

But then it occurred to me that you are also eliminating choices as you go. After I put a new label on an old envelope, I have reduced the number of options in each pile by one. I have no idea how to calculate that.

I want to think Celeval is on the right track, since 2/3 is always the right answer, but I'm not sure I understand. Are you saying that as we were blowing through all 547 envelopes that the chances that we would match up at least one label is 63.2%? That seems plausible to me, but if I only relabeled one envelope, what are the odds of that label being the same?

kcchief19
11-04-2003, 10:31 PM
Originally posted by wbonnell
doh! I guess I gave the odds that all labels would match
That's why those odds seemed long. I think you're right on that.

wbonnell
11-04-2003, 10:33 PM
Originally posted by kcchief19

I want to think Celeval is on the right track, since 2/3 is always the right answer, but I'm not sure I understand. Are you saying that as we were blowing through all 547 envelopes that the chances that we would match up at least one label is 63.2%? That seems plausible to me, but if I only relabeled one envelope, what are the odds of that label being the same?

Should Celeval's equation eliminate one each time:

(546/547)*(545/546)*(544/545 )....

And if you relabeled one envelope, your chances would be 1 in 547 which should be the same as 1 - 546/547

sabotai
11-04-2003, 10:38 PM
I knew I should have showed for class when I took Statistics. :)

Huckleberry
11-04-2003, 11:04 PM
I think this problem is more complicated simply because by affixing each label, you are eliminating that label from the future draw (edit - and that envelope).

All I know is that with 23 or more people in a room, odds are in favor of at least two having the same birthday. I guess it's sort of related. :)

RPI-Fan
11-04-2003, 11:16 PM
What Celeval is saying, is the odds of affixing the same label are equal to (one minus [odds of NOT labeling them the same])...

[ 1 - ( 546/547 * 545/546 * 544/545 * 543/544 ... * 1/2 ) ]

I don't have the mathematical software know-how to work that out, but I imagine Excel or somesuch could take care of it.

wbonnell
11-04-2003, 11:27 PM
Originally posted by RPI-Fan
What Celeval is saying, is the odds of affixing the same label are equal to (one minus [odds of NOT labeling them the same])...

[ 1 - ( 546/547 * 545/546 * 544/545 * 543/544 ... * 1/2 ) ]

I don't have the mathematical software know-how to work that out, but I imagine Excel or somesuch could take care of it.

you need a factorial function:

1 - (546!) / (547!)