View Single Post
Old 04-11-2008, 12:55 PM   #42
Huckleberry
College Starter
 
Join Date: Dec 2001
Obviously we all agree that the issues Ksyrup has pointed out are a serious problem. So let's see what we can come up with to suggest to Markus as a solution.

As TroyF astutely observed, a stats-only (or almost stats-only) decision-making algorithm can run into problems with a player that has had an extended slump early in the year. But doing it only based on ratings causes silliness like the Cy Young winner being released.

Seems like some sort of Bayesian adjustment to the current year's stats should be utilized. Let's take the example of using a system with 0% ratings, 50% this year's stats, 30% last year's stats, and 20% two years ago stats.

I think what should happen is that a minimum number of plate appearances (or innings pitched/batters faced) should be set for the current year stats to be calculated alone. That should be set to the weighted average of the previous two years' stats. If the player still has less than that amount then it will be adjusted with the other two years.

Example:

Garret Atkins Stats (I ignored HBP for this quick analysis so his OBP is lower):
Code:
Year PA AB H 2B 3B HR BB SO BA OBP SLG 2006 695 602 198 48 1 29 79 76 .329 .399 .556 2007 684 605 182 35 1 25 67 96 .301 .364 .486 2008 37 35 10 2 0 0 1 5 .286 .324 .343
Obviously his 2008 numbers shouldn't be weighted at 50% all on their own yet. So what portion of that 50% should be his 2008 stats? The first thing I did was weight the plate appearances from the previous two seasons at the same ratio as the modifiers were set (60% last year and 40% two years ago). So the baseline plate appearances total before the 2008 stats are considered on their own was 688.4 PA. Beyond that, the calculation was to use the 2008 stats that have actually been accrued, and then fill in the rest of the plate appearances needed to get to 688.4 with stats from the previous two years. Once again, 60% of those stats came from 2007 and 40% came from 2006.

So currently, the stats that the game would use for Atkins' current year and last two years would look like this:
Code:
Year PA AB H 2B 3B HR BB SO BA OBP SLG 2006 695 602 198 48 1 29 79 76 .329 .399 .556 2007 684 605 182 35 1 25 67 96 .301 .364 .486 2008** 688.4 606.3 188.3 40.0 0.9 25.2 68.9 88.3 .311 .374 .504
Just to show how the effect changes throughout the year, let's assume that Atkins keeps up his 2008 performance through 10 times as many plate appearances (370):
Code:
Year PA AB H 2B 3B HR BB SO BA OBP SLG 2006 695 602 198 48 1 29 79 76 .329 .399 .556 2007 684 605 182 35 1 25 67 96 .301 .364 .486 2008 370 350 100 20 0 0 10 50 .286 .324 .343 2008** 688.4 629.3 187.1 38.6 0.5 12.3 43.2 90.7 .297 .335 .419
The 2008** line is of course the one that the game would use. And the effect obviously diminishes continuously and once the PAs for the current year reach the magic number, the effect completely disappears.

Just one way to do it.
__________________
The one thing all your failed relationships have in common is you.

The Barking Carnival (Longhorn-centered sports blog)
College Football Adjusted Stats and Ratings
Huckleberry is offline   Reply With Quote