I wanted to present some combine number findings to the community here. Being that I am still presently finding out which attributes will work best for Madden 11 and the FBG ratings, I would like to present some findings.
First of all, I have all of the combine and pro day data from 1998 through 2010. This includes the official heights and weights, and times. The distribution of the majority of this data is NOT normal. This is most likely because scouts only look at the guys who run and test well enough to have a shot at an NFL training camp, although other organizations like the AFL and CFL are usually present. Most of the data is thus, skewed to the best side of the distribution. For instance the population mean for the 40 yard dash is 4.81. However, because of the skewness of the distribution, an average time is actually closer to 4.59. The distribution is skewed toward better performance numbers.
I created two different charts that display the optimal ratings assigned to a performance mark. This was done by setting the next number up (or down) from the fastest/highest number at zero. For instance, the best 40 time recorded is 4.21 set this year by Trindon Holliday. Now I know what many of you are thinking....I thought Holliday ran a 4.35! How did you get 4.21...the new combine record? Well what many people do not know is that there are actually 3 times associated with every runner's drill. Per NFLdraftscout.com, which is owned by TSX, the same provider as my site:
"NFLDraftScout.com uses the best verifiable 40-yard time for each player. There is no single, official 40-yard time for any player, even those who run at the Indianapolis Combine. Those players who participate in the 40 yards at the Combine actually run twice and on each run they are timed by two hand-held stopwatches and one electronic timer (that is actually initiated by hand on the player's first movement). Combine data includes all six of those times for each player, but no single official time. Team scouts and coaches have various approaches for getting the 40 time they use from those six timings. Some use averages. Some throw out slowest and fastest and then average the rest. In deference to each player, NFLDraftScout.com attempts to use the best verifiable time that seems appropriate for each player. That is the 40 time we post."
The times that are posted on NFL.com are NOT the official times. NFL.com typically posts the SLOWEST of only ONE of the three best times. They do this because they get the information in real time via NFL network and need to publish it instantly (for those who watched it on TV or followed it online). They also do not wait for the revised times to be posted. This is critical in the analysis of combine statistics.
Now back to the example. The TOTAL range for Holliday this year among all three timings was 4.18 to 4.38. His BEST MEDIAN (meaning the middle) time was 4.21. That is the number we post. The good thing is that all of our data uses this philosophy, so that 4.24 posted by Chris Johnson is not a 4.10...it is still a 4.24 which was his best median time. NFL network actually got this one correct...I guess even a blind squirrel finds a nut every once in a while.
Now what this means for the ratings is I took the best mark and went one interval up or down and set that to 100. This alleviates having to put any 100s in the game, which in my experience should be impossible. Should someone in the near future break that mark, they will be assigned a 99 as well or the system will be readjusted.
So for the 40 yard dash, since the best official time is 4.21, the next interval that is better is 4.20. This is set to 100. So 4.20 = 100 SPD via the 40 yard dash. The lower bound was set at 20 for one interval worse than the worst mark. For the 40 yard dash, this was 6.16 set by QB Mike McQuery of Penn State in 1998. Therefore 6.17 was set to 20.
What I then did was set the skewness to a number that seemed to fit. In the case of the 40 yard dash, the population average is 4.81. The average due to skewness is 4.59. A fair rating for this number IMO is 80. Therefore I have set 4.59 to 80. Now remember, I also have the upper and lower bound. Insert a simple polynomial curve to unite all three numbers (upper, lower, and skewed median) and BINGO, you can find the SPD rating assigned to every player who ran. This was then done with every drill since 1998.
The result is a chart that shows how a player should be rated upon knowing the agility drill marks and standards. Using this method, there were a total of 13490 40 times taken during this time period. Of these there were only 348 players who achieved a SPD rating of 90 or higher (2.6%), 3959 with an 80 SPD or higher (29.3%), and 7919 with a 70 SPD or higher (58.7 %). It truly makes that 90 SPD rating elite.
The second method I utilized was setting the population average to the average Madden 11 SPD, ACC, AGI, etc. This means that since the average 40 time is 4.81 and the average Madden SPD is 74, a 4.81 40 time = 74 SPD. The upper and lower bounds were also used as in the first method.
This led to some very different results. For the SPD rating, out of the 13490 players that were calculated, a whopping 691 had a SPD above 90 (5.1%), 5689 were above 80 (42.2%), and 9225 were above 70 in SPD (68.4%).
These are very different numbers from one method to the other. The inflation of the ratings in the present system provided by EA is evident even when using standardized data! My questions for the community are:
1. Which method do you prefer and why?
2. What would you do differently to determine attribute ratings using hard data, if anything?
I want it to be quite clear that I would like thought provoking opinions and advice as what is said here may determine how the attributes are FBG ratings are calculated. Thanks for the time and sorry for the novel.
Dan B.
www.fbgratings.com/members