===============================
I was reading an article about MLB using the new Pitch FX system to record data and found it interesting so I went and downloaded a few games worth to look over. While I was going over the play-by-play data (which is the primary pitch data) I thought to myself, "Hey, I bet I could use this data to help make the MLB2k8 CPU AI (and game overall)a bit more realistic". So I downloaded ALL 702,000+ pitches in the MLB database and got to work. What I came up with resulted in this set of sliders which I think should provide as realistic a game of baseball as MLB2K8 is capable of. There are still some glaring problems which I will point out, but I think it's pretty damn close.
Ok, now for those who haven't already scrolled down or just closed the thread altogether, here's the data I collected, how I used it, and the results I eventually received after way too many games of MLB2K8. I swear these results are not "cooked" or cherry-picked in any way. The MLB2K8 game data I present is the last 21 games I played with this set of sliders. The only tweak I made to these sliders over that time is a very slight change to the AI Pitching after the patch was released due to the AI throwing more balls.
First, let's start with the MLB Pitch FX data I downloaded:
<iframe width='760' height='175' frameborder='1' src='http://spreadsheets.google.com/pub?key=pFLe20s9yWG43C6BcZA82CA&output=html&gid=0& single=true&range=a2:f8'></iframe>
This is the actual data from MLB. There are 672823 pitches that had the data I needed. As you can see in the first (blue) table, I broke it down into 5 basic categories, and then figured the percentage of each. I then went a step further and broke it down into only the pitches swung at, and those percentages. Here's what these stats mean for those interested:
Total Balls: What it sounds like. Any pitch not swung at and not termed a strike. Includes pitchouts, hit batters and such.
Total Called Strikes: A pitch not swung at and called a strike.
Total Swinging K's: A pitch swung at and not hit.
Total Foul Balls: A pitch fouled off that does not result in an out.
Total Hits In Play: This is any pitch put into play that results in an out or a hit.
This is the main data I then used to adjust sliders to attempt to duplicate these real life averages. I had no idea what I was getting into..... I figured I'd play a few games, take some data, and adjust sliders. But that was too simplistic.
To start with, I played 20 games with my base sliders that I had been using, recording on a notebook every time the AI swung and that swing's result. I then put that into a data base to find what averages I was getting. After I had a base to work from, I then adjusted sliders, played a few games, looked at the results, adjusted, made marks in my notebook, reviewed data, adjusted....and on and on and on.
I then added another set of stats. I logged onto espn.com and reviewed the MLB 2007 year team averages such as hits per game, doubles per game, k's per game and etc. I then took it upon myself to attempt to get the AI to average something close to this as well.
Finally after playing a crap-load of games, adjusting every so often and looking at my stats I finally had a set of sliders that was producing some realistic results. It's important to remember these results are averages over a 21 game period. I had games that were wild in terms of Swinging Strikes, and such, but then would eventually swing the other way for a game.
This next two tables are the averages of the 21 games played. I also included another table that I was using that showed the results of the latest 10 games. These games where played with my Cardinals fantasy draft franchise against a good variety of teams. Two series I played against the Cubs and Brewers, which in my franchise are the power teams right now. They are #3 and #2 in home runs and average and both fighting for the Central NL lead. Then I played a series against the TB Rays, who are an average team..literally #13, #15 etc in their ratings. Then I played a couple of series against two bottom dwellers, the Dodgers and Pirates. So I think I had a good combination to draw data and stats from.
<iframe width='539' height='359' frameborder='1' src='http://spreadsheets.google.com/pub?key=pFLe20s9yWG43C6BcZA82CA&output=html&gid=1& single=true&range=a1:d17'></iframe>
Please note these are percentages of the result of when the bat is swung by the AI. The Called Strikes and Balls percentages are overall pitches. Balls are the amount of balls that I threw, so these aren't entirely AI dependent, it has as much to do with the fact I don't throw enough balls.
One glaring thing that sticks right out. No matter what I set sliders too, even in my initial stages of collecting data, there aren't enough foul balls by the AI! This 15% or so average stayed pretty much consistent through out everything (and of course the AI Foul Balls slider is 100%). If you look at the Hits in Play and Foul Balls, you'll notice these basically offset each other. So I think it's safe to say that 2K needs to increase the amount of foul balls hit by the AI.
But you'll also notice I'm very close to real life MLB averages with the strikes; swinging and called. And if you notice the MLB average of 45.58% of pitches are swung at...if you factor in I don't throw enough balls, I'm pretty close here as well.
Finally my last set of tables:
<iframe width='835' height='375' frameborder='1' src='http://spreadsheets.google.com/pub?key=pFLe20s9yWG43C6BcZA82CA&output=html&gid=2& single=true&range=a1:i17'></iframe>
Again, the overall 21 games, and the last 10 games. You'll notice I have the CPU throwing just about the right amount of balls. In the last 10 games it went up some due to the patch which seems like somewhat increased the balls thrown. I adjusted for this by raising the Throw Strike slider a notch and the last two games brought the amount back down to normal.
Everything here is also pretty close to MLB averages, with the exception of the AI homeruns overall. I think I'm to blame there....I got absolutely shelled by the Cubs and Brewers when I played them. They are home run hitting machines in my game and skewed the stats somewhat. You'll notice the last 10 games is exactly average.
Also you'll notice the AI just doesn't walk. Certainly I don't throw enough balls, so that may be why, but I also suspect something else at work here causing this. My guess is the large strike zones of the umps. I noticed that when I played a game with an umpire with a smaller zone, the AI did take walks, yet with the large zone umps, they take practically none. So there's another thing that 2K needs to change to improve the realism.
And so that's it. Way more work that I initially thought it would be, but ultimately I think I've gotten a realistic set of sliders and have the data and stats here to back that up. I also hope that 2K (and Sony for that matter) are aware of this Pitch FX data. There's a wealth of data in there that would improve baseball games like you wouldn't believe. If they aren't, then they need to hire and pay me a crap load of money to explain it to them.
And I'll finish this off by repeating two things: 2K, we need more AI Foul Balls and we need the strike zones tightened up slightly. With these two simple fixes (which could be addressed in a patch), you would actually have a pretty realistic game on your hands.
Comment