View Single Post
Old 03-24-2014, 08:40 PM   #4
SportsDino
College Prospect
 
Join Date: Oct 2001
While locks are the most powerful tool for cutting down the solution space the number of historic locks is very limited and will not be enough to completely narrow the pool. The second tool is the 'ceiling' which involves a slightly more complex model of the solution space.

Since every game is a binary decision, either A or B wins, while you can represent every unique bracket with a 63-bit string it can also be represented as a binary tree of depth 6 rooted at the championship game (with that decision resulting in the winner). The binary tree model allows us to introduce the following basic equation:

The number of possible outcomes from any node in the tree is twice the product of the possible outcomes of the child nodes of that node, or:

O = 2* C1 * C2

Building up by induction, the first round game is considered to have two inputs of 1 each (each team), and there are two results, team A wins or loses. So 2 * 1 * 1 = 2 outcomes. The second round game has two child games from the first round, each has 2 outcomes. All possible combos of those two games involves a 1 or 0 from the A game, a 1 or 0 from the B game, or 4 possible combos. Regardless of how the combos came in, the second round game itself is a win/lose decision, so 4 inputs each with 2 outputs causes there to be 8 possible outcomes, or 2 * 2 * 2 = 8.

This process continues all the way to the championship which calculates to the true probability of the tourney, 9 quintillion.

What this allows for is the calculation of independent sub-trees, with each sub-tree by depth being more closely related. Whether Florida makes the championship game has very little to do with Michigan this year, but it has a heck of a lot to do with UCLA. Reduction of a sub-tree carries forward to the rest of the tree, you may not be able to predict much about the B inputs, but in the 1-16 matchup you can guarantee the A input, so the second round game looks like 2 * 1 * 2 = 4 instead of 8. This matches our earlier result, every lock cuts the field in half.

The ceiling is a natural result of this new structure. For each team there are a string of games between it and the championship, and each of these have a probability of falling in favor of that team. Also, historically there are records for how far each seed has made it in the tournament, for instance a 15 seed has never made the championship game. A ceiling is that for a particular team you predict that it cannot make it past a certain round. By doing so, although you cannot fix the inputs to that round, you can reduce the outcomes from that round (every combo where that team could have won becomes a guaranteed loss).

For instance, if you have a 3-14 matchup and you choose not to lock it so the 14 always loses (14.7%). However, you may decide that in all of history a 14 seed made it past round 2 only 2/120 times (1.7%) so you want to put a ceiling on the 14 seed. The inputs to the second round are 2 and 2, however the outcomes are that every combo the 14 seed wins the first game it will always lose the second game. Out of 4 inputs, half involve the 14 seed, and normally they would generate wins and losses. Now you can remove the win possibilities (2). So the possible outcomes of round 2 are now 2 * 2 * 2 - (2 * 2 / 2) = 8 - 2 = 6.

Ceiling reductions are smaller than locks, however, constraining options within each sub-tree gets multipled at each depth level, so those 2 eliminated possibilities can end up covering quadrillions of combos eliminated from the pool (assuming no other reductions).

It can also be considered in terms of a prime factorization, the total outcomes of the pool is now 2 * 3 * factor(rest o' tree) instead of 2^3 * factor(rest o' tree). Since powers of two are extremely common while performing these calculations by hand I usually left the tree in prime factor form for speed.

The ceiling effect changes based on locks or other ceilings involved, to calculate the effect you always subtract from the total number of outcomes the number of input strings that contain the team you are restricting. This can be done recursively by building up the number of inputs from the first round leaf of the tree to the current node where the ceiling hits, multiplying by the size of the other sub-tree.

Since this involves a subtraction it will break the prime factorization, but often it can be refactored, and an automatic tool would eliminate manual calculation as a concern anyway.

Using the lock and ceiling exclusively we can create the baseline bracket based off historical results which reduces the solution space to 2.6 billion.
SportsDino is offline   Reply With Quote