Bryan Wilson Empirical Ratings
Ratings Blog
ENTRY 69 - 5-4-24 - System Bracket Generation Performance 2024
It is time once again to see how my bracket generator did this year.
We will answer the usual question: If you had to put your
faith in 100 brackets, would you be better off picking 100 random human ESPN brackets,
or 100 from my system? Or from some other system? Which produces the better "best" bracket on average? And
how does that answer change if the size of the pool changes?
First, a recap of the 2024 tournament and some things that are relevant to the
following discussion. There were 3 teams that traded the #1 spot throughout the season
and stood well above the rest of the field. Those teams were Uconn, Purdue, and Houston.
Most people had the three teams in that order, and UConn's betting odds had been at the
top for a long time. The word "inevitable" was used to describe them. Houston had the
highest computer numbers for most of the season but people didn't trust them after early
exits in other recent seasons due to their offensive issues. They also had 3 key injuries,
and sustained a 4th to their best player during the tournament. Purdue had the undisputed
best player in college basketball, maybe in the last 30 years (Zach Edey), but the issue (like last
year) was whether the supporting cast was strong enough. The tournament was very chalky
this year through the Sweet 16, with all 8 of the 1 and 2 seeds surviving and only
a single double digit seed making it. (11) NC State
was the big story of the tournament, winning 5 games in 5 days in the ACC tournament
in order to qualify at all, then making it 4 more to reach the Final Four. (4) Alabama
also exceeded expectations to make the Final Four, although it was in the region with
the weakest 1 seed (UNC) and a 2 seed that wasn't very trustworthy (Arizona).
Ultimately, The Finals ended up returning to chalk with (1) Uconn beating (1) Purdue,
making it now 12 straight tournament wins by double digits.
How did my advice about bracket picking in Blog Entry 68 hold up? Did I say some good things? Let's see:
- "I definitely would not pick a 16 seed to win this year, and probably not a 15 or 14 either." (We did get a shocker
in (14) Oakland which upset (3) Kentucky, but I did at least call this one out as the most likely 14 seed winner)
- "There are some double digit seeds with reasonable chances - New Mexico, Nevada, Colorado, and Drake." (Wow, these
guys all flopped. Only Colorado even won the first round, and then just barely!)
- "The numbers suggest we see one 1 seed, one 2 seed, one 3/4 seed, and one long shot in the final four." (Pretty close
to correct, just two 1 seeds instead of 1 and 2)
- "Expect a little more chalk going into the Sweet 16" (Pretty much exactly what I observed above)
- "UNC, Marquette, Kentucky, and Kansas were all over-seeded based on their strength." (All four of these were upset
by worse seeded teams, and two of them were upset multiple rounds before they were expected to lose)
- "(1) UConn is the betting favorite to win the title but they drew a tough quadrant so choosing them to lose
somewhere could distinguish your bracket." (Bad take!)
- "Purdue would be my #1 choice" (They did make the finals, so I'll call this a partial win)
- "Auburn and Iowa State are both great teams that nobody is talking about." (Both of them big flops)
- "Illinois is the best of the 3 seeds" (They were the only one to make the Elite 8, so I'll claim victory)
- "(3) Kentucky, (3) Baylor, (4) Alabama, (6) BYU: Likely at least one of these teams will make the Elite 8." (Alabama made the
Final Four)
- "(5) Gonzaga and (5) St Marys are theoretically strong teams" (Well, one of them was)
- "The best really long shots are (11) New Mexico, (9) Michigan State, and (10) Nevada" (Already mentioned New Mexico
and Nevada as busts, but Michigan St did win their first one and gave UNC a good game)
Overall I'd say the advice was pretty solid, with a few key big misses.
Before we proceed on to the meat of this post where I break down my system's performance
in generating brackets this year, I have an important caveat. I mentioned in Entry 68 that
I would not be changing any of my system's team strengths manually because I felt that it
did not warrant it. HOWEVER, In generating my own brackets for my own pools I DID modify
the team strengths for some of my own personal reasons. I've watched a lot of basketball
this year and came to some of my own conclusions that disagreed with my program.
The short story is this: The generator available on my site was NOT modified manually
in any way. For my own personal brackets, I made these modifications:
- (1) UConn +1 strength because they were clearly the best team in my eyes.
- (1) Houston -2 because of their injuries and also general distrust in their offense, putting
them clearly behind UConn and Purdue but still #3
- (4) Kansas -1.5 because of some key injuries, which clearly hampered them in the Big 12 tournament
and made them play more like a 10 seed than a 4 seed
- (7) Florida -1 because their big man got injured in the SEC tournament
All four of these modifications turned out to be advantageous, so when I do my analysis below
I will mention at each step what effect these would have had.
Let's move on to chances of my system generating a perfect bracket through each round (using standard probabilities and unmodified strength):
Round 1 - 1 in 44.7 million
Round 2 - 3.31 x 10^-11
Round 4 - 3.70 x 10^-15
Perfect Bracket - 1.07 x 10^-15
Probability of perfect bracket maximized at SD=0.93, P = 1.10 x 10^-15
Probability of perfect bracket using strength modifications: 1.52 x 10^-15
Probability of perfect bracket flipping coins = 1 x 10^-19
This year's tournament is basically exactly in the middle in terms of probability. There was
about an average number of upsets and runs.
Good brackets mostly come down to having the Final Four correct. Let's look at final four
incidence rates next. I have a handy chart here to compare which systems had the best probability
at correct Final Fours. As will be a theme throughout the rest of this post, a lot of my usual
sources of information disappeared or were unavailable this season. In particular, ESPN decided
to basically divulge none of the information they usually do about people's picks. I'm sure
they could have but can't imagine why those chose not to.
As a reminder, "Typical Strength" represents basically picking based on how strong each
seed line typically is in past tournaments. "Exp Max" is my system on a setting that picks
based on the ratio of expected number of ESPN points each team is to get in the tournament,
which tends to be relatively conservative. KenPom uses kenpom.com's stated tournament
probabilites of advancing to each round. Coin Flips is just flipping a 50/50 coin for every
matchup in the tournament, included as a baseline.
Final Four % | System SD=1.0 | System (with mod) | Typical Strength | System Exp Max | Exp Max (mod) | KenPom | Coin Flips |
UConn | 31 | 34.4 | 33.6 | 40.8 | 45.4 | 36.5 | 6.25 |
Alabama | 10.2 | 10.2 | 8.9 | 9.3 | 9.2 | 12.5 | 6.25 |
NC State | 1.3 | 1.6 | 1.1 | 0.1 | 0.2 | 1.0 | 6.25 |
Purdue | 37.8 | 37.3 | 33.6 | 53.9 | 54.6 | 36.8 | 6.25 |
All Four | 1 in 6436 | 1 in 4775 | 1 in 9047 | 1 in 48895 | 1 in 21924 | 1 in 5956 | 1 in 65536 |
3 Correct | 1.37% | 1.55% | 1.15% | 2.07% | 2.33% | 1.84% | 0.09% |
2 Correct | 15.5% | 16.7% | 14.6% | 24.7% | 27.2% | 17.8% | 2.06% |
1 Correct | 45.0% | 45.4% | 44.5% | 48.5% | 48.0% | 45.5% | 20.6% |
UConn Champ | 12.1% | 15% | 11.8% | 14.3% | 18.3% | 16.3% | 1.56% |
As often happens in situations with one unlikely Final Four team, the chances of
getting all four of them correct comes down to just the chances of that unlikely team.
Small deviations in the other three teams don't really matter. My system at normal
settings (especially with the personal modifications) had the best chances for
NC State to make the final four. KenPom probably still had the best overall
position because of good odds of 3 correct or all 4, as well as good chances
for UConn to win it all (even though KenPom still had Houston as the favorite).
Exp Max had the best odds of a decent bracket (2 or 3 final four correct),
mainly because it is just more conservative than the others and advances (1) UConn
and (1) Purdue more often.
3 Correct is probably good enough to contend in most medium to large size bracket pools. That
could reasonably achieve a score of around 1500. For comparison, I was in a group of
size 1084 on ESPN and a 1500 would have gotten 3rd place in the group.
For the "typical strength" bracket generator, it kind of goes both ways. On
one hand, typical strength always does worse when the actual favorite (UConn)
wins because it puts all four 1 seeds on equal footing. However, UConn's region
was really tough and Typical Strength ignores that fact. So in the end
it kind of balanced out.
Bring on the percentiles, and let's find out which system wins at each
threshold! Same as last year, I've decided to change how I do this analysis to a methodology
that may be slightly easier for someone to look at. For each of my
system's settings I will generate
20 million brackets .
(NOTE: This is
different from previous years, where I generated only 100,000. I've modified
my generation setup to allow larger sample sizes, and 20 million will put me
about on equal footing with the ESPN pool size for once in the very high percentiles). Then I will give you
the score of the bracket that sits at each of a number of percentiles. For
example, we will compare the 0.99 percentile level for each system
and see what ESPN score a bracket in the top 1% generated by that system would get. The implication
is that if each system got to generate 100 brackets and take its best one,
which system would win? I've added one more column as an approximation
of generating brackets using KenPom's tournament probability table as a
generator.
Here is the chart. As a reminder, the system with the HIGHEST
score in a given row is the "winner" of that row, and the best at generating
brackets when it gets that corresponding number of entries. It is also
the best at generating brackets of that ESPN score level.
Percentile | ESPN | SD 0.5 | EV Max | EV Max (mod) | 0.8 | 1 | 1 (Mod) | 1.0 Typical | KenPom Approx. | 0.5 KenPom |
Best | 1810 | 1740 | 1740 | 1770 | 1800 | 1760 | 1780 | 1770 | 1800 | 1750 |
0.99999 | 1660 | 1640 | 1630 | 1640 | 1650 | 1630 | 1650 | 1630 | 1650 | 1650 |
0.9999 | 1590 | 1590 | 1580 | 1590 | 1580 | 1560 | 1570 | 1560 | 1580 | 1600 |
0.999 | 1520 | 1530 | 1520 | 1530 | 1500 | 1470 | 1480 | 1460 | 1500 | 1540 |
0.99 | 1420 | 1420 | 1410 | 1420 | 1370 | 1330 | 1350 | 1320 | 1370 | 1450 |
0.98 | 1390 | 1380 | 1360 | 1380 | 1320 | 1270 | 1300 | 1250 | 1310 | 1410 |
0.95 | 1310 | 1290 | 1240 | 1300 | 1190 | 1130 | 1170 | 1110 | 1190 | 1350 |
0.9 | 1210 | 1170 | 1120 | 1170 | 1070 | 1020 | 1060 | 1010 | 1080 | 1250 |
0.75 | 990 | 920 | 870 | 950 | 810 | 750 | 790 | 740 | 820 | 1080 |
0.5 | 670 | 720 | 710 | 740 | 630 | 580 | 600 | 570 | 610 | 780 |
0.25 | 480 | 570 | 560 | 580 | 500 | 460 | 470 | 450 | 480 | 600 |
0.1 | 330 | 490 | 480 | 490 | 420 | 390 | 390 | 390 | 400 | 510 |
0.01 | 110 | 390 | 390 | 390 | 330 | 300 | 300 | 300 | 310 | 410 |
For a bit of context in the analysis, it is worth mentioning the score each system
would have gotten by just generating a single bracket picking all favorites:
My System: 900
My System (with mod): 1230
Pick Higher Seed: 1210
KenPom: 1230
ESPN public: 1230
The above are all very similar except for my unmodified system, which suffers
from having UConn make the finals, but not quite win it (which is worth 320 points).
It is worth noting (as a point of methodology) that KenPom actually had Houston as
the favorite to win the title, but just since they had an easier road to the title.
UConn was still KenPom's highest rated team entering the tournament.
One note about the ESPN public as well: I am beginning to suspect from experience
that many ESPN users are using the "smart bracket generator" option that it has,
which I believe just uses ESPN's BPI on a sufficiently conservative setting
to make the picks (or some other advanced
metric system). If this is true, then it significantly reduces my program's ability
to outcompete the public solely on knowledge of which upsets to pick and how often.
I believe this has been the case for the last several years. It explains why the
public has often mirrored pretty closely my own generator set to a somewhat conservative setting.
Overall the ESPN public seems to have barely come out on top this year, edging out my
preferred setting for making brackets (EV Max) by about 10-20 points at most of the
upper percentiles, including having the highest overall bracket at 1810. None of
the other settings I used were able to generate an 1810 even with 20 million
attempts each (although we can note, ESPN's 2nd highest bracket was 1760 which places
1810 as a pretty high outlier). ESPN's good showing this year is probably largely due
to picking UConn as the national champion in a whopping 24.7% of brackets.
My own generator's best overall setting was EV Max (with the personal modifications). It
was nearly identical with Normal at SD=0.5 (quite conservative) but had slightly
higher upside at the highest percentiles. SD=0.8 may have been even better at
generating elite brackets though (for pools of 10,000 people or more). SD=0.8 is
a nice blend of conservative but also giving NC State a fighting chance to make
the Final Four. It has virtually the same percentile numbers as KenPom at its
normal setting.
The best
generation setting I was able to find was KenPom set to a very conservative setting
(SD = 0.5), which dominated all of the other settings (including ESPN Public) for pools of 10,000 or fewer
brackets, but suffered on the high end because NC State was too unlikely to be
picked for a Final Four run.
I generated around 50 brackets among a variety of bracket challenges online. I
used a variety of the above settings to cover my bases well. My best CPU generated
bracket was a 1370, which according to the above percentiles seems to be about
what you'd expect (looking at the 0.98 percentile). I unfortunately mixed things
up while recording brackets and do not know what settings generated that bracket.
In my family's ESPN group with 59 entries, I submitted 7 system generated brackets
and the best of the bunch was a 1070 that used the EV Max setting. It was in
the 82.1 percentile and had Uconn vs Purdue correctly in the title game, but most
of the bracket was a disaster other than that. The only other Elite 8 team it had
correct was Alabama. The highest bracket in the group was an extraordinary 1470,
which was good enough for the 99.8 percentile. Even my best generator setting above
would have been quite unlikely to beat that with 59 attempts. If I got to make 59
brackets using the SD=0.5 level, I'd have an 18% chance of one of
them having at least 1470 points, which is pretty awful. KenPom 0.5 making 59 brackets
would have achieved at least the 1470 total with a 34% chance, which is better but still below 50%.
See ya for more bracket fun next year!
ENTRY 68 - 3-18-24 - NCAA Tournament Special 2024
We've got another bracket in front of us. Time to dig into the data and see what it tells us about what kind of year we can expect!
First, a disclaimer - I have discussed in some previous bracket analysis posts some manual effects
of injuries that I have accounted for. I decided for this year to not dive into that,
because for the most part those injuries just don't have as big of an impact as people think,
especially if those teams have had a few games to get used to playing without those players.
However, here are the major injuries to be aware of: Houston has 3 players out for the season,
and another who may miss some time. I think this does affect their strength and might drop them
to the 3rd best chances instead of the best. Marquette is missing Tyler Kolek, but he should be
back for the tournament so I wouldn't say that affects anything. Same with Kansas, which is
missing two starters. They are both expected back and Kansas may actually be undervalued here
if they return. Also Florida had a player injured yesterday which significantly limits their
chances.
Let's start with my system's chances that each team makes the Final Four (by percent):
(Note that for First Four games, I've assumed my higher rated team will be the winner. I have cut the team
from the list if the probability rounds to 0.0)
(1)Houston: 40.5
(1)Purdue: 37.8
(1)Connecticut: 31
(1)North Carolina: 23.3
(2)Tennessee: 22.2
(4)Auburn: 19.5
(2)Arizona: 19.1
(2)Iowa State: 18.8
(3)Baylor: 14.4
(3)Illinois: 13.7
(4)Duke: 12.9
(2)Marquette: 11.6
(3)Creighton: 11.3
(4)Alabama: 10.2
(3)Kentucky: 9.5
(5)Saint Marys: 8.5
(5)Gonzaga: 7.8
(4)Kansas: 6
(6)Texas Tech: 5.8
(6)Brigham Young: 4.8
(11)New Mexico: 4.8
(5)Wisconsin: 4.7
(7)Florida: 4.4
(9)Michigan State: 4.2
(7)Dayton: 4.1
(5)San Diego State: 4
(10)Nevada: 3.8
(7)Texas: 3.7
(10)Colorado: 3.6
(8)Nebraska: 3.1
(8)Mississippi State: 3
(6)South Carolina: 2.7
(6)Clemson: 2.7
(10)Drake: 2.6
(8)Utah State: 2.1
(10)Colorado State: 1.9
(7)Washington State: 1.8
(9)Texas Christian: 1.7
(11)Oregon: 1.7
(8)Florida Atlantic: 1.5
(12)Grand Canyon: 1.5
(9)Texas AM: 1.5
(11)NC State: 1.3
(9)Northwestern: 1.3
(12)James Madison: 0.8
(13)Samford: 0.6
(11)Duquesne: 0.5
(13)College of Charleston: 0.5
(12)Mcneese State: 0.4
(12)Alabama Birmingham: 0.4
(13)Vermont: 0.2
(13)Yale: 0.2
(14)Oakland: 0.1
(14)Akron: 0.1
And here are the title chances for each team:
(1)Houston: 16.8
(1)Purdue: 13.3
(1)Connecticut: 12.1
(4)Auburn: 6.4
(2)Tennessee: 6
(2)Iowa State: 5.6
(1)North Carolina: 5.3
(2)Arizona: 4.5
(3)Illinois: 3.3
(4)Duke: 2.9
(3)Baylor: 2.5
(2)Marquette: 2.2
(3)Creighton: 2.1
(4)Alabama: 2
(3)Kentucky: 1.7
(5)Saint Marys: 1.4
(5)Gonzaga: 1.3
(4)Kansas: 1
(6)Texas Tech: 0.9
(6)Brigham Young: 0.8
(5)Wisconsin: 0.7
(5)San Diego State: 0.7
(7)Florida: 0.6
(11)New Mexico: 0.6
(9)Michigan State: 0.5
(7)Texas: 0.5
(7)Dayton: 0.5
(10)Colorado: 0.4
(8)Nebraska: 0.4
(10)Nevada: 0.4
(10)Drake: 0.4
(8)Mississippi State: 0.3
(6)South Carolina: 0.3
(6)Clemson: 0.2
(8)Utah State: 0.2
(7)Washington State: 0.2
(10)Colorado State: 0.2
(9)Texas Christian: 0.2
(8)Florida Atlantic: 0.2
(9)Texas AM: 0.2
(11)Oregon: 0.1
(9)Northwestern: 0.1
(12)Grand Canyon: 0.1
(11)NC State: 0.1
(12)James Madison: 0.1
Here are number of teams to reach each threshold, by year the last few years:
At least 10% Chance at Final 4:
2015 - 9
2016 - 18
2017 - 15
2018 - 13
2019 - 12
2021 - 9
2022 - 14
2023 - 13
2024 - 14
At least 1% Chance at Final 4:
2015 - 28
2016 - 43
2017 - 43
2018 - 42
2019 - 34
2021 - 46
2022 - 45
2023 - 49
2024 - 44
At least 10% Chance at Title:
2015 - 4
2016 - 4
2017 - 2
2018 - 2
2019 - 4
2021 - 2
2022 - 1
2023 - 2
2024 - 3
At least 1% Chance at Title:
2015 - 12
2016 - 18
2017 - 24
2018 - 19
2019 - 16
2021 - 22
2022 - 16
2023 - 22
2024 - 18
The above numbers seem very average for recent times. There are a few primary title contenders
and a bunch of other teams have a reasonable shot. The 1 seeds (3 of them, at least) have about
average strength for a 1 seed. North Carolina's strength is more like that of a weak 2 seed.
There were an insane number of upsets in conference tournaments this year. The very strong teams
(like Vermont, Colgate, James Madison, Samford, Mcneese) mostly won their tournaments, which means
there are some dangerous teams. However, my system had most of the 12 seeds originally seeded as
13s before the upsets, so they will have to play up a bit in order to cause some upsets. The 14
seeds and beyond have a steep dropoff and are weaker than usual. I definitely would not pick a
16 seed to win this year, and probably not a 15 or 14 either.
Breakdown of expected number of each seed in the final four (with comparisons to last year):
1 seeds: 1.05 (vs 1.09 last year)
2 seeds: 0.72 (vs 0.72 last year)
3 seeds: 0.49 (vs 0.51 last year)
4 seeds: 0.49 (vs 0.3 last year)
The rest: 1.25 (vs 1.08 last year)
The "others" category has heavy influence from the 5 and 6 seeds, which are quite strong. There
are some double digit seeds with reasonable chances - New Mexico, Nevada, Colorado, and Drake.
The numbers suggest
we see one 1 seed, one 2 seed, one 3/4 seed, and one long shot in the final four.
Here are the most likely upsets by round and seed line. First round:
16 over 1: 0.27 upsets, most likely Stetson over UConn, 4.6% chance
15 over 2: 0.41 upsets, most likely Western Kentucky over Marquette, 12.8% chance (all of them except Saint Peters over Tennessee are viable)
14 over 3: 0.52 upsets, most likely Oakland over Kentucky, 17.0% chance
13 over 4: 0.80 upsets, most likely Samford over Kansas, 33.8% chance
12 over 5: 1.26 upsets, most likely James Madison over Wisconsin, 35.9% chance (all four are over 25%)
11 over 6: 1.71 upsets, New Mexico is favored over Clemson, 55.3% chance
10 over 7: 1.94 upsets, Drake is favored over Washington St, 52.4% chance
9 over 8: 1.94 upsets, Michigan St favored over Miss St, 54.3% chance (all four matchups are basically coin flips)
It is kind of shocking to see UConn as the most likely 1 vs 16 upset, but both teams have had a relatively
high level of variance in their results this year, which contributes to that.
The number of total upsets is expected to be less than usual, partially due to the observation from
before about suboptimal conference championship winners.
Second round:
8 or 9 over 1: 1.02 upsets, most likely Michigan State over UNC, 35.3% chance
7 or 10 over 2: 1.46 upsets, most likely Florida/Colorado over Marquette, 43.5% chance
6 over 3: 1.67 upsets, most likely Texas Tech over Kentucky, 46.8% chance
5 over 4: 1.73 upsets, with Gonzaga over Kansas, 52.6% chance
These numbers are lower across the board than last year, which means expect a little more chalk going
into the Sweet 16. The threat vectors continue to be the same since UNC, Marquette, Kentucky, and Kansas
were all over-seeded based on their strength. As noted before, Kansas could be undervalued here if they
are at full strength for the tournament.
Later rounds:
4 or 5 over 1: Auburn over UConn and Alabama over UNC are both at about 44%.
3 over 2: If Kentucky survives to the Sweet 16, they have great chances to beat Marquette (47%).
2 over 1: Arizona is a slight favorite to make the Final Four over UNC.
If you are picking exactly 1 bracket, first decide on two teams in the 1-2 seed range to move to
the final four. (1) UNC is a weak 1 seed so it might be a good idea to let them fall. (1) UConn is
the betting favorite to win the title but they drew a tough quadrant so choosing them to lose somewhere
could distinguish your bracket. (4) Auburn and (2) Iowa St are both capable of knocking them off. (2)
Arizona and (2) Tennessee are both good choices to reach the Final Four.
The best title chances lie with the teams everyone has thought are the best all year - UConn, Purdue,
and Houston. Everyone agrees that UConn is the best team right now and their betting odds are almost
twice as good as any other team. I think especially with how tough their region is, that is going
too far. Purdue and Houston have much easier regions and I would pick one of them to win it all.
Given Houston's injury situation Purdue would be my #1 choice but any of the three are defensible. I
don't think there is much value in choosing any other champion with just 1 bracket. If you want to
get "fancy", Auburn and Iowa State are both great teams that nobody is talking about. I also have
seen very little chatter about Arizona and they still have the tools to win it all, especially since
they are playing in their backyard for the entire tournament.
Among "long shot" teams, the 3 seeds are all very strong this year and any of them could make the
Final Four. (3) Kentucky is the weakest but they also play a high variance style that could have them either
knocked out early or capable of knocking off 1 or 2 seeds. Illinois is the best of the bunch but
is one of many teams in the tournament (including Kentucky, Baylor, Alabama, BYU, and others) that don't
play much defense so a bad shooting night will doom them. Or they could just shoot hot for 4 games and
make a run. Likely at least one of these teams will make the Elite 8. (4) Auburn is extremely strong
and could have been a 1 seed but they have to go through (1) UConn. (5) Gonzaga and (5) St Marys are theoretically
strong teams but they are a bit of a mystery. The best really long shots are (11) New Mexico,
(9) Michigan State, and (10) Nevada.
The above advice is best if you are trying to really get a high level bracket. If you are trying to
win a smaller pool of 100 or less brackets, the best advice remains to pick conservatively. You will
have to be aware that many of your picks are wrong, but trying to nab some good upsets and picking the
wrong ones can turn out even worse than just going safe. I would stick with teams seeded no worse
than 3 seed (plus maybe (4) Auburn and (4) Duke) for the final four in a small pool.
As always, good luck and have fun bracketing!
ENTRY 67 - 3-17-24 - Final Bracket Projection 2024, Women's Basketball Rankings
Before getting into bracket projections, I'd like to first give a quick announcement that as of a few weeks
ago my site now offers women's college basketball rankings! I will not be doing as much upkeep on the
women's rankings page as I do for the men's. For example, game predictions will not be available. I also
at least for the moment will not be providing seed projections for the women's NCAA tournament. The reason
for this is from what research I have done so far it seems the committee emphasizes different things for
the women's selection than for men's, so the same algorithm would not effective for trying to seed the
teams. Also, there are some bugs to work out. For example, the site I use for game data has a different
format for labeling games as conference tournament games, which affects the auto bids. I am not ruling out
doing women's seed projections in the future though once these issues are addressed.
I do still intend to enter the women's bracket into my bracket simulator though, so you will be able to
generate random (but probability-weighted) brackets for the women's tournament by the end of today, same
as the men's. Fun!
First, a few quick words about the early bracket reveal that was done about a month ago.
The top five seed lines were revealed to be:
1: Purdue, UConn, Houston, Arizona
2: UNC, Tennessee, Marquette, Kansas
3: Alabama, Baylor, Iowa St, Duke
4: Auburn, San Diego St, Illinois, Wisconsin
Close: Dayton, Creighton, Clemson (in some order)
The committee member did not make many comments during the actual reveal show. He did especially
emphasize Purdue's noncon work and UConn's wins in Q1/Q2. He mentioned that all 12 committee members
unanimously had the same 1 seeds in the same order.
Bracketologists have read between the lines that road/neutral record and road wins are super
important this year. UNC was kind of shocking to see this close to the 1 line. Auburn was an
interesting case because they had 0 or 1 Q1 wins at this point in the season, so it is
interesting the commmittee was willing to ranking them this high with no quality wins. Alabama
was in a similar position with an upside down Q1 record. Iowa State was seemingly punished for
a putrid noncon SOS.
Here are my computer's projected seeds on Selection Sunday. Remember that Sunday results
(besides the auto-bids) are generally ignored by the committee, so I ran my system without
Sunday's results. Last year Princeton (set to be a 15 seed in my system) upset Yale (a
projected 13 seed). I just set Princeton to a 13 because I figured the committee would be
lazy and put them in the same slot. Princeton did end up with a 15 seed in the final bracket,
so I won't make that mistake again. All of the expected 1-bid teams won on Sunday so I didn't
have to make any last second changes this time.
1: Purdue, Houston, Connecticut, North Carolina,
2: Iowa State, Arizona, Tennessee, Marquette,
3: Baylor, Illinois, Auburn, Creighton,
4: Kansas, Wisconsin, Duke, Alabama,
5: Brigham Young, Florida, Kentucky, Texas Tech,
6: South Carolina, San Diego State, Clemson, Washington State,
7: Colorado, Gonzaga, Dayton, Nebraska,
8: Saint Marys, Nevada, Utah State, Colorado State,
9: Oregon, Boise State, Florida Atlantic, Texas AM,
10: New Mexico, Texas Christian, Michigan State, Pittsburgh, Virginia,
11: Drake, Texas, Oklahoma, NC State, Grand Canyon,
12: Duquesne, Mcneese State, James Madison, Alabama Birmingham,
13: Samford, Vermont, Yale, College of Charleston,
14: Oakland, Akron, Long Beach State, Western Kentucky,
15: Colgate, South Dakota State, Morehead State, Stetson,
16: Longwood, Saint Peters, Montana State, Grambling State, Howard, Wagner,
Barely missed: St Johns, Northwestern, Seton Hall, Ohio State
Also, here are my personal guesses at the seeds:
1: Connecticut, Purdue, Houston, North Carolina,
2: Arizona, Tennessee, Iowa State, Marquette,
3: Creighton, Baylor, Illinois, Auburn,
4: Kansas, Kentucky, Duke, Alabama,
5: Wisconsin, Brigham Young, San Diego State, South Carolina,
6: Florida, Clemson, Texas Tech, Washington State,
7: Saint Marys, Gonzaga, Utah State, Nevada,
8: Dayton, Colorado State, Boise State, Florida Atlantic,
9: Nebraska, Texas, Texas Christian, New Mexico,
10: Northwestern, Michigan State, Oregon, Texas AM, Mississippi State,
11: Colorado, Oklahoma, Drake, NC State, Grand Canyon,
12: James Madison, Mcneese State, Alabama Birmingham, Duquesne,
13: Samford, Vermont, Yale, College of Charleston,
14: Oakland, Akron, Morehead State, Western Kentucky,
15: Long Beach State, Colgate, South Dakota State, Stetson,
16: Longwood, Saint Peters, Montana State, Grambling State, Howard, Wagner,
Barely missed: Virginia, Seton Hall, St Johns, Pitt
Some differences between me and my system:
- I left the first 3 seed lines the same, with slightly different order.
- People have been irrationally high on Kentukcy all year and I think it will
be no different here. I move them up a line from 5 to 4.
- I think my program's biggest miss is Colorado at 7. It sees Colorado's predictive metric
high and treats them basically the same way that St Marys and Gonzaga are being treated.
The difference is those teams both got some big time wins, and Colorado's top end wins
are still a little lacking. The best argument for Colorado being higher is many of their
bad results were without their two best players, and they've played better recently.
Bracketologists seem to have ignored this so likely the committee does too.
- The 8 and 9 lines were where it got difficult. I saw basically everyone from 8-11 as
roughly equal and the committee could end up going any number of ways with those teams.
I favored the teams with the good noncon schedules and road records since that is what the reveal
taught us.
- My program has Pitt and Virginia in, and Miss St and Northwestern out. I tend to
agree with the bracket matrix which has these reversed. Especially Pitt has a really
horrible noncon SOS. To me Northwestern is the team my program might be correct to
leave out, because they also had a really bad noncon and a terrible road record, so I
am worried about them. The matrix has Northwestern absolutely safe at a mid 9 seed.
I wouldn't be so sure about that! I actually have no idea why my program hates
Miss St so much. I'll have to look into that in the offseason.
- I have a feeling the commitee will be compelled to put in one of the four Big East
bubble teams. St Johns is highest in the NET but doesn't have great wins. Seton Hall
has the best wins but the worst NET. Providence has a lot of Q1 wins but is upside down
in Q1-Q3 which seems disqualifying. Villanova has a good stockpile of decent wins but
also some bad Q3 losses, and just loss quantity in general. I think Seton Hall has the
best case out of the four so I'm going to
ENTRY 66 - 4-5-23 - System Bracket Generation Performance 2023
It is time once again to see how my bracket generator did this year.
We will answer the usual question: If you had to put your
faith in 100 brackets, would you be better off picking 100 random human ESPN brackets,
or 100 from my system? Or from some other system? Which produces the better "best" bracket on average? And
how does that answer change if the size of the pool changes?
First, a recap of the 2023 tournament and some things that happened that are relevant
to our discussion. It was described throughout the year as being the year of parity, and
that no team really looked elite. The #1 spot cycled weekly. Some key injuries took their
toll. That is exactly what we saw in the tournament, with no #1 seeds left by the Elite 8
and the best seed left in the Final Four was a #4. That team was (4) UConn, who most
bracketologists had as a 3 seed and had spent some time at #1 in the country back in
December and January, but had a strange slump in January. They were back in full force,
winning every game by double digits in similar fashion to 2018 Villanova. For big time
runs we had the second ever 16 seed victory over a 1 seed, we had 15 seed Princeton in the sweet
16, and we had 9 seed Florida Atlantic in the Final Four. Somewhat shockingly, we had only
one 13 seed upset and no 12 seeds (after many people had all of those games as coin flips).
I'd first like to revisit Blog Post 65 from 2 weeks ago. I was despairing about how the
manual changes I made in subtracting 2 points from (2) UCLA and (4) Tennessee due to injuries (backed up by
betting market data) was being seemingly now refuted... by betting market data. However,
both teams ended up losing their Sweet 16 game as a favorite to a worse seeded team. I
don't like to be too results oriented, especially with a 2 game sample. But that sure
felt good to be validated by those results. I no longer regret bumping those two teams
down in my generator.
How did my advice about bracket picking in Blog Entry 64 hold up? Did I say some good things? Let's see:
- "They might be the weakest 1 and 2 lines in history." (They sure played like it! Only one 2 seed even made
the Elite 8, and they lost before the Final Four.)
- "The 12 to 13 seeds are shockingly strong this year." (Did not end up panning out. Most didn't even make it close.)
- "The numbers suggest we see one 1 seed, one 2 seed, and two other long shots in the final four."
(I think reasonable advice given how many people pick their brackets, but somehow still too safe)
- "Basically, expect just complete carnage going into the sweet 16. Nobody is safe. The 7-10 seeds are very dangerous."
(There were a total of four 1-2 seeds that did not make it, and a few others with close calls)
- "(1) Purdue is a weak 1 seed. (1) Kansas has a very difficult region so you might want to count
on them losing somewhere. (2) Arizona and (2) UCLA both are vulnerable along their path." (Ha ha! All four
of them lost before their seed suggested.)
- "(3) Baylor and (3) Gonzaga are both strong." (Went 1 for 2 there, with Gonzaga outperforming seed)
- "Uconn is a strong 4 seed." (You don't say!)
- "Stay away from (4) Indiana and (4) Virginia, they are both traps." (Yup!)
- "Some other good options higher up
are (5) Duke, (5) St Marys, and (5) San Diego St." (Well, SD St made the title game so I'll count victory here)
- Heck, if you're feeling frisky then do some
research and pick exactly one of the 2-15 upsets. Maybe pick one where that 2 seed would have been
vulnerable to the 7-10 in the next round anyway to minimize the damage." (Arizona was my most likely
2 seed to fall before the Sweet 16, which fits perfectly here to pick Princeton)
- "I would stick with teams seeded no worse
than 3 seed for the final four in a small pool." (usually good advice, but a big oops this time)
Let's move on to chances of my system generating a perfect bracket through each round (using standard probabilities):
Round 1 - 1 in 655 million
Round 2 - 1.60 x 10^-13
Round 4 - 1.82 x 10^-17
Perfect Bracket - 3.37 x 10^-18
Probability of perfect bracket maximized at SD=1.44, P = 6.77 x 10^-18
Probability of perfect bracket flipping coins = 1 x 10^-19
Owing mostly to the first round and Fairleigh Dickinson, this year is the least likely
tournament we've had since I started keeping records (in 2000). It is about 10 times less
likely than last year's tournament and 1000 times less likely than the typical tournament.
Good brackets mostly come down to having the Final Four correct. Let's look at final four
incidence rates next. I have a handy chart here to compare which systems had the best probability
at correct Final Fours. I have several settings of my own system, as well as ESPN brackets
and some other prominent systems, and complete coin flips for comparison
Final Four % | System SD=1.0 | Typical Strength | System Exp Max | ESPN Public | KenPom | Austin Mock (The Athletic) | Coin Flips |
San Diego St | 12.2 | 7.4 | 12.2 | 2.1 | 8.7 | 6.9 | 6.25 |
Miami | 3.4 | 7.4 | 2.1 | 3.4 | 2.8 | 1.8 | 6.25 |
UConn | 9.7 | 8.9 | 9.2 | 11.1 | 16.4 | 17.5 | 6.25 |
FAU | 5.1 | 1.9 | 1.5 | 0.7 | 3.8 | 3.9 | 6.25 |
All Four | 1 in 48732 | 1 in 107992 | 1 in 282840 | 1 in 1.8 million | 1 in 65871 | 1 in 117925 | 1 in 65536 |
3 Correct | 0.13% | 0.08% | 0.04% | 0.01% | 0.11% | 0.08% | 0.09% |
2 Correct | 2.81% | 2.07% | 1.78% | 0.76% | 2.84% | 2.4% | 2.06% |
1 Correct | 24.4% | 21.22% | 21.29% | 15.74% | 25.68% | 25.02% | 20.6% |
UConn Champ | 2.4% | 1.7% | 1.5% | 2.3% | 5.1% | 6.5% | 1.56% |
Because the Final Four was so unlikely this year, a lot of analyzing it comes down to
what size of pool you are interested in winning. In a small pool, picking UConn to win
it all and getting no other final four might be enough. The numbers above (especially
ESPN) suggest that getting 2 Final Four teams right is only necessary in a pool of
size 150+. Getting all four of the teams right was not very reasonable, but necessary
if you want to be #1 on the leaderboard of a site.
My system was just OK for small pools. It had a much lower chance of picking UConn
as the champ than some of the other models, but was about on part with ESPN public.
It crushed the public in other Final Four metrics though. Most people are not willing
to pick 4 longshots to make the Final Four, and are more likely to follow my previous
advice and pick 3 safe picks and 1 longshot. Most of the hideously long odds the public
had are owed to their total lack of faith in San Diego St and FAU. My system was higher
on these two teams than everyone else. I think the best system for elite level
brackets was likely KenPom. He has a solid mix of reasonable chances to get all four
correct, along with relatively high chances of UConn to win it all. We will see in a moment.
As for the champ UConn - My system was high on them (had them at #10 when the tournament
started) but not high enough (KenPom had them at #4, for example). The future on UConn was
at +1800 when the tournament started (implied title percent 5.3%, which was the 9th
highest odds). The bettors remembered that UConn was ranked #1 and looked unstoppable
at the end of December, and they appeared to be returning to that form. I have some
analysis to do in the offseason to see if I can identify in an objective manner some
teams that have a chance to go on a run like that.
Notably, this does not appear to be a great year for "Typical Strength". This setting
ignores knowing anything about the teams and picks based on typical strength of that
seed number alone. It definitely paid off to know that UConn, San Diego St, and FAU
were all very underseeded.
Bring on the percentiles, and let's find out which system wins at each
threshold! Same as last year, I've decided to change how I do this analysis to a methodology
that may be slightly easier for someone to look at. For each of my
system's settings I will generate 100,000 brackets. Then I will give you
the score of the bracket that sits at each of a number of percentiles. For
example, we will compare the 0.99 percentile level for each kind of bracket
and see what ESPN score a bracket in the top 1% would get. The implication
is that if each system got to generate 100 brackets and take its best one,
which system would win? I've added one more column as an approximation
of generating brackets using KenPom's tournament probability table as a
generator.
Here is the chart. As a reminder, the system with the HIGHEST
score in a given row is the "winner" of that row, and the best at generating
brackets when it gets that corresponding number of entries. It is also
the best at generating brackets of that ESPN score level.
Percentile | ESPN | SD 0.5 | EV Max | 0.8 | 1 | 1.2 | 0.5 Seed Typical | 1.0 Seed Typical | *KenPom Approx. |
Best | 1600 | 1570 | 1560 | 1640 | 1580 | 1540 | 1510 | 1520 | 1570 |
0.99999 | 1440 | 1540 | 1510 | 1550 | 1570 | 1530 | 1420 | 1490 | 1540 |
0.9999 | 1320 | 1430 | 1440 | 1430 | 1430 | 1410 | 1240 | 1400 | 1460 |
0.999 | 1170 | 1250 | 1260 | 1280 | 1280 | 1260 | 1090 | 1210 | 1330 |
0.99 | 1050 | 1010 | 1010 | 1030 | 1030 | 1030 | 780 | 1000 | 1100 |
0.98 | 960 | 850 | 870 | 930 | 950 | 940 | 710 | 850 | 1040 |
0.95 | 680 | 740 | 730 | 740 | 740 | 730 | 620 | 690 | 900 |
0.9 | 580 | 650 | 640 | 650 | 640 | 630 | 570 | 590 | 680 |
0.75 | 500 | 530 | 520 | 520 | 500 | 490 | 510 | 490 | 520 |
0.5 | 450 | 460 | 450 | 430 | 410 | 400 | 450 | 420 | 430 |
0.25 | 390 | 410 | 390 | 370 | 350 | 330 | 410 | 360 | 370 |
0.1 | 330 | 370 | 350 | 320 | 300 | 280 | 370 | 310 | 320 |
0.01 | 190 | 300 | 280 | 250 | 230 | 210 | 320 | 250 | 250 |
For a bit of context in the analysis, it is worth mentioning the score each system
would have gotten by just generating a single bracket picking all favorites:
My System: 480
Higher Seed: 470
KenPom: 530
ESPN public: 470
KenPom gets a headstart by having some good picks early on resulting in a bias
towards 50 points higher. His Round of 64 is very solid, and he also picked (6) Creighton
over (3) Baylor and picked (4) UConn to beat (1) Kansas. My system made some ugly
early picks (like picking every single 7-10 matchup incorrectly). Out of 63 rankings
on Massey's Composite of computer rankings, mine ended up 53rd in tournament picks (and
KenPom ended up 4th).
This translates into KenPom dominating the entire chart at nearly every level. If all
systems generated 100 brackets, KenPom's best would likely be about 70 points higher than
my best. He dominates also at mid-elite brackets because of the much higher chance
of UConn winning it all, as expected. However, it seems that my program probably still
holds a slight edge at generating brackets at the very highest level (99.999 percentile
and above) just due to how many more brackets mine will have with all Final Four correct.
The ESPN public was kind of pathetic this year. They tend to shine when there
are fewer upsets. Their numbers are lower than mine at nearly every level. Their
50th percentile bracket makers appear as usual to be pretty safe and are pretty
comparable to my SD=0.5 and Typical SD=0.5 settings (which are quite conservative).
ESPN even seems to have a slight edge at about the 99th percentile. Past that to
the 99.9 percentile, suddenly my system takes a massive 100 point lead. This gap
happens right about at the percentile where my system begins to have a chance of
getting 3 of the Final Four correct, whereas the public had basically no shot.
My SD=0.8 simulations above even managed to produce a 1640 bracket that would have
topped the ESPN leaderboard. That is pretty impressive considering ESPN public
got 20 million shots at it, and my generator did better with only 100,000 attempts.
That 1640 bracket didn't even have all of the Final Four correct, missing on FAU.
Among my own systems, it was a strange year. As noted before, Typical Strength
was very far behind this year. Usually the best results come with the most conservative
bracket setting. However, this year all of my settings performed about the same.
Even at the 90th percentile where the SD=0.5 is usually dominant, this year it
was very flat across all of the settings. Owing to higher SD levels having the
better shot at elite brackets, they were probably best even in smaller pools. Note
that even though my chances at a perfect bracket were maximized at SD=1.44 (the craziest
such setting in tournament history), it is still not advisable to actually use this setting
no matter how large your pool is (unless your pool has 1 million trillion people).
I generated around 50 brackets among a variety of bracket challenges online. I
used a variety of the above settings to cover my bases well. I did end up with
a beautiful gem at Yahoo which scored 1300 points. It used the SD=1 setting
and nearly made it into Yahoo's top leaderboard (which shows the top 50 on the site).
It managed to hit the 1 in 800 chance of getting 3 of the Final Four correct, and
also proceeded to pick UConn to win it all.
In my family's ESPN group with 55 entries, I submitted 6 system generated brackets
and the best of the bunch was an 810 that used the EV Max setting. It was in
the 97.3 percentile and had UConn in the title game, but not winning the title
and no other Final Four correct. Two brackets in the group had UConn winning
it all, and they achieved scores of 1060 and 1050, both at about the 99th percentile.
If I got to make 55 brackets using the SD=1 level, I'd have a 32% chance of one of
them having at least 1060 points, which is not fantastic. KenPom making 55 brackets
would have beaten the 1060 total with a 59% chance, which is much better.
See ya for more bracket fun next year!
ENTRY 65 - 3-21-23 - Some case studies on handicapping injuries
A bit of full disclosure here - When you generate 2023 brackets on my bracket generator,
it does not use the actual team strength for all of the teams. I made an executive decision
to manually edit some of the team's strengths for bracket generation. In particular, 2 teams
were adjusted - UCLA and Tennessee. Both had key injures just before the tournament. UCLA
lost one of the best defensive players in the country. Tennessee lost its point guard, a
player deemed crucial to the team on both ends of the floor.
What is an injury worth? I've heard it said that even an injury to one of the best players
in the country will not move the vegas line more than a few points at most. If the player is
not all-american caliber then the line may not move at all. The effect of an injury will
always be overblown by the media. UCLA went from a title contender to suddenly media claiming
they were doubtful to make the final four, and no longer any chance of winning the title. That
sounds big!
There are a few past tournaments for which I have already made adjustments based on injuries.
Here is a list of the past adjustments I have made:
2018 - Duke lost all-american Zion Williamson for a large portion of the season before
the tournament. They notably played worse without him. I manually adjusted their strength
up 2 points for bracket generation (this put them at about the strength they were just before
the injury). Duke ended up losing in the Elite 8 to Michigan State. Adjusting Duke up 2
points probably ultimately hurt the generator.
2021 - The pandemic year. The biggest manual adjustment was for Colgate. I adjusted them -5
because they were clearly overrated from obtaining stats only in Patriot league play. This
brought them pretty near to the correct betting line against their first round opponent,
Arkansas. This proved valuable as they did get blown out by Arkansas. I decided to make a
few other manual adjustments while I was at it due to injuries.
2021 - Villanova lost key player Gillespie 2 games before the tournament. They got knocked
out of their conference tournament early and did not seem to be themselves. Everyone agreed
they were destined for disappointment in the tournament. I manually adjusted -2 for Villanova.
Then they ended up making the Sweet 16 anyway. It did help that they caught the 13 seed in
their second game. They covered the spreads in both of the first two rounds.
2021 - Michigan lost Isaiah Livers 2 games before the tournament. I adjusted them -1 since
he was not as key as Gillespie. They proceeded to nearly make the final 4 anyway, blowing
out popular upset pick (4) Florida St on the way and falling at the hands of the hot UCLA team.
2021 - UConn had an injury to star player Bouknight for much of the season but got him back
for the tournament. I adjusted UConn +1 but they lost handily in the first round to a
mediocre Maryland team anyway.
2021 - Baylor had a lengthy COVID pause and didn't seem to be themselves for a while, losing
some uncharacteristic games. They seemed back in form near the tournament so I bumped them +1.
They did end up winning the national title.
My conclusion from all of this - it seems to be just completely random. Sometimes the bump or
drop seems justified. Other times the next man up turns out to be a blessing in disguise
and no drop is warranted. I decided not to make any adjustments for the 2022 season, telling
myself to just trust the numbers. Even if my numbers turn out to be a little wrong, what
difference does it really make when the randomness outweighs the modification?
So of course I turned right around and made manual adjustments again this year (2023). Here is what
I did:
- (2) UCLA -2
- (4) Tennessee -2
I thought maybe this time I'd do a little more analysis before jumping in on my adjustments.
I checked out Vegas spreads for all of the games both teams played after the injuries, compared
to my own system and KenPom (since his system is often the starting point for Vegas spreads).
The data was undeniable - Vegas was discounting both teams consistently.
The market was
treating UCLA about 2.5 to 4 points worse than their base rating. This continued into the NCAA
tournament. You never know with 2 vs 15 matchups really, but against Northwestern my system
(without modification) had UCLA favored by 10, and KenPom favored UCLA by nearly 12.
The Vegas number was only 7.5. UCLA won by 5 so the skepticism seems to have been justified. My
program's value -2 was nearly dead on the vegas line.
Here is the crazy part - UCLA lost another player (Singleton) during the Northwestern game.
No matter! The media has done a complete turnaround. They are now FAVORING UCLA against Gonzaga!
My system has Gonzaga favored there, and KenPom has UCLA by 3.5 so the -2 spread represents only
a 1.5 point discount now there, even with an extra player out. The odds to win the title have
UCLA as the 3rd most likely team, right where KenPom ranks them. A week ago, the media darling
was Gonzaga to come out of that region. Nothing has changed about Gonzaga's situation, and UCLA
has lost another player, and yet the media seems to be all in on UCLA now.
It is a similar story for Tennessee. The market gave Tennessee about a 2 to 4 point drop in
the games following the injury. My system had the spread against Louisiana at Tenn -11, and
the actual spread was only -6. KenPom is even higher on Tennessee than I am. They won a close
one by 3. (5) Duke was favored against (4) Tennessee by a whopping 3.5 in the second round.
Part of this is just that everyone became way too high on Duke. Duke lost by 13 in a really
rough game. Now we arrive in the Sweet 16 against Florida Atlantic. Tennessee suddenly no
longer has any discount in the market at all! If anything, they are actually a little higher
than the KenPom numbers would suggest! The injury discount seems to have completely vanished.
Suddenly everyone has Tennessee as the default Final Four team out of the region and are laying
endless praise on their physicality advantage. Their offensive woes have been forgiven.
Another overreaction by the market in the opposite direction? Tough to tell.
In both cases, you could say I've been swindled again. With the information we have right now,
it seems I would have been better off not adjusting those teams at all. But Vegas seemed so
sure a few weeks ago that those adjustments were right! Is this another case of the bettors
getting carried away? Or do we just have more information now?
The reason I made the changes is I didn't want to look stupid making bracket after bracket
with Tennessee beating Duke (and Purdue about half of the time) when they were clearly not
a good team. I guess I look stupid now, not trusting in the system and seeing the value
in Tennessee's defense. The market still thinks Duke was the favorite (and probably would
still favor them if they played again tomorrow) but maybe the market is just wrong sometimes.
Trust the numbers! And trust a team to find a way. Maybe now that both teams have "proved it"
a bit, people are more willing to give them their original value back. Maybe the discount
should only last a little while before they find a way to fill that gap. We saw that with
Xavier this season. Everyone was worried about Freemantle getting injured, and maybe they
were worse for a while, but the next man up has proved himself and Xavier may be stronger
than ever now.
So what is the final takeaway? I think there is some value to discounting a team directly
after an injury. The betting market seems to think so and generally the betting market
outperforms any computer system over the long run. I would never discount a team more than
about 3 points no matter how important the player is. But if a team has a few games to figure
things out, I think the discount diminishes and you can value the team fully again.
Especially for the purposes of determining whether a team can go deep in the tournament, you
are probably better off just using the base ratings and not applying any discount at all.
ENTRY 64 - 3-13-23 - NCAA Tournament Special 2023
We've got another bracket in front of us. Time to dig into the data and see what it tells us about what kind of year we can expect!
Let's start with my system's chances that each team makes the Final Four (by percent):
(Note that for First Four games, I've assumed my higher rated team will be the winner)
(1)Houston: 37.3
(1)Alabama: 28.6
(1)Purdue: 22.3
(1)Kansas: 21
(2)Texas: 20.9
(3)Gonzaga: 20.7
(2)UCLA: 20
(2)Marquette: 15.6
(2)Arizona: 15.6
(5)San Diego State: 12.1
(3)Baylor: 12.1
(4)Tennessee: 10.4
(3)Kansas State: 10
(4)Connecticut: 9.9
(5)Duke: 9.3
(3)Xavier: 8.5
(5)Saint Marys: 7.6
(10)Utah State: 7.1
(7)Texas AM: 6.8
(6)Kentucky: 6.8
(8)Memphis: 6.7
(6)Creighton: 5.8
(9)Florida Atlantic: 5.2
(4)Virginia: 5
(6)Iowa State: 4.7
(4)Indiana: 4.7
(10)Southern California: 4.6
(9)West Virginia: 4.3
(8)Arkansas: 4.2
(6)Texas Christian: 4
(9)Auburn: 3.8
(7)Michigan State: 3.8
(10)Boise State: 3.6
(5)Miami: 3.5
(8)Maryland: 2.8
(12)Virginia Commonwealth: 2.7
(12)Drake: 2.5
(11)Providence: 2.5
(7)Missouri: 2.3
(8)Iowa: 2.3
(11)NC State: 2.2
(9)Illinois: 2.1
(12)Oral Roberts: 1.9
(10)Penn State: 1.9
(11)Mississippi State: 1.8
(7)Northwestern: 1.8
(11)Nevada: 1.5
(12)College of Charleston: 1.3
(13)Kent State: 1.1
(13)Iona: 0.8
(13)Furman: 0.5
(13)UL Lafayette: 0.5
(15)Vermont: 0.4
(15)Princeton: 0.2
(14)UC Santa Barbara: 0.2
(14)Grand Canyon: 0.2
(15)Colgate: 0.2
(14)Montana State: 0.2
(14)Kennesaw St: 0.1
(16)Northern Kentucky: 0
(15)UNC Asheville: 0
(16)Howard: 0
(16)Texas AM CC: 0
And here are the title chances for each team:
(Note, only includes teams that won at least one of the 100,000 simulations)
(1)Houston: 14.9
(1)Alabama: 11.2
(3)Gonzaga: 6.7
(2)Texas: 6.6
(1)Kansas: 6.4
(2)UCLA: 5.9
(1)Purdue: 5.8
(2)Arizona: 4.3
(2)Marquette: 3.2
(5)San Diego State: 3
(3)Baylor: 2.9
(4)Connecticut: 2.4
(4)Tennessee: 2.1
(5)Duke: 1.9
(5)Saint Marys: 1.8
(3)Kansas State: 1.7
(10)Utah State: 1.5
(3)Xavier: 1.5
(7)Texas AM: 1.4
(8)Memphis: 1.3
(6)Creighton: 1.1
(6)Kentucky: 1
(9)Florida Atlantic: 0.9
(9)West Virginia: 0.8
(4)Virginia: 0.7
(6)Iowa State: 0.7
(8)Arkansas: 0.7
(6)Texas Christian: 0.7
(4)Indiana: 0.7
(10)Southern California: 0.6
(9)Auburn: 0.6
(10)Boise State: 0.5
(7)Michigan State: 0.5
(5)Miami: 0.4
(8)Maryland: 0.4
(12)Virginia Commonwealth: 0.4
(7)Missouri: 0.3
(8)Iowa: 0.3
(12)Drake: 0.3
(11)NC State: 0.3
(11)Providence: 0.2
(9)Illinois: 0.2
(10)Penn State: 0.2
(7)Northwestern: 0.2
(12)Oral Roberts: 0.2
(11)Mississippi State: 0.2
(11)Nevada: 0.1
(12)College of Charleston: 0.1
(13)Kent State: 0.1
(13)Iona: 0.1
(13)Furman: 0
(13)UL Lafayette: 0
(15)Vermont: 0
(15)Princeton: 0
(14)Grand Canyon: 0
(15)Colgate: 0
(14)UC Santa Barbara: 0
(14)Kennesaw St: 0
(14)Montana State: 0
Here are number of teams to reach each threshold, by year the last few years:
At least 10% Chance at Final 4:
2015 - 9
2016 - 18
2017 - 15
2018 - 13
2019 - 12
2021 - 9
2022 - 14
2023 - 13
At least 1% Chance at Final 4:
2015 - 28
2016 - 43
2017 - 43
2018 - 42
2019 - 34
2021 - 46
2022 - 45
2023 - 49
At least 10% Chance at Title:
2015 - 4
2016 - 4
2017 - 2
2018 - 2
2019 - 4
2021 - 2
2022 - 1
2023 - 2
At least 1% Chance at Title:
2015 - 12
2016 - 18
2017 - 24
2018 - 19
2019 - 16
2021 - 22
2022 - 16
2023 - 22
I thought last year's 1 and 2 seeds were pretty weak. This year's 1 and 2 seeds are even slightly
weaker. They might be the weakest 1 and 2 lines in history. This is combined with the fact that
many of them are not playing great at the moment. Two of the 1 seeds (Houston and Kansas) got
blown out in their last outing. Nobody has had very much faith in #1 Purdue, as they've been
struggling for weeks. Houston, Kansas, and Texas have potential injury issues.
The result is the most open field in history, especially at the Final Four level. The 12 to
13 seeds are also shockingly strong this year. Many conferences sent their best or second best
team to the tournament. All four of the 12 seeds are stronger than the average 12 seed usually is.
VCU has about an 8 seed strength. The 13 seeds are on average about 1 point better than a usual
year. The 15 seed line might be the most brutal - they are about 3 points better than usual 15
seeds. The 2-15 matchup this year will play more like the 3-14 matchup in a normal year.
Breakdown of expected number of each seed in the final four (with comparisons to last year):
1 seeds: 1.09 (vs 1.2 last year)
2 seeds: 0.72 (vs 0.83 last year)
3 seeds: 0.51 (vs 0.56 last year)
4 seeds: 0.3 (vs 0.33 last year)
The rest: 1.38 (vs 1.08 last year)
There are a huge number of significant contributors to the "others" category. The numbers suggest
we see one 1 seed, one 2 seed, and two other long shots in the final four.
Here are the most likely upsets by round and seed line. First round:
16 over 1: 0.23 upsets, most likely Howard over Kansas, 9.1% chance
15 over 2: 0.55 upsets, most likely Vermont over Marquette, 19.2% chance (all of them except UNC Asheville over UCLA are viable)
14 over 3: 0.69 upsets, most likely Montana St over Kansas St, 19.7% chance
13 over 4: 1.23 upsets, most likely Kent St over Indiana, 38.1% chance
12 over 5: 1.5 upsets, most likely Drake over Miami, 46.2% chance
11 over 6: 1.67 upsets, all four are about equally like at 42%
10 over 7: 2.07 upsets, the 10 seed is favored in every matchup except Penn St vs Texas AM. Utah St is over 60% vs. Missouri.
9 over 8: 1.99 upsets, most likely Auburn over Iowa 54.3% chance (all four matchups are basically coin flips)
It is pretty crazy that we have about even odds to get a 2-15 upset. All of the 12-16 expected upset numbers are higher
than the corresponding ones from last year.
Second round:
8 or 9 over 1: 1.35 upsets, most likely Memphis over Purdue, 41.9% chance
7 or 10 over 2: 1.55 upsets, most likely Utah State over Arizona, 42.8% chance
6 over 3: 1.73 upsets, most likely Kentucky over Kansas State, 48.7% chance
5 over 4: 2.09 upsets, with San Diego St favored over Virgina (62%) and Duke favored over Tennessee (51%)
Basically, expect just complete carnage going into the sweet 16. Nobody is safe. The 7-10 seeds are very dangerous.
Later rounds:
4 or 5 over 1: Duke over Purdue (43.9%) and UConn over Kansas (42.9%) are both worth considering.
3 over 2: (3) Gonzaga is favored over (2) UCLA (54%). Also consider (3) Baylor over (2) Arizona (46%).
3 over 1: (3) Gonzaga is projected to win its region over (1) Kansas. (2) UCLA and (4) UConn are also
in contention in that region. (2) UCLA would be favored to win that region if not for the injury issues.
If you are picking exactly 1 bracket, first decide on two teams in the 1-2 seed range to move to
the final four. (1) Purdue is a weak 1 seed. (1) Kansas has a very difficult region so you might want to count
on them losing somewhere. (2) Arizona and (2) UCLA both are vulnerable along their path. Basically, pick
either Houston or Texas from the Midwest region, and then one more top two seed.
There are a lot of good options for higher seeds to advance. (3) Baylor and (3) Gonzaga are both strong.
Uconn is a strong 4 seed. Tennessee gets a lot of love from computers but has a major injury to contend
with. Stay away from (4) Indiana and (4) Virginia, they are both traps. Some other good options higher up
are (5) Duke, (5) St Marys, and (5) San Diego St. They are many people who believe (6) Creighton and
(6) TCU will still live up to their high preseason expectations.
For the rest of the teams I didn't mention... well, the 12 and 13 seeds are really strong. You really
couldn't go wrong picking about half of those upsets. Feel free to let several 1 and 2 seeds lose
at every stage of the bracket besides the first round. Heck, if you're feeling frisky then do some
research and pick exactly one of the 2-15 upsets. Maybe pick one where that 2 seed would have been
vulnerable to the 7-10 in the next round anyway to minimize the damage.
The above advice is best if you are trying to really get a high level bracket. If you are trying to
win a smaller pool of 100 or less brackets, the best advice remains to pick conservatively. You will
have to be aware that many of your picks are wrong, but trying to nab some good upsets and picking the
wrong ones can turn out even worse than just going safe. I would stick with teams seeded no worse
than 3 seed for the final four in a small pool.
Good luck and have fun bracketing!
ENTRY 63 - 3-12-23 - Final Seed Guesses 2023
First, a few quick words about the early bracket reveal that was done about a month ago.
The top five seed lines were revealed to be:
1: Alabama, Houston, Purdue, Kansas
2: Texas, Arizona, Baylor, UCLA
3: Tennessee, Virginia, Iowa St, Kansas St
4: Indiana, Marquette, Gonzaga, Xavier
5: Creighton, Uconn, Miami, St Marys
The biggest surprises: Iowa St, Kansas St, and Indiana were all rated higher than expected.
Gonzaga and UConn were rated lower than expected. Usually in the past, teams were given
unexpectedly high seeds for their nonconference work, which would have seemed to give UConn
a good seed even after their January slump. Creighton's injury issues early in the year seem
to have been taken fully into account to give them such a high seed.
The thing most bracketologists observed from this ranking is that Q1 wins mean more than
ever before, and in particular road Q1 wins. Some of the high seeds had good Q1 road wins,
while others rated a bit lower did not.
Alabama was given #1 overall based on the good Q1/Q2 record and win at Houston. I don't see
that this has changed at all so I anticipate them still being #1 overall. Kansas is the only
challenger but they had some concerning blowout results down the stretch, and honestly they
don't have any super-tier Q1 road wins. Their best road win is at TCU, which is OK but nothing
like winning at Houston.
Here are my computer's projected seeds on Selection Sunday. Remember that Sunday results
(besides the auto-bids) are generally ignored by the committee, so I ran my system without
Sunday's results. I did have to replace Yale with Princeton. Yale was set to be a 13 seed,
and my system would have probably called Princeton a 15 seed, but committee precedent seems
to be to have seeds basically set by Saturday night, so I believe Princeton will get the 13
that Yale was going to get.
1: Alabama, Kansas, Houston, UCLA,
2: Purdue, Texas, Arizona, Gonzaga,
3: Marquette, Baylor, San Diego State, Connecticut,
4: Tennessee, Xavier, Duke, Kansas State,
5: Indiana, Virginia, Missouri, Texas AM,
6: Saint Marys, Memphis, Miami, Iowa State,
7: Texas Christian, Creighton, Kentucky, Michigan State,
8: Florida Atlantic, Northwestern, Boise State, Utah State,
9: Penn State, West Virginia, Maryland, Southern California,
10: Auburn, Illinois, Arkansas, Arizona State,
11: Nevada, Iowa, Rutgers, NC State, Providence, Mississippi State,
12: Virginia Commonwealth, Oral Roberts, College of Charleston, Drake,
13: Kent State, Princeton, Grand Canyon, UL Lafayette,
14: Iona, UC Santa Barbara, Furman, Montana State,
15: Vermont, Colgate, UNC Asheville, Kennesaw St,
16: Northern Kentucky, Howard, Texas AM CC, SE Missouri State, Fairleigh Dickinson, Texas Southern,
Barely missed: Vanderbilt, Oregon, Clemson, Oklahoma State
Also, here are my personal guesses at the seeds:
1: Alabama, Kansas, Houston, Purdue,
2: UCLA, Texas, Arizona, Marquette,
3: Baylor, Gonzaga, Xavier, Connecticut,
4: Kansas State, Tennessee, Duke, Indiana,
5: San Diego State, Virginia, Texas AM, Iowa State,
6: Saint Marys, Miami, Texas Christian, Michigan State,
7: Missouri, Creighton, Kentucky, Northwestern,
8: Florida Atlantic, Maryland, West Virginia, Illinois,
9: Boise State, Memphis, Penn State, Iowa,
10: Auburn, Arkansas, Southern California, Providence,
11: Pittsburgh, Utah State, Mississippi State, Rutgers, Nevada, Arizona State,
12: Oral Roberts, College of Charleston, Virginia Commonwealth, Drake,
13: Kent State, Princeton, UL Lafayette, Iona,
14: Furman, Grand Canyon, UC Santa Barbara, Montana State,
15: Vermont, Colgate, Kennesaw St, UNC Asheville,
16: Northern Kentucky, Howard, Texas AM CC, SE Missouri State, Fairleigh Dickinson, Texas Southern,
Barely missed: NC State, Oklahoma State, Vanderbilt, Wisconsin
Some things I changed up:
- Gave Purdue the last 1 seed. UCLA's injury issues together with loss in the Pac 12 final
I think sealed it.
- I gave Marquette the last 2 seed. I think Gonzaga is going to get shafted.
- I think San Diego St deserves a higher seed but the committee did not like them in the reveal
so a 5 seed seems right. In general the MWC has seemed unfairly undervalued by everyone. I split
the difference a bit and put those teams in between where my computer and the public have them.
A lot of home teams held serve in the conference so there were not many good road wins.
- I had a lot of trouble with the 8 to 10 seed lines. I tried to slightly favor teams with
the better road wins.
- I think NC State will be the team that barely misses the cut. I have them out,
and Pitt in. I don't know why everyone likes Pitt so much more but it seems to be somewhat
of a concensus. I guess they do have better road wins (slightly). With NC State I am just
really worried about the 3x blowouts to Clemson, two of them in the last week.
- I think it is super possible that the committee decides to put a few auto bids on the 11
line. I think all of the teams on the 12 line besides Drake have a chance at the 11 line. I could
totally see Oral Roberts and Charleston both getting to 11 and all of the play-in matches
happening at 12. I'm going to place my bet on it not happening, but just saying.
ENTRY 62 - 9-16-22 - How should we rate a team moving up in division?
It is about time we finally answer this question and challenge an assumption I made a
long time ago.
In both football and basketball, it seems every few years we have a
handful of teams that move up from lower divisions and want to try their hand at the
highest level. Since my system does not collect data on lower divisions, I have no
information about these teams. What is a good default rating to apply to such teams
at the start of the year until I can gather some real information about them?
Up until this point, I have always started with the assumption that teams moving up
are roughly on par with the average team from the division they came from. When a
game happens with a lower division team, I label them as "FCS Team" in football, and
"Unranked" in basketball. Although I still give predictions for these games, it is
with the understanding that these predictions are going to be inaccurate because
there is a wide range of team strengths at lower divisions, particulary in basketball.
In both sports, I would estimate that the average prediction error in such games will
be nearly double the prediction error in normal games.
The point is, I have collected this data for a long time and although you don't see it
in my system rankings, "FCS Team" and "Unranked" are treated as single teams that just
play a ton of games each year, and they have their own statistics including rating.
In football, this rating tends to hover around -24. This means in most years the
bottom 3 or 4 FBS teams are worse than the average FCS team.
In basketball, since lower division teams go at minimum down to division 2 (instead of
D1-FCS like in football), as noted there is a wider range of outcomes, and a bigger
talent disparity. The result is an average "Unranked" rating of around -33 recently. This is
far worse even than nearly all of the worst D1 ratings (although 2013 Grambling takes the cake with
a rating of -35).
With basketball season coming soon, this got me questioning my assumption. Surely
we can't expect a team moving up in division to typically have an average of -33.
It should be much better than that. How much better? Let's try to answer that
question and update my model (for both sports) to better predict these risers.
Let's start with basketball. I looked at all 32 teams that have moved up to D1
since the year 2000 (North and South Dakota St are the earliest). In the year they first
played as an official member of D1, I recorded their rating (along with a bunch of
other team statistics I collect behind the scenes).
The simple answer is that this average rating was -12.84 (compared to -33). So my model would indeed be much
better off assuming a lot more from these teams. In most years this rating would
place the team just outside the bottom 50 of D1. This makes sense, as you would
assume the teams moving up in division are the ones that are extra successful at
their previous level and ready for a new challenge.
There was a lot of variance. The
worst riser was NJIT in 2007, with a rating of -22.23. That team was 3rd worst that
season. The best riser was Bellarmine in 2021, with a rating of -1.08. That team
went 10-3 in the ASUN in their first year, and was in the top half of all D1 teams
with a rank of 169.
The interesting thing, though, was the shape of the data. Here is a chart showing,
over time, how teams that had a debut that year did for 1 season:
There is a definite positive tendency to the graph over time. This is confirmed
with a linear best fit line and a corresponding T-test with p-value 0.0032, which
is convincing statistical evidence that the upward trend is not due to chance
alone.
I can think of a few reasons for this tendency. One is maybe the lower
division teams are just more competitive than they used to be, and there is a
greater pool of talent that is delivering decent recruits further down the
totem pole. Another is perhaps with advanced statistics and metrics more
readily available to all people, it is easier for the better teams from D2 to
self-select for being ready to move up and determine that they will be able
to compete at the next level.
I decided to at least partially test the first hypothesis. Are average unranked teams
generally more competitive now than they were 20 years ago? I can look at the
rating of the "Unranked" team in my system at the end of each year and see what
trend, if anything, is present in the data.
The result was a bit surprising given my hypothesis.
There is a very definite trend downwards, and the correlation is very high. The
average Unranked team is getting measurably worse, and the rating has dropped by
about 6 points over 20 years. More research needs to be done on this before
drawing any conclusions. It is possible teams are just scheduling worse Unranked
teams to make sure they win those games. There is a larger standard deviation
of individual Unranked team performances recently, but it's not enough to explain
the full 6 point difference. The average rating of the team that schedules an
Unranked opponent has been stable at about -4.5. The number of games against
Unranked opponents per year has almost exactly doubled since 2001 (247 to 490). Who knows
what is going on. Although we cannot say there is any evidence that the
average D2 team is getting any better, maybe the upper echelon of such teams
is separating and indeed improving.
SUMMARY: I decided that starting with this year (and with 5 new teams joining D1
in basketball), instead of using average Unranked rating as the default, I will use
a value of -9.75. This is the average from the last 10 years only and seems like a
decent compromise between the -12.84 figure using all the data, and a -6.78 figure
which would be extrapolating the regression line. I will continue to monitor this
in future years and am interested to see how well the 5 new teams do.
What about football? The situation there is a little easier. There have been
16 new D1-FBS teams over the last 20 years (the earliest being UConn in 2000).
Those teams have averaged a rating of -16.95 in their first year, compared
with the average FCS opponent rating of -24. New teams are still definitely
better than the average FCS team, but not by as much. The worst ever riser
was UConn in 2000, with a rating of -31. It is actually conceivable this time
that a new team may be worse than the average FCS team.
Here is a graph of the average debut season rating over time:
The correlation is very low, and the T-test gives a p-value of 0.33, which
is not low enough to provide sufficient evidence that the average debut rating
is changing over time. I therefore trust the -16.95 value for now and starting
the next time we have new D1-FBS teams, I will use this value instead of the -24
default.
ENTRY 61 - 5-5-22 - System Bracket Generation Performance 2022
This post will be analyzing my March Madness bracket generator for the 2022 tournament,
now that it has ended. We will answer the eternal question: If you had to put your
faith in 100 brackets, would you be better off picking 100 random human ESPN brackets,
or 100 from my system? Which produces the better "best" bracket on average? (And similar for any
size "pool" you could wish)
First, a recap of the 2022 tournament and some things that happened that are relevant
to our discussion. Gonzaga was once again the favorite to win it all and got the #1
overall seed. KenPom had them as something like 27% favorites. Typical betting lines
had Gonzaga at +350 (implied odds of 22.2%). Besides Gonzaga, nobody could be trusted.
It was a season of parity, with every team having dud performances against unranked teams,
especially leading into the tournament. It was viewed as wide open. (1) Arizona and (2)
Kentucky were viewed as the hot teams after late season pushes. A few early season risers
limped into the finish line (ex. (2) Auburn, (4) UCLA, (3) Purdue to give some examples)
Then in the actual tournament, pretty much everything unexpected happened. Hot Kentucky
lost to (15) St Peters, who went on to make the Elite 8. (4) Providence, the computer-hated
lucky squad, made the Sweet 16 anyway. (4) Arkansas sent Gonzaga packing before the Elite
8. Every ACC team went on a mad tear (after they were maligned all season for being weak),
much like the Pac 12 last year. (1) Kansas vs (8) UNC in the title game resulted in a title
for Kansas.
First, how did my advice in Blog Entry 60 hold up? Did I say some good things? Let's see:
- "I'd try to keep my 1 and 2 seeds mostly alive until the Sweet 16,
and then let some upsets happen from there." (5 of the 8 top 2 seeds made the Sweet 16.
This was a little more carnage than I expected. And Auburn and Kentucky were two of the
less likely ones to fall out)
- "(1) Baylor is clearly the weakest 1 seed so a Roudn 2 upset is worth taking a shot at" (Yup.)
- "The region with Arizona looks brutal so let some chaos happen there" (Houston did beat them,
but I thought Villanova would lose somewhere)
- "Stay away from 3v14 upsets" (Good call, none of these happened)
- "The 4v13 upsets actually might be more reasonable than the 5v12" (Fail - the 4s all won, and
the two least likely 5v12 ended up winning)
- "The 6 through 8 seed Round 1 games are coin flips" (7 of the 12 seed upsets happened - I'll say this
is close enough to coin flip)
- "For Final 4, take Gonzaga, then two more 1 or 2 seeds, then a long shot" (I had the correct overall
distribution, but just the wrong exact teams)
- "Some nice long shots: UCLA, Houston, Tennessee, Iowa, Arkansas, Texas Tech" (Houston and Arkansas
did well but the rest mostly flopped)
- Out of the most likely Round 1 upsets, I did label St Peters over Kentucky as most likely 15 over 2.
My 12 over 5 didn't happen. My 11 over 6 (Michigan over Colorado St) looks pretty good now. My 10 over 7
was the only one that DIDNT happen. My 9 Memphis did beat 8 Boise St.
- Round 2 upsets - I did have UNC over Baylor as the most likely 8v1 upset. Houston over Illinois was
a good 5v4 pick. I also picked Houston correctly to give Arizona trouble next round. My other picks didn't pan out here.
None of my Round 3 upset picks worked.
Let's move on to chances of perfect bracket at each round (SD=1):
Round 1 - 1 in 22 million
Round 2 - 1.59 x 10^-12
Round 4 - 2.46 x 10^-16
Perfect Bracket - 3.68 x 10^-17
Probability of perfect bracket maximized at SD=1.2, P = 4.43 x 10^-17
Probability of perfect bracket flipping coins = 1 x 10^-19
The first two rounds were not nearly as crazy as 2021, but the crazy in the Elite 8 and Final 4
made up for that and made it one of the more unlikely touranments we've seen overall. The
maximized probability at SD=1.2 also indicates it was more crazy than an average tournament.
Good brackets mostly come down to having the Final Four correct. Let's look at final four
incidence rates next. I have a handy chart here to compare which systems had the best chances
at correct Final Fours. I have several settings of my own system, as well as ESPN brackets
and some other prominent systems.
Final Four % | System SD=1.0 | System EV Max | Typical Strength | ESPN Public | KenPom |
Duke | 14.8 | 16.4 | 22.7 | 17.2 | 13.6 |
UNC | 3.5 | 0.7 | 1.9 | 2.7 | 2.0 |
Villanova | 17.5 | 19.6 | 22.7 | 17.4 | 15.7 |
Kansas | 30.0 | 39.3 | 38.8 | 42.7 | 27.9 |
All Four | 0.027 | 0.009 | 0.038 | 0.034 | 0.012 |
All but UNC | 0.75 | 1.25 | 1.96 | 1.24 | 0.58 |
Kansas Champ | 9.2 | 11.7 | 13.8 | 8.3 | 6.6 |
Let's start with the Final 4. The public has my program beat this year, whether you look
at the base probabilities or EV Max (my preferred conservative bracket setting, which is
similar to SD = 0.6 or so). I'm going to blame the storylines, which ended up coming through
for them. A lot of people were rooting for the Coach K send-off with a title. UNC has a lot
of homers too. It was a very bad year for KenPom. His probabilities are
lower across the board. Beating everyone, though, is my program set to "Typical Strength".
This settting just uses typical values for strength for seed lines and ignores program
strength. Some years typical strength does well for whatever reason. It often does well
when somewhat neglected teams end up playing to their seed line anyway.
I did do a little better job at picking Kansas to win it all though, which is my one hope
for generating better brackets. Typical Strength once again had the highest number,
ignoring perceived strength of Arizona and Gonzaga and treating all 1s equal.
The implied title odds for Kansas by sports books at +1100 was 8.3%. I peronsally trusted
Kansas a lot too since the start of the year and picked them in several single bracket
money pools I entered.
Often having all four correct just really comes down to getting the least likely one. My
system at regular levels had the best chances there out of everyone, surprisingly.
Bring on the percentiles, and let's find out which system wins at each
threshold! Same as last year, I've decided to change how I do this analysis to a methodology
that may be slightly easier for someone to look at. For each of my
system's settings I will generate 100,000 brackets. Then I will give you
the score of the bracket that sits at each of a number of percentiles. For
example, we will compare the 0.99 percentile level for each kind of bracket
and see what ESPN score a bracket in the top 1% would get. The implication
is that if each system got to generate 100 brackets and take its best one,
which system would win? It is really the inverse
of the previous charts I made. I've added one more column as an approximation
of generating brackets using KenPom's tournament probability table as a
generator.
Here is the chart. As a reminder, the system with the HIGHEST
score in a given row is the "winner" of that row, and the best at generating
brackets when it gets that corresponding number of entries. It is also
the best at generating brackets of that ESPN score level.
Percentile | ESPN | SD 0.5 | *EV Max | 0.8 | 1 | 1.2 | 0.5 Seed Typical | 1.0 Seed Typical | *KenPom Approx. |
Best | 1710 | 1550 | 1560 | 1560 | 1600 | 1540 | 1490 | 1610 | 1560 |
0.99999 | 1560 | 1520 | 1510 | 1550 | 1560 | 1540 | 1490 | 1540 | 1550 |
0.9999 | 1460 | 1400 | 1400 | 1470 | 1480 | 1460 | 1360 | 1440 | 1440 |
0.999 | 1310 | 1290 | 1310 | 1300 | 1310 | 1310 | 1310 | 1320 | 1280 |
0.99 | 1180 | 1180 | 1200 | 1170 | 1160 | 1140 | 1240 | 1200 | 1130 |
0.98 | 1130 | 1140 | 1150 | 1130 | 1110 | 1080 | 1200 | 1150 | 1070 |
0.95 | 1040 | 1070 | 1080 | 1040 | 1020 | 980 | 1150 | 1080 | 960 |
0.9 | 850 | 970 | 990 | 910 | 850 | 780 | 1090 | 980 | 730 |
0.75 | 650 | 660 | 670 | 620 | 590 | 560 | 810 | 660 | 560 |
0.5 | 530 | 530 | 530 | 490 | 470 | 450 | 620 | 510 | 450 |
0.25 | 430 | 450 | 450 | 410 | 390 | 370 | 520 | 420 | 380 |
0.1 | 360 | 390 | 390 | 350 | 330 | 310 | 450 | 360 | 320 |
0.01 | 200 | 320 | 320 | 280 | 250 | 240 | 360 | 280 | 250 |
The top 5% of pickers on ESPN seem to be very similar to my system
at the 1.0 setting (which is picking based on regular probabilities). ESPN
appears to have the slight edge by about 10-20 points. Although we saw that ESPN
had a higher chance of getting all or most of the Final 4 correct, we
must condition on Kansas winning it all, and my system did much better there.
We see a big gap in score between the 0.9 percentile and 0.75 percentile,
representing the gap between brackets that picked Kansas and those that didn't.
Some of those above 0.9 did not pick Kansas but were very good brackets otherwise.
Overall my SD=1 setting was my best this year for any group size over 1000.
Keep in mind for "Best" that each of my systems only got 100,000 tries, whereas
ESPN humans had 17.4 million attempts, so there is a sample size issue.
The rest of the ESPN crowd more resembles EV Max, which is a fairly
conservative setting. This is pretty typical, as the average picker doesn't
tend to pick that many upsets and has a lot of chalk. For anything at the
0.98 level or below, the best system of all was 0.5 Typical, ignoring
what we knew about this year's teams and just going with average seed
strength. In small groups (50 or fewer) this would have been a good choice.
KenPom did OK at generating elite brackets (about on par with SD=1 and ESPN)
but really fell off at lower levels due to the number of teams he was
high on that didn't come through.
Summary: My SD=1 appeared to be nearly as good (but slightly worse) than ESPN
human pickers at high levels this year. But my system is better at picking than most humans,
especially if you allow it to generate conservative brackets. And
this is more evidence that it is a good idea to generate at least a
few brackets by just considering seed numbers only and picking
upsets from there.
My family group had 64 entries (12 were mine, but only 6 of those
were computer generated). The group high was 1160. If I had generated
all 12 of my brackets using EV Max, I would have had an 18% chance to
beat 1060, which is about my fair share of the group. If my program got
to generate 64 brackets on EV Max, on average its best bracket would be
1190. Out of my 25 brackets on ESPN, one of them did get 1180.
I did win a money pool that allowed 1 entry per person and had 20 entries.
I did so only by virtue of being the only one in the pool that picked
Kansas to win it all. My score of 940 was extremely pitiful for being
one that had the champion picked. It is probably in the bottom 5% of
brackets that had Kansas picked. But ESPN's scoring system is stupid
so whatever.
See ya for more bracket fun next year!
ENTRY 60 - 3-13-22 - NCAA Tournament Special 2022
We've got another bracket in front of us. Time to dig into the data and see what it tells us about what kind of year we can expect!
Like with last year, I don't feel like debriefing the bracketology seed projections
and what I've learned about the bracket prediction at the moment. I'd rather just
break down the bracket we have and see what we can learn about it!
Let's start with my system's chances that each team makes the Final Four (by percent):
(Note that for First Four games, I've assumed my higher rated team will be the winner)
(1)Gonzaga: 41.2
(1)Kansas: 30
(2)Auburn: 26.8
(1)Arizona: 26.5
(2)Kentucky: 23.7
(1)Baylor: 22.9
(3)Tennessee: 20.6
(2)Villanova: 17.5
(5)Houston: 16.7
(3)Texas Tech: 15.9
(4)UCLA: 15.6
(2)Duke: 14.8
(5)Iowa: 13
(3)Purdue: 12.8
(4)Arkansas: 7.4
(3)Wisconsin: 6.5
(5)Saint Marys: 6.4
(6)Louisiana State: 6
(4)Illinois: 5.4
(5)Connecticut: 5
(8)San Diego State: 4.7
(4)Providence: 4.2
(6)Texas: 4.1
(8)North Carolina: 3.5
(10)San Francisco: 3.2
(6)Alabama: 3
(11)Virginia Tech: 3
(10)Loyola Ill: 2.9
(9)Memphis: 2.9
(8)Boise State: 2.6
(6)Colorado State: 2.4
(7)Southern California: 2.4
(7)Michigan State: 2.2
(9)Creighton: 2.1
(11)Michigan: 2
(10)Davidson: 2
(7)Murray State: 1.8
(7)Ohio State: 1.6
(8)Seton Hall: 1.6
(9)Marquette: 1.5
(10)Miami: 1.5
(9)Texas Christian: 1.4
(11)Iowa State: 1.3
(12)Indiana: 1.3
(11)Notre Dame: 1.2
(13)Vermont: 0.9
(12)Alabama Birmingham: 0.8
(13)South Dakota State: 0.8
(12)New Mexico State: 0.8
(12)Richmond: 0.7
(13)Chattanooga: 0.3
(15)Saint Peters: 0.1
(14)Montana State: 0.1
(14)Colgate: 0.1
(13)Akron: 0.1
(14)Longwood: 0
(15)Jacksonville State: 0
(14)Yale: 0
(15)Delaware: 0
(15)Cal State Fullerton: 0
(16)Georgia State: 0
(16)Norfolk State: 0
(16)Wright State: 0
(16)Texas Southern: 0
And here are the title chances for each team:
(Note, only includes teams that won at least one of the 50,000 simulations)
(1)Gonzaga: 17.9
(1)Kansas: 9.2
(1)Arizona: 8.8
(2)Auburn: 6.6
(2)Kentucky: 6.6
(3)Tennessee: 6.3
(1)Baylor: 6.1
(5)Houston: 5.2
(2)Villanova: 4.7
(3)Texas Tech: 4.1
(4)UCLA: 3.4
(2)Duke: 3.3
(3)Purdue: 2.6
(5)Iowa: 2.5
(4)Arkansas: 1.4
(5)Saint Marys: 1
(4)Illinois: 0.9
(5)Connecticut: 0.8
(6)Louisiana State: 0.8
(8)San Diego State: 0.7
(3)Wisconsin: 0.7
(6)Texas: 0.5
(4)Providence: 0.5
(8)North Carolina: 0.4
(10)Loyola Ill: 0.4
(9)Memphis: 0.4
(10)San Francisco: 0.4
(8)Boise State: 0.3
(6)Alabama: 0.3
(11)Virginia Tech: 0.3
(6)Colorado State: 0.3
(11)Michigan: 0.3
(7)Michigan State: 0.3
(9)Creighton: 0.2
(7)Southern California: 0.2
(7)Murray State: 0.2
(10)Davidson: 0.2
(8)Seton Hall: 0.2
(7)Ohio State: 0.2
(9)Texas Christian: 0.1
(9)Marquette: 0.1
(10)Miami: 0.1
(11)Iowa State: 0.1
(12)Indiana: 0.1
(11)Notre Dame: 0.1
(13)Vermont: 0.1
(13)South Dakota State: 0.1
(12)New Mexico State: 0.1
(12)Alabama Birmingham: 0
(12)Richmond: 0
(13)Chattanooga: 0
(14)Colgate: 0
(15)Saint Peters: 0
(13)Akron: 0
(14)Longwood: 0
Here are number of teams to reach each threshold, by year the last few years:
At least 10% Chance at Final 4:
2015 - 9
2016 - 18
2017 - 15
2018 - 13
2019 - 12
2021 - 9
2022 - 14
At least 1% Chance at Final 4:
2015 - 28
2016 - 43
2017 - 43
2018 - 42
2019 - 34
2021 - 46
2022 - 45
At least 10% Chance at Title:
2015 - 4
2016 - 4
2017 - 2
2018 - 2
2019 - 4
2021 - 2
2022 - 1
At least 1% Chance at Title:
2015 - 12
2016 - 18
2017 - 24
2018 - 19
2019 - 16
2021 - 22
2022 - 16
The top seeds this year are some of the weakest we have ever seen. The strength of the 1 and 2 seeds
is pretty comparable to two years from the past in particular - 2014 and 2006. In 2014, we saw
7 seed UConn win the national title game over 8 seed Kentucky. In 2006, we saw no 1 seeds make the
Final 4, only a single 2 seed, and (4) LSU and (11) George Mason were represented there. We see
this represented this year with the historically high numbers of teams that could possibly make
the Final 4. But the title chances still reside with the few really good teams, so we don't see
as much representation there as you might expect from an open field.
Breakdown of expected number of each seed in the final four (with comparisons to last year):
1 seeds: 1.2 (vs 1.37 last year)
2 seeds: 0.83 (vs 0.73 last year)
3 seeds: 0.56 (vs 0.36 last year)
4 seeds: 0.33 (vs 0.3 last year)
The rest: 1.08 (vs 1.24 last year)
The relative lack of strength in the 1 seeds in particular is gained mostly by the 2 and 3 seeds.
Unlike last year, where the 3-8 seeds were mostly interchangeable, there is a gradual drop-off in
strength this time going down the seeds so there is a slightly smaller chance of a big longshot
making the Final 4. Much of the "rest" category comes from (5) Houston and (5) Iowa.
Here are the most likely upsets by round and seed line. First round:
16 over 1: 0.21 upsets, most likely Norfolk St over Baylor, 6.9% chance
15 over 2: 0.34 upsets, most likely Saint Peters over Kentucky, 11.2% chance
14 over 3: 0.5 upsets, most likely Colgate over Wisconsin, 17.5% chance
13 over 4: 1.11 upsets, most likely South Dakota St over Providence, 37% chance
12 over 5: 1.16 upsets, most likely Indiana over St Marys, 34.9% chance
11 over 6: 1.74 upsets, most likely Michigan over Colorado St, 47% chance
10 over 7: 2.06 upsets, the 10 seed is actually slightly favored in every matchup except Miami vs USC
9 over 8: 1.85 upsets, most likely Memphis over Boise St 49.4% chance (none of the 9s are favored)
Even though the top seeds are weak, we also have weak 15 and 16 seeds to offset that. The 14 seeds are particularly weak
compared to normal. However, starting with the 13 v 4 matchup there are some great upset chances.
Second round:
8 over 1: 1.09 upsets, most likely UNC over Baylor, 33.1% chance
7 or 10 over 2: 1.22 upsets, most likely Loyola Chicago over Villanova, 34.6% chance
6 over 3: 1.58 upsets, with LSU favored over Wisconsin, 51.9% chance
5 over 4: 2.12 upsets, with Houston over Illinois and Iowa over Providence as big favorites (over 60% each)
The upset numbers for the 1 and 2 seeds are actually lower than last year, so expect perhaps a little
less chaos than last time. Watch for the particularly vulnerable ones though.
Later rounds:
4 or 5 over 1: Houston over Arizona (48.4%) and UCLA over Baylor (45.4%) are both worth considering.
3 or 6 over 2: Texas Tech over Duke (52.2%) and Tennessee over Villanova (53.1%) are favorites for seed upsets
2 over 1: (2) Kentucky is projected to win its region over (1) Baylor. Note that in the region with (1) Kansas, they will be
favored in every game they play but they have a tough road and (2) Auburn with an easier path actually has
nearly equivalent final 4 chances.
If I had to pick one bracket, I'd try to keep my 1 and 2 seeds mostly alive until the Sweet 16,
and then let some upsets happen from there. Pay attention to which ones show some weakness and
will probably lose at some point so could be worth taking a shot on losing early. (1) Baylor is
clearly the weakest 1 seed so a Round 2 upset is worth taking a shot at, and if not, then losing
to UCLA or St Marys in the Sweet 16 is certainly possible. The Region with (1) Arizona is a
brutal region, so even though Arizona looks really good, consider letting some chaos happen there
because of the number of land mines. Stay away from the 3 v 14 upsets (none of them look good) but
a 4 v 13 will probably happen, and I'd call the 6 through 8 seed Round 1 games coin flips and just
pick who you like. For Final 4 picks, I'd take Gonzaga, then two more 1 or 2 seeds, and then you
can get cute with the last pick. (4) UCLA, (5) Houston, (3) Tennessee, and (5) Iowa are spicy picks.
(4) Arkansas was a popular Final Four prospect a week ago but drew Gonzaga in its region. (3) Texas
Tech is who I would pick in Gonzaga's region if I picked them to fall.
As a usual reminder, if you are using my Bracket Generator to make picks, I would only
recommend leaving it at the default setting if you are generating brackets just for fun,
or trying to win a very large pool (1000+ people). Otherwise you are probably better
off sliding it all the way to "conservative" and generating brackets from that position.
Good luck and have fun bracketing!
ENTRY 59 - 3-13-22 - Selection Sunday Final Brackets 2022
Well, here we are again. It is selection Sunday! This post comes a few hours
before the full bracket will be revealed. So let's break down the bracket situation!
Let's start with a brief look at the Early Reveal from this year, which happened
on the morning of 2/19/22. Here were my program's top seeds at the time of the reveal:
1: Auburn, Arizona, Kansas, Purdue
2: Gonzaga, Kentucky, Baylor, Tennessee
3: Duke, Villanova, Wisconsin, Texas Tech
4: Providence, UCLA, Illinois, Alabama
missed: Houston, LSU, Ohio St, Texas
And here was the actual committee top 4 seeds:
1: Gonzaga, Auburn, Arizona, Kansas
2: Baylor, Kentucky, Purdue, Duke
3: Villanova, Texas Tech, Tennessee, Illinois
4: Wisconsin, UCLA, Providence, Texas
missed: Alabama, Houston, Ohio St
My system continues to be confused by teams like Gonzaga. They did not have very
many Q1 wins at the time but by most metrics they were clearly the best team in the
country. The committee reiterated during the show that wins and losses are all they
care about, and they do not use advanced metrics to seed the teams. So there seems
to be a contradiction here. At the time their best wins were Texas, UCLA, and
Texas Tech. They did not have many Q2 games. Most of their schedule was in Q4.
Overall the noncon schedule was not great. But most people had them as a clear #1
overall.
The committee even said their noncon SOS was "good" (it was about #150). I guess a few
high profile games can really make up for a bunch of cupcakes in their mind.
Other than that, the committee really seemed to value quality wins above pretty much
anything else, in particular Q1 wins but to some extent Q2 wins. In particular, a
team's top 2 or 3 wins seem to matter a lot (regardless of location!)
I did make some adjustments to my weights after that early reveal and at least got my program to
reproduce the same list of seeds as the committee reveal, although not necessarily
in the same order. The adjustments required to put much more weight in team
strength (ironic since the committee said they do not value this, although it was
the only way to get Gonzaga to the 1 line). I also decreased the value of bad losses
a bit and emphasized Q1 and Q2 wins more. One of the issues with messing with weights
in response to the committee reveal is there's no telling what consequences it will have
further down the bracket and whether those principles will still hold.
For example, with a team like Murray State, most people have them ranked quite highly
and looking like an 8 or even 7 seed. But their team sheet is kind of lackluster.
Their conference is just so bad. They won at Memphis and that is their only win
over an at large team. At Belmont is their only other Q1 win, and it's barely
Q1. Their 3 Q2 wins are very marginal and barely Q2 wins. The one at home over
Chattanooga helps a bit. But over half of their team sheet is in Q4. If the committee
cares most about who you beat, then how is this deserving of an 8 seed? How, for example,
is this team regarded higher than a North Carolina team that beat Duke on a neutral floor?
Or TCU, which has 8 Q1 wins, including Kansas, Texas, and Texas Tech? This makes actual
0 sense to me.
Enough of the ranting. I'll do more ranting later probably. Here are the projections!
My system's seeds at 4:40pm EST:
1: Arizona, Gonzaga, Baylor, Kansas,
2: Purdue, Tennessee, Auburn, Kentucky,
3: Villanova, Duke, Texas Tech, UCLA,
4: Wisconsin, Illinois, Providence, Arkansas,
5: Iowa, Connecticut, Houston, Saint Marys,
6: Texas, Colorado State, Michigan State, Southern California,
7: Seton Hall, Louisiana State, Boise State, North Carolina,
8: Loyola Ill, Creighton, Texas Christian, Memphis,
9: San Diego State, Ohio State, Marquette, Alabama,
10: Iowa State, Virginia Tech, Texas AM, Davidson,
11: Miami, Indiana, Murray State, Notre Dame, Michigan,
12: Oklahoma, San Francisco, Alabama Birmingham, New Mexico State, South Dakota State,
13: Richmond, Vermont, Chattanooga, Longwood,
14: Akron, Montana State, Colgate, Yale,
15: Delaware, Norfolk State, Cal State Fullerton, Jacksonville State,
16: Georgia State, Texas Southern, Bryant, Saint Peters, Wright State, Texas AM CC,
And here are my own personal projections as of about 5:10pm EST:
1: Gonzaga, Arizona, Kansas, Baylor
2: Auburn, Kentucky, Purdue, Villanova,
3: Tennessee, Duke, Texas Tech, Wisconsin,
4: UCLA, Illinois, Providence, Arkansas,
5: Iowa, Connecticut, Houston, Saint Marys,
6: Texas, Southern California, Louisiana State, Boise State,
7: Colorado State, Michigan State, Alabama, Texas Christian,
8: Seton Hall, Ohio State, Creighton, San Diego State,
9: Marquette, Memphis, Murray State, Iowa State,
10: Loyola Ill, North Carolina, Virginia Tech, Davidson,
11: San Francisco, Texas AM, Miami, Indiana, Michigan,
12: Wyoming, Oklahoma, Alabama Birmingham, New Mexico State, South Dakota State,
13: Chattanooga, Richmond, Vermont, Akron,
14: Yale, Longwood, Montana State, Colgate,
15: Delaware, Cal State Fullerton, Jacksonville State, Georgia State,
16: Norfolk State, Texas Southern, Bryant, Saint Peters, Wright State, Texas AM CC,
Out:
Notre Dame, Xavier, SMU, Wake, Rutgers
The bubble is really big this year. I wouldn't be surprised to be wrong about several of
these teams. Usually if I am wrong it's because the committee chooses to put in a team
with better Q1 situation even if their metrics and other things are bad. Notre Dame and Wake
look like classic cases of really bad records against Q1 even though the rest of their
profile is OK. The committee doesn't tend to reward clean sheets against Q2-Q4 if the Q1
record is bad. I think Oklahoma, Michigan, and Miami could all be considered to qualify
for this. Rutgers will be an interesting case. They have a great Q1 record but lots of
garbage in Q2-Q3 and bad metrics. How much does that matter? We'll see!
ENTRY 58 - 6-14-21 - System Bracket Generation Performance (2021)
Well, I am back after a bit of a break to analyze what happened in the
2021 NCAA Tournament. In particular, did my system and its bracket picker
do better than an average human? Did it do better than an elite human
picker? Let's see!
First, a quick reminder about some cool things that happened in the
tournament. Gonzaga was the prohibitive favorite entering the
tournament. Gonzaga and Baylor were clear #1 and #2 (in some order) until
Baylor had some COVID issues that seemed to slow them down and they didn't
quite look themselves in the leadup. Then Baylor defeated Gonzaga in the
finals by almost 20 points. The big story was the Pac 12, which had a
substandard nonconference but had multiple teams go on runs in the tournament.
(6) USC made the Elite 8 before falling to Gonzaga. (11) UCLA made the
Final Four in a weak region with hobbled 1 seed Michigan before falling
to Gonzaga in an instant classic with overtime. (12) Oregon State took advantage
of (8) Loyola's upset of (1) Illinois to reach the Elite 8. Outside of those
runs, Syracuse was Syracuse again (reaching the Sweet 16 as a #11) and Oral
Roberts also made the Sweet 16 as a #15. Meanwhile, the Big 10 which was
considered one of the strongest conferences in years advanced only a
single team to the Sweet 16.
Now, how did my advice from Blog Entry 57 hold up? I had some advice
for bracket makers and I hope it was good.
- "Expect a mild first round and lots of upsets in the second round"
(I think there were too many first round upsets to call this a success,
but the second round did produce lots of high seeds)
- "The 1-2 seeds are vulnerable in the Round of 32. Also the 3-6 seeds
are interchangeable" (I'll call this success. 3 of 8 top two seeds fell
before the Sweet 16. Only a single 3 and a single 4 made the Sweet 16)
- "If I had to pick one bracket, I would advance all 1 and 2 seeds one game."
(Nope.)
- "I would pick 1 upset each of 3v14 and 4v13. They are all reasonable."
(We had one 3v14 and two 4v13 so pretty close)
- "The 12 over 5 upsets are all pretty enticing. I would pick 1 or 2."
(Surprisingly, only the least likely one actually happened)
- "You could flip coins past the first round and do just fine." (More
on this later!)
- "Just make sure you pick Gonzaga to win it all." (Close)
- "Pick your Final 4 from the best 9 teams (the 1-2 seeds and #3 Arkansas),
with maybe one sleeper that is 9 seed or better." (Good advice besides
(11) UCLA, which is a deeper sleeper than anyone expected. At least
that team was in the weak region.)
An extra note just for this year: In the previous post I mentioned
that I manually changed the strength of 5 teams before generating
brackets. In hindsight, how did these changes affect things?
- Colgate (-5): They lost their first round in lopsided fashion
so this change appeared justified.
- Villanova (-2): Due to injury of their star player. They still
made the Sweet 16 so this definitely hurt my brackets.
- Michigan (-1): Due to injury of a key piece. You can take or
leave this one. They still made the Elite 8 and probably should
have made the Final 4 but I think overall this bump probably
hurt my brackets.
- UConn (+1): Due to playing much better with a star player
back. They still lost in the first round, so this hurt me.
- Baylor (+1): Due to probably piecing things together after
a COVID pause dinged them. I guess I should have bumped them up
even more? The +1 put them about halfway toward their strength
from earlier in the season before the pause.
Overall, I guess what I learned is there's no point in making
manual adjustments (besides Colgate) since it's all random anyway.
Let's move on to chances of perfect bracket at each round (SD=1):
Round 1 - 1 in 504 million
Round 2 - 5.33 x 10^-14
Round 4 - 6.51 x 10^-17
Perfect Bracket - 1.21 x 10^-17
Probability of perfect bracket maximized at SD=1.32, P = 1.87 x 10^-17
The Round 1 and Round 2 chances of perfect bracket might be the lowest
ones I've ever seen. Some upsets that were way out there brought the
probability way down. Only the somewhat more likely results in the final
four saved this from being a historically crazy bracket (2009 is the only
one with a lower probability, at 7.8 x 10^-18. It had (11) VCU vs (8)
Butler in the Final Four and no 1 or 2 seeds there)
Of particular note, a bracket that flips a literal coin for every game
will have probability 1 x 10^-19 of being perfect. So my system is only
barely better than that this year.
Let's look at final four incidence rates next. My system's probability
of picking each Final Four team correctly, as per Blog Entry 57:
Gonzaga 55.7%
Baylor 34.5%
Houston 27.0%
UCLA 1.6%
All Four: 1 in 1205
All except UCLA: 1 in 19
My system benefits from its 3 top picks (including (2) Houston)
succeeding in making it. Heavy human favorite in its region Illinois
lost before the Sweet 16.
Here are the public's
chances at getting each one right as per ESPN data:
Gonzaga 65.7%
Baylor 47.8%
Houston 17.6%
UCLA 0.9%
All Four (using these probs): 1 in 2012
All Four (actual): 1 in 4878
All Except UCLA: 1 in 18
This is a typical pattern. The public plays it conservative,
which helps for favorites and really brings the probabilities down
for long shots. Their chances of a 3/4 final four are about the
same but a much lower chance of getting all four. It's curious that
the actual number of perfect final fours is about half of what you'd
expect by combining the individual probabilities.
I have a few other comparisons to note. First, here is the same
numbers using KenPom's system and his annual probability chart:
Gonzaga 60.5%
Baylor 31.5%
Houston 28.2%
UCLA 1.3%
All Four: 1 in 1431
All except UCLA: 1 in 19
His numbers are very comparable but the big relative dip in UCLA
caused the drop in performance.
Finally, here are the numbers using my system set to EV Max (a
conservative setting I use to generate most of my brackets which
should be more directly comparable to the public):
Gonzaga 74.5%
Baylor 53.8%
Houston 39.8%
UCLA 0.2%
All Four: 1 in 3125
All except UCLA: 1 in 6
As one might expect, the huge drop in UCLA's chances really hurts
the All Four chances, but the upside is a great 1 out of 6 chance
of nailing 3 out of 4! Will this be enough to generate elite
brackets?
Bring on the percentiles, and let's find out which system wins at each
threshold! I've decided to change how I do this analysis to a methodology
that may be slightly easier for someone to look at. For each of my
system's settings I will generate 100,000 brackets. Then I will give you
the score of the bracket that sits at each of a number of percentiles. For
example, we will compare the 0.99 percentile level for each kind of bracket
and see what ESPN score a bracket in the top 1% would get. The system
with the highest number in that row is the best for making brackets
to reach that particular ESPN score. It is really the inverse
of the previous charts I made. Here is the chart:
Percentile | ESPN | SD 0.5 | EV Max | 0.8 | 1 | 1.2 | 0.5 Seed Typical | 1.0 Seed Typical |
Best | 1690 | 1590 | 1580 | 1610 | 1630 | 1660 | 1570 | 1560 |
0.99999 | 1570 | 1590 | 1580 | 1590 | 1560 | 1600 | 1570 | 1550 |
0.9999 | 1510 | 1550 | 1540 | 1540 | 1520 | 1540 | 1530 | 1520 |
0.999 | 1450 | 1490 | 1490 | 1470 | 1450 | 1430 | 1480 | 1450 |
0.99 | 1360 | 1410 | 1410 | 1360 | 1320 | 1280 | 1400 | 1310 |
0.98 | 1320 | 1370 | 1370 | 1300 | 1260 | 1210 | 1360 | 1250 |
0.95 | 1190 | 1290 | 1280 | 1190 | 1120 | 1060 | 1280 | 1120 |
0.9 | 1050 | 1160 | 1150 | 1060 | 1000 | 940 | 1160 | 1010 |
0.75 | 880 | 990 | 970 | 860 | 780 | 720 | 960 | 760 |
0.5 | 710 | 830 | 780 | 690 | 620 | 560 | 740 | 580 |
0.25 | 530 | 700 | 640 | 540 | 460 | 410 | 590 | 440 |
0.1 | 200 | 580 | 510 | 410 | 360 | 320 | 490 | 360 |
0.01 | 120 | 390 | 350 | 280 | 250 | 230 | 370 | 270 |
As an example of how to read the chart, In the 0.99 row and the SD 1
column you see that my system set to normal probabilities out of
100 attempts would have a best bracket usually above 1320 in
ESPN Score. The probabilty is 1% that a random bracket will be
1320 or above.
Here is how I interpret the data:
A far as "Best" brackets are concerned, it is not surprising that my
craziest settings generated the best brackets out of 100,000 tries.
Recall the optimal settings for generating a 100% perfect bracket are
SD 1.32, which is even crazier than I usually make on this chart. But
the chance of creating really great brackets comes at a price, as
the 1.2 setting becoems worse than all other settings at the 99.9
percentile downward. This means for practical purposes it really
isn't your best option.
As this year illustrates better than ever, even with the extremely
unlikely outcomes, you are still best off going conservative. The
0.5 SD setting (the most conservative) has the highest scores at
every percentile below 0.9999 across the board. The EV Max setting
has very similar numbers but seems to be about 10-20 points behind
at each level.
The Best ESPN Score of 1690 is unfair to compare to my system's bests
since it is out of 17 million attempts and not 100,000. Thus it is
better to compare starting at the 0.99999 level, at which you can see
it does worse than my preferred EV Max setting at every percentile
level. The ESPN scores seem actually very similar to my system's
scores at the 0.8 level, which is slightly conservative.
Did my system do better than just using typical seed strength? In
this setting, I just take the team's seed and look at how strong
teams of that seed have typically been in the past. This gives
a higher chance of Baylor winning it all (since it is equal in
strength to Gonzaga). However, it slightly punishes Houston (22%
Final Four) and severely punishes UCLA (0.6% Final Four). The
result seems to be a wash this year. At both the SD 0.5 and 1.0
levels (both conservative and normal), the scores are nearly
identical to their counterparts. My takeaway is that it seems
to still be a good idea to hedge and generate some of my
brackets using typical seed strength if I generate a large
number of brackets.
My annual family group had 67 brackets. The highest score
turned out to be 1340 (it wasn't me unfortunately). My EV Max
setting has a 2% or 1 in 50 chance of any bracket it generates being
1370 or higher. So my generator likely would have done better
than 1340 if all 67 of the brackets in the group used it instead.
I generated the full 25 allowed brackets on ESPN and one of
my EV Max brackets did get 1470 score, which was quite lucky
for only 25 attempts, but does show that great brackets can
happen even from few attempts.
Anyway, that's all for today! I'm going to conclude from this
year that my system did generate better brackets than the
public (the EV Max or SD 0.5 setting is what I have used as a
benchmark in the past). It was better at generating both
average and elite brackets. Its love of Gonzaga probably
cost it being better than just using average seed strength, but
I can settle for a draw. See you all next year!
ENTRY 57 - 3-15-21 - NCAA Tournament Special 2021
We've got another bracket in front of us. Time to dig into the data and see
what it tells us about what kind of year we can expect!
Quick note: Usually I make a post about how my seed projections did. I
might not end up doing this at all this year since it was such a weird year
and I'm not sure much can really be taken away from it.
VERY IMPORTANT NOTE: Because of the unique Colgate situation (ranked #13
right now), I decided for the purposes of doing analysis that would be
helpful to people (including myself) making brackets to adjust Colgate,
as well as a few other team's strength manually from my system's outputs.
I did this only for teams like Colgate or ones that had significant injury
issues or COVID pauses affecting their strength. There were a total of 5
teams modified, most of them by only a single strength point.
Let's go ahead and start with Final Four and Champions breakdowns:
Final Four Percent
(1)Gonzaga: 55.7
(1)Baylor: 34.5
(2)Houston: 27
(1)Illinois: 25.1
(1)Michigan: 21.8
(2)Alabama: 19.8
(2)Ohio State: 15.6
(2)Iowa: 11
(3)Arkansas: 10.9
(3)Texas: 9
(6)Brigham Young: 8.9
(4)Oklahoma State: 8.7
(3)Kansas: 8.4
(6)San Diego State: 8.1
(3)West Virginia: 8.1
(4)Florida State: 7.8
(4)Purdue: 7.7
(5)Colorado: 7.7
(5)Tennessee: 7.7
(7)Connecticut: 7.4
(6)Southern California: 6
(4)Virginia: 5.6
(9)Saint Bonaventure: 5.5
(6)Texas Tech: 5.5
(5)Villanova: 5.4
(8)Louisiana State: 5.4
(8)Loyola Ill: 5
(8)North Carolina: 5
(5)Creighton: 4.1
(7)Florida: 3.7
(11)Utah State: 3.2
(7)Oregon: 3
(7)Clemson: 2.7
(9)Wisconsin: 2.7
(9)Georgia Tech: 2.5
(10)Virginia Tech: 2.2
(11)Syracuse: 2
(10)Maryland: 1.9
(14)Colgate: 1.9
(10)Rutgers: 1.8
(10)Virginia Commonwealth: 1.7
(12)Georgetown: 1.7
(11)UCLA: 1.6
(8)Oklahoma: 1.3
(12)Winthrop: 1
(9)Missouri: 1
(11)Wichita State: 0.9
(12)Oregon State: 0.8
(13)North Texas: 0.8
(13)UNC Greensboro: 0.6
(13)Ohio: 0.6
(12)UC Santa Barbara: 0.6
(14)Abilene Christian: 0.6
(15)Iona: 0.2
(13)Liberty: 0.2
(14)Morehead State: 0.1
(15)Grand Canyon: 0.1
(14)Eastern Washington: 0.1
(15)Cleveland State: 0.1
Champion Percent
(1)Gonzaga: 30.1
(1)Baylor: 11.6
(2)Houston: 8.1
(1)Illinois: 7.4
(1)Michigan: 5.3
(2)Alabama: 4.7
(2)Ohio State: 2.9
(2)Iowa: 2.8
(3)Arkansas: 2.1
(3)Kansas: 1.9
(6)San Diego State: 1.3
(6)Brigham Young: 1.3
(5)Tennessee: 1.3
(6)Southern California: 1.3
(4)Oklahoma State: 1.2
(3)Texas: 1.2
(3)West Virginia: 1.2
(4)Virginia: 1.2
(4)Purdue: 1.1
(4)Florida State: 1
(5)Colorado: 1
(7)Connecticut: 1
(8)Loyola Ill: 0.8
(9)Saint Bonaventure: 0.8
(8)North Carolina: 0.7
(5)Creighton: 0.7
(8)Louisiana State: 0.7
(5)Villanova: 0.7
(6)Texas Tech: 0.7
(7)Oregon: 0.5
(7)Florida: 0.4
(11)Utah State: 0.4
(7)Clemson: 0.3
(9)Wisconsin: 0.3
(9)Georgia Tech: 0.3
(10)Virginia Commonwealth: 0.2
(10)Virginia Tech: 0.2
(14)Colgate: 0.2
(11)Syracuse: 0.2
(8)Oklahoma: 0.2
(10)Maryland: 0.1
(10)Rutgers: 0.1
(11)UCLA: 0.1
(12)Georgetown: 0.1
(9)Missouri: 0.1
(11)Wichita State: 0.1
(12)Winthrop: 0.1
Here are number of teams to reach each threshold, by year the last few years:
At least 10% Chance at Final 4:
2015 - 9
2016 - 18
2017 - 15
2018 - 13
2019 - 12
2021 - 9
At least 1% Chance at Final 4:
2015 - 28
2016 - 43
2017 - 43
2018 - 42
2019 - 34
2021 - 46
At least 10% Chance at Title:
2015 - 4
2016 - 4
2017 - 2
2018 - 2
2019 - 4
2021 - 2
At least 1% Chance at Title:
2015 - 12
2016 - 18
2017 - 24
2018 - 19
2019 - 16
2021 - 22
There is a very unique distribution of team strength this time. There are
9 teams that separated themselves at the top of my ratings. These are the
ones that have the 10% chance at final four. They are all the 1 and 2 seeds,
plus 3 seed Arkansas. Then after that, ranks #10 through #31 are remarkably
similar. This cluster ends at #31 Oregon. This cluster of 22 teams spans
only 2.5 rating points, which means any of these teams would be virtual coin
flips against each other and are entirely interchangeable.
The result of this is that 9 teams have a good shot at winning it all, but
after that any amount of chaos could happen. Even the 1 and 2 seeds don't
have the separation you would typically expect from the top lines. The 1
seeds in particular, besides Gonzaga, are extremely vulnerable by typical
measures. It is possible this is just due to the shortened nonconference
schedule resulting in fewer chances for the top teams to punish bottom
dwellers and separate in my system. Only time will tell. If the ratings
are to be trusted, expect a relatively mild first round followed by utter
chaos in the second round with frequent upsets at all levels.
Breakdown of expected number of each seed in the final four:
1 seeds: 1.37
2 seeds: 0.73
3 seeds: 0.36
4 seeds: 0.30
The rest: 1.24
These numbers are down across the board (besides the "others" category)
from the 2019 numbers. This means the final four will likely have a
member and possibly two from outside the top 4 seeds. This is the result
of high likelihood of 1v8 and 2v7 upsets compared to usual years.
Here are the most likely upsets by round and seed line. First round:
16 over 1: 0.23 upsets, most likely Mt St Marys over Michigan, 8.3% chance
15 over 2: 0.53 upsets, most likely Grand Canyon over Iowa, 18% chance
14 over 3: 0.93 upsets, most likely Colgate over Arkansas, 34.3% chance
13 over 4: 1.08 upsets, most likely Ohio over Virginia, 31% chance
12 over 5: 1.40 upsets, most likely Georgetown over Colorado, 36.4% chance
11 over 6: 1.48 upsets, most likely Utah St over Texas Tech, 43.4% chance
10 over 7: 1.73 upsets, most likely Rutgers over Clemson, 46.9% chance
9 over 8: 1.89 upsets, St Bonaventure favored over LSU at 50.6% chance
Because of the weakness of the 1 seeds and the relative strength of 16 seeds
compared to usual years, the 1v16 odds look more like 2v15 odds from normal
years. Vegas generally agrees that the above are the most likely. Vegas
actually favors Rutgers over Clemson.
Here are some upset chances and most likely upsets in later rounds:
8 or 9 over 1: 1.19 upsets, most likely St Bonaventure over Michigan, 38.6% chance
7 over 2: 1.43 upsets, most likely UConn over Alabama or Oregon over Iowa (tied at 38.5% chance)
6 over 3: 1.97 upsets, with San Diego St favored over West Virginia and BYU favored over Texas
5 over 4: 1.94 upsets, with Colorado favored over Florida St and Tennessee favored over Oklahoma St
As mentioned before, the 1 and 2 seeds are extremely vulnerable at this position
in the bracket in particular, and the 3-6 seeds are basically indistinguishable.
One trend to note here is my system's disrespect across the board for all of
the Big 12 teams. I think this disrespect is a product of how badly the
bottom half of the league did in the nonconference, and how those teams
improved during conference play and brought everyone else down. It is
possible all of these Big 12 teams should be given more credit than my system
lets on, but I have not changed any numbers here.
4 or 5 over 1: Most likely is Colorado over Michigan, 38.6% chance.
3 or 6 over 2: Arkansas is barely favored (50.9%) over Ohio State
2 over 1: Houston is favored over Illinois and Alabama is favored over Michigan to make the final four
To summarize again, if I had to pick one bracket, I would advance all the 1 and 2 seeds
at least one game. I would pick 1 upset each of 3v14 and 4v13. All of them are reasonable.
I would take either one or two upsets at each of the other opening round seed
matchups. The 12 over 5 upsets are all pretty enticing. The 8 and 9 seeds are
usually pretty close to coin flip. Vegas actually favors all four 8 seeds by exactly
2 points as I type this, but 2 points is nothing. Then let the bracket just descend
into utter chaos after the first round. You could flip coins past that point and do
fine. Just make sure you pick Gonzaga to win it all. In a smaller pool, you might
be best served at least advancing four of the best 9 teams mentioned above to
the final four. Pick one sleeper if you want, but I'd stick to something 9 seed
or better.
As a usual reminder, if you are using my Bracket Generator to make picks, I would
only recommend leaving it at the default setting if you are generating brackets
just for fun, or trying to win a very large pool (1000+ people). Otherwise you
are probably better off sliding it all the way to "conservative" and generating
brackets from that position.
Good luck and have fun bracketing!
ENTRY 56 - 3-14-21 - Selection Sunday in the Pandemic Year - Bracket Projections
EDIT: Looks like everything is going to form on Sunday so everything I say
below is my FINAL bracket projections.
Well, here we are. We suspect a tournament will happen this time. So of
course we're going to analyze it. But everything will be a little different
this year so we'll just have to try to adjust accordingly.
First, let's discuss briefly the early selection committee reveal in
February. This happened on February 13. My biggest takeaway is that the
Big 12 was ranked very highly by the committee. They have beaten each other
up and all ended up with lots of losses. Texas Tech and Oklahoma are the
biggest ones. They both racked up lots of losses. Texas Tech actually has
a losing record against Q1-Q3 (9-10), which is usually disqualifying for the
tournament in general, but people have them as a 6 seed. A 6 seed! They
are 4-10 vs. Q1. This kind of record usually gets into the realm of past
teams that have just thrown away too many chances. Most of those losses
were close though and people give them the benefit of the doubt.
Other general takeaways: The committee promised to still do their duty as
usual and not change their process due to the pandemic. This in particular
means they will still be looking at wins and losses of just what is on the
schedule, and not what "could have been". They did let slip that the
"eye test" is a little more prominent. This could be what went into the
Big 12. They also said road wins are still looked upon highly even in a
time with no fans in the buildings.
Here was the committee's Top 16:
1: Gonzaga, Baylor, Michigan, Ohio St
2: Illinois, Villanova, Alabama, Houston
3: Virginia, West Virginia, Tennessee, Oklahoma
4: Iowa, Texas Tech, Texas, Missouri
close: USC, Florida St, "other big 10 teams"
And my system for comparison:
1: Gonzaga, Baylor, Michigan, Ohio State,
2: Illinois, Virginia, Alabama, Houston,
3: Villanova, West Virginia, Iowa, Wisconsin,
4: Southern California, Florida State, Kansas, Tennessee,
5: Missouri, Colorado, Creighton, Oklahoma,
OK, so let's move on to the main bracket now, which will be revealed today.
As is tradition recently, I will put out and lock my projections on
Sunday morning. The reason for this is I have analyzed past tournaments,
and it seems to be true that the committee has basically locked in their
own bracket as of Sunday morning and only makes changes if a one-bid
league has an upset, or if there's a bid thief. This makes sense because
they have only an hour after the final games end to reveal their bracket,
and there is just not enough time to do revotes and make sweeping changes.
Here is my system as it stands now at noon Eastern:
1: Gonzaga, Illinois, Baylor, Michigan,
2: Alabama, Ohio State, Iowa, Arkansas,
3: Houston, Kansas, Oklahoma State, Purdue,
4: Texas, West Virginia, Louisiana State, Virginia,
5: Villanova, Southern California, Creighton, Brigham Young,
6: Tennessee, Colorado, Oregon, San Diego State,
7: North Carolina, Florida State, Missouri, Saint Bonaventure,
8: Virginia Commonwealth, Connecticut, Clemson, Loyola Ill,
9: Wisconsin, Georgia Tech, Texas Tech, Florida,
10: Rutgers, Colgate, Wichita State, Maryland,
11: Louisville, UCLA, Utah State, Michigan State, Oklahoma,
12: Syracuse, Virginia Tech, Georgetown, Winthrop, Oregon State,
13: North Texas, UC Santa Barbara, Ohio, UNC Greensboro,
14: Liberty, Abilene Christian, Morehead State, Cleveland State,
15: Eastern Washington, Iona, Hartford, Grand Canyon,
16: Drexel, Norfolk State, Mt St Marys, Oral Roberts, Texas Southern, Appalachian State,
Let's go over the main issues. My system has had some problems
this year, in particular with teams that didn't play a nonconference.
I didn't end up having time to try to work out some fixes so it's just
going to have to stand as it is. Here are the primary issues and
places where my system differs most from others:
Florida State 7 seed vs projected 4: My system sees that Florida St
doesn't have particularly good quality wins and has some bad losses.
I think people are looking past this with the eye test. If the NET
represents an eye test, they are 24th there, but 15th in Kenpom. They
got really hot in January and I think people are still seeing that
version of the team.
Texas Tech: Already mentioned above. I have them at 9 and they are
a consensus 6. Same goes for Oklahoma, which is consensus 8 and I
have them at 11.
North Carolina: I have them at 7, projected is 9. They have a
lack of quality wins issue. My system has overrated teams like this
in the past so this is a known issue that needs to be looked into in
the offseason. Not sure though why Florida St gets a pass and UNC does
not. I think it is a visual thing with the Quadrants. Although Florida
St's best wins are very marginally better, they are actually Q1 so they
matter way more in people's eyes. This issue is not fixable in my current
setup since I am approximating the NET in my own system so slight
differences will be a huge deal.
Colgate: My system is duped by them nearly as much as the NET. They
are ranked #9 in the NET, and #12 in my rankings. This is because their
entire conference played no nonconference so there is not a good point
of reference. They have totally demolished everyone in conference. I
think they are a good team (maybe top 60) but not #12. Since my system
has a large reliance on the "eye test", which it counts as team strength,
it has Colgate at a 10 seed vs 13 or 14 projected. If I removed some
of the team strength reliance, it would harm teams like the Big 12 and
Houston, so there is no winning here. I maybe need to qualify the "eye
test" in future seed projections by making sure the high level play
is against sufficiently high competition.
Wichita St: This is my program's biggest potential miss. I have them
as a 10 seed, most have them either on the cut line or out. My system
is not blown away by any one metric but thinks they've done just
enough. It helps that my program has them ranked #47 vs #72 in the NET.
But is anyone really using the NET as a qualifier when it has the
obvious Colgate thing going on? Another thing going on here is
Wichita has not beaten any tourney teams besides Houston once at home.
Some of their wins are pretty decent, away wins vs. 50-100 rated teams,
but these don't move people's needles as much as they probably should.
With all of that said, here are my current projections (myself, not
my system):
1: Gonzaga, Baylor, Michigan, Illinois
2: Alabama, Ohio State, Iowa, Houston,
3: Arkansas, Texas, Oklahoma State, Purdue,
4: Kansas, West Virginia, Virginia, Villanova
5: Tennessee, Creighton, Southern California, Florida State,
6: Brigham Young, Colorado, Louisiana State, Oregon,
7: Texas Tech, Missouri, San Diego State, Clemson,
8: Connecticut, Wisconsin, Saint Bonaventure, Loyola Ill,
9: Virginia Commonwealth, Georgia Tech, Florida, Oklahoma,
10: Rutgers, North Carolina, Maryland, Virginia Tech,
11: Michigan State, Louisville, UCLA, Utah State, Syracuse, Drake
12: Georgetown, Winthrop, Oregon State, UC Santa Barbara,
13: North Texas, Ohio, UNC Greensboro, Colgate,
14: Liberty, Abilene Christian, Morehead State, Cleveland State,
15: Eastern Washington, Iona, Hartford, Grand Canyon,
16: Oral Roberts, Drexel, Norfolk State, Mt St Marys, Texas Southern, Appalachian State,
ENTRY 55 - 3-24-20 - 2020 Basketball Wrap-up - Part 2
It's time for Part 2 of the season ending analysis! In Part 1 yesterday,
I talked about the wrap-up of bracketology. Today, I explain the work
I've done on analyzing a non-existent tournament.
Let's break down our methodology first. I wanted to start with a
reasonable looking touranment that didn't have too many hot takes in it,
and did a reasonable job of following procedures. I looked on
Bracket Matrix for a
bracket that held pretty close to concensus and also had
the whole bracket built with regions. I chose
Inside the Hall by Andy Bottoms.
Of course, no projection was going to be pefect, and every bracket will
have issues with some. For example, this bracket has Kentucky as a 4
seed, which was a little unpopular. But we're going with it just to get
the ball rolling.
This bracket, as well as my system's team strengths, are now loaded
into my
Bracket Simulator for
the year 2020, as if it is the official bracket. You can generate your own
simulations and write your own fake stories about the results if you
want now.
I did make a few necessary modifications to Andy's bracket. 1) He listed
Little Rock twice on the original bracket I grabbed and left out another
auto-bid. I fixed this for him. 2) For each play-in game, I automatically
advanced my system's projected winner to the real bracket. His play-in
games, and my results, went as follows:
11 seed Xavier beats Texas
11 seed Richmond beats UCLA
16 seed Robert Morris beats NC Central
16 seed Siena beats Prairie View
Now, for the results! Usually when I do this, I'll list out the teams
in order by Final 4 chances and Title chances. Let's do it!
Final Four Breakdown
(1)Kansas: 47.7
(1)Baylor: 34
(1)Dayton: 29.1
(1)Gonzaga: 28.4
(2)San Diego State: 19.8
(3)Duke: 17.5
(2)Villanova: 17.4
(2)Florida State: 16.8
(3)Michigan State: 13.2
(2)Creighton: 11.4
(3)Seton Hall: 11
(3)Maryland: 10.5
(4)Kentucky: 10.3
(4)Oregon: 9.3
(4)Wisconsin: 9.1
(5)Ohio State: 8.9
(6)West Virginia: 7.6
(4)Louisville: 6.9
(5)Auburn: 6.6
(5)Brigham Young: 6.2
(7)Houston: 5.5
(7)Providence: 5.2
(6)Michigan: 5.2
(5)Butler: 4.8
(8)Arizona: 4.7
(11)Texas Tech: 4.3
(7)Illinois: 4.2
(10)Rutgers: 3.5
(6)Penn State: 3.2
(7)Virginia: 3.1
(6)Iowa: 3
(9)Marquette: 2.9
(8)Colorado: 2.9
(11)Richmond: 2.8
(9)Oklahoma: 2.5
(8)Saint Marys: 2.5
(9)Louisiana State: 2.3
(12)Cincinnati: 2.1
(9)Florida: 2
(10)Indiana: 1.9
(10)Arizona State: 1.7
(10)Utah State: 1.6
(11)East Tennessee State: 1.2
(8)Southern California: 1.1
(11)Xavier: 1
(13)Akron: 0.6
(12)Yale: 0.5
(13)New Mexico State: 0.4
(13)Vermont: 0.3
(14)Hofstra: 0.3
(12)Stephen F Austin: 0.2
(14)Bradley: 0.2
(14)Belmont: 0.2
(13)North Texas: 0.2
(15)Eastern Washington: 0.1
(12)Liberty: 0.1
(16)Northern Kentucky: 0.1
(14)UC Irvine: 0.1
(15)Winthrop: 0.1
(15)Arkansas Little Rock: 0.1
Champions Breakdown
(1)Kansas: 25.9
(1)Baylor: 10.2
(1)Gonzaga: 9.1
(1)Dayton: 8.4
(2)San Diego State: 4.5
(3)Duke: 4.2
(3)Michigan State: 3.9
(2)Villanova: 3.5
(2)Florida State: 3.4
(2)Creighton: 3
(3)Seton Hall: 2
(3)Maryland: 1.9
(4)Oregon: 1.8
(4)Louisville: 1.5
(5)Ohio State: 1.5
(4)Kentucky: 1.4
(4)Wisconsin: 1.3
(7)Houston: 1.2
(6)West Virginia: 1.2
(5)Auburn: 1
(5)Brigham Young: 0.9
(6)Michigan: 0.8
(7)Providence: 0.8
(8)Arizona: 0.8
(5)Butler: 0.6
(11)Texas Tech: 0.6
(7)Illinois: 0.5
(9)Florida: 0.4
(6)Iowa: 0.4
(6)Penn State: 0.4
(9)Louisiana State: 0.4
(9)Marquette: 0.3
(10)Rutgers: 0.3
(10)Utah State: 0.3
(7)Virginia: 0.3
(11)Richmond: 0.3
(8)Saint Marys: 0.3
(8)Colorado: 0.2
(9)Oklahoma: 0.2
(10)Indiana: 0.2
(12)Cincinnati: 0.2
(10)Arizona State: 0.1
(11)Xavier: 0.1
(8)Southern California: 0.1
(11)East Tennessee State: 0.1
This was shaping up to be a pretty wild year. The 1 seeds were extremely
weak by past standards, besides Kansas who was slightly above average for
a 1 seed. The 2 seeds were also weak. Chaos could have erupted at any
moment. Let's try to quantify that, by looking at some of the typical
metrics I study. Let's start with number of teams that had a chance
to make these two thresholds vs. historic.
At least 10% Chance at Final 4:
2015 - 9
2016 - 18
2017 - 15
2018 - 13
2019 - 12
2020 - 13
At least 1% Chance at Final 4:
2015 - 28
2016 - 43
2017 - 43
2018 - 42
2019 - 34
2020 - 45
At least 10% Chance at Title:
2015 - 4
2016 - 4
2017 - 2
2018 - 2
2019 - 4
2020 - 2
At least 1% Chance at Title:
2015 - 12
2016 - 18
2017 - 24
2018 - 19
2019 - 16
2020 - 20
The numbers share some qualities of both upset-heavy and upset-light years.
Although lots of teams got small nibbles at the Final Four pie, Kansas
dampened a lot of the title chances for others. When picking brackets,
you would have wanted to send Kansas to the Final Four most of the time,
but anything goes in the other three regions.
We don't have a set of true results to look at, but we can at least compare
my system's favorites to a few other possibilities. One easy comparison
is vs. the vegas futures just before the shut down. I put the odds here,
as well as the implied percentage for easy comparison. It is worth mentioning
that adding up all of Vegas's odds across all teams does not result in
a probability of 1. The probability is closer to 1.6. That means many
of these percents are likely inflated by up to 30%. I should also mention
I consulted 3 different books to get an approximation here.
Vegas Implied Odds/Percents
Kansas: 5 to 1, implied 17%
Gonzaga: 8 to 1, implied 11%
Michigan St: 10 to 1, implied 9%
Dayton: 11 to 1, implied 8.3%
Kentucky: 12 to 1, implied 7.7%
Baylor: 12 to 1, implied 7.7%
Duke: 13 to 1, implied 7.1%
San Diego St: 18 to 1, implied 5.3%
Florida St: 18 to 1, implied 5.3%
Seton Hall: 20 to 1, implied 4.8%
Maryland: 25 to 1, implied 3.8%
Creighton: 25 to 1, implied 3.8%
Louisville: 25 to 1, implied 3.8%
These percents already add up to almost 95%, with plenty of decent
contenders left, so probably some of those 2-4 seeded teams are
overvalued. Kansas seems to be undervalued, even as a clear favorite,
because of their last few games which were closer than expected against
mediocre competition.
It is not surprising that San Diego St was undervalued and Kentucky
overvalued. KenPom had an article recently that spoke about
preseason expectations still having some predictive value in the
postseason. That seems to be bearing out here. In spite of Kentucky
not having great metrics, the whole country was still convinced they
were destined for the Final Four.
We can also compare to title chances given by ESPN's BPI. Here are
BPI's top contenders at the moment of the shutdown, available in
an article written on ESPN (one of the many simulation teaser articles
referred to in the previous post)
Kansas: 18%
Duke: 17%
Gonzaga: 15%
Michigan St: 10%
Dayton: 7%
Baylor: 6%
Ohio St: 3%
San Diego St: 3%
Louisville: 3%
Maryland: 2%
The numbers are similar to mine, but in a slightly different order,
and obviously not as heavy on Kansas as mine. Here are KenPom's and
NET's top 10 teams at the time that the season ended, for more
comparison:
KENPOM: Kansas, Gonzaga, Baylor, Dayton, Duke, San Diego St, Mich St, Ohio St, Louisville, West Virginia
NET: Gonzaga, Kansas, Dayton, San Diego St, Baylor, Duke, Mich St, Louisville, BYU, Florida St
If I actually went about making brackets this year for pools, I have
advised in the past that being conservative is generally best in
most pools that are not 100,000+ size. I have a special setting
I have developed in my system for maximizing chances of an elite
level ESPN score which is quite conservative. I may have called it
EV Max before. The point is to run 10,000 simulations and record
how many ESPN points each team is awarded. Take an average of this,
and then when simulating a game, choose the teams in proportion to
their expected value instead of the original probabilities. This
methodology leaves nearly equal games as they are, while exaggerating
the favorite's chances when heavily favored. The result is almost
identical to running the system on SD=0.6, which is most of the way
toward Conservative on the slider. Here is how often I would have
picked each of the top 8 teams to make the Final Four or to be the
Champion in my brackets for various pools this year:
Final Four:
(1) Kansas 67%
(1) Baylor 50%
(1) Gonzaga 44%
(1) Dayton 43%
(2) San Diego St 27%
(2) Florida St 22%
(2) Villanova 22%
(3) Duke 20% (3rd place out of this region)
(3) Michigan St 14%
Champion:
(1) Kansas 53%
(1) Baylor 12%
(1) Gonzaga 9.4%
(1) Dayton 7.4%
(2) San Diego St 2.9%
(3) Michigan St 2.8%
(3) Duke 2.2%
(2) Villanova 2%
(2) Florida St 1.9%
In both cases, (2) Creighton did not make the top 8 list because
(3) Michigan St was favored over them in the Sweet 16 matchup,
and both were in the Kansas death region. Note
again the shape of the distributions, with higher title numbers than usual for
Kansas and the top 2 or 3 contenders, and lower numbers for basically
everyone else. My system had Kansas as such a high favorite that
it believed it would be crazy not to have them going to the Final Four
in most brackets for a pool, and even had it as reasonable
to hedge pretty heavily in picking them for the title in a slight
majority of brackets.
That's all I've got for you today. I'm not going to go as far
as trying to pull in a simulation someone did and dissecting bracket
score thresholds like usual years. I probably won't post again
until summer, where I plan to do some calibration work this summer
to tweak some values, especially in Bracketology. Have a good
day!
ENTRY 54 - 3-23-20 - 2020 Basketball Wrap-up - Part 1
Well, here we are. No tournament to dissect and analyze, yet I am here,
analyzing a tournament that doesn't exist. I don't intend to make some fake
tournament simulation and pretend like it is the real tournament like many
others have done. However, I do think there is still some information to
be gleaned from the info we do have. So, on to the results!
First, the takeaways from the bracketology this season. The recording of my
thoughts here is mostly self-serving, but might be of interest to some people
anyway.
We will never know what the real committee would have done, but I can at least
compare my system's results to what other people thought. There were some
very strange cases that stood out this year and are worth breaking down here
for future use. Let's list them:
GONZAGA: For what seems like the 50th consecutive year, Gonzaga was a unanimous 1 seed
but my program did not think so. In previous years, it at least made sense -
Gonzaga was clearly a great team, and at some point just being great can allow you to
ignore resume flaws. But Gonzaga was not elite, especially earlier in the season,
according to my system. I fixed this early on by significantly increasing the
weight of overall record. I think a bonus may be called for in future years to
account for situations where a team has fewer than 3 losses late into the season.
It's definitely a far cry from a few years ago where all that mattered was
raw number of Q1 wins.
MARYLAND: My system overrated Maryland toward the end relative to others. I had
them as the #6 overall seed, whereas most had them a 3 or even 4 seed. There are a
number of factors working against Maryland - for example, advanced metrics (#11
at KenPom, #18 NET), total number of losses (7), road record (5-6). My system
got too caught up in the Q1 and Q2 records, in particular the quality wins (in
a veritable sea of chances).
OKLAHOMA: Most people had Oklahoma right near the cut-line for most of the season.
I had them comfortably in. They are a classic case of good-lossing their way in.
Some tweaking is definitely needed to recognize such cases.
WICHITA STATE: Another case like Oklahoma of bad Q1 record (2-7), but no terrible
losses. Like Oklahoma, I had them comfortably in most of the season and
projected as a 9 seed, but most people had them out. Why? Their two Q1 wins
were not very good (@Oklahoma St, @UConn). These wins are actually pretty good
(there's a reason they are Q1) but people do not respect those sub-bubble road
wins as much as they should. The perception is that since these are not
tournament teams, they do not really count. I may need to decrease the "best win"
bonus for road games to simulate this.
RUTGERS: I had Rutgers being in serious trouble without the road win over
Purdue near the end. I think their 1-10 road record would have cost them
a bid. Others were not as down on them as me. I don't think changes are needed
here.
TEXAS TECH: This one confuses me. Most people had them solidly in, but I
had them on the cut line or possibly out. Their 3-11 Q1 record smells like
Oklahoma to me. They also really feel like NC State/Clemson from last year,
with a sparkly NET and advanced metrics but no wins to show for it. Last
year in particular I got burned bad for rewarding teams with good metrics.
Hard to say here. Maybe their N-Louisville win just carries that hard.
FLORIDA: I feel like preseason expectations of this team competing for
the title brought them up without reason. Most people had them as a 9 seed
to end the season, I had them at 11. A 19-12 record in the SEC is not
inspiring. They really don't have great wins. There was a perception that
they were coming on strong later in the season, but timing of wins is not
supposed to matter. They did have a decent volume of wins against tournament
teams so maybe that is good enough, even if most of those wins were
at home or neutral.
TEXAS: My system was higher on Texas than most. That might have to do with
the fact that it didn't heavily consider the 30-40 point losses to marginal
teams. Really ugly results may be worth considering.
Let's suppose for now that the final Bracket Matrix ended up being the true
seeds for the tournament. This will not be perfect, because nobody has ever
come even close to the final bracket (measured in points on Bracket Matrix).
Theoretically a perfect score would be 6 points per team, or 408 points.
The typical best score out of 150 bracketologists for a year is 360 points.
This equates to typically missing on 1 team entirely, and being off by a
seed line on 20 or so other teams. That is a lot of error for the best
of the best!
But unfortunately, that's all we've got for this year. So what would my
final bracket score have been vs. the Bracket Matrix? Here's the breakdown:
Correct Selections: 66 of 68 (I have Xavier and Texas in, UCLA and Cincinnati out)
Correct Seed: 40
+/- 1 Seed Line: 19
Off 2 or more: 7
Final Score: 337
This would have been my 2nd best score on Bracket Matrix and a decent effort,
but of course not optimal. It would have been probably below average.
I want to end the discussion here with some things I heard and learned
from the selection committee themselves, from the early reveal and David
Worlock appearing on several podcasts.
The early reveal happened on Saturday, February 8. The committee admitted
they did not take the previous day's results into account, so let's compare
to my seeds through Feb 6, using my current weights (which were changed
after that show):
COMMITTEE:
#1 Baylor, Kansas, Gonzaga, San Diego St
#2 Duke, Dayton, Louisville, West Virginia
#3 Maryland, Florida St, Seton Hall, Villanova
#4 Auburn, Oregon, Butler, Michigan St
Close misses: Iowa, Kentucky, LSU
MY SEEDS:
1: Baylor, Kansas, San Diego State, Duke,
2: Louisville, Gonzaga, Maryland, West Virginia,
3: Florida State, Auburn, Seton Hall, Dayton,
4: Butler, Villanova, Penn State, Oregon,
Close misses: Michigan St, Creighton, Marquette, Iowa
LSU = 7 seed, Kentucky = 6 seed
The committee gave Dayton a bump on the eye test, because they sure
didn't have results to back it up. Their best win all season was
N-St Marys. That is not 2-line material usually. Gonzaga also
got a bump for their road wins in the Pac 12, even though those teams
were a little suspect this year. Maybe they just didn't put their full
heart into this reveal? Apparently they didn't care about Kentucky's
gaffe against Evansville much. This leads me to believe Kentucky
probably would have been a 3 seed in the final reveal. People
were overall surprised by the lack of respect the committee had for
Penn St. LSU had a gaudy record (17-5, 8-1 SEC) but not really any
great wins to show for it, in fact best win at the time was either
H-Florida or N-Arkansas. Yikes. Black marks vs. Vanderbilt and
East Tennessee St too.
The impression I got from all of the podcasts I listened to
is that they really pay a lot of attention to the work teams
do away from home. I am tempted to create new metrics in my
system to track power rating performance in R/N only, as well
as a best-win weighting that only cares about R/N wins. On the
other hand, some of the comments above seem to fly directly in
the face of this idea. Who knows. This year more than
ever, who knows.
That's all for today. I'll come back tomorrow or at least some time
this week and do a Part 2, where I do some simulations with a mock
bracket I chose and see how everything turns out. And by simulations,
I mean like 10,000 of them to get big picture type results, NOT
single simulation teasers.
ENTRY 53 - 8-15-19 - Bracketology Difficulties
I promised in the last post that I would do some follow up on last year's
bracketology situation. I've been working hard to improve the accuracy
of the seed predictor but have uncharacteristically been having trouble this
time around.
To put it simply with last year, I stated in the immediate bracket reveal
reaction that I "planted my flags in all of the wrong places". Primarily,
this flag I planted was in believing that with the creation of the NET rankings,
the committee would be putting a lot more stock into the computer strength
of teams now. This turned out to be incorrect. For the most part, the
committee is still using the new NET only for the purpose of creating Team
Sheets and separating wins and losses into four Quadrants. The basic principle
of analyzing resumes based on wins and losses and who they were against seems
to be intact.
My bracket received 319 points on the Bracket Matrix, which is completely
unacceptable. I only placed ahead of 10 other bracket predictions out of
almost 200. I incorretly placed 4 teams into the field, which is a personal
worst. My system had TCU, NC State, Indiana, and Clemson in the field. All
of these teams had a common feel to them - they are teams that had strong
computer numbers, but just lost a lot of games. Indiana and TCU had terrible
conference records at 8-12 and 7-11 respectively, but that did not doom
them alone because 7-11 Oklahoma got in (and as a 9 seed even). Clemson and
NC State had a different problem - their WL record and conference records
were not too bad, but they just didn't have a lot of marquee wins. They blew
nearly all of their opportunities.
This was a sentiment the committee cited again and again when interviewed
on selection day. They said "What did these teams do with their chances?
They did not have enough Quadrant 1 wins considering the number of chances
they received."
On the other hand, the 4 inclusions that I missed were
Ohio State, Temple, St Johns, and Belmont. Temple and Ohio State were my
first two teams out. Ohio State had an 8-12 conference record just like
Indiana. Ohio St had about the same mark vs. Quadrant 1, very similar.
Ohio St did not have as many marquee wins (Indiana won at Michigan St).
Indiana does have 15 losses vs 14 for Ohio St, but the committee showed
recently that a 15 loss Vanderbilt was able to get in. Indiana was
a little better in the computers. The only really stinky metric for
Indiana was RPI, which is apparently not used any more. Apparently.
I just don't understand that inclusion.
Temple was only 2-5 vs Quadrant 1 with little of consequence in terms
of quality wins. They did win at home against Houston. Their best win
away from home is Davidson neutral. They never scored a road win against
a tournament team. They also were a decent 6-2 vs. Quadrant 2.
Belmont ran through the OVC but lost in the conference championship game.
They only played 3 Quadrant 1 games but won 2 of them. This I think is
what the committee was looking at in terms of "making the most of their
chances". I don't personally see how that offsets playing almost 2/3 of
your games in the Quadrant 4 zone and going 8-5 in the other 3 Quadrants.
Saint Johns is the worst of all. They went 8-10 in a mess of a Big East.
Their computer numbers (and even the NET) put them down in the vicinity
of #70 or so. Their non-con SOS was pretty bad (around 200). So they had
good wins to offset all of this, right? Well, they won at Marquette. And
they beat Villanova at home. That's pretty much it for really impressive
results. They are only 5-4 against Q2, including winning only 1 out of 3 vs.
Depaul during the year. 5-7 road/neutral record. I don't see where
the appeal is here.
So what can be done? Maybe there is some factor I haven't considered that
can explain some of these differences?
The most likely possible culprit is the NET. I am currently using an approximation
of the NET to make my quadrants, but it is not perfect. Just look at
Indiana. I have Indiana as 4-3 in my approximation of Q2. But the real
NET has them at 3-6 against Q2. This may have been a big factor against
them. It shouldn't, really, but it does. Just a few teams sitting on
the fence between classifications can really swing a stat like that
and make it look much worse. A few Q2 wins becoming Q1 wins can make
things a lot more pleasant. Example - Clemson was 1-10 against the NET
Q1. This really presents a strong case against them. However, in my
system's approximation, Clemson was 3-12 against Q1. My approximation
counted wins at South Carolina and home against Syracuse as Q1. It
also counted losses N-Creighton and at Miami in Q1. 3-12 is still not
great, but rings better than 1-10.
There are some options here. It is against the philosophy of my system
to take in another system's rankings as input to make decisions. I am
trying to use only game results to make all system rankings. Thus I
cannot look at the current NET and have to make an approximation. But
I can at least study the NET and make my approximation the best I can.
I took note of the NET rankings at 6 different times during the year
and can run some analysis to try to line up an approximation of it to
best match it during these 6 times. This will take time and requires
creating new infrastructure and heavy programming. I'll need at least
a month for this option.
The other option is to do the best with what I have and change my weights
to best match what happened. I might be able to get 2 of the 4 missed
teams in and leave the other two as lost causes. This is what I have
been doing for the last few days. If I just use the weights from the
2018 season's bracket matrix, I produce an improvement already to a score
of 326, a 7 point jump. By further running a genetic algorithm from here
to test different weight configurations,
I was able to find a set of weights to produce 332. This is the best
I've done so far, and is still not even top 50% on Bracket Matrix. The
main changes I needed to make to weights to produce this higher
score make a lot of sense given the committee's remarks: I raised
the weight of % vs. Q1 and Q2, decreased the weight of computer strength,
and put more emphasis in non-con SOS.
So what if there is some ranking factor I have not considered yet? One
metric I have been playing with for the last year is a Strength of
Record metric. I have made a simple SOR metric but it has some flaws
and still needs time to develop. The main problem is it's difficult to
produce a good SOR number that is also not computation intensive. Another
metric I already have available and can run tests on is Average Win and
Average Loss. This again would ideally use NET rankings of such but I
am forced to use my approximation. It is possible these could provide
some value in moving teams about. The SOR in particular seems to
have promise and can explain moving some of the teams into and out of
the field. For example, St. Johns scores relatively highly in my
current version of SOR and weighting this higher could get them in.
Clemson and TCU have low SOR scores.
In conclusion on this post, there is a lot of work to do to get the
system generating accurate seeds and it seems to all start with trying
to approximate the NET the best I can using only my own system's tools.
You will not see an update here in at least several months, possibly
even until during the basketball season.
ENTRY 52 - 8-11-19 - Does my system do better historically than just using typical seed strength?
It is time to see if my observations from the last few NCAA Tournaments have
any merit in the long run.
To summarize my observation, it seems that in most recent years, my system
would generate much better brackets if I completely ignored my own (and all
other) computer power rankings and just trusted the seeds given by the
committee (along with what those seeds imply in terms of strength). This
can't possibly be true, can it? The committee must seed based on resume, which
would seem to be disconnected from strength somewhat. There have been
obvious cases recently where teams were not seeded even close to correctly
based on strength (some mid majors like Wichita St and Gonzaga come to mind).
My methodology here will be to use the same SD setting (frequency and size
of upset picks) and run two sets of 10,000 simulations - one with my system's
assigned team strength, and one with typical seed strength. I am running less
simulations than before because I do not need enough simulations to ensure
results with very high scores. It would also be very time consuming to do
the analysis with 100,000. I will be looking primarily at the 90th, 95th,
and 99th percentiles for each set of simulations to do the comparison. All
simulations will be done at SD=0.65 (a setting that has been good at producing
results at around the 95-99 percentile).
Unfortuately, I do not have ESPN data for any of these years so I cannot
say whether I am outperforming the public, only against myself. The 2015 tournament
was already analyzed in Blog Posts 7-10. Keep in mind if you look at that data
that I changed my SD setting definition after that year, so a SD=1.4 in 2015 is about
equivalent to SD=1.0 now.
99.9, 99th, 95th, 90th
2015
Reg - 1580 - 1440 - 1240 - 1090
Typ - 1580 - 1500 - 1370 - 1270 T
2014
Reg - 1120 - 880 - 760 - 710 R
Typ - 830 - 770 - 720 - 690
Best in Family Group 860
95% ESPN = 770
2013
Reg - 1520 - 1380 - 1240 - 1180 R
Typ - 1360 - 1200 - 1120 - 1080
Top Group 1160
2012
Reg - 1620 - 1520 - 1400 - 1340 R
Typ - 1550 - 1440 - 1270 - 1190
Top Group 1440
2011
Reg - 1130 - 780 - 620 - 580
Typ - 1120 - 770 - 630 - 590 tie
2010
Reg - 1390 - 1250 - 1140 - 1060
Typ - 1360 - 1240 - 1160 - 1110 tie
2009
Reg - 1570 - 1400 - 1260 - 1180
Typ - 1640 - 1540 - 1380 - 1310 T
2008
Reg - 1680 - 1580 - 1440 - 1330
Typ - 1670 - 1580 - 1450 - 1350 tie
2007
Reg - 1690 - 1590 - 1460 - 1380
Typ - 1730 - 1640 - 1520 - 1420 T
2006
Reg - 1370 - 1100 - 850 - 780 R
Typ - 1230 - 930 - 830 - 760
2005
Reg - 1560 - 1470 - 1340 - 1240 R
Typ - 1500 - 1390 - 1300 - 1170
2004
Reg - 1510 - 1300 - 1090 - 930 R
Typ - 1360 - 1240 - 970 - 840
2003
Reg - 1350 - 1080 - 940 - 870 R
Typ - 1230 - 1030 - 920 - 860
2002
Reg - 1470 - 1330 - 1230 - 1180
Typ - 1440 - 1330 - 1250 - 1200 tie
2001
Reg - 1600 - 1500 - 1360 - 1270 R
Typ - 1570 - 1450 - 1300 - 1220
So there you have it. If you include also the 5 most recent years, that makes
19 years of data. The number of wins for each system is:
Regular - 9
Typical Strength - 6
Virtual Tie - 4
And the regular program comes out on top! A caveat - a lot of its victories
came at the very beginning of the data, when the committee didn't really
know what it was doing. In more recent years they have done a better job
of seeding, and the seed strength has done better. I guess the only way to tell
for sure is to let things run for about another 50 years and have some solid
data to work with. For now, I think it is most prudent when generating
brackets for my pools to do a mix of program and seed strength until other
evidence emerges.
The regular strength program came out on top exactly when you'd expect - when
it made a call on an underseeded, strong 2 or 3 seed that ended up paying
off and making the Final Four. The Typical Strength system won out when
the tournament went mostly to form in terms of seeds, or when the upsets
that happened were seemingly random. They tended to be pretty close when
there was mass chaos and everyone's brackets were hurting regardless of
what they thought about any of the teams.
This concludes my current look into my bracket generator. For the next
post, I will be doing some work in analyzing the bracketology system.
The 2019 bracket showed that the committee may be changing its views on
what the important criteria are. I must do the same to match them.
ENTRY 51 - 8-10-19 - System Bracket Generation Performance (2016)
This is a continuation of the last four posts, in which I will be analyzing my
bracket generator over the last few tournaments to answer the questions: Does
my system generate better brackets than humans? And what is the best strategy
and level of upsets to pick for different sizes of pools?
The 2016 tournament is our focus today. Note that some aspects of my previous
years analysis may be missing because I collected less data in the earlier
tournaments. Let's start the discussion by reminding ourselves what actually
happened in the 2016 tournament, and what advice I gave in my analysis after
the bracket reveal (Blog Entry 29)
Kansas came into the tournament as the favorite. UNC was also right there.
Virginia and Oregon received 1 seeds but nobody knew whether to trust them yet.
The first round saw about an average distribution of upsets, besides the
surprise 15 seed victory of Middle Tenessee over Michigan State. This was
especially surprising because most had Michigan St winning that region. The sweet
16 was pretty average, with (10) Syracuse and (11) Gonzaga the double digit
representatives and facing each other. Gonzaga was probably strong enough to be a 6 or 7 seed
so this was not totally surprising. Syracuse ended up making the Final Four, together with
blue bloods (2) Villanova, (2) Oklahoma, and (1) UNC. Villanova defeated
UNC in the title game.
Here is some of the advice I gave, and how it held up this year. Note
that Blog Entry 30 and 31 already have some analysis of the most likely
upsets so I will not repeat this, but will speak more about the bigger
picture.
- "The top of the field is more wide open. Definitely cover your bases
when choosing title teams in multiple brackets" (Good advice if a 1 seed didn't win I'd say)
- "Go with some crazy. Would not be surprised to see some double digit seeds in the final
four this year." (Yep! I probably wouldn't have picked Syracuse in particular, but I'll take it)
- "Kansas is the favorite with about an 18% chance to win the title.
(They lost in the Elite 8, which isn't ideal)
- "If you want to pick some 1 or 2 seeds to lose early, Oregon and Xavier are
good targets." (Xavier lost after 1 round, Oregon in the regional final, so yay)
- "I think UNC is vulnerable, not because they are bad, but because the Indiana/Kentucky
winner and West Virginia will both be very dangerous" (Nope on all counts here)
- Some late round outright seed upsets called:
- (3) West Virginia over (2) Xavier (both teams lost early so we'll call it a wash)
- (2) Michigan State over (1) Virginia, then (1) UNC (yikes! this will hurt)
So there was some good and some bad advice this time around. As usual,
the bad advice tended to be the advice that cost more bracket points, so
I suspect our analysis will give another year where the public is beating
my generation results. Hard to say for sure, because I remember the public
having the same blind spots as my system.
What are the chances of a perfect bracket with SD=1.0 through each round?
Round 1 - 1 in 128 million (historically quite unlikely due to Middle Tenessee)
Round 2 - 5.49 x 10^-12
Round 4 - 1.76 x 10^-15
Perfect Bracket - 3.6 x 10^-16 (A bit low, but most of this is due to the one big upset)
Probability of perfect bracket maximized at SD=1.08, P = 3.74 x 10^-16
The SD=1.08 marks this year as fairly average besides the Middle Tennessee
upset. Without that it would have been pretty close to 1. Syracuse did
also reduce the probability, but keep in mind the rest of the bracket
went pretty much to form to compensate.
I do not have access to the ESPN public's selection rates for each Final
Four team. I can at least show you my own and compare to previous years.
My system's probability
of picking each Final Four team correctly, as per Blog Entry 29:
UNC 30.4%
Oklahoma 22.9%
Villanova 20.8%
Syracuse 1.1%
All Four: 1 in 6278
My system's top two most likely teams to make the Final Four,
(1) Kansas and (2) Michigan St, both took hits early. This means
it is unlikely for decreasing the upset level to help much. The
composition of teams here is once again nearly identical to 2017 and
2018. Let's try with SD=0.65 and see what the looks like:
UNC 38.3%
Oklahoma 29.7%
Villanova 15.5%
Syracuse 0.2%
All Four: 1 in 28358
Just as in previous years, the more conservative setting helps get
the two best teams in, but the hit to Syracuse hurts too much for
perfect bracket chances. This means we expect conservative to again
be better at lower score thresholds, but worse at higher thresholds.
What about the new Typical Seed Strength setting discussed in the last
blog post? It might help a lot here to kick Michigan St back a rung
and help get other teams in. The percentages with this setting look like:
UNC 63.7%
Oklahoma 26%
Villanova 24.6%
Syracuse 0.1%
All Four: 1 in 24544
Well, two of the teams are much higher. Syracuse being lower ultimately
doomed our chances of perfect brackets, but having good mid-level brackets
with this setting is something I would highly expect at this point.
Time for the percentiles! We must discover which bracket setting is
best for each threshold goal, and whether we've outdone the public. As a reminder,
the first post includes more information about what you see in this table, but
just know that each score threshold (row), the lowest number is the best. ESPN
had about 13 million brackets generated this year. My system had 100,000 simulations
done at each setting.
Points | ESPN | SD 0.5 | SD 0.65 | 0.8 | 1 | 1.2 | 0.65 Seed Typical |
1700 | (Rank 9) | (none) | (none) | (none) | (none) | (none) | (none) |
1660 | (Rank 38) | (none) | (none) | (none) | (none) | (none) | (none) |
1620 | 0.99999 | 0.99999 | (none) | 0.99999 | (none) | (none) | 1 |
1580 | 0.99993 | 0.99998 | (none) | 0.99999 | 0.99998 | 0.99999 | 0.99994 |
1540 | 0.99971 | 0.99983 | 0.99985 | 0.99989 | 0.99992 | 0.99993 | 0.99965 |
1500 | 0.99920 | 0.99927 | 0.99945 | 0.99953 | 0.99968 | 0.99981 | 0.99874 |
1460 | 0.99846 | 0.99843 | 0.99866 | 0.99897 | 0.99917 | 0.99953 | 0.99731 |
1420 | 0.99745 | 0.99738 | 0.99763 | 0.99793 | 0.99854 | 0.99894 | 0.99518 |
1380 | 0.99625 | 0.99578 | 0.99606 | 0.99637 | 0.99749 | 0.99810 | 0.99268 |
1340 | 0.99494 | 0.99417 | 0.99406 | 0.99445 | 0.99573 | 0.99681 | 0.98998 |
1300 | 0.99358 | 0.99218 | 0.99163 | 0.99224 | 0.99353 | 0.99515 | 0.98711 |
1260 | 0.99173 | 0.98969 | 0.98890 | 0.98964 | 0.99120 | 0.99296 | 0.98374 |
1220 | 0.98911 | 0.98675 | 0.98536 | 0.98630 | 0.98801 | 0.99017 | 0.97962 |
1180 | 0.98526 | 0.98239 | 0.98060 | 0.98185 | 0.98404 | 0.98643 | 0.97359 |
1140 | 0.98085 | 0.97741 | 0.97467 | 0.97572 | 0.97831 | 0.98175 | 0.96642 |
1100 | 0.97579 | 0.97140 | 0.96677 | 0.96805 | 0.97126 | 0.97554 | 0.95762 |
1060 | 0.97000 | 0.96463 | 0.95828 | 0.95874 | 0.96215 | 0.96734 | 0.94782 |
1020 | 0.96200 | 0.95784 | 0.94930 | 0.94948 | 0.95222 | 0.95755 | 0.93645 |
980 | 0.94700 | 0.94909 | 0.93938 | 0.93933 | 0.94189 | 0.94740 | 0.91849 |
940 | 0.91800 | 0.93312 | 0.92579 | 0.92752 | 0.93128 | 0.93671 | 0.88639 |
900 | 0.88000 | 0.90453 | 0.90525 | 0.91178 | 0.91885 | 0.92613 | 0.83501 |
860 | 0.82000 | 0.86491 | 0.87543 | 0.88774 | 0.90202 | 0.91316 | 0.76535 |
820 | 0.76500 | 0.81421 | 0.83554 | 0.85429 | 0.87690 | 0.89394 | 0.68235 |
780 | 0.70000 | 0.75475 | 0.78495 | 0.81381 | 0.84356 | 0.86714 | 0.59668 |
740 | 0.63300 | 0.69318 | 0.72912 | 0.76326 | 0.80204 | 0.83356 | 0.51561 |
700 | 0.55800 | 0.62514 | 0.66908 | 0.70683 | 0.75292 | 0.79251 | 0.43371 |
660 | 0.47500 | 0.54189 | 0.59764 | 0.64269 | 0.69800 | 0.74456 | 0.34506 |
620 | 0.38700 | 0.43963 | 0.51137 | 0.56859 | 0.63385 | 0.68787 | 0.25059 |
580 | 0.30000 | 0.32783 | 0.40819 | 0.47766 | 0.55781 | 0.62044 | 0.16310 |
540 | 0.21900 | 0.21535 | 0.29813 | 0.37454 | 0.46435 | 0.53966 | 0.09211 |
500 | 0.15100 | 0.11732 | 0.19080 | 0.26511 | 0.35798 | 0.44158 | 0.04195 |
460 | 0.09900 | 0.05141 | 0.10267 | 0.16368 | 0.24957 | 0.32947 | 0.01519 |
420 | 0.06400 | 0.01674 | 0.04515 | 0.08333 | 0.15096 | 0.21902 | 0.00433 |
380 | 0.04200 | 0.00390 | 0.01479 | 0.03384 | 0.07431 | 0.12571 | 0.00097 |
340 | 0.02900 | 0.00045 | 0.00310 | 0.01049 | 0.02908 | 0.05864 | 0.00020 |
There are some interesting things to discuss here. First, Typical Seed Strength
setting again outperformed all other columns significantly, except perhaps at
the very high end. This was to be expected with the nature of some of the upsets
in the tournament. (2) Michigan St was liked by both system and public, but
Typical Strength didn't care.
More interesting than that, for once the SD=0.5 setting is not obviously the
best of my system settings this time. It was the best of them on the ranges
0-900 and 1380+, but there is a gap in the middle where SD=0.65 was the best.
The public also underperformed my systems during the interval 980-1460. If picking
all top seeds to win, the bracket score would have been about 720. The ranges above
represent significant improvement, through either getting 2 more Final Four teams in,
or getting the champion correct.
For elite level brackets, my system faltered this time. Out of all 6 settings
and a total of 600,000 simulations, only two brackets would have cracked the ESPN
top 100 (both of them with score 1640). By multiplying by 22 to simulate producing
the same 13 million brackets that ESPN did, I only would have had 44 top-100 worthy
brackets.
My family group had a maximum of 1120 and about 50 brackets. At SD=0.65 and
generating 50 brackets, I would have had a 78% chance of producing a better
bracket than 1120. With SD=0.65 and Typical Seed Strength, this increases to 85%.
The median 50-bracket pool in this scenario would have had a high score of 1320.
The only other group I monitored was an ESPN group of size 550 brackets. The
high score there was 1480. If my system produced 550 brackets, its chances of
beating 1480 at SD=0.65 are about 40%, and at SD=0.65 Typical Strength are about
60%. I noted in my data that the top scores in that group were 1480, 1460, then
the third highest was 1260. Also, 14 of the 550 brackets chose Villanova as the
champion (2.54%). If we assume this approximates the percentage of all ESPN brackets
choosing Villanova, this explains the underperformance we noted in the public brackets
in the middle of the score range. My system's actual odds for Villanova winning were
6.7%, and SD=0.65 chose Villanova 4.5% of the time, both better than the public.
IN SUMMARY: There definitely seems to be something to ignoring
power rankings and using Typical Seed Strength in some of the generated
brackets. I think at SD=0.65 my system barely squeaked out a win over the
public because on the range most people would be interested in, it had
a higher chance to produce such brackets. Typical Seed Strength once again
just demolished everyone.
I think for tomorrow's post I'll do some research going back much earlier to
see if the typical seed strength setting outperforms over a long period of time.
I can go back about 20 years, and although this is a small sample size, it's the
best I can do right now. I must know which one is better over time!
ENTRY 50 - 8-9-19 - System Bracket Generation Performance (2017)
This is a continuation of the last three posts, in which I will be analyzing my
bracket generator over the last few tournaments to answer the questions: Does
my system generate better brackets than humans? And what is the best strategy
and level of upsets to pick for different sizes of pools?
Today we analyze the 2017 tournament. Note that I did not collect quite
as much data from tournaments that happened longer ago, so parts of the analysis
I did on the 2018 and 2019 tournaments may be missing.
Let's start 2017 with a brief reminder of what transpired during the tournament,
and how my advice for bracket picking from Blog Entry 36 holds up.
The tournament strikes me as a pretty average tournament. (1) Villanova and (1) UNC
were expected by the public to fight for the title. I don't have information
on which one was favored by public to win it all. Villanova bowed out quite
early to (8) Wisconsin and that region ended up going crazy, with (7) South
Carolina coming out of the mix. (1) Gonzaga and (1) UNC survived their regions
to make the Final Four, and the last member was (3) Oregon, which limped into
the tournament with an injured big man and was a popular pick to lose early.
The only other significant upset run was (11) Xavier through to the Elite 8.
Wichita was an interesting dark horse before the tournament with a 10 seed but
the strength of more like a 3 seed, but could not beat (2) Kentucky.
How did my advice for picking the bracket turn out?
- My system had Villanova and Gonzaga as cofavorites to win it all at 12% each.
Gonzaga held up their end of the bargain and made the title game, but Villanova
lost early. UNC was my 4th highest title team at 7.9%.
- I stated that the power level was fairly flat and many teams had a shot
at a Final Four and a title. Although there were some early 1v8 and 2v7 upsets,
the Final Four teams were mostly standard.
- I observed a drop off in power at around the 7-8 seed lines, likely meaning
less upsets in the first round. This was spot on, as there was only one 12-seed
upset and no upsets involving 13+ seeds.
- I did observe in my summary to expect probably 1 big upset (14+ seed), but this
did not happen.
- I recommended using higher SD numbers this tournament because of chaos that
would break out after the first round. I think this mostly did not happen, as
in the second round there were 4 upsets in 16 games, and things went mostly
to form toward the end of the tournament.
- My system called outright for the following seed upsets:
- (10) Marquette over (7) South Carolina (real bad call)
- (9) Vanderbilt over (8) Northwestern (nope, but minimal bracket impact)
- (10) Wichita over (7) Dayton (yes, minimal bracket impact)
- (6) SMU over (3) Baylor (SMU lost first round to USC)
- (2) Kentucky over (1) UNC (very bad, calls for elimination of eventual champion)
- (2) Kentucky over (1) Kansas (doesn't matter)
There was some good advice in there about avoiding big upset picks, but most
of the other advice (derived from system simulations) turned out to be pretty
smelly. Based on our 2019 and 2018 analysis we've already done, using higher
SD numbers is probably always a ticket for disaster. Calling for UNC to lose
early is really going to hurt our high end bracket chances.
Let's move on to chances of perfect bracket at each round (SD=1):
Round 1 - 1 in 127,796 (extremely likely historically)
Round 2 - 1 in 2.4 billion
Round 4 - 9.18 x 10^-14
Perfect Bracket - 1.83 x 10^-14 (right about average)
Probability of perfect bracket maximized at SD=0.73, P = 3.29 x 10^-14
Not much to say here. Round 1 was one of the highest probabilities I've seen,
which means very few upsets. Things went crazy in Round 2, then leveled out again.
Let's look at final four incidence rates next. My system's probability
of picking each Final Four team correctly, as per Blog Entry 36:
Gonzaga 34.2%
UNC 26.5%
Oregon 16.7%
South Carolina 1.7%
All Four: 1 in 3887
This composition is very similar to the 2018 group, with three more likely
candidates and an unlikely one in South Carolina. Compare this to the public's
chances at getting each one right as per ESPN data:
Gonzaga 34.6%
UNC 43.0%
Oregon 11.0%
South Carolina 0.6%
All Four: 1 in 10184
Just like 2018, the public had a good eye for UNC, but failed to give South
Carolina quite as much credit as they deserved. THe public's chances of getting
2 or 3 teams correct are higher, but chances of getting all 4 correct are lower.
I unfortunately do not have data on how many real ESPN brackets had the final
four correct.
Let's also investigate my system's Final Four percentages if I set
it to SD=0.65, a much more conservative setting:
Gonzaga 47.3%
UNC 36.3%
Oregon 18.8%
South Carolina 0.2%
All Four: 1 in 15490
As has been a recurring theme, this set of percentages should make my
system superior to the public for getting 3 teams right, but is the worst of
all at getting all 4 of them because South Carolina is hurt so much.
Bring on the percentiles, and let's find out which system wins at each
threshold! As a reminder, the previous two posts explain what the numbers
in the chart mean, and in general, lower numbers are better. ESPN had a total
of 18.8 million brackets in its challenge this year. My system ran
100,000 simulations at each setting in the table.
Points | ESPN | SD 0.5 | SD 0.65 | 0.8 | 1 | 1.2 | 0.65 Seed Typical |
1700 | (Rank 8) | (none) | (none) | (none) | (none) | (none) | (none) |
1660 | (Rank 48) | (none) | 1 | (none) | (none) | (none) | 0.99999 |
1620 | 0.99998 | 0.99995 | 0.99998 | 0.99999 | 0.99998 | (none) | 0.99994 |
1580 | 0.99992 | 0.99978 | 0.99984 | 0.99994 | 0.99992 | 0.99997 | 0.99977 |
1540 | 0.99967 | 0.99919 | 0.99955 | 0.99975 | 0.99981 | 0.99992 | 0.99923 |
1500 | 0.99899 | 0.99796 | 0.99871 | 0.99918 | 0.99943 | 0.99975 | 0.99766 |
1460 | 0.99700 | 0.99552 | 0.99718 | 0.99824 | 0.99861 | 0.99930 | 0.99390 |
1420 | 0.99400 | 0.99138 | 0.99423 | 0.99614 | 0.99738 | 0.99865 | 0.98566 |
1380 | 0.98900 | 0.98566 | 0.99012 | 0.99298 | 0.99547 | 0.99759 | 0.97055 |
1340 | 0.98400 | 0.97875 | 0.98453 | 0.98945 | 0.99290 | 0.99572 | 0.95188 |
1300 | 0.97900 | 0.97161 | 0.97865 | 0.98510 | 0.98960 | 0.99341 | 0.93618 |
1260 | 0.96900 | 0.96431 | 0.97262 | 0.98027 | 0.98577 | 0.99066 | 0.92322 |
1220 | 0.95700 | 0.95494 | 0.96528 | 0.97431 | 0.98132 | 0.98714 | 0.90550 |
1180 | 0.94100 | 0.94234 | 0.95517 | 0.96693 | 0.97561 | 0.98304 | 0.88233 |
1140 | 0.91900 | 0.92779 | 0.94174 | 0.95654 | 0.96800 | 0.97757 | 0.85287 |
1100 | 0.89200 | 0.91066 | 0.92555 | 0.94313 | 0.95805 | 0.96945 | 0.82082 |
1060 | 0.87200 | 0.89215 | 0.90820 | 0.92784 | 0.94534 | 0.95916 | 0.78564 |
1020 | 0.84900 | 0.87172 | 0.88970 | 0.91120 | 0.93105 | 0.94663 | 0.75427 |
980 | 0.82900 | 0.85137 | 0.87317 | 0.89503 | 0.91696 | 0.93425 | 0.72889 |
940 | 0.81000 | 0.82599 | 0.85441 | 0.87988 | 0.90302 | 0.92125 | 0.70130 |
900 | 0.78800 | 0.78918 | 0.82952 | 0.86182 | 0.88839 | 0.90967 | 0.65808 |
860 | 0.76000 | 0.73914 | 0.79303 | 0.83567 | 0.87088 | 0.89540 | 0.59733 |
820 | 0.72200 | 0.67773 | 0.74598 | 0.79920 | 0.84512 | 0.87587 | 0.52933 |
780 | 0.65100 | 0.61022 | 0.68936 | 0.75210 | 0.80937 | 0.84898 | 0.46019 |
740 | 0.60500 | 0.53927 | 0.62615 | 0.69535 | 0.76403 | 0.81297 | 0.39359 |
700 | 0.55100 | 0.46723 | 0.55953 | 0.63331 | 0.71187 | 0.76932 | 0.33100 |
660 | 0.48800 | 0.39051 | 0.48933 | 0.57046 | 0.65472 | 0.71857 | 0.26520 |
620 | 0.40800 | 0.30783 | 0.41449 | 0.50001 | 0.59326 | 0.66175 | 0.19610 |
580 | 0.33100 | 0.22210 | 0.32911 | 0.41980 | 0.52222 | 0.59745 | 0.13097 |
540 | 0.24800 | 0.14335 | 0.23863 | 0.32967 | 0.43682 | 0.51981 | 0.07539 |
500 | 0.17300 | 0.07954 | 0.15370 | 0.23473 | 0.33832 | 0.42544 | 0.03659 |
460 | 0.12100 | 0.03704 | 0.08516 | 0.14653 | 0.23486 | 0.31928 | 0.01471 |
420 | 0.08200 | 0.01303 | 0.03748 | 0.07534 | 0.13968 | 0.21307 | 0.00427 |
380 | 0.05900 | 0.00291 | 0.01172 | 0.03021 | 0.06693 | 0.11871 | 0.00086 |
340 | 0.03300 | 0.00043 | 0.00254 | 0.00843 | 0.02443 | 0.05092 | 0.00012 |
The 2017 data has some interesting patterns we can discuss. First, the uninteresting
one (as we've been discovering) - The lower the SD number (and thus more conservative
the bracket), the better it is at almost every threshold. Since the optimal SD number
for perfect brackets is 0.73, we would expect with more data to have that be the best
above some threshold like 1800 that doesn't appear here. The SD=0.5 generator beat
the public on every threshold except for roughly from 900-1180. This range seems
like the sweet spot for a bracket that is otherwise not very good but gets the title
game players or the champion correct. This would explain the difference because
the public did have a much better read on the eventual champion. However, at all
other thresholds my system's level of upset picking won it through. Likewise, the
SD=0.65 generator beat the public on 1500+ and about 580 and below.
Now for the interesting bit - I ran the generator one more time with SD=0.65 and
what I call "Typical Seed Strength". This setting ignores all information about the
individual teams and just uses average team strength of teams in the past that have
received that seed. Thus, all 1 seeds are equal, all 2 seeds are equal, etc. I put
these results on the far right of the table. These results demolish ALL of the other
columns! It seems odd that discarding everything we know about the season and going
with seeds only should have that profound of an impact, since we would think predictive
systems should do better than committee resume-based placements. So is there
anything to this in the long run?
I checked this for the 2018 and 2019 tournaments from the previous two Blog posts.
In 2018, the Typical Seed Strength setting again dominated all other methods. But
in 2019, it far underperformed the other settings. I speculate that this will
continue to be random and mostly determined by the tournament's champion. In 2019,
the system favorite won so it would hurt the generation to modify the champion's strength.
However, in 2018 and 2017 my system slightly underestimated the eventual champion
relative to the seed it received. My conclusion from this - it may be worth using
a few of my bracket slots on this setting to account for a possibly over or under rated
team (due to injuries, for example). Unfortunately, I currently have no way to
allow others to generate brackets in this fashion on the site, but may add it as
an option.
For elite level brackets, my 5 regular settings between them would have managed
a 1660 and three 1650s, or 4 scores in the ESPN top 100. Multiplying by 36 to give
my program the same number of brackets generated as ESPN public (18.8 million)
results in 144, so about the same effectiveness as humans, but the small sample size
does not allow us to say anything totally conclusive. The Typical Seed setting
generated a 1670, a 1660, and a 1650 just by itself, which equates to 540 brackets
worthy of top 100 from a larger generation, so quite solid.
My family group maxed out at a high score of 1330 this year. I don't know exactly
how many entries it had, but somewhere close to 50. Using my SD=0.65 generator and
making 50 brackets, it would have been exactly 50% to do better than 1330.
The SD=0.5 generator would have been 60% likely to do better. The Typical Strength
setting would have done better than 1330 in 92% of 50-bracket samples, and in fact
its median for 50-bracket samples is a high of 1420.
I only tracked one other group this year, the Fans of Utah group of size 2500. It
had a high of 1530. Generating 2500 brackets, my system would have an 87% chance
to beat 1530 at the SD=0.5 setting, a 68% chance at the SD=0.65 setting, a 38% chance
at the SD=1 setting, and an 85% chance at the Typical Strength setting.
IN SUMMARY: Conservative wins again here, almost regardless of what your
goal was. Even if your goal was a perfect bracket, SD=0.73 is hardly crazy, and
lower than what most people would think in a year where an uremarkable
7 seed made the Final Four. The Typical Strength setting is worth continuing to
investigate to see if it holds weight in other years. The public wins this year
at the lower thresholds, but at the elite level I still have the edge. Because nobody
in their right mind would ever have picked South Carolina to reach the Final Four,
but my system doesn't have a mind.
Back again tomorrow with the 2016 tournament!
ENTRY 49 - 8-8-19 - System Bracket Generation Performance (2018)
This is a continuation of the last two posts, in which I will be analyzing my
bracket generator over the last few tournaments to answer the questions: Does
my system generate better brackets than humans? And what is the best strategy
and level of upsets to pick for different sizes of pools?
Let's start 2018 with a brief reminder of what transpired during the tournament,
and how my advice for bracket picking from Blog Entry 42 holds up.
My system's overwhelming favorite to win the title, Virginia, was the first
1 seed ever to fall to a 16 seed UMBC. That is not a good start. Virginia's
region turned into a garbage fire with seeds 9, 5, 11, and 7 making the Sweet
16. (9) Kansas State and (11) Loyola made the Elite 8, with Loyola making the
Final Four. The rest of the regions were fairly normal. The most outlandish
thing to happen in another region was (9) Florida State making the Elite 8. The
rest of the Elite 8 was 1, 2, and 3 seeds, with the rest of the Final Four
rounding out as (3) Michigan, (1) Villanova, and (1) Kansas. The rest of
the tournament went to form with Villanova defeating Michigan for the title. The
public did have Virginia as most picked to win the title (18%), but with Villanova
as a close second (16%).
As for my advice?
- My system had Virignia as the clear favorite at 22.7% to win it all, with
Villanova second at 14.9%. (Bad)
- I stated "Almost every at-large team has a shot
at making the Final Four". (Decent call with Loyola in there)
- I said the 1 seeds were relatively weak and the most likely scenario
was only 1 gets in (Seems OK)
- Outright seed upsets predicted as follows:
- (10) Butler over (7) Arkansas (happened)
- (5) Kentucky over (4) Arizona (Arizona first lost to Buffalo, but Kentucky beat Buffalo so we'll call it good)
- (5) West Virginia over (4) Wichita St (Wichita lost to Marshall, then WVU beat Marshall)
- (2) UNC over (1) Xavier (Neither team made the sweet 16, so doesnt matter)
- (2) Duke over (1) Kansas (Matchup happened but Kansas came out on top)
Overall my advice was some good, some bad. Probably more bad because the misses were
things that mattered a lot more in terms of bracket scoring.
Let's break down the probabilities of perfect bracket by round with SD=1 (read previous
posts to see what this means):
Round 1 - 1 in 54 million (for perspective, about 10 times less likely than 2019, mostly due to Virginia upset)
Round 2 - 3.074 x 10^-13 (about on par with the entire tournament from 2019)
Round 4 - 1.5549 x 10^-16
Perfect Bracket - 2.98 x 10^-17
Probability of perfect bracket maximized at SD=1.36, P = 4.41 x 10^-17
This is the second lowest probability I've seen since I have data (going
back to 2000). Even discounting the Virginia loss, which adds 2 orders
of magnitude alone, it would have still counted
as a moderately crazy year. Most years are in the vicinity of 10^-15. 2019's
10^-13 was by contrast one of the highest there's ever been.
Let's look at final four incidence rates next. My system's probability
of picking each Final Four team correctly, as per Blog Entry 42:
Villanova 38.8%
Kansas 17.7%
Michigan 15.9%
Loyola 1.6%
All Four: 1 in 5723
Compare this to the public's chances of each correct Final Four team, as reported by ESPN:
Villanova 58.4%
Kansas 29.8%
Michigan 19.1%
Loyola 0.5%
All Four (using probs above): 1 in 6016
All Four (actual brackets): 550 out of 17.3 million, or 1 in 31454
The public was higher on three of the four teams' chances, which will probably
pay dividends in the percentile analysis and bracket generation abilities to
come. However, the Loyola chances being roughly 1/3 of mine doomed the
chances of perfection. It is notable that the actual number of public brackets
with the final four correct was much lower than expected by these percentages,
which makes me believe that a good portion of the public brackets with Loyola
in the Final Four were "not serious", and had completely unreasonable other
final four members.
As I did for 2019, here are my system's Final Four percentages if I set
it to SD=0.65, a much more conservative setting:
Villanova 53.0%
Kansas 20.7%
Michigan 18.0%
Loyola 0.1%
All Four: 1 in 50,638
Like the public's numbers, the more likely teams bring the probability
up a bit, but Loyola's diminished chances are too damaging. It is notable
as a reference point to all of these Perfect Final Four odds that picking
random teams from each region to advance to the Final Four would give
odds 1 in 16^4, or 1 in 65536.
Time for the percentile charts! Please read the previous blog post (Entry 48)
for the description of what we are looking at in this chart.
Points | ESPN | SD 0.5 | SD 0.65 | 0.8 | 1 | 1.2 |
1700 | (none) | (none) | (none) | (none) | (none) | (none) |
1660 | (Rank 2) | (none) | (none) | (none) | (none) | (none) |
1620 | (Rank 8) | (none) | (none) | (none) | (none) | (none) |
1580 | (Rank 21) | (none) | (none) | 1 | 0.99997 | (none) |
1540 | (Rank 122) | 0.99999 | 0.99999 | 0.99996 | 0.99995 | 0.99999 |
1500 | 0.99995 | 0.99995 | 0.99995 | 0.99988 | 0.99987 | 0.99996 |
1460 | 0.99973 | 0.99974 | 0.99970 | 0.99973 | 0.99978 | 0.99990 |
1420 | 0.99900 | 0.99921 | 0.99932 | 0.99912 | 0.99945 | 0.99958 |
1380 | 0.99850 | 0.99834 | 0.99839 | 0.99834 | 0.99883 | 0.99903 |
1340 | 0.99550 | 0.99681 | 0.99649 | 0.99675 | 0.99792 | 0.99816 |
1300 | 0.99000 | 0.99450 | 0.99410 | 0.99479 | 0.99617 | 0.99684 |
1260 | 0.98600 | 0.99098 | 0.99096 | 0.99205 | 0.99357 | 0.99482 |
1220 | 0.98200 | 0.98601 | 0.98671 | 0.98815 | 0.99051 | 0.99227 |
1180 | 0.97500 | 0.97831 | 0.98029 | 0.98322 | 0.98640 | 0.98880 |
1140 | 0.96000 | 0.96531 | 0.97013 | 0.97500 | 0.98084 | 0.98391 |
1100 | 0.93400 | 0.94550 | 0.95319 | 0.96247 | 0.97138 | 0.97625 |
1060 | 0.90500 | 0.92008 | 0.93140 | 0.94364 | 0.95784 | 0.96518 |
1020 | 0.87900 | 0.89092 | 0.90475 | 0.92130 | 0.93977 | 0.95011 |
980 | 0.85400 | 0.86072 | 0.87458 | 0.89533 | 0.91733 | 0.93059 |
940 | 0.83700 | 0.83821 | 0.84895 | 0.86941 | 0.89332 | 0.90839 |
900 | 0.82400 | 0.82242 | 0.83010 | 0.84884 | 0.87268 | 0.88731 |
860 | 0.80500 | 0.80594 | 0.81719 | 0.83453 | 0.85794 | 0.87242 |
820 | 0.77100 | 0.78033 | 0.79965 | 0.82005 | 0.84657 | 0.86136 |
780 | 0.72900 | 0.74189 | 0.77444 | 0.80144 | 0.83216 | 0.84996 |
740 | 0.69000 | 0.69498 | 0.74136 | 0.77569 | 0.81302 | 0.83492 |
700 | 0.65600 | 0.63427 | 0.69781 | 0.74069 | 0.78631 | 0.81332 |
660 | 0.60300 | 0.56598 | 0.64479 | 0.69664 | 0.75052 | 0.78347 |
620 | 0.54000 | 0.49888 | 0.58519 | 0.64374 | 0.70490 | 0.74598 |
580 | 0.46600 | 0.43015 | 0.52151 | 0.58541 | 0.65414 | 0.70066 |
540 | 0.38000 | 0.35015 | 0.44713 | 0.51971 | 0.59579 | 0.64746 |
500 | 0.29400 | 0.26244 | 0.35945 | 0.43825 | 0.52334 | 0.58051 |
460 | 0.21600 | 0.17714 | 0.26601 | 0.34181 | 0.43360 | 0.49809 |
420 | 0.15100 | 0.10116 | 0.17383 | 0.24297 | 0.32991 | 0.39931 |
380 | 0.10200 | 0.04411 | 0.09338 | 0.14773 | 0.22425 | 0.28791 |
340 | 0.06300 | 0.01335 | 0.03887 | 0.07217 | 0.12752 | 0.17884 |
Let's get this out of the way first: Why did I not include a column on this
table for SD=1.36, the setting that has the highest probability of producing
a perfect bracket? Recall that lower numbers in this chart are better. You can
see that all of the trends are towards more conservative (smaller) SD numbers being
better at pretty much every threshold. I believe that if I had the computing
power to calculate trillions of simulations, then SD=1.36 might begin to assert
its dominance at somewhere around the 1800 threshold. But with the number of
simulations I am using here, it is pointless to include the column as it will
just be worse on every row. But just for you guys, I did run 100,000 simulations
on SD=1.36 just to see. And it was terrible. Its best bracket was 1560, which
is not even that good compared to the other settings. So forget about it.
The summary from this table with regards to the system vs. public is the public
got me this time. No matter which SD setting I use, the results are worse than
what the public was able to do. This is probably due to what I mentioned before -
the public's distrust of Virginia ended up paying off, and the public's trust of
1 seeds in general helped with Kansas (which was a weak 1 seed, no getting around
it).
The conclusion with respect to my system was a little shocking, but makes
sense when I think about it. Even during a crazy year, picking conservatively
is still the best you can do and leads to the best results! Almost no matter
what size of pool you are trying to win, using SD=0.5 is your best bet and has
lower numbers at almost every row. It is a small sample size, but it seems the
only place that SD=0.5 is inferior is at the top end of the range. SD=0.8 and
SD=1 seem better if your goal was 1500+. In my family group of 57 brackets,
the highest was 1160, so 1500+ is complete overkill. In the Fans of Utah group
on ESPN with size 2300, the highest was 1510, so maybe it starts to become
worth it there.
The largest group, Tournament Challenge, with size 300,000 had
maximum score 1630 (2nd highest score was 1550). In my simulations I had a 1610
at SD=1, as well as 10 total scores above 1550 spread across the other settings.
This seems comparable with humans at the elite level. If you look at ALL ESPN
brackets, there were a total of 82 brackets above 1550. If I took the 500,000
brackets I simulated (100,000 at each level in the chart)
and multiplied by about 34 to give myself the same number of brackets as the
public, I could expect roughly 340 brackets above 1550, outperforming the public.
IN SUMMARY: Even though this was a crazy year by any metric, it
was still a good year to be conservative almost regardless of your goals. There
are some crazy results that it is not worth stretching the upset frequency
to try to obtain, and you are better off just trying to get the three Final Four
teams you can and cut your losses. My system's biases against some of the teams
that ended up going far hurt it on most levels versus the public, but it still
had an excellent shot at producing really top level brackets.
I'll be back tomorrow with the 2017 tournament!
ENTRY 48 - 8-7-19 - System Bracket Generation Performance (2019)
I'm back again today (after yesterday's initial post) to do some analysis on
my "March Madness Bracket Generator". The question to answer today is: Are my
program's rankings sufficiently accurate to be useful in generating brackets
for people? And does it generate excellent brackets often enough to compete with
the best humans can come up with?
Let's start by saying I went back through my logs and was a bit surprised to
find that I haven't done a tournament performance analysis since Entry 10 of
the blog back at the conclusion of the 2015 tournament. I have collected
data since then which would allow doing analysis but haven't taken the time to
actually perform it. Since the analysis takes a while to put together, I
will only focus today on the most recent tournament (2019) and work backwards
from there.
First, some basic stats round by round.
PROBABILITY OF SYSTEM PICKING A PERFECT BRACKET THROUGH EACH ROUND
Round 1 - 1 in 5.4 million
Round 2 - 1 in 980 million (Note - there was a perfect bracket on NCAA.com after 2 rounds)
Round 4 - Probability 5.367 x 10^-13
Perfect Bracket - Probability 1.02 x 10^-13
Probability of perfect bracket maximized at SD = 0.71
There were 17.2 million brackets made on ESPN alone. When combined with other sites, it is probably
fairly conservative to say there were about 50 million brackets made online. 1 perfect bracket out of
50 million would still only happen in my program in 1 out of 20 trials, so this is remarkable.
The optimal standard deviation setting to use for the tournament turned out to be 0.71, which corresponds to
most of the way to "conservative" on my sliding scale. This represents a relatively low frequency of
upsets (SD = 1 would represent an expected number of upsets). At this setting, the probability of
a perfect bracket through Round 2 would have been closer to 1 in 400 million.
Next let's look not at completely perfect brackets, but pretty strong brackets. One measure of this
is having the Final Four correct. Here are the probabilities, directly from Entry 46
PROBABILITY OF SYSTEM GETTING EACH FINAL FOUR TEAM CORRECT
Virginia: 49.2%
Michigan State: 29.4%
Texas Tech: 15.2%
Auburn: 8.6%
All Four: P = 0.00189
ESPN PERCENT OF PEOPLE TO PICK EACH FINAL FOUR TEAM
Virginia: 42.6%
Michigan State: 18.6%
Texas Tech: 10.3%
Auburn: 3.9%
All Four: 0.000318
We find that my program was almost 6 times more likely to have all of the final four correct.
However, maybe this data was being skewed. My program's percentages are completely independent
events, but the people's are not. A person picking Virginia may be more likely to pick all 1
seeds, for example. Fortunately, ESPN's tournament challenge gave the number of people that
ACTUALLY got all four Final Four teams correct: 7928 out of 17.2 million, which works out to
P = 0.00046. So a little higher than the independent calculation, but still not on the same
level as my system.
If I set my program to a lower SD number to give less upsets, here are the new percentages:
Virginia: 64%
Michigan State: 29.8%
Texas Tech: 15.3%
Auburn: 4.7%
All Four: P = 0.00137
We find that Virginia has a much higher chance to be selected, Mich St and Texas Tech are
strong enough to be unaffected, and Auburn is much lower. The result is a lower P value
because Auburn almost halved in likelihood. However, this is STILL more than double the chance the public
had.
Also via Tournament Challenge nuggets, we were told that Duke and UNC were extremely overvalued
in their chances to make each round and to win it all. The public picked Duke to win in 36% of brackets,
compared to 13.7% in my system. The public picked UNC to win in 15% of brackets, compared to 11% in my
system. These two teams losing early really hurt their chances at good brackets.
The Vegas Lines before the tournaments concurred with the public on Duke. Duke had 2-1 odds to win it all (consistent
with about a 33% implied chance) and UNC had 8-1 odds (implied 11% chance).
Let's proceed then to some histograms. The entries in the histogram are the percentile
of scores represented by the number in the row header.
The information you see is my program's percent of brackets at or below each
score range for different SD settings, as well as ESPN's percent at or below each score.
The percentiles from my program are derived from 100,000 simulations.
We know my program outdid the public in Final Four picking, but was its "average" or "top 10%"
bracket better than the public's?
Points | ESPN | SD 0.5 | 0.65 | 0.8 | 1 | 1.2 |
1700 | 0.99998 | 0.99992 | 0.99994 | 0.99994 | 0.99997 | 0.99999 |
1660 | 0.99996 | 0.99967 | 0.99969 | 0.99987 | 0.99992 | 0.99995 |
1620 | 0.99987 | 0.99908 | 0.99901 | 0.99953 | 0.99975 | 0.99991 |
1580 | 0.99971 | 0.99777 | 0.99794 | 0.99874 | 0.99945 | 0.99975 |
1540 | 0.99942 | 0.99567 | 0.99601 | 0.99742 | 0.99861 | 0.99943 |
1500 | 0.99900 | 0.99277 | 0.99327 | 0.99542 | 0.99748 | 0.99867 |
1460 | 0.99800 | 0.98837 | 0.98977 | 0.99238 | 0.99540 | 0.99733 |
1420 | 0.99650 | 0.98150 | 0.98423 | 0.98793 | 0.99274 | 0.99530 |
1380 | 0.99400 | 0.97060 | 0.97559 | 0.98165 | 0.98818 | 0.99252 |
1340 | 0.98900 | 0.95134 | 0.96270 | 0.97218 | 0.98227 | 0.98852 |
1300 | 0.98200 | 0.92061 | 0.94144 | 0.95872 | 0.97320 | 0.98300 |
1260 | 0.97100 | 0.87958 | 0.91226 | 0.93719 | 0.96024 | 0.97468 |
1220 | 0.95600 | 0.83293 | 0.87526 | 0.90963 | 0.94191 | 0.96222 |
1180 | 0.94300 | 0.78944 | 0.83674 | 0.87765 | 0.91933 | 0.94532 |
1140 | 0.93200 | 0.75838 | 0.80190 | 0.84570 | 0.89225 | 0.92530 |
1100 | 0.92300 | 0.73939 | 0.77570 | 0.81651 | 0.86436 | 0.90247 |
1060 | 0.91500 | 0.72380 | 0.75624 | 0.79311 | 0.83897 | 0.87793 |
1020 | 0.90600 | 0.70277 | 0.73904 | 0.77363 | 0.81752 | 0.85620 |
980 | 0.89000 | 0.67242 | 0.71696 | 0.75841 | 0.79861 | 0.83654 |
940 | 0.87500 | 0.62809 | 0.68600 | 0.73089 | 0.77990 | 0.81985 |
900 | 0.82800 | 0.57129 | 0.64396 | 0.70042 | 0.75791 | 0.80146 |
860 | 0.81100 | 0.50768 | 0.59488 | 0.66161 | 0.72940 | 0.77964 |
820 | 0.75500 | 0.44210 | 0.53804 | 0.61393 | 0.69317 | 0.75125 |
780 | 0.69900 | 0.37227 | 0.47458 | 0.55767 | 0.64817 | 0.71629 |
740 | 0.61300 | 0.29527 | 0.40434 | 0.49430 | 0.59552 | 0.67316 |
700 | 0.51700 | 0.21396 | 0.32351 | 0.42140 | 0.53338 | 0.61989 |
660 | 0.40700 | 0.13557 | 0.23898 | 0.33987 | 0.46169 | 0.55606 |
620 | 0.30300 | 0.07374 | 0.15965 | 0.25190 | 0.37885 | 0.48221 |
580 | 0.21100 | 0.03252 | 0.09281 | 0.17132 | 0.28846 | 0.39537 |
540 | 0.14600 | 0.01120 | 0.04610 | 0.10289 | 0.20089 | 0.30269 |
500 | 0.10100 | 0.00318 | 0.01894 | 0.05316 | 0.12602 | 0.21173 |
460 | 0.07300 | 0.00079 | 0.00608 | 0.02288 | 0.06748 | 0.13280 |
420 | 0.05500 | 0.00009 | 0.00150 | 0.00796 | 0.03019 | 0.07160 |
380 | 0.04300 | 0.00001 | 0.00028 | 0.00189 | 0.01073 | 0.03254 |
340 | 0.03400 | 0 | 0.00002 | 0.00035 | 0.00315 | 0.01215 |
Let's try to put the numbers above into some perspective.
First, what does a percentile (the numbers above) mean? Let's look at the 1300 score row.
ESPN's percentile of 0.98200 means that 98.2% of ESPN brackets had a 1300 or lower. Turned
around, this means that 1.8% of ESPN brackets were higher than 1300. Likewise, my program
at the regular SD=1 setting had 0.9732, which means 2.68% of brackets were higher than 1300.
The most conservative setting, SD=0.5, had 0.92601, which means about 7.4% of brackets were
higher than 1300. If your goal is to get above 1300, then the conservative setting is about
4 times more likely to achieve this goal than a random ESPN bracket.
So the short version is: You want your number in a row to be as low as possible. This
means our analysis for this chart is really easy - SD=0.5, the most conservative setting,
wins at basically any threshold you could place. It had a top 10% line at about 1280, vs.
1000 for ESPN. This is an enormous difference. It had a median line at about 860, vs.
700 for ESPN. There is no level at which SD=0.5 is inferior to any other setting.
What about for really, really high scores? SD=0.65 might have the edge there. It had a 1790,
a 1760, and two 1750s. SD=0.5 had a 1780, a 1750, and everything else 1740 or below.
Let's try to put these scores in perspective vs. ESPN. On the Full ESPN Tournament Challenge
leaderboard, the lowest score in the top 100 was 1730. Out of the total 500,000 trials
I ran between my 5 system settings, I managed 8 scores of 1740 or above (It seems only fair
to use all 5 settings because you never know which one will be best any given year).
If you multiply this by 34 to give it the same 17 million entries that ESPN had, that would give
my system 272 entries that are top-100 worthy, almost 3 times the number of elite entries that
ESPN had. The 1790 and 1780 would have been ranked #5 and #8 on the leaderboard respectively.
How does this match up with winning bracket pools? I followed several different sized
bracket pools to see whether my system would have had a fair chance to win the pool.
My family bracket pool had 58 entries. If my system got to have 58 brackets entered
into a pool on SD=1, its highest bracket would on average be a 1370. The actual group
winner was at 1420. SD=1 would produce a 1420 in 58 tries about 36% of the time. If
you give 58 tries to SD=0.65, then the highest bracket would on average be 1470, which
would be enough to usually win the group.
Same game with the Fans Of Utah group on ESPN, size 2500. The winning bracket in that
group was a 1590. SD=1 would produce a single 1600 and win the group outright with
probability P=0.0004 as in the table above. This equates to about 1 in 2500, exactly
the same. If instead we go to SD=0.65, then it produces 1600 brackets at a rate of
P=0.00150, or 1 in 666 brackets, about 4 times more often.
IN SUMMARY: Since this was a good year to be conservative, and since my system's
prohibitive favorite to win it all (Virginia) ended up coming through, it was easier for
it to generate elite brackets than usual. The public picking Duke to win it all 36% of the
time ruined a lot of brackets in that region before the Final Four. Also, as mentioned
in the previous post, having (5) Auburn favored outright over (4) Kansas and as the most
likely 1 seed upset in the next round probably did it a lot of favors.
I will continue this analysis with the 2018 tournament next (and then 2017 and 2016) to
see how the system does in some years with a bit of a different complexion. Does it
still outperform if there are more upsets, or if the Final Four is a bit crazier? It
might do even better in that case because the public tends to overvalue 1 seeds.
ENTRY 47 - 8-6-19 - Thoughts from the 2019 NCAA Tournament Analysis
It is nearly time for football to kick off this season! I really wanted to do some
recaps and analysis from basketball, so this is about my last chance to do so before
football takes over.
First, if you read this, I have done some updating of things that should have been
done a while ago. The basketball postseason rankings are available. The predictions
through all of the postseason games and the results are posted. I also did some updates
to catch us up with the new year of sports. That includes posting the football
preseason rankings, and I will soon put out Week 1 predictions.
Now with that out of the way, I want to write a series of posts here, and we'll see
what I have time for before other obligations come around. This post will recap
the "bracket picking advice" I gave in the previous post and how that advice panned
out in the real thing.
Future posts will address:
- How my bracketology seeding prediction did, and what the takeaways are for next year
- How my bracket maker does in terms of generating good brackets with hopes of winning pools
- If I have lots of time, some calibration updates to get my rankings working better in general
OK! Let's break down some of the statements of advice I gave for picking a bracket.
- "Overall, this year is much like last year where I would refrain from many big upsets
in the first round because the strength at the top is just too much. Even the second
round might be light on upsets. But after that, there could be some big upsets in
the Elite 8 and Final 4."
I think this is a big hit. All 12 of the teams seeded 1, 2, or 3
made it to the sweet 16. The remaining sweet 16 entrants were seeded 4, 4, 5, 12. The 12 was Oregon,
which many people considered underseeded and doesnt count as a true Cinderella.
There were some big upsets after this, with the seeds in the final four being 1, 2, 3, 5.
The one knock against my statement was the number of 12 seed upsets in the first round,
with 3 of them winning and the 4th losing to a miracle.
- "Virginia is by far the best
team to pick to win it all, but be aware Duke has beaten them twice this year and has
at times looked like the real best team. They are close in team strength but
Virginia just has the much easier region to navigate"
Yep. That is pretty much what happened. Duke was probably better but had
to go through a murderer's row. Viriginia had a relatively easy final four route.
- "Most likely first round upsets"
16 over 1: Play-in over Gonzaga, 4.2% chance. 8.1% chance of at least one Never challenged. Virginia was the closest.
15 over 2: Montana over Michigan, 9.5% chance. 24.1% chance of at least one This game was really a grind. Probably Colgate was closest to an upset.
14 over 3: Yale over LSU, 16.7% chance. 44.8% chance of at least one Indeed the closest but not quite.
13 over 4: Northeastern over Kansas, 24.7% chance. 61% chance of at least one Kansas wins by 34, although they sure looked weak in the next round.
12 over 5: Murray St over Marquette, 36.8% chance. 1.24 such upsets expected Happened, along with 3 others.
11 over 6: St Marys over Villanova, 42.7% chance. 1.38 such upsets expected Close game, could have gone either way.
10 over 7: Florida over Nevada, 47.4% chance. 1.57 such upsets expected Florida was the better team.
9 over 8: Oklahoma is favored outright over Mississippi, 50.5% chance. 1.93 such upsets expected Oklahoma wins by 23.
Some up and some down there. How about later in the bracket?
- "Most likely later upsets"
8 over 1: Syracuse over Gonzaga, 24.2% chance. 0.76 such upsets expected. Syracuse didn't advance anyway. UCF almost toppled Duke in a near stunner.
7 over 2: Nevada over Michigan, 33.2% chance. 1.22 such upsets expected. Michigan was not challenged by Florida. Tennessee barely escaped though.
6 over 3: Buffalo over Texas Tech, 42.5% chance. 1.59 such upsets expected. Nope. LSU was the only 3 seed challenged (against Maryland)
5 over 4: Auburn is favored outright over Kansas, 55.5% chance. 1.87 such upsets expected Yep Yep Yep!
4 or 5 over 1: Auburn over UNC, 35.8% chance. 1.23 such upsets expected. And Yep.
3 over 2: Texas Tech over Michigan, 47.9% chance. 1.72 such upsets expected. Keep em coming.
2 over 1: Kentucky over UNC, 45.3% chance. The idea here was correct.
I stated in the follow up to this section that:
- "The pressure points in terms of bracket craziness seem to lie with Gonzaga and UNC as
1 seeds, and Michigan and Tennessee as 2 seeds. We also see that, in general,
the strength of especially the 1 seeds is enough that it is entirely reasonable
to pick no upsets with them until the 4 seed matchup at the earliest. Even there,
probably three of the 1 seeds can be expected to make the elite 8."
Nailed that exactly three 1s would make the elite 8. Nailed that
Gonzaga, UNC, Michigan, and Tennessee would show weakness and bow out before their seed
dictated.
- "In a larger pool (at least 100 brackets), you could branch out a little bit and
a good mix would be two 1 seeds, a 2 seed, and someone else from the remaining field
to make the final four."
This slid a little further than expected. The real seeds were
a 1, 2, 3, and 5. All four of these that made it were each the strongest in their seed
line so this mitigates the damage a bit. Strictly following the advice above, a 3/4 final
four prediction is on the table.
Overall, I think the advice I gave was quite sound. There were not many upsets,
particularly at the top seed lines. They advanced to the sweet 16 and then some chaos
took hold. My top seed (which the program gave a 19 percent chance to win it) ended
up taking it home. Even (5) Auburn was not all that unexpected, as only one of the 4 seeds
was given a better chance to reach the Final Four. Someone following my advice would
have been tempted to advance them to at least the Elite 8 to distinguish themselves in
a large pool.
I'll be back soon with some analysis of my system's prowess in generating
elite level brackets!
ENTRY 46 - 3-17-19 - NCAA Tournament Special 2019
So what kind of bracket have we got this year? Who should you pick? How many upsets
will there be? Let's find out!
Quick note: There will be another post later about the seed projections, my thoughts,
and how both myself and my system did against others (I think the answer is not well
since we both planted our flags in a lot of the wrong places).
Final Four Breakdown
(1)Virginia: 49.2
(1)Duke: 38.7
(1)North Carolina: 36.3
(1)Gonzaga: 33.1
(2)Michigan State: 29.4
(2)Kentucky: 24.7
(2)Tennessee: 19.9
(2)Michigan: 19.4
(3)Texas Tech: 15.2
(3)Houston: 12.6
(4)Florida State: 11.6
(3)Purdue: 10.4
(3)Louisiana State: 9.2
(5)Auburn: 8.6
(4)Virginia Tech: 7.6
(6)Buffalo: 6.8
(4)Kansas: 5.8
(4)Kansas State: 5.5
(6)Iowa State: 4.7
(5)Mississippi State: 4.6
(5)Wisconsin: 3.9
(5)Marquette: 3.6
(7)Louisville: 3.3
(7)Wofford: 3.1
(7)Nevada: 3
(7)Cincinnati: 2.9
(6)Villanova: 2.9
(6)Maryland: 2.5
(8)Syracuse: 2.1
(10)Florida: 1.8
(11)Saint Marys: 1.5
(8)Virginia Commonwealth: 1.5
(9)Central Florida: 1.4
(9)Baylor: 1.3
(8)Utah State: 1.2
(12)Oregon: 1.1
(9)Oklahoma: 0.8
(12)New Mexico State: 0.7
(12)Murray State: 0.7
(8)Mississippi: 0.7
(9)Washington: 0.6
(10)Minnesota: 0.6
(10)Iowa: 0.6
(11)Arizona State: 0.5
(10)Seton Hall: 0.5
(11)Ohio State: 0.4
(11)Belmont: 0.3
(13)Vermont: 0.2
(14)Yale: 0.1
(13)UC Irvine: 0.1
(13)Saint Louis: 0.1
(13)Northeastern: 0.1
Champions Breakdown
(1)Virginia: 19.4
(1)Duke: 13.7
(1)North Carolina: 11.3
(1)Gonzaga: 10.5
(2)Michigan State: 8.6
(2)Kentucky: 6.3
(2)Tennessee: 5
(2)Michigan: 4.3
(3)Texas Tech: 3.1
(3)Houston: 2.2
(3)Purdue: 1.9
(4)Florida State: 1.7
(5)Auburn: 1.5
(3)Louisiana State: 1.3
(4)Virginia Tech: 1
(6)Buffalo: 1
(4)Kansas State: 0.8
(4)Kansas: 0.8
(6)Iowa State: 0.5
(5)Mississippi State: 0.5
(5)Wisconsin: 0.4
(7)Louisville: 0.4
(6)Villanova: 0.3
(7)Nevada: 0.3
(5)Marquette: 0.3
(7)Wofford: 0.2
(7)Cincinnati: 0.2
(6)Maryland: 0.2
(8)Syracuse: 0.1
(9)Central Florida: 0.1
(8)Virginia Commonwealth: 0.1
(11)Saint Marys: 0.1
Here are number of teams to reach each threshold, by year the last few years:
At least 10% Chance at Final 4:
2015 - 9
2016 - 18
2017 - 15
2018 - 13
2019 - 12
At least 1% Chance at Final 4:
2015 - 28
2016 - 43
2017 - 43
2018 - 42
2019 - 34
At least 10% Chance at Title:
2015 - 4
2016 - 4
2017 - 2
2018 - 2
2019 - 4
At least 1% Chance at Title:
2015 - 12
2016 - 18
2017 - 24
2018 - 19
2019 - 16
The bracket mixes some qualities we've seen in recent years. There is definitely
a top tier of roughly 8 teams, which includes all of the 1 and 2 seeds, which is
soaking most of the chances at making the final four and winning the title.
However, they share the chances fairly equally with Virginia the clear best of
the bunch, but the rest with quite reasonable chances.
Most of the 1-4 seeds have at least a prayer at a title. However, team strength
drops off pretty fast after that. Breaking down expected number of final four appearances
by seed number, we have:
- Expected 1 seeds in final four = 1.57
- Expected 2 seeds in final four = 0.93
- Expected 3 seeds in final four = 0.47
- Expected 4 seeds in final four = 0.31
- Expected final four teams from rest of field = 0.72
Overall, this year is much like last year where I would refrain from many big upsets
in the first round because the strength at the top is just too much. Even the second
round might be light on upsets. But after that, there could be some big upsets in
the Elite 8 and Final 4.
Let's break down the most likely upsets by round and seed line, as well as chances of
at least one such upset.
16 over 1: Play-in over Gonzaga, 4.2% chance. 8.1% chance of at least one
15 over 2: Montana over Michigan, 9.5% chance. 24.1% chance of at least one
14 over 3: Yale over LSU, 16.7% chance. 44.8% chance of at least one
13 over 4: Northeastern over Kansas, 24.7% chance. 61% chance of at least one
12 over 5: Murray St over Marquette, 36.8% chance. 1.24 such upsets expected
11 over 6: St Marys over Villanova, 42.7% chance. 1.38 such upsets expected
10 over 7: Florida over Nevada, 47.4% chance. 1.57 such upsets expected
9 over 8: Oklahoma is favored outright over Mississippi, 50.5% chance. 1.93 such upsets expected
The percentages in the first three matchups are lower than last year. Past that
it is about the same as usual.
Best upset chances in some of the later rounds:
8 over 1: Syracuse over Gonzaga, 24.2% chance. 0.76 such upsets expected.
7 over 2: Nevada over Michigan, 33.2% chance. 1.22 such upsets expected.
6 over 3: Buffalo over Texas Tech, 42.5% chance. 1.59 such upsets expected.
5 over 4: Auburn is favored outright over Kansas, 55.5% chance. 1.87 such upsets expected
4 or 5 over 1: Auburn over UNC, 35.8% chance. 1.23 such upsets expected.
3 over 2: Texas Tech over Michigan, 47.9% chance. 1.72 such upsets expected.
2 over 1: Kentucky over UNC, 45.3% chance.
The pressure points in terms of bracket craziness seem to lie with Gonzaga and UNC as
1 seeds, and Michigan and Tennessee as 2 seeds. We also see that, in general,
the strength of especially the 1 seeds is enough that it is entirely reasonable
to pick no upsets with them until the 4 seed matchup at the earliest. Even there,
probably three of the 1 seeds can be expected to make the elite 8.
To summarize, it pays to go a lot of chalk this year in a small pool. I would recommend
in that scenario advancing at least three of the 1 seeds to the final four and having
one of them win the title. Michigan St could maybe be included in that group also.
In a larger pool (at least 100 brackets), you could branch out a little bit and
a good mix would be two 1 seeds, a 2 seed, and someone else from the remaining field
to make the final four. All of the 3 seeds are good choices for the last in. Florida
St and Virginia Tech are OK as 4 seeds, Auburn is the best 5 seed, and Buffalo
and Iowa St your best bets as 6 seeds if you want to go a bit deep. For a complete
deep dive, Florida and St Marys are the most likely double digit seeds to make the Final
Four but each has less than a 2% chance to make it. Virginia is by far the best
team to pick to win it all, but be aware Duke has beaten them twice this year and has
at times looked like the real best team. They are close in team strength but
Virginia just has the much easier region to navigate.
Also a reminder about my random bracket generator: The smaller the pool is, the more
you want to just slide that slider all the way to "conservative". For the largest
of pools, you want it right where it defaults in the middle. Sliding any more toward
"crazy" is pointless except for if you want to throw away equity to have some fun.
Good luck with your brackets!
ENTRY 45 - 3-17-19 - Last Bracketology of the Year
Well, it's the best time of year again. It is time for the final bracket projection!
First, some methodology. The is the first year of the NET, so nobody really knows
what is going to happen this time. However, we can guess that it will be similar to last
year with the RPI quadrants. The problem in all of this is my program does not pull
any information from outside sources except the game scores and locations. The RPI formula
was easy enough to duplicate to use in previous years, but the NET is not. I have
been using an approximation of the NET to form quadrants but it is not perfect and is
off by 20+ ranking spots for some teams, which messes with the quadrants. This is a
big source of uncertainty in the final rankings.
I tweaked the formula a little bit this year, as described in some other posts. It
weighs even more heavily the team strength, as it is my NET approximation and the
committee might be relying on it a lot for seeding relatively equal looking resumes.
I also made signature wins worth quite a bit more, as some of my recent tournament misses
have been teams without a signature win.
I have non-con SOS relatively low as a metric right now. This might be a mistake, but
in previous years it just didn't seem to matter much and I'm trusting my gut.
A new thing this year is I am running my power rankings with a recency bias rather
than weighting all games equally. This would seem to fly in the face of the committee's
mantra that all games are equal, but for some reason this year it has just seemed
to match other people's projections a lot closer this time and I am trusting that.
Lastly, as usual, I am basically seeding the tournament based on results through
the end of Saturday. Especially since there is no chance for bid thieves on Sunday,
the committee is expected to basically seed lines 1-12 exactly as they are now and
Sunday results have historically been ignored entirely. The only way this changes is
if there is an upset in a one-bid conference that changes who gets the auto bid.
I imagine the committee has a listing of exactly the order that all auto-bids in
contention will appear, and that order is roughly the current order. Thus, if there
are any upsets, I will just
take the new team in whatever slot they were in before.
So here are my program's (likely) final seeds of this year, barring upsets.
1: Virginia, Duke, North Carolina, Tennessee,
2: Michigan State, Michigan, Kentucky, Gonzaga,
3: Florida State, Houston, Louisiana State, Kansas,
4: Texas Tech, Purdue, Wisconsin, Auburn,
5: Virginia Tech, Kansas State, Iowa State, Mississippi State,
6: Villanova, Louisville, Buffalo, Marquette,
7: Maryland, Wofford, Cincinnati, Nevada,
8: Iowa, Seton Hall, Minnesota, Syracuse,
9: Washington, Utah State, Mississippi, Virginia Commonwealth,
10: Central Florida, Florida, NC State, Clemson,
11: Saint Marys, Oklahoma, Baylor, Arizona State, Oregon,
12: Texas Christian, Indiana, New Mexico State, Murray State, Liberty,
13: UC Irvine, Old Dominion, Saint Louis, Georgia State,
14: Northeastern, Yale, Vermont, Northern Kentucky,
15: Montana, Colgate, Gardner Webb, Bradley,
16: Abilene Christian, Prairie View, Iona, North Dakota State, Fairleigh Dickinson, North Carolina Central,
Some comments on what appears here:
- The most likely misses with these projections are NC State and Clemson. A good
number of people have these two teams out, NC State because of bad non-con SOS (ranked dead last)
and both of them because of a lack of quality wins. My program sees their relatively high team
strength but this has showed to be a trap in some previous years, like with Louisville and USC
last year. Indiana could also end up getting cut.
- Some possible teams from the bubble cut line that could fill in these three teams
are Temple, Ohio State, Belmont, and St Johns. Temple has a bit of a signature win
problem itself, but has a much better record and non-con SOS. Ohio State had some
circumstances - their best player was out for a few games near the end that tanked
their record, but he is back now. Belmont is a great story but has a signature
win problem and the OVC was very bad. St Johns had some great wins in conference, beating
Marquette and Villanova multiple times, but has a terrible power ranking - worst
out of all teams in consideration.
- Some longer shots, but still possible, are Texas, Creighton, Alabama, UNC Greensboro, and Furman.
Texas is 16-16, nothing else to say there. Creighton came on strong but fell just short.
Alabama just has too many losses at 18-15 with not enough to compensate. UNC Greensboro
is 28-6 and an RPI of 32 but no signature wins so they're out. Furman is similar but
has a win at Villanova in the noncon. Their problem is depth of wins - their other signature
wins are home wins over the top 4 SoCon teams. Past that it gets really rough.
- The problem with team power rating is it applies pretty well to the top seed lines.
Most people have Gonzaga as a clear 1 seed just on their team power, even though they
have a critically low number of signature wins for a 1 seed. It seems to me more and
more that next year I need to implement some kind of conditional logic on how the
weights are applied so they are flexible to which part of the bracket they reference.
Given my thoughts above and some other small tweaks, here is how I would seed the
bracket today. These are my own personal opinions. I started from my program's projections
and just made small tweaks from there so large portions look the same.
1: Duke, Virginia, Gonzaga, Tennessee,
2: North Carolina, Kentucky, Michigan State, Michigan,
3: Houston, Louisiana State, Florida State, Kansas,
4: Texas Tech, Wisconsin, Purdue, Iowa State,
5: Virginia Tech, Kansas State, Mississippi State, Villanova,
6: Auburn, Buffalo, Marquette, Maryland,
7: Louisville, Wofford, Cincinnati, Nevada,
8: Iowa, Seton Hall, Minnesota, Syracuse,
9: Washington, Utah State, Mississippi, Central Florida,
10: Virginia Commonwealth, Florida, Oklahoma, Baylor,
11: Saint Marys, Arizona State, Oregon, Texas Christian, Indiana
12: NC State, Temple, New Mexico State, Murray State, Liberty,
13: UC Irvine, Old Dominion, Saint Louis, Yale,
14: Georgia State, Northeastern, Vermont, Northern Kentucky,
15: Montana, Colgate, Gardner Webb, Bradley,
16: Abilene Christian, Prairie View, Iona, North Dakota State, Fairleigh Dickinson, North Carolina Central,
I will take a chance on NC State barely making it, and St Johns missing. The NET ranking
difference is just too large.
More to come once the bracket is revealed! I will break down the field with
some probabilities and figure out what kind of year this is.
ENTRY 44 - 2-10-19 - The 2019 Early Seed Reveal
First of all, for anyone reading this, I have had a lot less time to write blog entries since
starting my new job and there has not been a lot to report on. I do still collect data on
percentiles in the NCAA tournament, but I just can't take the time to put together a big blog
post about it. Suffice it to say, my system has had reasonable percentiles with respect to
the public, but any single tournament is difficult to get a lot of information from because
of the variance involved. Some time I'll write a bigger post including data from all of the
tournaments I've been collecting data for.
The same goes for football. I haven't had time to do a lot of analysis and adjustment of the
formula, so nothing much to report there. The one thing I do want to definitely do this summer
is revisit home court advantages for both sports and see if adjustments are needed.
At the very least, I will do a seed projection post in about a month, and then a March Madness
breakdown and bracket picking guide.
With that said, we do have some business to speak of today. The top four seed lines were
revealed by the NCAA selection committee yesterday, and we must determine if there are any
important changes needed to the formula to account for the new NET ranking situation.
For anyone who does not know, the committee has discarded the RPI and is now using a new
formula devised by themselves, the NET, to make team sheets and assess quality wins. The
introduction of this new system throws a huge variable into the mix since it is unclear how
much they will lean on this metric. The formula is also not publicly available. The stated
objective of my system is to do all ranking and prediction using only the games that happened
and the results. In particular, while sticking to this restriction, I cannot pull outside
rankings such as the AP Poll or the NET as inputs to the seeding process. I can, however, try
to approximate them using the information I have and use the approximations.
I have already made some changes to my formula this year to try to match what I am seeing
on other people's seed proections and continue to adjust the process to account for special
cases. My main changes so far have been to further emphasize signature wins, remove
influence of RPI entirely (of course), use an approximation of NET for designation of
Quadrants for games, and severely cut emphasis on strength of schedule. One other change
to consider comes from a recent addition to the team sheets, an "Average NET Win" and "Average
NET Loss". These numbers may have some merit for weighting and are worth tinkering with,
as long as I can make some approximation of these.
Here were my program's top 4 seed lines going into the reveal:
1: Virginia, Duke, Tennessee, North Carolina,
2: Gonzaga, Michigan, Kentucky, Michigan State,
3: Louisville, Kansas, Houston, Purdue,
4: Villanova, Wisconsin, Iowa State, Iowa
(5: Virginia Tech, LSU, Marquette, Nevada)
And the actual top 4 seed lines revealed by the committee:
1: Duke, Tennessee, Virginia, Gonzaga
2: Kentucky, Michigan, North Carolina, Michigan State
3: Purdue, Kansas, Houston, Marquette
4: Iowa State, Nevada, Louisville, Wisconsin
(close: Villanova, Virginia Tech, LSU)
Some key commentary we got directly from the committee head:
- "Duke and Tennessee are basically 1a and 1b. Duke's SOS won out in the end."
- "Nevada was in the hunt to move up a line."
- "Wisconsin is in there because of quality G1 wins."
- "Villanova was the next team up on the 5 line."
- "Other teams close by: Virginia Tech, LSU."
My thoughts:
- I only had two complete misses: Villanova and Iowa vs. Marquette and Nevada.
- Villanova I can excuse. Most people on Bracket Matrix had Villanova in the middle of the 4 line.
- Iowa is a bigger miss. The metric helping them the most is Q1 and Q2 wins and win %.
- I literally cannot find a metric which is putting Marquette on the 3 line. They have NET of #21,
middling ranks in other computers, so-so Q1 and Q2 numbers, a not-great loss, a middling SOS, etc.
I am baffled here and am just going to accept this one. Actually, the only metric Marquette is doing
well in is AP poll rank of #10, which I don't have an approximation of currently.
- Nevada also has no business being on the 4 line currently, and much less being "in the hunt
for a 3 seed". They have very few quality wins, and actually no Q1 games at all! I believe they may
also be benefiting from a way over-inflated #5 AP rank. Predictive systems have them closer to #15-#20
(NET has them at #15).
- I'm not sure how Tennessee can be ahead of Virginia unless you are putting your entire stock
in that Gonzaga win.
Takeaways for System Modification:
- The committee insinuated SOS is still important. There is some evidence this is true.
- The committee does not seem concerned with the lack of depth in quality wins for Gonzaga, Nevada,
and Houston. The team power seems to hold enough weight to go on with.
- I would assume Villanova was hurt by some of their losses. Thus bad losses maybe should bump up a bit
in importance.
Overall, most of my real questions this year are closer to the cut line, so this reveal was not
particularly enlightening. I have been running some randomized simulations of the rest of the year,
and with the current weights, some really startling things are happening. Some examples of teams
that have been projected as at large bids by my system in different scenarios:
- Texas team with a losing overall record (15-16) but 7 Q1 wins
- Oklahoma team that is 6-12 in conference
- Indiana team that is 7-13 in conference
- Oregon team that is 10-8 in a bad Pac-12 and no wins over Washington
- Utah team that is 12-6 LOL
- Toledo team that is 26-5 and beats Buffalo, but 2nd best win is probably @ Northern Illinois
- BYU team that is 11-5 in WCC. Enough said.
Basically, I expect this year to be the most embarassing year ever for qualification to get
an at-large bid. The teams I listed above are an extreme version of the classical conundrum: Do you take a
major conference team with quality wins but a bad record, or do you take mid-majors that have
done nothing all year except refuse to lose? This has always been the case, and in the recent past,
the committee has sided with the bad major conference teams. However, I think this year may
represent a breaking point. The Pac 12 is not usually this bad, so that is 3 more slots up for grabs.
The Big East is a little bit down. The American is a little down. Some other conferences like A-10 are
single bid. This opens up more slots than usual on the bubble, at a time when many of the teams with
resume building wins have dismal conference records. We shall see what happens!
ENTRY 43 - 3-12-18 - Quick Update on Bracket Matrix
Bracket Matrix just finished compiling everyone's brackets. Here are the final
results:
- My system's 341 was good enough for a tie for 94th place out of 187 brackets, which is
just about exactly at the halfway point. I outperformed almost all of the
other computer brackets. I can consider this first year in the Quadrant era
a success I think.
- My own 355, although it was not officially counted on Bracket Matrix, would
have been good enough for a tie for 17th place. As I said in the last update,
I basically got it by taking my computer's bracket and making a few "obvious"
switches based on perception - Xavier for UNC on the 1 line, San Diego St onto
the 11 line to switch with New Mexico St, etc.
I guess maybe I should submit a seed projection for next year?
ENTRY 42 - 3-11-18 - Projection Results and some Tournament Analysis
First, quick observations about the real bracket:
- My system's bracket matrix score was 341. Now, this has some good news and bad news:
- The good news is this is my system's highest score yet. It beats my previous record of 333 last year.
- The bad news? I think other than the issue of which teams to have in the field,
including the two big surprises, this field and seeding was pretty easy to get. My
own projections, as an example, got a score of 355. And I don't pretend to
religiously follow all the NCAA procedures and such. So I think my system will
again be under the average on Bracket Matrix.
- The NCAA Committee member who explained the bracket and seeding did an awful
lot of talking about signature wins. He also mentioned signature wins in the
context of RPI, and made no real mentions of advanced metrics. This explains
the exclusion of USC and Louisville (who had very bad Best Wins) in favor of
Arizona St and Syracuse, which had really great wins. Saint Mary's 1 signature
win was not enough, and the committee cited their 26 total Group 3 and 4 games.
An idea for next year: I really need to play around with weights that are not
just linear, but dependent on each other for value. For example, SOS does not
matter as much when signature wins in multiples can back it up.
OK, now on to the bracket!
Like last year, we'll run some simulations (100,000 of them) and break down
chances of each team to make final 4 and also to claim the title, and compare
to previous years.
Final Four Breakdown
(1)Virginia: 48281
(1)Villanova: 38889
(2)Duke: 29198
(2)North Carolina: 23321
(2)Purdue: 22547
(1)Xavier: 21693
(3)Michigan State: 20491
(2)Cincinnati: 19136
(1)Kansas: 17730
(3)Michigan: 15924
(3)Tennessee: 11713
(4)Gonzaga: 11258
(4)Auburn: 10491
(5)Ohio State: 9457
(3)Texas Tech: 8090
(5)West Virginia: 7626
(4)Wichita State: 7380
(6)Houston: 7223
(5)Clemson: 6943
(5)Kentucky: 5611
(4)Arizona: 5432
(6)Florida: 4192
(6)Texas Christian: 3854
(8)Seton Hall: 3212
(10)Butler: 2980
(7)Texas AM: 2846
(7)Arkansas: 2560
(9)NC State: 2365
(8)Missouri: 2185
(7)Nevada: 2036
(7)Rhode Island: 2033
(8)Creighton: 1846
(9)Florida State: 1829
(8)Virginia Tech: 1814
(11)UCLA: 1687
(11)San Diego State: 1669
(11)Loyola Ill: 1627
(9)Alabama: 1464
(10)Providence: 1417
(10)Oklahoma: 1187
(6)Miami: 1173
(11)Arizona State: 1084
(12)Davidson: 952
(12)New Mexico State: 944
(10)Texas: 918
(9)Kansas State: 896
(12)South Dakota State: 546
(12)Murray State: 476
(13)UNC Greensboro: 321
(14)Montana: 296
(13)Buffalo: 259
(14)Bucknell: 234
(14)Stephen F Austin: 157
(13)Marshall: 125
(13)College of Charleston: 119
(15)Georgia State: 96
(16)Pennsylvania: 88
(15)Iona: 27
(15)Lipscomb: 15
(14)Wright State: 14
(15)Cal State Fullerton: 13
(16)UMBC: 10
Champions Breakdown
(1)Virginia: 22737
(1)Villanova: 14902
(2)Duke: 8834
(2)Purdue: 6050
(2)Cincinnati: 5603
(2)North Carolina: 5262
(3)Michigan State: 5108
(1)Xavier: 4342
(1)Kansas: 3916
(3)Michigan: 3345
(3)Tennessee: 2663
(4)Gonzaga: 1874
(4)Auburn: 1593
(5)Ohio State: 1433
(5)West Virginia: 1431
(3)Texas Tech: 1227
(6)Houston: 1129
(4)Wichita State: 1080
(5)Kentucky: 1055
(4)Arizona: 976
(5)Clemson: 859
(6)Florida: 574
(6)Texas Christian: 403
(10)Butler: 357
(8)Seton Hall: 281
(7)Arkansas: 277
(7)Nevada: 246
(8)Creighton: 242
(7)Texas AM: 226
(9)NC State: 182
(7)Rhode Island: 174
(11)Loyola Ill: 173
(8)Virginia Tech: 144
(11)San Diego State: 129
(8)Missouri: 128
(9)Alabama: 123
(11)UCLA: 122
(9)Florida State: 114
(6)Miami: 96
(10)Providence: 85
(12)Davidson: 82
(9)Kansas State: 78
(10)Texas: 78
(10)Oklahoma: 71
(11)Arizona State: 70
(12)New Mexico State: 37
(12)South Dakota State: 21
(12)Murray State: 18
(13)Buffalo: 18
(14)Bucknell: 11
(14)Montana: 5
(14)Stephen F Austin: 4
(13)UNC Greensboro: 3
(15)Georgia State: 3
(13)College of Charleston: 3
(16)Pennsylvania: 2
(13)Marshall: 1
Here are number of teams to reach each threshold, by year the last few years:
At least 10% Chance at Final 4:
2015 - 9
2016 - 18
2017 - 15
2018 - 13
At least 1% Chance at Final 4:
2015 - 28
2016 - 43
2017 - 43
2018 - 42
At least 10% Chance at Title:
2015 - 4
2016 - 4
2017 - 2
2018 - 2
At least 1% Chance at Title:
2015 - 12
2016 - 18
2017 - 24
2018 - 19
This year seems to be a pretty average year. It is a little top heavy, with a lot
of the power residing in the top 6 or so teams, including 3 heavy hitters in Virginia,
Villanova, and Duke. However, once you get past those 3, it becomes more wide open
and the next real drop off happens near the end of the 5 line, where the 1 percent
chance at title divide is. Almost every at large team can reasonably make the
final 4.
Some bracket picking tips:
- Virginia is a clear favorite this year with a 22 percent chance at a title.
- With that said, the other 1 seeds are really weak. There is expected to be only 1.24
1 seeds in the final four, which means the most likely outcome is only a single
one seed makes it there.
- My system outright picks the following seed upsets: (10) Butler over (7) Arkansas,
(5) Kentucky over (4) Arizona, (5) West Virginia over (4) Wichita State,
(2) North Carolina over (1) Xavier, (2) Duke over (1) Kansas.
There were a lot of terrible teams that won their conference tourney, but it turns out
there is still a decent chance of each level of upset, even the bigger ones. This is
due to a weaker than usual 1-4 seed lines also. Here are the most likely upsets at
each seed line, in addition to the chances of an upset there:
- 16 over 1: Penn over Kansas is probably the best 1 seed upset chance we've had in
many years, at 13.7%. In total a 19.9% chance of at least one of these happening.
- 15 over 2: Georgia St over Cincy at 12.4%, 28.0% chance of at least one.
- 14 over 3: SFA over Texas Tech at 20.2%, 50.8% chance of at least one.
- 13 over 4: Buffalo over Arizona at 25.4%, 60.7% chance of at least one.
- 12 over 5: Davidson over Kentucky at 35%, expected 1.2 such upsets.
- 11 over 6: Loyola Ill over Miami at 49.9%, expected 1.68 such upsets.
- 10 over 7: Butler over Arkansas at 50.9%, expected 1.84 such upsets (almost 50%).
- 9 over 8: Florida St over Missouri at 48.7%, expected 1.87 such upsets. Notably, somehow all
four 8 seeds are favored, but all between 45 and 49%.
Best upset chances in some of the later rounds:
- 8 or 9 over 1: Seton Hall over Kansas at 33.5%, expected 1.01 such upsets
- 7 or 10 over 2: Butler over Purdue at 32.3%, expected 1.15 such upsets
- 6 over 3: Florida over Texas Tech at 45.4%, expected 1.51 such upsets
- 5 over 4: All four are practically coin flips, best is West Virginia over Wichita St. at 51.9%,
expected 1.99 such upsets
- 4 or 5 over 1: Gonzaga over Xavier at 45%, expected 1.43 such upsets
- 3 over 2: Michigan over UNC at 47.9%, expected 1.72 such upsets
Overall, this year looks again pretty crazy. The crazy could start in the
first round due to some weak teams especially in the 6-8 range. If there is
too much carnage in the first round it could clear the way for the 1-3 seeds to
have a safer trip deep into the tournament so it goes both ways. Overall the
opening round is expected to have about 8 upsets.
I'll see you again after the first barrage of games for an update on how
my brackets are doing and how crazy the tournament has been. I used the
idea I had last year about weighting the brackets I am entering into contests
slightly more in favor of the favorites in proportion to the expected
number of points that team will grab in the contest. This is roughly equivalent
to a 0.6 upset level, which can be approximated by setting the slider on
my bracket maker nearly all the way to "Conservative".
ENTRY 41 - 3-11-18 - Final Seed Projection
Well, it is time once again. The bracket will be revealed by the committee just hours
after I type this. So it is time to put out my final projection of the year.
Now, first things first. I learned my lesson from the last two years, and this time
I am putting out my projections now instead of waiting until the Sunday games finish.
Why? Becuase in all cases, I would have gotten a better projection score using
only data from the day before. It has been pretty obvious that the committee does
not really consider the games that happen on Sunday. The only exception is if
a different team gets an autobid and it causes a lost bid or a different bid in a
bad conference. Then obviously this will be fixed.
Second piece of business: What did I learn from the committee's early reveal a
month ago? Well, here are the actual seeds given by the committee:
1. Virginia, Villanova, Xavier, Purdue
2. Auburn, Kansas, Duke, Cincinnati (Kansas, Cincy switched for region balance)
3. Clemson, Texas Tech, Mich St, North Carolina
4. Tennessee, Ohio St, Arizona, Oklahoma
My takeaways from this early reveal back on 2/11:
- The four teams they chose as 1 seeds, in addition to the fact that Cincy made
the 2 line, tells me that advanced metrics like kenpom, sagarin, BPI, etc. are
definitely important to them now and will be used for seeding purposes.
- The fact that Mich st still did not make a high line regardless of my first
point tells me that noncon SOS and number of Group 1 wins is still important.
- Texas Tech still was relatively high in the order, proving that enough
Group 1 wins can nullify a bad noncon.
- Arizona showing up shows a bias towards power conferences, even if they are
having a down year.
- Oklahoma showing up here was just a big question mark. It does show that
the committee is very interested in elite level wins, which again Mich St. did
not really have at the time of this reveal.
- The committee member who was actually present on the show said very little.
Unfortunately, the bumble heads on CBS mostly talked over him and didn't let
him say anything. The one thing he did say was they liked Virginia as 1st overall
because of their R/N record, so I have to assume that is also important.
I used these ideas to re-weight the system, specifically trying to match the
committee top four seeds as closely as possible. The best I was able to do from
messing with it a bunch was to get the following list:
1: Villanova, Virginia, Xavier, Kansas,
2: Purdue, Cincinnati, Auburn, Duke,
3: North Carolina, Michigan State, Texas Tech, Clemson,
4: Tennessee, Ohio State, Texas AM, Rhode Island,
Arizona and Oklahoma are both lurking in the 6 seed line on this list but that
just could not be helped. I achieved the above list by mostly cranking up
massively the effect of computer-type rankings on the list, as well as
minor adjustments to what I mentioned above: R/N record, best wins, Group 1 wins,
and noncon SOS. The Group 1 wins got the biggest bump out of these.
So here you go, the current list which will serve as my system's final bracket projection
for 2018:
1: Virginia, Villanova, Kansas, North Carolina,
2: Duke, Xavier, Purdue, Cincinnati,
3: Tennessee, Michigan State, Auburn, Michigan,
4: Wichita State, West Virginia, Arizona, Kentucky,
5: Texas Tech, Clemson, Gonzaga, Ohio State,
6: Houston, Florida, Texas AM, Texas Christian,
7: Arkansas, Nevada, Miami, Rhode Island,
8: Seton Hall, Missouri, Southern California, Virginia Tech,
9: Butler, Creighton, Kansas State, NC State,
10: Providence, Saint Marys, Saint Bonaventure, Louisville,
11: UCLA, Oklahoma, Alabama, Florida State, Loyola Ill, New Mexico St,
12: Davidson, San Diego State, Buffalo, South Dakota State,
13: Murray State, UNC Greensboro, Montana, College of Charleston,
14: Marshall, Bucknell, Wright State, Stephen F Austin,
15: Pennsylvania, Lipscomb, Georgia State, Cal State Fullerton,
16: UMBC, Iona, Radford, Texas Southern, LIU Brooklyn, North Carolina Central,
There are still three out of five Sunday games which could cause a re-working
of this bracket: Davidson could win the A10, causing an extra bid for them and
a lost bid on the bubble, and the Ivy and Sun Belt titles are up for grabs
and could go to either team. Penn is winning right now so they should be good
to go. EDIT: Davidson and Georgia St won.
There has been a lot of carnage this year in the lower leagues. Almost every
one sent a non-ideal team to the tournament. NC Central in particular is one
of the worst teams that has ever been sent to the tournament. SFA started out
as a 16 seed projection but has slipped slowly up the ranks as teams filled
in behind it.
The auto bids end at Texas on the 11 line (EDIT: Florida St) . Middle Tennessee is listed as the
first team out (EDIT: Texas), and their score is practically tied with Texas, so that could
really go either way. No other team is anywhere close to those two on the
bubble. The next few teams are Syracuse, Arizona St, Marquette, Oklahoma St,
Baylor. Nobody below that is even worth mentioning.
Bracket Matrix agrees with my list of at large teams, except it leaves out
Louisville and puts instead Arizona St. This would be consistent in some sense
with the emphasis on Best Wins that the committee has. However, Louisville
has good computer numbers and I think this might get them in. Of course,
one thing to be considered is that there is theoretically a difference between
selection criteria and seeding criteria. That means if Louisville cannot make
the resume cut, then the computer number criteria may never come into
consideration.
Even though there are some question marks in the seed list above, I am not
going to continue tinkering today becuase it just doesn't seem worth the time.
Nobody knows exactly what will happen this year and I'm for the most part
writing this year off as an opportunity for data collection. It will be a
nice benefit if my bracket does OK, but I just have no idea what will happen.
Chief among the concerns:
- There's just no way UNC gets a 1 seed with 10 losses. My system is pulled in
by UNC's massive 13(!) Group 1 wins and Number 1 SOS. No other team in
consideration for the 1 line (in particular Xavier, who seems to be the pick for
last 1 seed) has more than 6 Group 1 wins. Are Group 1 wins that important?
- Everything else looks quite reasonable until the 8 line, where USC is rated
too highly vs. the consensus. Most have them on the 11 line. I think the
truth is somewhere in the middle.
- My system has Providence very low as a 10 seed, vs the consensus on the
boundary between 8/9 line. They have 2 victories over Xavier and one over
Villanova, both projected 1 seeds. They suffer in my system becuase two
of those three wins were at home, and their computer numbers are not good
due to some terrible losses and in general close games against bad
competition. The moral of this story might be that top tier wins can proxy
as good actual computer numbers and reduce its necessity.
- Louisville I think will be in but closer to the cut line, definitely an
11 seed.
- I think the committee will try to sneak Mid Tenn in with their pristine
OOC SOS of 10, but I don't know who will be cut to get them in. Louisville
and Texas are the obvious choices but I just don't know.
- Not so sure about SFA on the 14 line but not sure who I'd move up.
So with all of this said, here is my best effort at an S-line and seed
projection (my personal projection, not my system's):
1: Virginia, Villanova, Kansas, Xavier,
2: Duke, North Carolina, Purdue, Cincinnati,
3: Tennessee, Michigan State, Auburn, Michigan,
4: Arizona, West Virginia, Texas Tech, Gonzaga,
5: Kentucky, Wichita St, Ohio State, Clemson,
6: Florida, Houston, Texas AM, Miami,
7: Arkansas, Texas Christian, Nevada, Seton Hall,
8: Rhode Island, Missouri, Creighton, Virginia Tech,
9: Butler, Kansas State, Providence, NC State,
10: Oklahoma, Alabama, Southern California, Saint Bonaventure,
11: Saint Marys, UCLA, Florida State, Texas, San Diego State, Loyola Ill,
12: New Mexico State, Buffalo, Davidson, Murray State,
13: South Dakota State, UNC Greensboro, Marshall, College of Charleston,
14: Montana, Bucknell, Wright State, Georgia State,
15: Stephen F Austin, Pennsylvania, Cal State Fullerton, Iona,
16: Lipscomb, UMBC, Radford, LIU Brooklyn, North Carolina Central, Texas Southern,
OK, I am done today. I will update these if Davidson, Harvard, or Georgia St
wins. If Davidson wins, then it will cause a shift on the bubble, and Davidson
can go on the 12 or 13 line. The other two winning would just replace that
team on that exact spot on the S curve and seed list.
UPDATE: Davidson won. Sorry, Middle Tennessee. Also, Georgia State won.
I've edited both lists above to account for this.
ENTRY 40 - 2-11-18 - Early Rankings Reveal 2018
It is time again for the NCAAM committee to put out their fake rankings up to this point.
I will be playing along and updating my automated seed projections for the day, as well
as give my own prediction for what will happen in their S Curve.
First, some commentary on my system. I did some work over the last week or two to add
extra categories for weighting in my bracketology system. First and most important,
since the committee is switching how they view road games, I am doing the same. Instead
of categories like Top 50 wins, Top 100, and Top 150 (all in RPI measure), there are
now Group 1, Group 2, Group 3, and Group 4. Here is how the groups break down:
Group 1 = 1-30 Home, 1-50 Neutral, 1-75 Road
Group 2 = 31-75 Home, 51-100 Neutral, 76-135 Road
Group 3 = 76-160 Home, 101-200 Neutral, 136-240 Road
Group 4 = everything else
This will not theoretically change much, and as a baseline, I basically converted my
weightings as Top 50 to Group 1, Top 100 to Group 2, etc.
The other major change I made was I finally took the time to add some additional
categories to my weightings which have been missing. Namely, the new weighted values
are Non-Con Strength of Schedule and Conference Strength. These should be important
in determining the correct seeding of in particular the 12-16 seeds, but also the Non-Con
SOS is often quoted as being a decider for bubble at-large teams.
Some preliminary optimization seems to show that neither of the two added categories
is all that important, which is a little surprising.
So here are my system's projection for the top 16 teams in the S curve, as well as my
own projections. Keep in mind I have done a minimal amount of optimization of the
weightings to this point, because this year has many different bracket principles
than previous years and optimizing to previous years just has less value. I will,
however, scrutinize the real committee seeds and optimize to those the best I can
to get my system ready for march. The benchmark is that my system with current weights
scores 334 on LAST year's bracket matrix, which is about average.
System:
1: Virginia, Villanova, Xavier, Kansas,
2: Auburn, Clemson, Cincinnati, North Carolina,
3: Texas Tech, Purdue, Duke, Tennessee,
4: Michigan State, Texas AM, Ohio State, West Virginia
My Projections:
1: Villanova, Virginia, Xavier, Kansas,
2: Purdue, North Carolina, Michigan State, Duke,
3: Clemson, Auburn, Cincinnati, Texas Tech,
4: West Virginia, Tennessee, Ohio State, Arizona
My system's 5 seed line:
5: Florida, Rhode Island, Oklahoma, Kentucky
My system does not love the work Arizona has done so far. The Pac 12 is not a
premier conference this year and Arizona still has managed to pick up some losses. In
addition to the ugly losses, they do not have a marquee win to date. Their list
of marquee wins is similar to Rhode Island or Houston. Still, I believe the committee
will probably still view the Pac 12 as "power" and rate the leader accordingly.
Purdue and Michigan St also have surprisingly little meat on the bones for being in a
"power" conference. They too will be overrated by the committee vs. the computer
numbers. The eye test is strong with them. However, the eye test is also strong with
teams like Michigan, St Mary's, etc. and they will not get high seeds. Also, historically
Wichita St has had very strong teams and gotten snubbed (and the same may happen this
year, their wins do not match with their talent). I guess the eye test only applies
if a team is also in a sufficiently strong conference. Cincinnati is strong enough
in computer metrics for a 2 seed, but their conference is too weak for the eye test
to apply to them.
Another note which probably won't affect the early reveal: The WCC is very bad this
year besides the obvious two. They are projected as 13th out of 32 conferences. Not
sure if this is good enough to get the Gonzaga/St. Mary's conference champion too high.
My system says currently an 8 seed for Gonzaga, and if they finish as champions
probably a 6 seed is their peak. However, I could imagine the committee giving them
a 4. This seems to clash with what I said in the previous paragraph and makes no
sense, but whatever.
ENTRY 39 - 2-2-18 - Projected Brackets
Here's a small update regarding a new feature of the site. Occasionally at the same time
that I update my seed projections for the basketball tournament, I will also post a projected
bracket to the Bracket Generator page. This projected bracket is an automatically generated
bracket by the program using the current seeds.
The algorithm currently used to generate the bracket is very unintelligent - it just slots
the teams in with the current S-curve with no regard for conference affiliation, geography,
or anything else. I understand that this will cause the bracket to be unrealistic as far
as actual matchups is concerned, but it was a quick job done for my own and others'
amusement, and also to generate data on the likelihood of teams making the final four
and winning the championship based on current team strength. Kenpom has stated numerous
times on his site that chances of final four and national title tend to follow very closely
to just the team's strength and while overseeding/underseeding and placement in a "difficult
bracket" can affect this, the effect is quite minimal.
I also am in the process of updating my seeding algorithm to reflect new seeding principles
of using more advanced metrics and also accounting for location of the game on a team's
tournament profile. For example, an RPI 1-30 road win is now theoretically on par with an
RPI 1-75 home win. I've already done the programming to collect the data but I still
need to calibrate the system to effectively use this data. The calibration will be
mostly guesswork for this season, as this is the first season of its use and it is a brand
new world for bracketologists in figuring out how important these new considerations are.
My personal guess is that they will have very little impact and seeding will basically go
forward the same as it always has.
The next post you should see is a response to the committee's initial teaser seeds
they plan to reveal in February. After that, we will be ramping into march madness which
will bring an official final seeding projection, a final bracket, and then the spam
of percentile data.
I also plan to do a number of mini-projects in the summer during which I will test
some possible improvements to the prediction algorithm. I will test some of the
conventional wisdom like the boon from a bye week, the effect of a "trap" game before
playing a big rival, the "hangover" from winning a big game especially in upset fashion,
and other things people claim have a tangible effect on the outcome of games. I also
am curious to see whether I can create some kind of device to guess when a team
has suffered a serious injury by looking only at results.
ENTRY 38 - 8-6-17 - Calibrating the College Basketball Win Probability Model
In Entry 22, I discussed the question: Are the win probabilities my program is producing accurate?
If my program gives a team a 63% edge, are they really 63% to win the game? Anyone using these
probabilities for things such as betting, picking for march madness pools, etc. would like
to know if they're getting what is advertised.
Entry 22 covered the college football program, while this entry covers college basketball. I
compared program win probabilities vs. actual results for all games in the last 6 seasons
(through 2011-2012), which gave me 16554 games of data.
For each season I used only data after
January 15th so the program has enough data to go from. Each game was placed in a 2% wide
"bin" as in Entry 22. I looked at the bins
to see if actual outcomes matched up with given probabilities.
The results were not as exciting as in football. The correlation of just a straight line fit
was a very high R^2 = 0.9934. Over the 6 seasons, the program got about 1% less games correct than
expected, and this difference was mostly experienced at the higher probabilities.
In this graph, the x-axis is the bin (program's given win probability), while the y-axis is the
actual observed win percentage for that bin. Most bins had a sample size of over 500 games,
with the smallest bin (the 98-100% bin) having 277 games.
I will slightly modify the produced probabilities from this point forward but it will not make
a huge difference.
ENTRY 37 - 3-18-17 - Some Round 1 Analysis
First, let's get this out of the way. Quote from Blog entry 36:
There is a big drop off in team power right around the 7-8 seed line. What this
means is I expect less upsets than usual during the opening 32 games.
I would stay away from any seeds 11 and below
making it very far though, as the top 6 seed lines are just too strong relative to the
rest of the field. Let the chaos break out after a quiet first round.
So... yeah. That is pretty much what happened. This might go down as the quietest
first round in tournament history. Even every "upset" that happened was one of the
"obvious" ones that was on everyone's radar. The only true shocker of an "upset" was
SMU's early departure to USC.
So remember the rest of what I said in the last post as well: expect a lot of chaos
from here on out as a lot of good teams remaining in the field means a lot of good
chances at upsets. Nobody is safe. I would expect at least 2 of the 1 and 2 seeds
to fall.
So what does this mean for my system's Round 1? Let's run some numbers.
- My system's chances at a perfect first round are 1 in 128,000. This is the
highest first round probability I've ever seen. It is about 20 times more likely than
the average first round. The odds increase even further at the optimal upset
level of 0.57 (all the way to conservative on the scale), where the chances
of a perfect first round are 1 in 40,300.
- My system would get about 1800 1-loss brackets given the scale of ESPN. Actual
ESPN got 4688 1-loss brackets. I have seen before that the distribution of human
brackets follows most closely the distribution of the 0.7 upset level, so quite
conservative. There exist a few "bracket farmers" which systematically cover a lot
of possibilities and represent most of the high end ESPN brackets. My system
on a 0.7 upset level would have (scaled) 7000 1-loss brackets.
- ESPN still has 164 perfect brackets out of approximately 18.8 million brackets.
This gives odds of 1 in 114,677, very comparable.
- The national bracket and seed favorites bracket each scored 260 points in the first round.
The upset level 1 simulator had a 16% chance of a bracket at least that good, and
the upset level 0.7 simulator had a 31% chance of a bracket with at least 260 points.
- The highest score in my 60 bracket ESPN group is 280. The chances at the two upset
levels of generating a bracket at least that good are 2.4% and 6.2% respectively.
My 6 brackets I entered in that group (3 each of 1 and 0.7 levels) managed a best of
2 brackets with 270.
On a side note, I got an idea for future bracket generation that I'm surprised I did
not explore before. I have kept a log of possible change ideas for my system ever since
it began in 2007. One idea I had in 2014 but never followed up sufficiently on was the
following: The goal of a high bracket score is NOT the same as the goal of getting as
many games right PER ROUND as possible. The problem is that trying to optimize
round by round might kill too much of the chances for a team that ends up advancing far.
When my system gives Texas Southern over 5% chance of beating UNC, the 10 points from
Round 1 is not all that is at stake. UNC has expected points it should receive in the
future rounds, and those points far outweigh the impact of just the single game.
My idea on this front (which I will try to implement in future years) is to come up with
an "expected contribution" of each team to standard scoring points (double points for
each successive round). Then, when flipping weighted coins for each matchup, do not
compare just the probability of that matchup but the EXPECTED CONTRIBUTIONS of each
team in the tournament as a ratio. These numbers can be found by running a large number
of simulations using the real probabilities.
As an example, (3)Oregon vs. (14)Iona is given as a 10.6% chance for Iona. Unfortunately,
this single round's outcome is not the only thing at stake. Oregon would probably win
a second game after this one if it wins, while Iona would not. Oregon may even go on
to win a 3rd and 4th. This is a lot of potential lost points. Calculating instead
each team's expected number of added ESPN points for the year, I get about 64 for Oregon
and about 1.5 for Iona. If I create brackets giving each team its fair share of the
expected number of points earned, it turns the matchup into only a 2.3% chance for Iona.
Returning to the (1)UNC vs (16)Texas Southern example above, the current predictor would
give Texas Southern a 5.4% chance of an upset. However, the expected ESPN point contributions
of the two teams are 121 vs. 0.6. Thus using the ratio of expected point contributions instead
would give a recommended upset frequency of 0.5%. This will signficantly reduce the
number of ruined brackets but still give Texas Southern a "fair" chance.
This idea is close but not quite equivalent to just using the regular numbers with a 0.65
upset level, which is the bracket generator slider placed most of the way to conservative.
Making this change would lower the chances of a perfect bracket, but theoretically it
should maximize the expected value of brackets by reducing the number of brackets
brutally destroyed by picking an upset of a team that is expected to go very far.
ENTRY 36 - 3-12-17 - Bracketology wrap-up and the NCAA Tourney Special
First, some quick hitters and observations on the seedings and tournament:
- My bracket matrix score is 333. This is my best official score so far and should definitely
put me in the top half. I am satisfied with this score. I will update later on my exact
rank.
- Strength of Schedule continues to be a strong factor. It's possible I didn't have enough
strength of schedule factor because I pulled back. I was worried about Gonzaga but should
have just left them as a lost cause.
- I turned out to be correct about Vanderbilt - they were ranked higher than most people
expected. My program's 7 seed was still too optimistic.
- I should have listened to some of the other bracketologists regarding just ignoring
Sunday altogether. Almost every projection move resulting from Sunday's games was in the
wrong direction. My bracket matrix score would have been 341 if I just used Saturday
night's projections. Will do this in the future until notified that it is unwise.
- Also regarding late moves, the selection committee chair described a "scrubbing" process
by which the bracket was pretty much set on WEDNESDAY and they made only small adjustments
from there. Teams could get "stuck" if they were unable to move ahead of a team immediately
in front. This was the explanation for UNC being the 1 seed ahead of Duke, who won the ACC
tourney while beating UNC. This scrubbing process also places a higher emphasis on the
regular season conference titles rather than the tournament, which can explain some of
my misses.
Now, for the good stuff. We need to look at the tourney field and do some simulations!
We'll start with the projections to make the final four and win it all,
with 10,000 simulations run.
Final Four Breakdown
(1)Gonzaga: 3422
(1)Villanova: 3198
(1)Kansas: 2726
(1)North Carolina: 2653
(2)Kentucky: 2411
(2)Duke: 1851
(2)Louisville: 1810
(2)Arizona: 1695
(3)Oregon: 1668
(3)UCLA: 1356
(4)West Virginia: 1341
(4)Florida: 1259
(3)Florida State: 1232
(6)Southern Methodist: 1089
(4)Purdue: 1076
(10)Wichita State: 920
(5)Virginia: 883
(4)Butler: 878
(3)Baylor: 811
(5)Iowa State: 760
(7)Saint Marys: 754
(5)Notre Dame: 682
(6)Cincinnati: 630
(7)Michigan: 622
(5)Minnesota: 400
(6)Creighton: 388
(8)Wisconsin: 361
(10)Oklahoma State: 322
(8)Arkansas: 268
(8)Miami: 224
(6)Maryland: 214
(9)Vanderbilt: 213
(10)Marquette: 188
(7)South Carolina: 168
(11)Rhode Island: 160
(8)Northwestern: 143
(9)Seton Hall: 131
(11)Wake Forest: 127
(12)Middle Tennessee: 121
(11)Xavier: 120
(10)Virginia Commonwealth: 116
(9)Michigan State: 110
(7)Dayton: 101
(12)Nevada: 99
(9)Virginia Tech: 70
(11)Providence: 65
(12)Princeton: 43
(12)UNC Wilmington: 34
(13)Vermont: 31
(13)Bucknell: 14
(13)East Tennessee State: 13
(14)New Mexico State: 9
(14)Florida Gulf Coast: 9
(14)Iona: 3
(13)Winthrop: 2
(15)Troy: 1
(16)Texas Southern: 1
(15)North Dakota: 1
(16)North Carolina Central: 1
(16)South Dakota State: 1
(14)Kent State: 1
Champions Breakdown
(1)Villanova: 1242
(1)Gonzaga: 1202
(2)Kentucky: 790
(1)North Carolina: 786
(1)Kansas: 775
(2)Duke: 490
(2)Louisville: 443
(3)Oregon: 425
(2)Arizona: 407
(4)Florida: 335
(3)UCLA: 299
(6)Southern Methodist: 287
(4)West Virginia: 287
(3)Florida State: 247
(4)Purdue: 231
(5)Virginia: 208
(10)Wichita State: 205
(3)Baylor: 152
(5)Iowa State: 145
(4)Butler: 143
(7)Saint Marys: 134
(6)Cincinnati: 126
(7)Michigan: 115
(5)Notre Dame: 108
(8)Wisconsin: 60
(10)Oklahoma State: 49
(6)Creighton: 44
(5)Minnesota: 40
(9)Vanderbilt: 28
(8)Arkansas: 27
(7)South Carolina: 19
(8)Miami: 17
(8)Northwestern: 16
(6)Maryland: 14
(10)Virginia Commonwealth: 13
(11)Wake Forest: 11
(12)Nevada: 11
(10)Marquette: 11
(11)Rhode Island: 11
(9)Michigan State: 8
(12)Middle Tennessee: 8
(7)Dayton: 7
(9)Seton Hall: 6
(9)Virginia Tech: 5
(13)Bucknell: 2
(12)Princeton: 2
(13)Vermont: 2
(11)Xavier: 2
(14)New Mexico State: 2
(11)Providence: 2
(12)UNC Wilmington: 1
Here are number of teams to reach each threshold, by year the last few years:
At least 10% Chance at Final 4:
2015 - 9
2016 - 18
2017 - 15
At least 1% Chance at Final 4:
2015 - 28
2016 - 43
2017 - 43
At least 10% Chance at Title:
2015 - 4
2016 - 4
2017 - 2
At least 1% Chance at Title:
2015 - 12
2016 - 18
2017 - 24
What really jumps out to me here is the 1% levels. There are no breakout best teams
this year and there is a very gradual dropoff in team power. What this means is a LOT
of teams have a legitimate chance at a final four and a title.
Some other observations:
- There is a big drop off in team power right around the 7-8 seed line. What this
means is I expect less upsets than usual during the opening 32 games. However,
after the first round, there should be more upsets than usual as there is less parity
at the top.
- Gonzaga and Villanova can be considered co-favorites in this tournament. They both
have about a 32% chance at a final four and a 12% chance at a title. Compare this to
last year when Kansas had a 40% chance at a final four and 18% chance at a title. It
is interesting that these two would play in the semifinals. Kansas has the best
shot on the other side of the bracket.
- Wichita State again owns the honor of worst screw-over job by the committee. They
are ranked 12th in my system but were given a 10 seed. They must fight through
Kentucky, but still have a 9% shot at a final four and a 2% shot at a title. Their
presence in that regional hurts Kentucky's chances at both marks.
- Seed upsets that my program predicts outright: (10)Marquette over (7)South Carolina,
(9)Vanderbilt over (8)Northwestern, (10)Wichita St over (7)Dayton, (6)SMU over (3)Baylor,
(2)Kentucky over (1)UNC, (2)Kentucky over (1)Kansas. This is 6 upsets, vs. 8 last year.
Here is the best shot at an upset at each seed vs seed matchup, as well as how many upsets
at each level we can expect:
- 16 over 1: Texas Southern over UNC (5.4%), 13.7% chance of at least 1 upset
- 15 over 2: Troy over Duke (7.3%), 21.6% chance of at least 1 upset
- 14 over 3: FGCU over Florida St (16.5%), 43.6% chance of at least 1 upset
- 13 over 4: Vermont over Purdue (22.7%), 52.4% chance of at least 1 upset
- 12 over 5: MTSU over Minnesota (38.5%), expected 1.2 upsets
- 11 over 6: Xavier over Maryland (45.6%), expected 1.5 upsets
- 10 over 7: Wichita over Dayton (69.1%), expected 2 upsets
- 9 over 8: Vanderbilt over Northwestern (53.6%), expected 1.8 upsets
Later Rounds best upset chances:
- 8 or 9 over 1: Arkansas over UNC (29.1%), expected 1 upset
- 7 or 10 over 2: Saint Marys over Arizona (43.7%) (All 4 of these are viable, expecting 1.5 upsets)
- 6 over 3: SMU over Baylor (55.1%)
- 5 over 4: Iowa State over Purdue (47.2%) (All 4 of these are basically coin flips)
- 4 over 1: Purdue over Kansas (41.3%)
- 3 over 2: Oregon over Louisville (48%) (All 4 basically coin flips)
- 2 over 1: Kentucky over UNC (50.1%) (All 4 basically coin flips)
SUMMARY OF INFORMATION FOR BRACKET PICKERS
So... lots of coin flips. Very unpredictable tournament, once we get
past the first round. The real bracket
will probably have at least 1 upset at almost every seed matchup. Chances are there will be
one big upset (14, 15, or 16) as well. The 10 vs 7 matchup is particularly badly seeded this
year and the 10 seeds are slightly favored as a whole against the 7 seeds. If you
are filling out brackets, this is the year to go with a little crazier settings and
play to the chaos. For example, filling out 10 brackets, I would probably choose at
least 8 different champions among them, and I'd have every 1-4 seed making it to the
final four in at least one bracket. SMU and Wichita State are easy picks
to make it there as higher seed numbers. I would stay away from any seeds 11 and below
making it very far though, as the top 6 seed lines are just too strong relative to the
rest of the field. Let the chaos break out after a quiet first round.
ENTRY 35 - 3-12-17 - My Final Bracket and Thoughts
I have tried over the last month since my last post to revise my program to simultaneously
optimize for several of the last years. The idea was to try to account a little more for a
variety of different resume quirks that a single year could not explain. Here are the results
of my search:
- Last year's criteria was definitely way different than previous years. There was an insane
and unprecedented emphasis on top 50 wins and strength of schedule.
- Any attempt to optimize for years before 2016 actively hurt the 2016 bracket score.
- Going back any further than about 2010 was completely meaningless. Before that year was
an era of basically everything coming down to overall record, conference record, and conference
strength. Strength of schedule was secondary, particularly the non-con SOS. Some teams that
had no business being in the tournament or seeded highly were there because their conference
lifted them. The same thing goes the other way around. Believe it or not, recent selections
and seedings are fountains of wisdom compared to how things were done 10 years ago.
- There are two ideas I still have for fixing some things that would seem to be out of alignment
for this year's tournament. One is allowing for resume weights to change depending on where
in the seed line a team is projected - for example, SOS is not as important for top seed lines
as it is for bubble teams. For the auto bids from bad leagues, conference RPI and best wins
matter a little more.
- My other idea is that it seems from reading a lot of stuff that certain resume problems
are excusable if other resume items are in place. The same thing goes the other way around:
Certain resume highlights are meaningless if certain other items are not there to back it up.
The most common resume item that causes problems like this is strength of schedule. A good
strength of schedule (like Vanderbilt) should not be so overpowering and lift the team by
itself. I do believe Vanderbilt will get a higher seed than most people think. But I think
that seed will be something like a 10, not an 8 like my program is projecting. On the other
side of the coin, Gonzaga should have enough other resume highlights that its poor SOS from
conference should not have it on the 2 line. They are pretty much consensus a 1. Who knows,
maybe 2 will turn out to be correct.
- I have been too busy recently to attempt to optimze and tinker with these ideas so for
now we're just going with what we have. I did decrease slightly the effect of strength of
schedule to help the two situations I described above. Doing so is dangerous, though, because
the small change I made already brings 2016's bracket score from 348 to 338.
So here are my own personal seed projections. They are independent of my system, but
I cannot say they are not influenced by my system and other things I have seen online as well.
A bracket projector really should use all information available, including other people's thoughts,
in order to make the most informed decisions.
MY SEEDS (As of Sunday 5:00 EST)
1: Villanova, Duke, Kansas, Gonzaga,
2: North Carolina, Kentucky, Arizona, Louisville,
3: Oregon, UCLA, Baylor, Florida State,
4: Butler, Florida, West Virginia, Notre Dame,
5: Iowa State, Virginia, Purdue, Southern Methodist,
6: Michigan, Cincinnati, Saint Marys, Wisconsin,
7: Minnesota, Miami, Creighton, Maryland,
8: Oklahoma State, Arkansas, Wichita State, Virginia Tech,
9: Dayton, South Carolina, Virginia Commonwealth, Northwestern,
10: Rhode Island, Michigan State, Seton Hall, Vanderbilt,
11: Xavier, Marquette, Middle Tennessee, Nevada,
12: Providence, Wake Forest, Kansas State, Southern California, UNC Wilmington, Princeton,
13: Vermont, East Tennessee State, Bucknell, Winthrop,
14: New Mexico State, Florida Gulf Coast, Iona, Northern Kentucky,
15: Kent State, Troy, North Dakota, Jacksonville State,
16: New Orleans, Texas Southern, South Dakota State, UC Davis, North Carolina Central, Mt St Marys
SYSTEM SEEDS (For Comparison)
1: Villanova, Duke, Kansas, Kentucky,
2: North Carolina, Gonzaga, Arizona, Louisville,
3: Florida State, Florida, Baylor, Butler,
4: Oregon, UCLA, Virginia, West Virginia,
5: Iowa State, Notre Dame, Southern Methodist, Michigan,
6: Purdue, Cincinnati, Minnesota, Creighton,
7: Saint Marys, Wisconsin, Arkansas, Vanderbilt,
8: Maryland, Dayton, Wichita State, Miami,
9: Rhode Island, Oklahoma State, Virginia Tech, Virginia Commonwealth,
10: South Carolina, Seton Hall, Marquette, Northwestern,
11: Xavier, Wake Forest, Michigan State, Nevada, Providence,
12: Middle Tennessee, Kansas State, Southern California, UNC Wilmington, Princeton,
13: Vermont, East Tennessee State, Bucknell, New Mexico State,
14: Winthrop, Florida Gulf Coast, Iona, Northern Kentucky,
15: Kent State, Troy, North Dakota, Jacksonville State,
16: New Orleans, Texas Southern, South Dakota State, UC Davis, North Carolina Central, Mt St Marys
My system has a very clear cut line with Illinois St, Syracuse, and Cal being the closest ones out.
In terms of point value they are all basically an entire seed line's worth of points out of the field,
so I would consider it quite a shock if my system does not get all 68 of the correct teams in
the field this time. Syracuse is the only one I'm really worried about but they have some pretty
big warts to overcome.
ENTRY 34 - 2-11-17 - The Early Rankings Reveal
Well, today was the reveal of the selection committee's first bracket seedings. This
is the first year there has been an early reveal, which offers a lot of potential for
gauging what the committee sees as important this year and doing some tuning before
the big moment in March.
So how does my current program match up with the committee's seeding of the top 4 lines?
Let's take a look, in order of committee overall numbers followed by my system's overall ranks:
1. Villanova (2)
2. Kansas (3)
3. Baylor (1)
4. Gonzaga (9)
5. North Carolina (6)
6. Florida State (4)
7. Louisville (7)
8. Oregon (14)
9. Arizona (15)
10. Virginia (5)
11. Florida (11)
12. Kentucky (10)
13. Butler (8)
14. West Virginia (19)
15. UCLA (21)
16. Duke (12)
In my top 16, left out by the committee: Creighton (13), Purdue (16)
There's no one factor that jumps out at me immediately that explains the differences in
overall rank. RPI certainly explains a few of them, but doesn't explain why Gonzaga is so
high or why West Virginia is in the top 16 at all. If we go on raw team ability that
helps boost Gonzaga and brings Butler down, but doesn't explain how the Pac 12 is so high.
Part of that may be a perception thing - The Pac 12 is way down this year but maybe the
committee doesn't really see it that way. Perhaps I need to tune my metrics which calculate
the "conference boost" that teams get. One other possibility is strength of schedule - it
would seem that I'm placing too much emphasis on SOS and top 50 wins. But these are
precisely the two metrics that made the MOST distinction in last year's final bracket. Who
knows - maybe the committee just didn't work very hard in making this list and got lazy.
Whatever the case, I'm going to rig up my system to analyze this information and run
some genetic algorithms to hopefully bring this year's really important metrics to the top.
I'll make another post here when I'm done with the analysis detailing my results and what
it means going forward.
Before I leave today, some extra business: there are some suspicious teams in my bracket
projections this week! I'll try to explain some of them.
- Tennessee an 8 seed: Tennessee at this point is on most people's cut lines. What they
have going for them: RPI of 39, A win against Kentucky, to go along with the 4th ranked SOS,
pretty good computer power rankings (42nd in kenpom) and 3
top 50 wins. The arguments against: Only a 9-10 record against Top 150, a mediocre 14-10
overall record, and 6-5 mark in an SEC that is viewed as weak. Tennessee's problem is that
although it has very good SOS, it hardly won any of those marquee games. Probably the 2nd best
win is at Vanderbilt, which somehow counts as a top 50 win. Perhaps what is called for is
a devaluing of the SOS metric if the wins are not there to back it up.
- UCLA a 6 seed: My system continues to be unimpressed by UCLA's resume. The win at Kentucky
is great, but there's not a lot of meat behind that. They have an RPI of 20, a non-con SOS
of 281, and just don't have elite level computer power rankings. This might be the opposite
case of Tennessee in terms of fixes: devalue the bad SOS if there are a few good wins to prove
they are actually a good team. The same thing can be said for Gonzaga right now.
- California a 9 seed: With a 9-3 record in the Pac 12, this is somehow actually higher
than most people have California right now. Case in point as to why the PAC 12 should not
be viewed as a power league this year, yet at the top it somehow is.
- Vanderbilt in the bracket at all: Blasphemy! Out of all the teams with losing records,
especially in conference (5-6 in the SEC), why does Vanderbilt get the pass? Try the number
1 non-con SOS and overall number 2 SOS! This is similar to Tennessee's situation. They scheduled
very tough but didn't really win any of those games. At Florida is certainly their best win, and
is very impressive. But after that it's a wasteland. At Arkansas and home against Iowa State
compete for the 2nd best win. Meanwhile, that loss to RPI 246 Missouri... yikes. But
that was today, so rest assured Vanderbilt will not be in my field tomorrow. That's one of the
nice things about cases like this - they tend to work themselves out by the end of the season.
The bottom line from all of this is the SOS valuation probably needs some work, especially when
it is an outlier among the team's other metrics.
ENTRY 33 - 1-25-17 - The New Regime in Bracketoloy
I just read an article on kenpom.com which describes a recent meeting between selection
committee members and some celebrated members of the computer ranking community. The
goal was to discuss the use of some more advanced metrics than just the RPI in selecting
and seeding the field for the NCAA basketball tournament. A new evaluation system may
be in place as early as 2018.
My first thought was, FINALLY! It has long been known that it is pretty easy to "game
the system" by playing against somewhat successful members of bad conferences and
boosting strength of schedule numbers that way. Strength of schedule is usually calculated
as an average of opponents, which means as long as you avoid complete pushovers, you can
get a very high strength of schedule number even without playing anyone difficult.
Exhibit A is this year's Middle Tennessee team, which somehow had a non-con SOS rank of 21st while
only playing two top-50 teams and 5 top-100 teams out of 12 games (and no top-25 teams). How is that possible?
Almost all of Middle Tennessee's opponents were in the 75 to 150 range. Only two opponents
were outside the top 200 - and those were in the low 200s. What a scheduling masterpiece. Compare to
the current 15th ranked non-con SOS of Michigan State, widely viewed as a murderer's row.
Games against Arizona, Kentucky, Baylor, Wichita St, and Duke all make it into the top 50.
But that is counterbalanced by 4 sub-200 games, including a sub-300 game.
So what does this mean for me? I have been intending to update my bracketology system
for a while now. This announcement means my update schedule has been forced. With my current
data and algorithm I have basically just this year before it all goes out the window for next
year. I need to get things moving!
Over the last few days I've done exactly that. I created a smart algorithm for determining
conference alignment by looking ONLY at the games played, by analyzing the connectivity of
the game graph. This allows me to include metrics like conference strength, conference
record, and championships in the mix when analyzing past tournaments. I also updated a
few other metrics like the Road/Neutral Record and how good wins and bad losses are treated.
All of this allowed me to really play with the weights on all of the metrics and figure out
what really matters. I've done some of the initial testing but there is still much left to
be done. Here is what I have discovered.
I began by optimizing for just the last year, 2016. I was able to get the program to the point
that it scores 346 points by Bracket Matrix's scoring system, which would have been good enough
for 4th place out of 144 brackets
(See Here) .
Not bad for a few days' work! What was the secret? I used the same basic weights as before,
but I significantly increased the weight on Number of Top 50 Wins and RPI SOS. There were some
other smaller changes but that was the biggest thing. That makes sense given the comments the
selection committee made afterwards and the comments I made in Entry 29 of the blog. I have left
the current algorithm as these settings and the bracket you will see tomorrow (1/26) will reflect
the new algorithm. After all, last year is probably the best place to start for what will happen
this year.
So how does this set of weights fare in previous years? Over the last five years, it would
have done about average on Bracket Matrix. Then if you go back further than that, it starts
to get bad. Really, really bad.
In the early 2000s, there are consistently several teams projected out that made it in.
All across the board everything is all screwed up. My
system gives 2004 Stanford a 5 seed when they actually got a 1. 2007 Butler, which the committee
gave a 5 seed, is left out entirely by my system. WHAT?
What it comes down to is this - I haven't done calibration yet for officially finding optimal
mid-2000s weights, but from what I see, strength of schedule meant virtually nothing. Quality
wins meant almost nothing. Bad losses? Who cares? It all just came down to RPI, raw number
of losses, and which conference you were blessed to be in. That was about all that mattered.
I'm interested in the coming months to do more research and see where things started to turn around.
What is clear is schedule and who you beat has become more and more important as time has gone
on, and was responsible for almost all of the "strange" looking seedings given in the 2016
tournament. This is an interesting development which is definitely worth keeping an eye on
in the future.
ENTRY 32 - 3-20-16 - 2016 Tournament Wrap-up
This is a placeholder for when I get around to actually writing this wrapup.
ENTRY 31 - 3-20-16 - Round 2 Quick Update
Finally, somewhat of a return to normalcy! Most of the better seeds
advanced in Round 2. Even the ones that didn't (Gonzaga over Utah,
Wisconsin over Xavier) weren't much of a surprise. On that note, let's
return to something I said two posts ago during the tournament analysis
in regards to the most vulnerable seeds at each level:
- 16 over 1: FGCU over North Carolina (4.1%)
- 15 over 2: CS Bakersfield over Oklahoma (9%)
- 14 over 3: SFA over West Virginia (23%) (None of these picks are that impossible)
- 13 over 4: Hawaii over California (26%)
- 12 over 5: Yale over Baylor (35%)
- 11 over 6: Michigan over Notre Dame (45%) (All of these are really quite likely)
- 10 over 7: VCU is favored over Oregon State (57%)
- 9 over 8: Uconn is favored over Colorado (59%) (Vegas favors the 9 in all four matchups)
- 8 or 9 over 1: Saint Joe's over Oregon (31%)
- 7 or 10 over 2: Wisconsin over Xavier (36%)
- 6 over 3: Arizona over Miami (48%)
- 5 over 4: Purdue is favored over Iowa State (60%)
None of the 16 v 1's happened but probably that was the closest one. Mid Tennessee
actually beat Michigan State. Then from that point on, almost all of the other
named upsets either happened or came very close! Michigan was doing well for a while
against Notre Dame. Saint Joe's had chances against Oregon. Arizona's successor
(Wichita State) probably should have beat Miami. And Purdue never had a chance
against Iowa State because of an epic meltdown there. Oregon is probably still at least
the most vulnerable 1 seed remaining.
My system's odds of being perfect through Round 2 are now 1 in 182 billion
using the default setting. The setting which maximizes probability of a
perfect bracket is now 1.075. This would result in slightly increased odds
of 1 in 176 billion brackets generated.
Here are the results of 100,000 simulations on Upset Level 1 using the standard
scoring system (10 point for R1, 20 points for R2):
ESPN Score: Level 1; ESPN Normalized to 100,000
580: 1; 0
570: 0; 1
560: 1; 2
550: 2; 4
540: 10; 15
530: 9; 38
520: 26; 90
510: 46; 200
500: 104; 400
490: 162; 800
480: 327; 800
470: 541; 2100
460: 901; 3100
450: 1338; 4600
440: 2007; 6000
430: 2764; 7600
420: 3774; 8700
410: 4768; 9200
400: 5804; 9300
390: 6783; 10200
380: 7651; 7500
370: 8187; 6400
360: 8543; 5000
350: 8214; 4000
340: 7580; 3000
330: 6967; 2300
320: 6056; 1600
310: 4888; 1300
300: 3942; 1000
290: 2935; 700
280: 2050; 600
270: 1389; 500
260: 887; 300
250: 605; 300
240: 334; 300
230: 192; 200
220: 102; 200
210: 71; 200
200: 16; 100
190: 11; 100
180: 9; 100
170-: 3; 800
Some numbers for comparison:
Median on ESPN: 400
ESPN "National Bracket": 420
Pick all Seed Favorites: 390
High in my 57 entry ESPN group: 430 (a Level 0.5 bracket my system made)
High in a random 550 entry group I'm tracking: 520
As with last year (and as expected), the Level 1 setting is not what
you want for making decent brackets, in spite of having near the best
chance of a perfect one. In reality (as said a few posts ago), to win
small to medium pools, you really want more consistency and probably
a level of 0.5 or even lower. I will update the above info eventually
with numbers for several system Upset Levels.
Out of all the groups I'm in on different sites, the highest bracket I
am in charge of is one where I picked straight-up using the preseason ranks out of a Lindy's
magazine, with a score of 450. Now THAT is hilarious, but also echoes
something I said during the football bowl season. There is somehow
a lot of wisdom to preseason expectations, and those tend to come
out in postseason play more than you'd expect. Reminds me of one of
my favorite KenPom posts,
The pre-season AP poll is great
ENTRY 30 - 3-18-16 - Round 1 Quick Update
Wow, what a day. I am going to make a quick post here to evaluate some
system performance and quantify how crazy today was.
Through Round 1, the odds of my system having a perfect bracket were 1
in 128 million. There are about 13 million entries on the ESPN bracket
challenge and none of them managed a perfect entry, so at least I was not
showed up there. There were 11 entries on ESPN missing only 1 game. I think
my system given 13 million tries would miss 1 game about twice.
How does this compare to a typical year? Most years, my system has a perfect
first round in about 1 out of 2.8 million brackets. We would thus expect there
to still be perfect entries on ESPN after Round 1. Out of all the tournaments
running back to 2002, this was the second craziest round 1 judging by probability
of perfect bracket. The craziest Round 1 was 2012, which had odds of
1 in 581 million for a perfect bracket (in that year, two 15 seeds won their
first game).
The odds of a perfect Round 1 were maximized with an Upset Level of 1.425
in my system. At that level, a perfect bracket had odds 1 in 75 million.
We can also compare these odds to the odds of a perfect bracket by
advancing automatically the 1 seeds and flipping fair coins for the rest of the
games. This results in 1 in 2^28 = 268 million odds. So my system
only did slightly better than coin flips with no knowledge at all. Crazy!
I ran 100,000 simulations of the first round using my system to see what
the results ended up being. Here are the numbers of brackets to get
each number of games correct (using Upset Level 1 / Default Slider)
Games Correct: Level 1; ESPN Normalized to 100,000
30: 1; 3
29: 4; 25
28: 48; 125
27: 235; 531
26: 789; 1739
25: 2359; 4569
24: 5286; 9230
23: 9470; 13846
22: 12923; 17692
21: 16790; 17692
20: 17100; 13846
19: 14043; 10769
18: 9902; 4615
17: 5669; 2231
16: 2744; 1077
15: 1088; 615
14: 398; 362
13: 115; 708
12: 31; 154
11: 4; 100
10: 1; 54
Some numbers for comparison:
Median on ESPN: 21
ESPN "National Bracket": 22
Pick all Seed Favorites: 19
High in my 57 entry ESPN group: 25 (a Level 1 bracket my system made)
High in a random 550 entry group I'm tracking: 29
I will later update this post when I have more time along the lines of what I did
in analysis last year: Turning the numbers above into a table with numbers of
brackets at various Upset Levels as well as comparison vs. percentiles on ESPN.
ENTRY 29 - 3-14-16 - Bracketology wrap-up and the NCAA Tourney Special
There have been many words said in many places about the selection and seeding process
for this year's tourney, so I'll be brief on that front. Basically it was laughable.
Kentucky should be seeded higher than Texas A&M. Oregon state is way too high. I still
believe Providence is overseeded but at least they weren't given an 8. I understand
perhaps Temple being in the field but Tulsa has no business being there. My guess is
the committee overthought the process when putting them in. It's like what I said a few
posts ago - sometimes it seems like only 1 or 2 factors about a team are treated as
important and the rest are allowed to be ignored. When asked, the committee head said
their top 50 wins pushed them over. Does that mean that is the only important metric now?
It does seem from listening to the committee member that sheer number of top 50 wins
is more important than ever. By sheer number I mean that the win percentage against
top 50 seemed less relevant than just number of wins. Otherwise Baylor would not have
gotten a 5 seed with a 5-11 top 50 record, and Vanderbilt would not have gotten in
at 2-7 against top 50.
I already valued top 50 wins quite highly in my system
bracketology, but perhaps still not enough. Pretty much every seed line difference
between my system and reality could be explained partially or entirely by top 50
wins. Wichita state barely made it in because only 1 top 50 win. Texas A&M had
more top 50 wins than Kentucky. Connecticut had few top 50 wins. Oregon State and
Texas Tech had many top 50 wins. My system's 3 misses, Syracuse, Temple, and Tulsa, all
had many top 50 wins. Who cares that they all had bad losses, losing records against
the top 150, and just don't look like good teams (and lost early in their conference
tourneys)? Just ride the top 50 to victory. Also not sure how this criteria disqualified
Saint Bonaventure, who was a respectable 3-2 vs. the top 50. And North Carolina with
only 5 top 50 wins and getting a 1 seed... whatever.
My system's bracket got 114th out of 144 on
The Bracket Matrix this year, which is worse
than I'd like to be at. At this moment I'm beyond worrying about that though because
the whole thing seems like a sham this year. I can at least compare my system to other
computer rankings-based systems which were on somewhat even footing. I found 11 computer
systems on Bracket Matrix (including mine), and tied for 3rd. I suppose that is
acceptable. It is somehow astonishing
to me how computers can be in general so bad at this. It seems like just the thing
that would be right up a computer's alley. NEXT YEAR THOUGH.
So, about the tourney ahead of us! My system is all set and ready to start making
brackets for you, so hit the link at the top of the page and see what happens!
Like last year, I've compiled some data on how often each team is making the final 4 or
winning the title according to the simulations. After that I'll have some comments
about the bracket and some words about what we learned from last year's simulator.
Final Four Appearances (out of 10000):
(1)Kansas: 4052
(2)Michigan State: 3299
(1)North Carolina: 3042
(1)Virginia: 2961
(1)Oregon: 2553
(2)Oklahoma: 2294
(2)Villanova: 2079
(3)West Virginia: 1840
(3)Texas AM: 1458
(2)Xavier: 1430
(4)Kentucky: 1391
(4)Duke: 1121
(5)Purdue: 1093
(3)Miami: 1015
(5)Indiana: 971
(3)Utah: 853
(5)Baylor: 694
(4)California: 643
(6)Texas: 615
(6)Arizona: 600
(5)Maryland: 556
(4)Iowa State: 526
(6)Seton Hall: 443
(7)Iowa: 346
(8)Saint Josephs: 317
(10)Virginia Commonwealth: 317
(7)Wisconsin: 268
(11)Wichita State: 264
(9)Connecticut: 253
(6)Notre Dame: 250
(9)Cincinnati: 247
(10)Pittsburgh: 234
(11)Gonzaga: 227
(9)Butler: 213
(7)Oregon State: 163
(11)Michigan: 162
(8)Southern California: 151
(7)Dayton: 138
(9)Providence: 123
(8)Colorado: 115
(10)Syracuse: 114
(12)Yale: 113
(14)Stephen F Austin: 101
(8)Texas Tech: 83
(11)Northern Iowa: 61
(13)Hawaii: 34
(13)UNC Wilmington: 33
(12)Arkansas Little Rock: 28
(10)Temple: 28
(12)Chattanooga: 17
(13)Stony Brook: 15
(12)South Dakota State: 12
(14)Fresno State: 11
(13)Iona: 10
(14)Green Bay: 9
(15)Cal State Bakersfield: 5
(16)Florida Gulf Coast: 4
(14)Buffalo: 3
(15)Middle Tennessee: 1
(15)Weber State: 1
National Champions (out of 10000):
(1)Kansas: 1813
(2)Michigan State: 1107
(1)Virginia: 1027
(1)North Carolina: 1009
(2)Villanova: 674
(1)Oregon: 567
(2)Oklahoma: 468
(3)West Virginia: 456
(4)Kentucky: 341
(2)Xavier: 284
(3)Miami: 247
(5)Purdue: 244
(3)Texas AM: 238
(5)Indiana: 177
(4)Duke: 167
(3)Utah: 163
(6)Arizona: 126
(4)California: 106
(5)Maryland: 94
(4)Iowa State: 84
(5)Baylor: 81
(6)Seton Hall: 72
(6)Texas: 67
(7)Iowa: 48
(11)Wichita State: 37
(9)Connecticut: 31
(8)Saint Josephs: 29
(11)Gonzaga: 25
(9)Butler: 23
(10)Virginia Commonwealth: 23
(6)Notre Dame: 22
(7)Wisconsin: 21
(10)Pittsburgh: 18
(9)Cincinnati: 17
(7)Dayton: 14
(8)Southern California: 12
(11)Michigan: 12
(8)Colorado: 12
(7)Oregon State: 10
(9)Providence: 6
(14)Stephen F Austin: 5
(12)Yale: 5
(10)Syracuse: 5
(8)Texas Tech: 5
(12)Arkansas Little Rock: 2
(11)Northern Iowa: 2
(10)Temple: 1
(12)Chattanooga: 1
(13)UNC Wilmington: 1
(13)Hawaii: 1
We'll compare some numbers to last year to get an idea for how open the field is
and how many upsets we can expect. Keep in mind for comparison that last year's
field was historically loaded in the top 2-3 seed lines. This year's best team,
Kansas, would have been the 4th best team in last year's field. Also, the
rest of the tourney field up through 9-10 seeds is relatively closer to the
top 2 seed lines than last year.
- Last year, 12 teams were given at least a 1% chance to win the title, and 4
teams had a 10% chance. This year, 18 teams have a 1% chance and 4 have a 10%
chance. Definitely more wide open. Last year Utah was the only seed worse than 3
with a 1% chance at a title. This year, 6 teams with worse than 3 seeds have
a 1% chance. Definitely make sure when filling out multiple brackets to have
all your bases covered for title teams.
- Last year, 28 teams had at least a 1% shot to make the final four, and 9 teams
had a 10% shot. This year, those numbers are 43 and 18, almost twice as much!
Definitely would not be surprised to see some long shots and possibly even
double digit seeds in the final 4 this year. Go crazy with your brackets!
- Kansas is the system favorite with an 18% chance at the title. Compare this
to last year, when Kentucky was a massive favorite at about 36%.
- The list of seed upsets that my system is outright predicting: (9)Connecticut
over (8)Colorado, (10)VCU over (7)Oregon St, (9)Butler over (8)Texas Tech,
(5)Purdue over (4)Iowa State, (3)West Virginia over (2)Xavier, (2)Michigan State
over both (1)Virginia and (1)North Carolina.
Here are the best chances of upsets at each seed matchup:
- 16 over 1: FGCU over North Carolina (4.1%)
- 15 over 2: CS Bakersfield over Oklahoma (9%)
- 14 over 3: SFA over West Virginia (23%) (None of these picks are that impossible)
- 13 over 4: Hawaii over California (26%)
- 12 over 5: Yale over Baylor (35%)
- 11 over 6: Michigan over Notre Dame (45%) (All of these are really quite likely)
- 10 over 7: VCU is favored over Oregon State (57%)
- 9 over 8: Uconn is favored over Colorado (59%) (Vegas favors the 9 in all four matchups)
- 8 or 9 over 1: Saint Joe's over Oregon (31%)
- 7 or 10 over 2: Wisconsin over Xavier (36%)
- 6 over 3: Arizona over Miami (48%)
- 5 over 4: Purdue is favored over Iowa State (60%)
The 1 and 2 seeds listed above (Oregon and Xavier) continue to be the most vulnerable
at each point throughout the bracket. If you're going to pick a 1 or 2 seed to lose
early (and you probably should), those would be good targets. I also think North Carolina
is vulnerable, not because they are bad, but because the Indiana/Kentucky winner is
going to be very dangerous, and because West Virginia is a strong 3 seed.
Now, if you are to use my bracket maker to create brackets, here are the guidelines I would
aim for, based on what we saw last year:
- First of all, over the summer I modified the upset likelihood algorithm because
it was not realistic about how many upsets were actually called. What this means is
when I say "Upset Level 1" for this year, that is equivalent to "Upset Level 1.3" from
last year. Likewise, "Upset Level 0.75" this year is about the same as "Upset level 1"
from last year.
- Last year I found that upset level 1 was optimal for a perfect bracket (which corresponds
to upset level 0.75 this year). This is a little abnormal and one should generally expect
more upsets than that. I think a New Upset Level 1 has proven over the course of this
season to be just about right in generating win probabilities. You can see on the
Basketball Predictions page that it estimated pretty closely how many games it was going
to get wrong each week. New Level 1 corresponds with leaving the Upset Level Slider at
its default middle point.
- With that said, the goal of getting a perfect bracket is different from the goal of
getting a pool-winning bracket. In a pool of size about 50-100 entries, an Old Level of
0.65 was about right. That corresponds to a New Level of 0.5. This would correspond
on the Upset Level Slider on my bracket page to sliding it all the way towards Conservative.
However, I expect this season to have more upsets and it is probably best to slide
it about halfway towards Conservative from the default middle point.
- In a very large bracket pool (thousands of entries), just leaving the Slider at
the default middle point is your best bet. It will give you worse brackets on average, but a better
chance of a really good bracket. High reward requires High Risk!
- If you actually want to try to win bracket pools, there is no reason whatsoever
to push the Slider past its default middle point. This actually decreases both the
chances of a decent bracket and a great bracket, so there's no point.
- The one exception
is if your pool rewards correct upset picks especially. Then it might be correct to
move the slider slightly towards Crazy, but not too much. All the way to Crazy is
just for people's amusement (because I know some people like to make beyond crazy
brackets). As a point of reference: All the way crazy means the 1 seeds only have
a 75% chance to move on against the 16 seeds. So it will pick one of these upsets
on average per bracket. No NCAA tourney in history has had even nearly that upset
level. The most crazy NCAA Tourney in history (the tourney with VCU in the final 4)
corresponds to wanting the slider a little bit towards crazy.
The Summary of the above info:
- Want a perfect bracket or to win a massive pool (over 1000)? Leave the slider where it is.
- Want to win a small pool (less than 20)? All the way to Conservative.
- Want to win a larger pool (20 to 100)? About halfway towards Conservative.
- Only move the slider towards crazy in pools with bonuses for upsets, and even then only move it a little.
That is all I have for today. The next post you see from me should be next Monday
or so, when I will run the numbers on the tourney and analyze how well my system
did and which Upset Levels were optimal for this one.
ENTRY 28 - 3-13-16 - Thoughts on System's Final Bracket
Well, it's March. And it's time for some madness.
Here is the agenda today:
Around halfway through the Big 10 Championship game I'll post my system's final bracket
projection of the year. Once the game is over, I'll update the system's normal rankings.
Later today I'll update the bracket maker page to make brackets for this year's tournament.
That way, people can begin creating endless numbers of brackets and making all kinds of
fabulous predictions about the tournament. Early hint: This is NOT the year to go anywhere
near the 1v16 upset pick. I would avoid the 2v15 upset as well.
Tomorrow, I'll post some initial analysis about the bracket, some of my thoughts, and
bring up some of the numbers I talked about in last year's bracket prediction and the
ensuing analysis.
As for right now, I want to mention some things about the bracket prediction from this morning
and give my own ideas about where the teams are going to fall later today.
- After looking over and over the variables involved with teams being higher or lower
than my system says they should be, I think the most influential variable I am not
putting enough emphasis on is performance at neutral / road sites. That would explain
Monmouth being close to the bubble vs. like 15 spots out like my system says. Also
it would help explain Providence being higher, Oregon State being lower, etc.
- It is also clear from reading lots of analysis from self-proclaimed bracket people that
the criteria for seeding changes based on what seed is being referenced. For example,
The eye test seems to be much more important for top 4 seeds than for bubble teams. For
bubble teams, avoiding bad losses and road record seem to be more important. And
for seeding the auto bids from 1-bid leagues, it seems that the rank of the conference
itself in the RPI is more important than any other criteria. For example, my system
currently has Yale as an 11 seed when 12 or 13 is more likely. It gets such a high seed
because the team is highly ranked in my regular rankings and thus passes the system's
closest equivalent of the "eye test". But the eye test is not an important criteria
for those teams because nobody really watches those teams and knows how good they really
are. Plus it is just hard to gauge team skill playing in a poor conference. KenPom
agrees - he has Yale at #38. He also has Stephen F Austin at #34, thus the high seed
projection for them (a 12 seed). But the committee does not appreciate obliterating
horrible opposition like computer rankings do.
So here are my own seed projections. Within each seed line the teams are not
necessarily listed in S-curve order. Keep in mind two things: I would consider myself
an amateur and learning bracketologist at best, and I don't have the fortitude to sit
down and work out things like seed adjustments for regions. Think of these more
of as using my system as a starting point and making some modifications to "fix" some
known shortcomings with its methods. We'll see later today who is more correct.
1) Kansas, Oregon, North Carolina, Michigan State
2) Villanova, Oklahoma, West Virginia, Virginia
3) Utah, Miami, Xavier, Kentucky
4) Texas A&M, Purdue, Arizona, Indiana
5) Maryland, Duke, California, Iowa State
6) Seton Hall, Texas, Baylor, Iowa
7) Notre Dame, Saint Josephs, Wisconsin, Connecticut
8) Dayton, Providence, USC, Colorado
9) Wichita State, Butler, Texas Tech, Oregon State
10) VCU, Gonzaga, Cincinnati, Pittsburgh
11) Saint Bonaventure, Saint Marys/Vanderbilt, Monmouth/Syracuse, Arkansas Little Rock
12) UNC Wilmington, Chattanooga, Northern Iowa, Yale
13) Stephen F Austin, South Dakota State, Stony Brook, Hawaii
14) Fresno State, Iona, Middle Tennessee, Buffalo
15) CS Bakersfield, Green Bay, Weber State, UNC Asheville
16) Hampton, FGCU, Southern/Holy Cross, Fairleigh Dickinson/Austin Peay
That felt horrible, by the way. I don't know how other bracket people have made it
through the whole season doing that. I got to around the 9 to 10 seed portion of the
bracket and didn't want to put any of them in the tournament.
ENTRY 27 - 2-26-16 - NCAA Seeding and the Perception Problem
I've been crazy busy with some other projects, but I am back for a quick update leading into
the march stuff.
I felt compelled to put up a post today because Providence has just now fallen out of the bracket
in my system's bracketology, which you can find here:
Bracketology .
This punctuates a trend I have been watching this year as my system's seedings have gone out and I
have compared them with the other ones on the
Bracket Matrix ,
as well as from reading intelligent people's bubble watches. The trend is this: some teams are
seeded far higher or lower than a look at their team sheet should ever be able to justify. Some
examples:
- Maryland: was a top 3 seed in everyone's brackets back in early January, and a 2 in many brackets now. What my system
saw was this - they were 0-1 against the top 50 RPI, only 2-1 against the top 100, and their
best wins were over Georgetown and Princeton, with Georgetown at home. They also had ugly spots
on their computer resume such as the 7 point escape over Rider at home after trailing at halftime.
Even RPI hated Maryland, at #26. Dayton's resume was better in pretty much every way but nobody
had them up there.
- Arizona: has been a 4 seed in most people's brackets for a while, even though throughout most
of January and early February they had beaten nobody in the top 50 of the RPI. On January 1 they
sat at 0-1 against top 50 and had an RPI of 42.
- North Carolina: is currently a 2 seed in most people's brackets in spite of a pathetic 3 wins
over the top 50 RPI at this time of year and an overall losing record of 3-4 against that category.
My system currently favors Oregon as a 2 because of the 8-3 top 50 record and Miami at 7-2. Right
now UNC's best road/neutral win is probably at Florida State or neutral vs Temple. Not what I
would expect from a 2 seed. My system currently has them at 3.
- Butler: had a good start but has been falling fast and currently has a losing league record
in the Big East. They have been in and out of the field in my system for the last few weeks,
but most people have had them in the entire time. This is not a good team! 7-9 against the top
150 is pretty embarassing.
- Providence: also had a good start and sat in the high single digits in RPI in early January,
but has had a completely miserable two months since. It does not seem to have affected their
tourney chances, though, according to most people. They also sit at 7-8 in the Big East. The win at Villanova is nice and
all, but in addition to the resume being pretty mediocre, they also have horrible computer strength
(61st in my system, 59th in KenPom, about 55th in Sagarin). What does everyone see in this team?
Are people still getting used to the idea that our November/December image of this team is a mirage?
On the bracket matrix, I am one of only two systems that has them out of the bracket entirely right now.
They are sitting at an 8 seed average! This is ludicrous!
Compare this Providence team to last year's BYU. That team "snuck" into the tourney on the back of
a late win over 2-seed Gonzaga at Gonzaga. People were still not sure if that was enough. They
spoke of BYU's resume hanging entirely on that win. In the end it was barely enough to get them
in, but they had to sweat it. And that BYU team has a higher computer score than Providence.
As a programmer of a system, my job is to look at all the data and figure out what the common
traits are which bolster certain profiles in everyone's eyes. Unfortunately, there doesn't seem
to be some easy magic formula which does it all. There are many, many factors which can be entered
which people claim are important - top X record, league record, overall record, RPI, R/N record,
"the eye test", and the list goes on. What is maddening is that whenever anyone discusses a particular
team in regards to seeding, they seem to ignore some of the factors at random and focus on only some
parts of a team's resume. You see this on sports channels when they do "blind resume comparisons".
The little graphics only have room for 4 or 5 factors, and they hand pick ones that will make everyone
choose the team they want. What is also maddening is that somehow, all of the major writers and
sports sites tend to pick the SAME random factors to ignore about each team!
Let me give you an example. Here is Saint Joseph's profile in my computer's eyes. They are 25th
in RPI (something like a 7 seed), reasonable OOC SOS of 75th, 23-5 overall record, 3-3 vs top 50,
5-5 vs top 100, and 11-5 vs top 150. They have a decent but not exciting collection of top wins,
including a big one over Dayton. They are ranked 32nd in my computer's regular rankings and 36th
in KenPom. They have 0 bad losses. Yet until that win over Dayton, they were right on most people's cut line and entirely
out of some people's brackets. What gives? People quoted just an overall lack of quality wins
as the culprit. But turn that around on Arizona - they were in pretty much the same boat but
up high in the bracket. Why is that factor so important for Saint Joseph's but not for Arizona?
Is it because Arizona is perceived as being a "better" team so doesn't need that part of the
resume to be there? If perception as a good team is all that's necessary for a good resume,
what about Wichita State, who is a top 10 team on KenPom (#9 right now) but still very much on
the bubble (a 10 seed on Bracket Matrix)?
I think "perception" is the big one here. People are swayed by results as they look at the
time and don't often step back to re-evaluate. Maryland's win over Georgetown looked big at the
time, but Georgetown has turned out to be pretty mediocre. Everyone still seems to be stuck in
the moment when Providence won at Villanova. And I think a lot of the perception difference between
people and computers is in the star power. All of the teams I listed above are blue-bloods with
5 star recruits and big name leaders. Everyone associates March with the players who rise up
and play at a whole new level. And the committee has shown before that they want big stars
to play in the tournament even if their team is not all that deserving. A lot of surprising
selections and snubs can be explained by this I think. NC State a few years ago was an example
of a surprising selection. UCLA and Ole Miss last year were surprisingly chosen and had some
big star players. And Colorado State and Temple last year were some surprising snubs.
Those teams played a "boring" brand of ball - Colorado State just destroyed teams on the offensive
glass, but somehow that is more "boring" than a good shooter who can knock down 3s.
Look at the teams on the list above. The biggest offender, Providence, has Kris Dunn, who
everyone is crazy about. Maryland has a million stars everyone is excited about. Arizona
is Arizona. UNC is UNC. And then there is Saint Joseph's, who doesn't really have anyone
called a "star". I think this is one of the keys to all of it. But it is not something
I can really program since I have no roster information. It also, like the factors mentioned
above, is not foolproof: Wichita State has two amazing players but doesn't seem to get
the same hype bump. So who knows. Maybe some day I will get to talk to a real bracketologist
who can explain why sometimes factors are allowed to be ignored and sometimes not.
ENTRY 26 - 12-17-15 - Bowl Special
Well, here it is, the bowl special! The bowl predictions are already up and can be viewed here:
Predictions page .
My system estimates it will go about 60% on the bowl games. Although this seems much lower than
the typical 70-75% rate it has been hitting throughout the season, keep in mind that the bowls are,
for the most part, set up to be somewhat equal. Over half of the bowl games have win probabilities
falling somewhere between 50% and 60% for the favorite, which would classify them as virtual
toss-ups. There are very few enormous mismatches that give free wins. Past bowl seasons have
confirmed this. My system rarely goes more than a few games above 50%. In fact, it usually
underperforms even the low 60% accuracy estimation. Some reasons I think this may happen:
- Bowl season represents basically a several-week bye week for every team. This allows teams to get
healed up if they are injured. Oregon and TCU can tell you how big of a difference that has made in their
season.
- Several weeks between games gives some teams a lot more time to practice and improve. The same
thing often happens throughout the course of the long basketball season - there tends to be kind
of a return to preseason expectations near the end of the season. A team with lots of talented young
players may have trouble finding chemistry but come together at the end and play up to their potential.
- Motivation is a huge determining factor in these games. An unmotivated team will play way below their
potential level. It is difficult to assess motivation in a computer program.
All of these represent variables that confuse the ratings determined throughout the season and make the
results closer to 50-50 than they should be. Again, some day I may try to program some variables like
these into the prediction algorithm. But that day is not here yet and will probably not be until
at least this summer.
With all of that said, here are a list of some "Interesting" program predictions and my take on them:
ARIZONA VS. NEW MEXICO (System Arizona by 7, Vegas Arizona by 8) - This game could have a lot of those
intangible factors in play for both teams.
New Mexico hasn't been bowling for a while and they have been red hot down the stretch, beating
several of the top teams in the MWC. Meanwhile, Arizona has had a disappointing season and served as
whipping boys to the upper class of the Pac 12. I personally have New Mexico winning a close one here.
SAN JOSE ST VS. GEORGIA ST (System Ga St by 1, Vegas San Jose St. by 2.5) - Kind of the same as the
game above. Georgia State started the year poorly but has won the last 4 to get into a bowl game, including
a thrashing of Georgia Southern in the last game. San Jose State hasn't really done anything impressive
all season. I'm siding with my system here with Georgia St.
WESTERN KENTUCKY VS. SOUTH FLORIDA (System USF by 1, Vegas Western Kentucky by 2.5) - Same story. USF
has a lot of momentum coming into this game, and Western Kentucky has had some injury and inconsistency
issues. I am going with my system with USF to beat a ranked Western Kentucky team.
TOLEDO VS. TEMPLE (System Toledo by 3, Vegas Temple by 1.5) - My system has been down on Temple all season,
with some early season struggles still weighing it down. I've talked about this in some of the previous
blog entries. I am starting to come around on Temple more now myself and I think this game is as much of
a toss-up as any other game on the schedule. My personal pick is on Toledo.
WASHINGTON VS. SOUTHERN MISS (System Wash by 11, Vegas Wash by 8.5) - My system has been high on Washington
for a while. They are currently in my top 25 at #20. They are inconsistent though, and could easily lose
this game. My pick is Washington.
UCLA VS. NEBRASKA (System UCLA by 3, Vegas UCLA by 6.5) - This game is a case-study in the value of a win.
Nebraska has played quite well but lost some head-scratchers at the last second by only a few points. UCLA
has been in the opposite position, of winning some nail-biters at the last second. These teams are only a few
plays away from having their records reversed. So my system gives UCLA only a small edge. Is this philosophy
right? We will see.
CENTRAL MICH VS. MINNESOTA (System Minnesota by 1, Vegas Minnesota by 6) - Minnesota has been just utterly
putrid this year in a good league. Central Michigan has been OK but in a bad league. I have Central Michigan
in this one, but mostly on principle. 5-7 teams do not belong in bowl games!
NORTH CAROLINA VS. BAYLOR (System UNC by 1, Vegas Baylor by 2) - Another case study here - does crushing
everyone in a bad league (the ACC) translate to winning under the spotlight? I myself like Baylor in this
game as long as they are playing a reasonable QB.
AUBURN VS. MEMPHIS (System Memphis by 7, Vegas Auburn by 2.5) - This is an interesting game to watch in the
context of my above comments. This game would have been perceived as a complete mismatch in favor of
Auburn in the preseason. Now, it is Memphis overachieving and Auburn underachieving. However, I think the
extra 3 weeks of practice (as well as the last few Auburn games) are allowing this team to finally pull
into form and look a little more like what we expected in August. Memphis? They have already peaked and
are going to fall behind in the arms race. I pick Auburn here.
LOUISVILLE VS. TEXAS AM (System Tex AM by 1, Vegas Louisville by 1) - I think Vegas used to have Texas AM
here but they have some quarterback eligibility issues which have pushed it the other way. I am taking
Louisville here.
USC VS. WISCONSIN (System USC by 1, Vegas USC by 3) - USC is and has recently been a confusing team to gauge
over the last several years. I think the motivation issue is strongest here than almost any other team.
They can play exceptionally well, then lose to a random patsy. On paper, USC should win this game by 2
touchdowns. Wisconsin is really not that good. I have USC, but I'm worried.
HOUSTON VS. FLORIDA ST (System FSU by 4, Vegas FSU by 7.5) - Florida St has been playing better recently
(ever since the revelation that Dalvin Cook is a really, really good player). Houston has fallen off
a bit, partially due to injuries. I say go with FSU if you can get inside the coaching staff's mind
and see Cook getting 25+ carries.
OKLAHOMA VS. CLEMSON (System Oklahoma by 6, Vegas Oklahoma by 4) - Oklahoma is really good. They have
destroyed nearly everyone in a conference with a lot of power at the top. Beating Oklahoma St. by 30
was a sight to behold. Clemson has done the same, but in a much worse conference. They are battle-tested
with Florida St, Notre Dame, and UNC, but none of those games were easy. I think the smart money is
on Oklahoma, but I just have this feeling about Clemson as the latest team of destiny from the ACC.
MICHIGAN ST VS. ALABAMA (System Alabama by 6, Vegas Alabama by 9.5) - I don't think it will even
be as close as either of these estimates. Michigan State is not a good team at all. They are about
the equivalent of last year's Florida State. And they will fall to about the same fate, losing by at least
4 TDs in their first real game of the entire season. And no, collecting gifts from Ohio State and Michigan
does not count.
NORTHWESTERN VS. TENNESSEE (System Tenn by 7, Vegas Tenn by 8.5) - It is really comical how much
Northwestern is overrated right now. System still has them at #50. Remember, this is the team that
lost to Michigan 38-0. Top 15 teams do not get shut out like that. I pick Tennessee in an easy one.
MICHIGAN VS. FLORIDA (System Mich by 4, Vegas Mich by 4.5) - Speaking of teams that are overrated.
Florida has one of the most putrid offenses I have ever seen. Florida State might be good, but you still
have to manage some kind of offensive points against them. I don't know why Florida has any business even
being in the top 25. The committee got this one wrong. With that said, Michigan has really taken a turn
for the worse as well and Florida's D might just be able to outscore them. I have Michigan in this one.
NOTRE DAME VS. OHIO ST (System Ohio St by 7, Vegas Ohio St by 7) - Do you think Elliot will see the ball?
If so, Ohio State wins this one no problem.
STANFORD VS IOWA (System Stanford by 6, Vegas Stanford by 7) - I was really pulling for Iowa to win the Big 10
championship. Why? I am certain they would have been the worst team to ever make the playoffs, and it
likely would have stood for a while. Their schedule is a joke, about the likes of the old Boise State schedules
for which they got some BCS Championship game hype. The difference? BSU never had a real chance, and Iowa
was guaranteed in if they won out. Where is the fairness? Where is the objectivity? Stanford wins this,
and it isn't close.
OKLAHOMA ST VS OLE MISS (System Miss by 4, Vegas Miss by 7.5) - This doesn't really feel like a big 6 bowl.
Oklahoma St got here on the back of some flukes and gift wins, followed by losing badly the only two important games
they played. Ole Miss had the same but a little less extreme than that. So I take Ole Miss.
OREGON VS. TCU (System TCU by 3, Vegas EVEN) - Another one about as close to a toss-up as it gets. This
one comes down to which team gets healthier over the long break. I think TCU is able to pull this one out
since they are a little more complete of a team.
View the link above to see the rest of the picks my system made. They all look like good picks now, but
remember! Football has all kinds of variance. A 3-point edge like a lot of these games have is virtually
a coin flip. Most games of college football have multiple plays, each of which swings the game by 7 or more
points either way. It all comes down to a missed tackle, a cornerback who makes just the right undercut of
a route to lead to a pick 6, the outcome of a scrum for a fumbled ball, etc. These, especially the last one,
are for the most part random and you never know when a team is going to uncharacteristically lose 3 fumbles
and as a result play 21 points worse than their ability suggests.
I'll be back on here again to talk some basketball soon.
ENTRY 25 - 12-8-15 - Bowl Special coming up
It's really late right now and I can't type a lot, but I wanted to put something here in case anyone
still comes to this page and has given up on me after the last month. Getting my system ready for
basketball season has been more trouble than I expected, and there are still bugs in the system I am
trying to work out. The primary bug is causing my system to give absurd results of games (i.e. 227-6
as final scores) whenever a prediction period spans 2 different months.
Ever since I started posting the weekly statistics inside the predictions page, it has seemed less necessary
to state them here as well. For the most part, the system has continued to overperform this year and
has been exceeding the "expected" number of correct predictions most weeks.
Soon I will be putting up another blog post as kind of a look back on the football year, and a look
forward to bowl season. The system predictions for the bowl games are now posted and you can find
them on the predictions page. I'll also put in my two cents about the basketball season so far
and what I think will happen in the remainder of the season.
See you next time!
ENTRY 24 - 11-4-15 - Weeks 8 and 9, and upcoming Basketball season
I missed last week due to trying to catch uap on some work. Now I'm back!
Week 8 Stats:
Number of FBS Games: 55
Number Correct: 43 = 78.1%
Expected Number Correct: 40.071 = 72.1%
Average Error: 13.34
Average Signed Error: -1.42
Week 9 Stats:
Number of FBS Games: 52
Number Correct: 42 = 80.7%
Expected Number Correct: 39.316 = 75.6%
Average Error: 13.07
Average Signed Error: -2.54
Comments:
You can view the exact predictions and their results on the
Predictions page .
Week 9 was a bit of an unusual week for the middle of conference season. There were only 6 games the entire
week which had win probability in the 50-60 range for the favorite, compared with 10 or 12 for most weeks.
As a result, the expected number correct was quite a bit higher than usual, and the system delivered with
another percentage above 80%.
It is interesting to see the signed errors so far negative (meaning the home teams are not being given enough
credit). In week 9, the home advantage was almost an entire touchdown!
The college football playoff rankings were just released yesterday. Among some of the comments I have
about them are as follows:
- There has been much said about Memphis and their place in all of this. Some have Memphis in their top 4
right now. I know I have been mistaken about them earlier (in Entry 20 I proclaimed that they would be
embarrased by Mississippi, and we know how that turned out), but the reasoning I used there still stands.
It is still the same Memphis team that barely wheezed past South Florida and has looked at times pretty
bad against some awful teams. If they win the rest of their games by 20 points over their ranked
conference foes they'll start to have a case in my eyes.
- Although my system has Baylor in the top 4, I agree they don't belong up there yet. They should
not be rewarded for their pitiful nonconference schedule. I can't wait for the end of the year when all
the big 12 teams have a loss and they are left out again because their schedules don't compare to the other
1-loss schedules. FBS teams should not be playing against FCS teams! It is basically just exploiting a
loophole in scheduling requirements in order to get an extra bye week.
- Since the emphasis is on resume rather than the "eye test" and actual team strength, the committee's
rankings sadly have the same flaw that the actual polls have - they are basically ranked in order
of wins/losses with not much regard for how the games were played. Toledo just got spanked yesterday,
and neither me nor my system is surprised (it had Toledo at #35). My system actually has 6-2 Bowling Green
ahead of them at #30, and that team sure looked the part today against Ohio. Temple is still not a top 25 team, even
after pushing Notre Dame. They still have too much bad history to wash down first, like their fortunate
escape over UMass.
- Others have complained about some of the 2-loss bluebloods that made it in the rankings. Someone has to
explain to me how Texas A&M, UCLA, and Northwestern are in there. It just seems like they took all the 2 loss teams
and threw them in on principle. UCLA's signature win is over a team that is now sliding badly (Arizona) and
ranked 75th in my system. They just got done barely beating Colorado. This is not a good team! Same
with Northwestern - top 25 teams don't lose 38-0 to Michigan. Period. They are just lucky Mike Riley
is coaching Nebraska or that would be 3 losses in a row. Sad lack of vision here.
- Meanwhile, deserving teams like USC, Tennessee, and Washington are not considered since they have 3 or more losses.
My system has USC at #9 and Tennessee at #14 as of right now. Both of those teams are in prime position
to win out given the strength of their teams. Tennessee is a massive favorite in every remaining game and
should make it to 8-4. USC will probably stumble once but still finish at 8-4. And at that point they
will probably both be in the committee's top 25. So at this point you have to ask the question - what
is the purpose of the committee rankings? Is it to find the 4 best resumes, or the 4 best teams? Their
stated goals seem to be the latter, but just like the polls they seem to always tend towards the former
anyway.
In other news, basketball season is nearly upon us! I, of course, have been lazy and haven't gotten the
stuff rolling on my ranking system calibration. It is a much more massive undertaking with basketball
than football, really because I have so much more data available. The structure is not in place right
now to churn through the almost 100,000 games in the database and optimize like I did with football.
What this means is I think again I'll have to push off the greater portion of the work to this summer, and
more or less run out the system I currently have for another season. This is not a complete loss - my
system was in the top 10 computer rankings in its bracket picks, which has to count for something.
The only fixes I intend to make before the season starts are the following:
- Rework the preseason data to align with the football system, so each previous year is its own entity.
Currently it treats the last 5 years as a very long season with much higher weights on more recent games.
- Set up my score collection system to use both Massey's data and KenPom's data. Currently it uses only
KenPom's because it is more descriptive of the game circumstances, but he only posts the games after they
happen. As such, it makes predictions impossible. Of course, with all the little bracket tournaments in
November and December, it will still be impossible to use Massey's data to predict games occuring after
the first round. But it will at least give most of the predictions.
See you next week with the first preseason basketball rankings, which I will have ready in one form
or another before Friday's tipoff!
ENTRY 23 - 10-18-15 - Week 7 System Performance and Probabilities
Week 7 done, and it is time again to look at the stats for this week:
Number of FBS Games: 56
Number Correct: 36 = 64.2%
Expected Number Correct: 40.392 = 72.1%
Average Error: 14.23
Average Signed Error: 1.69
Comments:
You can view the exact predictions and their results on the
Predictions page .
Thanks to the midweek work I did, I now have some new numbers to digest each week: The win probabilities!
Read Entry 22 on 10-15-15 for more about how I arrived at these numbers.
The new "Expected Number Correct" will be added to each week's stats, and is also added to the predictions
as they are posted up for each new week. This really hammers home something I've tried to emhpasize
in the past: College football is really random. Many of the outcomes are nearly coin flips. For one
prediction system to consistently get a very high percentage correct is just impossible because of the
odds against it. I would be shocked if my system got all of the games right in a given week! Rather
than celebrate a victory, I'd be scrambling to figure out what went wrong and who is accepting money
under the table!
This week is just as appropriate as any to bring up this idea. This was my system's worst week for
predictions of the season at only 36 of them (64%) correct. But the win probabilities suggested I only expected
to get about 40-41 of them correct. Is 4-5 games a reasonable amount to expect to be off by? Some
elementary binomial statistics assuming a straight 72% chance of each game being correct suggests that
I should only start running for the hills if the number correct is off by more like 7 in either direction
(either less than 33 right or more than 48 right). You need only look to last week to see that it goes
both ways - last week my system was 3.5 games above the expected number. The law of large numbers will
have it all average out in the end.
Overall for the season up through Week 7, the probability model suggested my system would get 272.244
out of the 363 FBS games correct (75%). Actual number correct: 275 (75.8%). Some weeks were above
the prediction and some were below - it all averages out.
So how did my system do so poorly this particular week, going over 4 games below the prediction? Through
the first half of Saturday I was ready to call "conspiracy theory". Almost every coin flip game was going
against me. Normally sure-handed Boise State decided to turn the ball over 8 times to lose to Utah State.
There don't exist words for Michigan's choke. The 2nd-highest win probability game of the day (Ball State
over Georgia State at home, 98.6%) went against me (I'm not exactly sure how this game was that high in the
first place - probably because some teams are still settling into their spot in the rankings). Here is
a breakdown of how teams fared when my system gave them a particular probability to win the game:
50-60%: 6 out of 15 (40%)
60-70%: 6 out of 9 (67%)
70-80%: 11 out of 18 (61%)
80+%: 13 out of 14 (93%)
So really, the damage was done in the coin-flip games (where I would have expected 2 more victories) and
the 70-80% range (where again Michigan and Boise State winning would have put things in balance). After
the horrible start to Saturday, most of the rest of the day went to plan and I did not have to run for
the hills.
Here is how next week looks in win probabilities (This can be found on the predictions page):
Number of FBS Games: 55
Expected Number Correct: 40.08 = 72.8%
My system's overall win rate over the last 15 years is about 73.2%, so we can expect a lot of weeks like
this. Through the rest of the regular season (353 games) my system expects to get 256.9 of them correct
(72.7%).
ENTRY 22 - 10-15-15 - A Football Win Probability Model
I'm back sooner than expected, and the reason is the following: I've ported over my technology from the
bracket prediction program to football, and I'm now using it to predict probability of upsets!
One immediate application of this is to determine whether my program has been performing well of late
or not. Sure, 81% correct predictions last week sounds impressive, but what if that week's slate was
just full of sure picks? Imagine a world where every game last week was so lopsided on paper that
the better team would win 95% of such matchups. Then 81% would be an abysmal record, and one could
potentially expect a picking system to get them all right! On the other hand, imagine a different
world in which every game last week was a coin flip on paper. Then 81% would be very impressive!
Clearly the context and projected win probability matters a lot here.
The above commentary and implied application relies on a probability estimate per game that is accurate.
Up until this point, though, I have not run the probability system through any kind of thorough
diagnostic to see if the probabilities it is spitting out are real in any sense. What I am asking
is the question: In all games for which my computer says "Team A wins with probability 63%", do
those games actually end up going to the favorite exactly 63% of the time? Being off in either direction
is a problem, so accuracy is really essential.
I've acquired game data back through 1996 (Thanks, James Howell!) and ran the probability predictor
through the years 1998-2014, and week 6 to bowl season in each one. The result was 8271 games of data,
which is still on the low end of where I'd want to be, but it has to be good enough for now (I stopped
at 1996 going backwards because it is the first year without draws). For each game, I had the predictor
calculate the probability of the favorite winning. I put this data into bins of 2% intervals (i.e.
50-52%, 52-54%, etc.) so the data would be readable. Then I calculated how often a favorite falling
within this range actually wins and compared this percentage to the bin it was inside.
What I found was that my "first try" at the predictor was not bad. A linear regression on the data
above had R squared of 0.95, which is pretty nice. However, the predictor was overconfident in the
favorite across the board. Across all 8271 games, the predictor suggested my system would get 6363
games correct (76.9%), but instead my system correctly predicted only 6055 of the games correctly
(73.2%). What was more interesting was the shape of the data. See below:
The x-axis has the bins, centered at the midpoint of the bin (so 50-52% is recorded as 51). The
y-axis has the historical probability that a team predicted inside that bin actually wins the game.
The data seems to have multiple important humps at which a big change in win probability happen. The
biggest one is at about the spot on the curve where the team becomes a touchdown favorite. I guess
the idea here is that it doesn't really matter whether a team is a 1 point or a 5 point favorite,
because a random fumble return for a TD will change the results of both games. Once a team is favored
by more than a TD, though, it makes a drastic difference because multiple random events favorable
for the underdog must happen in order for an upset to occur.
The other interesting feature is the tail end. The prediction bin 96-98% was actually only good enough for about
a 91% win rate, and the prediction bin 98-100% only recorded a 96% win rate. The biggest historic upset
I saw from a brief glimpse was in 1998, when Virginia Tech lost to Temple as a "99.9% favorite". Why
are such improbable upsets happening more often than my original model would suggest? My guess just
thinking about it now is that sometimes, when a game goes badly for one team, it tends to snowball.
It starts going worse and worse. A #1 team in a close game at home to an unranked opponent starts to feel the
sweat and the pressure. The coaches get frustrated. The players get frustrated. The team takes
more risks to try to win by what "everyone said" they should win by and add style points. The result
is a feedback loop that exaggerates a bad day into a horrible day and can cause massive upsets. Another
possibility is shots from the blue (good or bad) - a key injury that turns a great team suddenly
average, a key backup quarterback for a bad team turning out to be a stud, etc.
So what I have done as a quick fix for now is grafted the previous prediction model onto this quartic
polynomial and I will use the outputs of this quartic as the new model. I still can't say with
a high amount of statistical significance, but the predictions it is now making for win probability
should actually conform quite closely to the real probabilities. Each bin had about 350 games in it,
so we should reasonably expect an error of plus or minus 5 percentage points per bin. Kind of interesting,
but not surprising, that the biggest bin by far was the 98-100% bin, about double the size of the other ones.
When a brand name school wants to schedule a cupcake, they don't typically mess around.
As a result of this (and I meant to do this for a while anyway), I will now put the weekly predictions
in table form with a win probability for the favorite attached. At the bottom of the table I will
include the number of FBS vs FBS games in the week, and the predicted number of games my system will
get correct. The prediction is calculated by simply adding up all of the win probabilities for the
week. I will see if I can convert over the previous weeks this year to the same format.
If anyone got to the end of this bunch of words, let me know what you think in the comments!
ENTRY 21 - 10-12-15 - Week 6 System Performance
Week 6 done, and it is time again to look at the stats for this week:
45 out of 55 FBS games correct (81.8%)
Average MOV error: 12.91
Average Signed MOV error: 0.04
Comments:
You can view the exact predictions and their results on the
Predictions page .
Somehow my system comes through with an amazing week right in the middle of conference play. As
I have said the last two weeks, I do not expect this trend to continue, and results with these settings
from previous seasons suggests the average for a typical midseason week should be around a 73% prediction
rate. True to form, 4 of the 10 misses were on games predicted by 1 point (in the wrong direction).
However, there were just a lot of close games, and that makes my system 4-4 in such games this week, which
is to be expected.
It continues to be the case that the biggest contributor to MOV error is not surprising outcomes, but
rather blowout games that are just bigger blowouts than predicted. A perfect example is Baylor vs. Kansas,
which was predicted to be a 36 point Bayor victory (Vegas had Baylor by 46) and they surpassed both in
winning by 59. This will be part of the plan in the offseason to make the prediction system its own
animal - right now its MOV predictions come strictly from comparing team ratings and adding the home field
advantage. As stated in the methodology, this way of making predictions ultimately results in assuming
a transitive comparison of teams (i.e. if team A beats B by 10 and B beats C by 10, then A should beat
C by 20). I believe that this is not the case, and the best comparison method is somehow logistic -
up to an extent, the prediction should be some exponential applied to the teams' differences of rating.
I say "up to an extent" because this clearly can't continue forever. If my rating system had all teams
through DIII football and Baylor theoretically played the worst DIII team, they would not win by 5000 points.
DI has a little less parity, but I suspect there will still be some mitigating factor on extreme blowouts.
I have other plans too to separate the predictor from the straight comparison of ratings. I
in particular want to analyze the effect of the previous week or two's games played by a team in predicting
this week's game. Questions are: How many points (if any) does a bye week help a team by? Along the
same lines, how many points (if any) does playing a weak or FCS team help a team by (virtual bye week)?
Does playing against a high ranking team the previous week hurt a team the next week? Does playing
a soft early season schedule hurt a team more than the numbers might suggest when they face their first real
foe? Does being undefeated late into the season cause a team to play worse, as the "pressure" gets to them
and other teams give them their best shot? All questions to answer next summer. This is the entire
principle my ranking system is founded on: There are many different "conventional wisdoms" that people
throw around in regards to predictions and team strength. Which of them hold true in the face of the
past? Data will decide.
ENTRY 20 - 10-6-15 - Week 5 System Performance
Week 5 done, and it is time again to look at the stats for this week:
41 out of 58 FBS games correct (70.7%)
Average MOV error: 12.4
Average Signed MOV error: -0.12
Comments:
You can view the exact predictions and their results on the
Predictions page .
Well, I did promise a prediction percentage closer to 70% this week, right? I feel like there were
a lot of missed opportunities this time. The system went 2-4 on games predicted to be decided by 1 point.
According to the system, these are still basically toss-ups (53% win for the higher team), and sometimes
a coin will come up heads 2 times out of 6.
There is still a high number of teams ranked lower than teams they beat earlier in the season, and
overall the rankings don't yet line up well with things like the polls. When I set my system weights
to 100% the logarithmic "achievement based" component, it always looks a little more like the polls.
Interestingly enough, if I had run the predictor on this mode for the previous week, this is what the
results would have been overall:
45 out of 58 FBS games correct (77.6%)
Average MOV error: 12.86
Average Signed MOV error: 0.69
A much better win percentage! But this is one isolated week, and the whole point of doing all the work
I did in the offseason is to determine if weeks like this are a fluke or the norm. The totality of the
system data seems to suggest the former - that sometimes one component will outperform the others. This
is one of the reasons I am using them all in some capacity - I think all of the components have something
to add to the picture.
Some comments about the Week 6 rankings, which are now posted:
- Because of the weighting of previous season data, my system will always be a little slow to adjust when
a team is much different from previous seasons. The philosophy is that an entire previous season is
a large amount of data points, and there needs to be really significant evidence in the current season
to suggest a team's strength has changed dramatically. Only a few games can be considered a fluke;
getting towards 6 or more games starts to become a trend. One example here is UCF. My system is still
in shocked disbelief that UCF has gotten so putrid this year so quickly. They are currently ranked
90th, but this is still really generous considering they've lost decisively to several teams ranked
lower. If I set the system to ignore previous season data and use only this season, then UCF is ranked
more accurately at #114. However, this also would result in Navy at #9, Houston at #15, Michigan St at #50,
Oregon at #74, and a number of other travesties. The point here is, usually in ranking systems, trying
to fix one problem resulting from an extreme corner case just causes more problems than what you started
with. I must just move on with the understanding that teams which have had big changes in strength
will be a little off for a while.
- Although many ranking systems would take offense to having Tennessee, a team with a losing record, at
#23, you should stop for a moment and think about how they have lost. One was in overtime to my #3 team.
Another was by 3 points to my #9 team. The third was by 4 points to my #37 team. The two wins were
dominating wins. You get the sense from watching the team that the #23 ranking is really about right -
this team can be very good if it ever learns to close out games. The raw talent is there, and as the
season goes on, I believe the #23 ranking will turn out to hold up pretty well.
- My system has one of the lowest opinions of California, at #49. Again, looking at their games, it
kind of makes sense. Beating Texas by a single point should set off alarm bells this season. They
barely beat Washington State at home. The evidence is there, but most computer systems and the human
polls overrate wins and will simply place all of the teams which are undefeated at or near the top.
Many of the same things apply to Temple, Memphis, and Toledo. Tell me - do you really believe that
Memphis is going to hang with Ole Miss in 2 weeks and make it a game? Do the coaches, who voted
Memphis at #25, really believe that? The same Memphis team which just wheezed past South Florida?
ENTRY 19 - 9-30-15 - Week 4 System Performance
Week 4 done, and it is time again to look at the stats for this week:
41 out of 52 FBS games correct (78.8%)
Average MOV error: 12.98
Average Signed MOV error: 2.75
Comments:
You can view the exact predictions and their results on the
Predictions page .
The win percentage is still remaining quite high for it being week 4, but I suspect the number will
come down starting in week 5 as almost all teams are now entering conference play. I would not be
surprised to see next week's prediction percentage closer to 70%.
For the second straight week, we see quite a high over-estimation of the home field advantage. A
signed MOV error of 2.75 says that the real home-field advantage across all of FBS averaged to only
about 0.7 points. I feel as though the observation I made a few weeks ago may be the culprit here,
but in reverse: A lot of the home field advantage average comes from a systematic error in my system
of under-estimating huge blowouts at home over cupcake teams (like Baylor by 53 over Rice). Then,
the actual closely contested games experience less home field advantage than the average. This
is just one more thing I am eager to play around with in the offseason and see if there is anything
to this. Then again, almost a full 1 point of that margin of error is due to just a single game:
Utah 62-20 over Oregon on the road. A sample size of 50 games is still allowing for a huge amount
of random variance.
Well, somehow West Virginia managed to move up even higher than last week and are sitting at #4 now.
A blowout of that magnitude (45-6) over a Power 5 team is starting to have me believing though, even if it
was Maryland. I would not be surprised at all if West Virginia defeats Oklahoma this week as is
being predicted by my system.
A kind of weird thing - Clemson moved up from #20 to #5 without playing. This is mostly due to one
of their two FBS opponents this season, Appalachian State, making a big move up with a 49-0 victory
over Old Dominion. Like West Virginia, this is the kind of thing that will correct itself over time.
It is more evidence that my system will need some tweaks in the early part of the season to give
more reasonable results. In past years, I have not even bothered putting actual rankings up
until at least 5 or 6 weeks have been played.
One last gripe to address: Utah has only moved up to #21 in my rankings this week, even after
the utter annihilation of Oregon on the road. My system is giving Utah one of the lowest rankings
out of all computer systems, and quite a bit lower than the human polls (at #10). Again, the full
body of work is being scrutinized here and there are some holes: Utah only beat Utah State by 10
points, and they are a very mediocre team. They also only beat Fresno by 20 (about the same margin
by which San Jose State just beat them), so it could be argued that Utah is about on San Jose State's
level. This is similar to the argument I made against Michigan State 2 weeks ago, and Michigan State
has done nothing since then to convince me they are an elite team either. Again, a few weeks' time
will tell whether Utah should really be up there higher or not. They will get rewarded if they
output more performances like their 62-20 win over Oregon to prove that we saw the real Utes that
night and not a mirage.
ENTRY 18 - 9-20-15 - Week 3 System Performance
Week 3 done, and it is time again to look at the stats for this week:
42 out of 52 FBS games correct (80.8%)
Average MOV error: 12
Average Signed MOV error: 1.73
Comments:
You can view the exact predictions and their results on the
Predictions page .
While the percentage looks pretty good this week, I somehow feel like it is a little overstated
and the losses really hurt. Among the very close misses were Texas over Cal (by 0.2), Georgia Tech over
Notre Dame (by 0.1), Duke over Northwestern (by 0.9), and Nebraska over Miami (by 1.3). All of these
games came really close (and 3 of them would have been vegas upsets I believe) but ultimately none
of these teams were able to put together a victory. Some victories to be happy about are correctly
picking Southern Miss over Texas State (most computers like Texas State), Navy over ECU, and Indiana
over Western Kentucky (my system is still down on W Kent even after a great start).
Some odd things to address in Week 4's Rankings:
- Alabama remaining at #1 after the loss. This is due to Alabama having an enormous lead in points
before that game, so one loss was not enough to kick them out. Some other top teams had rough days too:
Ohio State barely got past Northern Illinois, TCU struggled with SMU, Florida St did worse than
predicted against Boston College, etc.
- Oregon finally dropped from #3 to #11 on the back of a second suspect performance, allowing Georgia
State to score some good points. Michigan State dropped even more because of Oregon's plummet and
because of their own relatively small victory margin.
- West Virginia remains at #8 for now mainly as a placeholder. They have only played one FBS game
and used it to crush a decent #60 Georgia Southern team by 44 points. My guess is they will fall 10-15 spots even
if they defeat Maryland this week. A few case-studies like this are giving me ideas for how to
rewrite the early season parameters to be more accurate next season.
- Meanwhile, Charlotte is West Virginia's opposite for now, coming in as 10 points worse than any
other FBS team right now even though they have defeated two of them already, because of the absolute
massacre against Middle Tennessee. Middle Tennessee's
rating is also likely a bit inflated because of this.
This is the first season I've put out serious ratings starting from Week 1 so I imagined there would
be some problems such as these. In previous years I waited until at least 5 weeks had been played.
This gives me lots of good things to think about for next time. The biggest problem with early season
ratings is striking a balance between using only a few current season results and using previous season
results. I also believe that the procedure I alluded to last week of weighting games higher when they
are against opponents of similar skill might be a good way to fix a lot of these problems.
ENTRY 17 - 9-13-15 - Week 2 System Performance, and some Q and A
Week 2 done, and it is time again to look at the stats for this week:
38 out of 51 FBS games correct (74.5%)
Average MOV error: 12.86
Average Signed MOV error: -1.37
Comments:
You can view the exact predictions and their results on the
Predictions page .
74 to 75 percent is about average for my system in a given mid-season week. There were some
very close misses, like my Miss St. over LSU prediction coming down to a terrible delay of game
penalty followed by a missed FG as time expired. There were also some good predictions, like Oklahoma to beat
Tennessee, Syracuse to beat Wake, Tulsa to beat New Mexico, and BYU to beat Boise St, all of which
were getting serious votes the other way from either the Vegas line or the other computer prediction
systems. Most of these games can be considered to have come down to just 1 or 2 plays that could
have gone the other way. Such is the way of college football, and is why no system can be expected
to be entirely accurate.
The rankings for Week 3 have a few anomalies I wanted to address in case there are questions about
it. I will present it in Q-A form along the lines of the questions I would expect to get.
Q: Why do the rankings have so many cases of a team being just a few spots below a team
they beat this week or last week? Examples are (24) Nebraska over (25) BYU, (49) Cincy over
(52) Temple, (68) Vanderbilt over (69) Western Kentucky, etc.
A: The goal of a predictive system is to stand as the best predictor of future games. If a team
is rated above another team, it is because the system believes that in the long run, the higher team
would beat the lower team a majority of the time. Some cases such as those listed above can be
considered as the system saying that the actual meeting between the teams was a fluke and would not
be expected to be repeated if the teams played again. See again the above comment about randomness
in college football and remind yourself that without one Hail Mary, BYU would have lost to Nebraska,
and similar for the other games. Think about how many game results would have been totally different
if it wasn't for just 1 stupid penalty, 1 fumble recovery that could have gone either way, or
1 missed tackle that allowed a long TD run or punt return.
Also, this early in the season there is still a lot of sway from the previous season weight. I
understand that in some cases, a team has simply gotten much better (or much worse) than previous
seasons and in these cases it will take a few more weeks for the system to give accurate ratings.
Q: Michigan State just beat Oregon. The game pushed Oregon only from #2 to #3, and Michigan
State actually moved BACKWARDS from #14 to #16 as a result of the game. How dare you give Michigan
State so little credit for such a marquee win!
A: I have a few responses to this. First, rehash again the previous question's answer. The system
is stating here that it believes the outcome of the game was a fluke and that Oregon would win
the majority of such contests. Second, the game was a home game for Michigan State and they still
barely squeaked it out. According to my 3.4 point home field advantage, Oregon is still actually
the better team even counting this result alone, and would be expected to win on a neutral field.
Third, this typifies the knee-jerk recency bias that most people have in regards to sports. The fact
is that Michigan State's ranking is the combination of a number of factors, one of which is the
strength of the team they played last week (Western Michigan). After Michigan State only managed
a 13 point win over them, Georgia Southern annihilated them next week by 26 points. So the system
should give just as much merit to the idea that Michigan State might be worse than Georgia Southern
by comparison against a common opponent!
With this said, I do intend in the future to play around with a feature of the rankings that will
assign a higher weight to games between opponents that are roughly the same skill level. This might
alleviate some of the damage caused by 70-0 wins over cupcakes vs. hard-fought 3 point wins over
close rivals. My hopes are not too high about this currently because I've already tested such a system
in NCAA basketball with negative results (it made predictions worse). There's nothing to say it won't
work in football, though, and is a project for next summer.
Q: TCU and Baylor sure made a strong push up the rankings. Auburn took a bit of a tumble. All
three of them played against FCS teams this week. I thought you said those games have no effect
on the rankings!
A: See comment 3 on the above response. Many other factors affect a team's ranking from week
to week than just who they played this week. This is the beauty of a ranking system with infinite
depth of strength of schedule consideration - a seemingly unimportant game can have far-reaching
consequences. Let's check out those three team's week 1 opponents:
- TCU's week 1 opponent, Minnesota, beat Colorado State on the road.
- Baylor's week 1 opponent, SMU, beat North Texas by 18.
- Auburn's week 1 opponent, Louisville, lost a shocker at home to Houston.
This is (part of) the real explanation for the moves these three teams made. Examining the
continued progress of past opponents tells you a lot more about current power than you might
expect.
ENTRY 16 - 9-7-15 - Week 1 System Performance
With Week 1 in the books, here are the prediction stats:
32 out of 39 FBS games correct (82.1%)
Average MOV error: 10.97
Average Signed MOV error: -4.05
Comments:
You can view the exact predictions and their results on the
Predictions page .
This was a pretty good Week 1 for my system historically (typically the average is about 80%). I
explained in a previous blog post why this prediction percentage is higher than throughout the rest
of the season in general: Week 1 always features a collection of blowout home wins of Power 5
heavyweights over million dollar guaranteed cream puffs. Every system will get these games right,
so it really comes down to the couple of close matchups. I believe this also explains the -4.05
signed MOV error (which suggests that home field advantage should have been 7.5 this weekend).
My system slightly underestimates MOV on blowout wins the like of which there are plenty in
Week 1.
Some of the misses were on games that can be faulted by the preseason machinery I'm using.
Picking Vanderbilt over Western Kentucky is an example. WKU only got really good recently, and
Vanderbilt was quite good 4 or 5 years ago, so the preseason weights put more emphasis on
Vanderbilt's good past and WKU's bad past and gave Vanderbilt the predicted win. Before
calling fault on this and becoming results oriented, though, I'd like to point out that
Texas A&M thumped Arizona St, making a correct prediction, and going against purely last season's
postseason results and returning closer to the overall program strength over the last 5 seasons.
There will be hits and misses either way, and the data I have right now suggests the sustained
success model is the best predictor going forward.
ENTRY 15 - 9-4-15 - Phasing Out Preseason Rankings
Since football season is finally underway, it is officially (past) time to analyze my current system
for phasing out the preseason rankings and see if I can squeeze some extra value out of it.
I already discovered in the process of finding the results for the last blog post that on Week 0,
the optimal rankings are produced by combining the previous 5 season's team scores with weights
1, 1/2, 1/5, 1/5, 1/5. This is to say that although the most recent previous season was definitely
most important, the other 4 seasons prior to that should all add up to about the same level of
importance. Sustained excellence should have an impact on predictions of future excellence. A
little bit of playing around seemed to indicate that as the season progresses, these numbers should
hold pretty steady in providing the preseason data for the rest of the year's rankings.
The point of today's parameter manipulation was to determine how many weeks into a new season
the old preseason data should still have effect, and by what factor to phase it out until it is no
longer present at all. What I found is the following:
- The game in Week 1 should have about 1/3 of the impact on that week's rankings, against 2/3
preseason numbers.
- This should fall slowly until Week 9's Rankings, at which point the current year's games should
account for about 4/5 of the weight, against 1/5 preseason numbers.
- Starting with Week 10's Rankings, it is unclear from the data I have whether the preseason
rankings should continue to have an effect or not. If they do continue to have an effect, the
weight of the current season should continue to increase.
To restate the caveat from last week's blog post, this analysis was only done with results from
seasons 2007-2014, with a total of about 420 analyzed games per week that I was investigating.
This is a relatively small sample size to say anything conclusive in most of these cases. Here
are the numbers for correctly predicted game percentage from each week with/without preseason data.
Prediction Week ; Percentage with preseason ; Percentage without preseason
7 ; 75.1 ; 70.0
8 ; 73.6 ; 71.1
9 ; 75.9 ; 74.9
10 ; 74.7 ; 73.8
11 ; 71.7 ; 72.1
12 ; 74.5 ; 75.1
13 ; 74.6 ; 73.7
14 ; 75.6 ; 74.9
It is quite safe to say that preseason data is still helping up through Week 8's predictions
(Week 7's Rankings), but the percentages are too close together after that to say anything with any
statistical confidence at all. With these sample sizes, one needs about a 2.1 percent difference
to be confident it is not just noise. That they seem to flip-flop from week to week after Week 9
is an indication that nothing can be said past this point. I will have to continue to monitor
this situation as more data is gathered and see if then anything can be said. For the current
season, I will phase out the preseason data entirely starting with Week 10's rankings.
ENTRY 14 - 8-24-15 - Football Preseason is up
I finally did all of the program analysis I wanted to do before the football season, so the preseason
rankings can finally go up! For anyone interested, here is the list of changes (and attempted changes)
that I made and tested.
The idea was to set all my system parameters so they would both maximize the number of correct predictions
and minimize the point spread errors over the last 7 years, if those preseason rankings were used to
make predictions for the entire season. I can only go back 7 years since I only have data back to 2004, and
since the program uses several previous years data I can only safely make preseason rankings from 2008 and on.
I should note that all of the following was done entirely ignoring results and predictions against FCS
teams.
The first step, before doing that, is to make sure my actual ranking system is running in top shape for the
games in which it has enough season data to work with (after week 7). This will provide a good foundation to
start from for the preseason stuff.
My program consists of three major components right now: A transitive linear score comparison component, A transitive
scoring ratio comparison component, and a logarithmic comparison component that emphasizes wins more and has
quickly diminishing returns on blowout wins. I set about first optimizing each individual component before
putting them together and determining the optimal configuration of all three together. Here are the overall
conclusions of optimizing the individual components:
- The home advantage of 4 that I have been using is too high. The correct home advantage, which I will now
be using, is 3.4.
- All three components had too high of an overall range between the teams. By dampening all three of them, I was
able to increase slightly the prediction capability and margin of victory estimates.
- The linear component performed the best with about 70.7 percent of games chosen correctly. It only needed
tweaking to dampen the overall range of scores.
- The ratio component needed some minor tweaking from its previous state to perform at its best. The idea
behind the ratio component is to assign teams a number for each game of the ratio of total points in that game
by that team. This approach works well in a game like basketball without modification, since both teams will score
some number of points in the game, but in football an adjustment is needed because 1) Sometimes a team scores 0
points in a game, which makes ratios awkward, and 2) It is probably unfair to base a team's entire performance on
their score. It would probably be more fair to use a measure like total yards that describes how well a team was
moving the ball / stopping the opponent from moving. Since my program does not have this data available, however,
the best fix is to insert some kind of proxy points. I had been using a proxy score of 10 per team in my calculations
before (i.e. a score of 30-10 is instead scored as 40-20 for the purpose of finding the point ratio.) However,
the predictive value of this component was optimized at a proxy score of 45 per team. I find this really interesting
because basically the effect of having such a high proxy score is to de-emphasize strong defense and the ability
to hold opponents under 10 points. After this adjustment, the ratio component predicted on its own about 70.4
percent of games correctly.
- The log component also needed some minor tweaking. This component basically works by taking the difference in
two team's scores in a game and applying a logarithm to actually record that team's score for the game. The
logarithm sees a big difference between a margin of 1 point and 3 points, between 3 and 7, and between 7 and 20.
Past that point the logarithm dampens blowout wins pretty significantly. The system applies a positive buffer to the margin
of victory to give claiming the victory at all some small amount of points. What I found was that the buffer I was using
before was too high - the victory itself needed to be worth less. Also, the other astonishing thing is that before,
I was assigning automatically a higher log score to away victories and a bigger log penalty for home losses. After
testing, this turned out to be just entirely detrimental and I removed it. After these changes, the component
predicted 69.6 percent of games correctly.
- Just for fun, I also tested two more components which are not a part of the actual system right now: The components
of the linear predictor which are only concerned with offensive and defensive strength. Defense Only prediction was
able to predict 64.6 percent of the games correctly, and Offense Only prediction was able to predict 60.9 percent of
the games correctly. This would seem to agree with the conventional wisdom that defense "wins championships".
- The average margin of victory error between the three reasonable components was about 13.1 points.
After putting all the components together and running predictions with the composite, I settled on a ratio of
Score = 4, Ratio = 2, Log = 1. This ratio predicted about 70.4 percent of games correctly, but outperformed the linear
predictor in margin of victory predictions by a decent margin so I went with it.
Next up was to change the way previous year data is fed into preseason rankings. The old way was to treat the last 5 years
as one very long season (with weights to give more recent seasons more sway) and make rankings from that. This just seems
incorrect, though, since a team beating Auburn 4 years ago during their 3-9 season shouldn't count as if it's against a
current Auburn team. I thus separated each previous season as its own season and performed a weighted average of all of them.
This improved the predictive capability of preseason rankings, but not by as much as you might expect (something like 0.5 percent).
After all of this, it was time to start adjusting components and seeing how preseason rankings held up. I changed up the component
ratios to see what would work best, and the result was really astonishing: The best ratio turned out to be
Score = 4, Ratio = 2, Log = 1, Offense Only = 2. That's right, a team's overall OFFENSIVE strength in previous seasons
turns out to be a decent predictor for future seasons, and adds to the overall predictive strength! I tried the same thing
with Defense Only and it made the predictions worse! This was not what I expected, but that is what the numbers say. Using
this setup (and the setup I am now using for preseason rankings), the preseason rankings should predict about 72.4 percent
of that season's games correctly!
Now... hold on a moment! How can it be that the preseason rankings which only have previous year data do better than rankings which
have sufficient data from the current season? The answer lies in the range of weeks that each calculation was done in. For
preseason rankings, I tested them on games from Weeks 1-10, and for the in-season rankings, I tested them on games from
Weeks 10-Bowls. The explanation is that the OOC early season games tend to be between teams that are not
evenly matched - The entire SEC plays home games against the Sun Belt, the Pac 12 plays against MWC bottom feeders, the Big 10
plays against all the directional Michigan schools, etc. So any system that is anywhere near reasonable will predict these
correctly. My program's overall record on its current setting in games between Weeks 1-3 (excluding FCS games) is 78.3 percent!
However, once the teams get into conference play and bowls, the games are usually more even. Thus naturally looking only
at weeks 1-10 will increase your win percentage.
Let me know if you have any other questions about my rankings or the process I used to update them. I can think of two
possible complaints myself:
1) Optimizing components individually and then putting them together does not necessarily optimize the whole.
However, this is the only option I have available to me right now with the time I have available and the
processing power I have access to.
2) In any sufficiently chaotic system like football seasons, there are a number of totally random results that cannot
really be predicted. It is possible that in changing and tweaking a bunch of small parameters in my system, one
set will produce better results just by coincidence and not because that set of parameters is actually better.
It's the same logic as choosing 20 different coins to use making predictions for Week 1 of the season, and declaring
the one that turns out the best as the best predictive coin going forward. Indeed, the percentages increases I have been
speaking about in these paragraphs were very small. For the sample size of seasons that I had available to use, a difference
in proportion of correct predictions would need to be about 2 percent to be statistically significant, and all my changes
have made a difference of really no more than 1 to 1.5 percent. Thus the conclusions I have drawn might just be coincidence
and they should be monitored again and again going forward as more data is available.
Thanks for reading! View my football preseason rankings on the football page.
ENTRY 13 - 8-15-15 - Previous Years
I spent some time today going through both basketball and football and including postseason rankings
for the last several years. These rankings use my system as it stands now, and were generated today, not
years ago. For this reason, the teams will sometimes have the wrong conferences listed and the wrong
conference record. It is still a project I mean to get to eventually to include for each team a conference
history (at least going back as far as I have season data). Until that point, though, they will be wrong.
The numbers it generates should all be correct, though, since it does not use conference affiliation in its
ranking calculation.
I also went ahead and put up the preseason bracketology for the 2015-16 college basketball season. I have
no problems with doing that at this point because somehow in my mind the bracketology feels like a much
less rigid thing - I don't really feel any problems with changing its algorithm throughout the season.
Perhaps that is because the bracketology is only really tested once - in mid march. The rankings are
tested every week when teams do battle. It just doesn't feel as honest to tinker with the ranking systems
when I could easily adjust parameters to cater to the weekly results and claim that my program was right
all along when it really wasn't.
The program parameters still need tinkering because the machinery to even do the self-analysis is not in
place yet. I am working on it and hope to have the analysis begin by tomorrow.
ENTRY 12 - 8-12-15 - Site Comments and Football Rankings Updates
I have some exciting pieces of news for the site leading up to the football season!
1) I have installed a comment system on the site so it is easier to post comments that others (or myself) can read and reply on.
The system is not super complicated and does not allow one to reply to individual previous comments, but I will consider making
such an improvement if there is enough traffic on here to warrant it. I am fond of quoting a math professor I had in college on
matters such as this: "Mathematicians are lazy!"
The comment system appears on all of the site's pages and the comments are universal - you will see the same comments on every page.
Thus it is important if you plan to make a comment to be specific in which page you are currently on and what on that page you are
trying to reference if you have a comment about a particular thing.
2) I will be improving the overall look of the site as time allows over the coming weeks. As much as the plain black and white
is just so appealing, some color and formatting would be nice too.
3) Since I do not use any data except for game scores and locations in my rankings, my preseason 2015 NCAA Football rankings have
actually been available (to me) since the final game was played last year. However, I have not posted it because I planned on doing
some heavy analysis of my system over the summer to determine the optimal list of various parameters to use in my system to give
the best predictive value. In particular, I wanted to determine the best series of weights on previous season data to try to predict
future seasons. The current model takes a weighted average of the last four years, weighing the most recent year as about 75 percent
of the value, and declining rapidly from there. The idea is that in football there is a lot of carry over of talent from year to
year (as opposed to basketball, when many entire starting squads are one and done). It should stand to reason that last year is a
good place to start when trying to determine the strength of this year's group. To account for some teams losing a massive number
of key players, it also makes sense to put some emphasis on the team's tendency over the long run. How much of each is the best is
yet to be determined... that is the work of the coming week. So you can expect preseason rankings to be posted soon.
I have been real lazy this summer. Now comes the cram session of trying to do all the optimization in the last 3 weeks of summer!
I have set a time constraint on myself at a week before the first games kick off. After that point, the system will be set and I
will make no further modifications until next season. I will post progress on what I do in this blog.
ENTRY 11 - 4-10-15 - Postseason System Performance
Here's the quick run down for the postseason predictions:
Postseason
Basic Stats
Number of Games: 145
Picked Correctly: 101 (69.7%)
Higher Rated Winner (no home court advantage): 100 (69.0%)
Average MOV Error (unsigned): 8.13
Very Close Point Spread Predictions (Error less than 1.0)
NC State 66 Louisiana State 65
Predicted margin 0.93510165507713 Actual margin 1 Difference 0.06
Virginia Commonwealth 72 Ohio State 75
Predicted margin -2.5984838192675 Actual margin -3 Difference 0.4
Utah 57 Stephen F Austin 50
Predicted margin 7.9406513017151 Actual margin 7 Difference 0.94
Wichita State 81 Indiana 76
Predicted margin 5.510243751051 Actual margin 5 Difference 0.51
Notre Dame 67 Butler 64
Predicted margin 3.2489390943823 Actual margin 3 Difference 0.24
Oklahoma 72 Dayton 66
Predicted margin 5.7754568327814 Actual margin 6 Difference 0.22
Vermont 78 Radford 71
Predicted margin 6.7202164633315 Actual margin 7 Difference 0.27
Stanford 78 Vanderbilt 75
Predicted margin 3.9888395460495 Actual margin 3 Difference 0.98
Loyola Ill 65 UL Monroe 58
Predicted margin 6.9261645741698 Actual margin 7 Difference 0.07
Five Worst Point Spread Predictions
Kentucky 78 West Virginia 39
Predicted margin 12.595677085376 Actual margin 39 Difference 26.4
Northwestern State 79 Tennessee Martin 104
Predicted margin 4.4533150281138 Actual margin -25 Difference 29.45
Iowa 83 Davidson 52
Predicted margin 1.0510042369852 Actual margin 31 Difference 29.94
Western Michigan 57 Cleveland State 86
Predicted margin 2.8727809273311 Actual margin -29 Difference 31.87
Bowling Green 59 Canisius 82
Predicted margin 9.8685554179776 Actual margin -23 Difference 32.86
Out of the games listed above, 69 of them were played on neutral courts, so the difference
between including home court advantage and not is a little deceiving. One game prediction apart
is still not very much, but is a little more significant over 76 actual home/away games rather
than 145.
One kind of interesting thing - although my bracket simulator was set to the ratings as they
stood on Selection Sunday, this analyzer uses the most current data for each set of predictions.
For example, the analyzer had Wisconsin as a slight favorite over Arizona on Selection Sunday, by
a mere 0.08 points. By the time the game actually happened, Arizona was the slight favorite. The
reason for this is that the Pac 12 had an overall very good postseason, with UCLA making the Sweet 16
and Stanford winning the NIT. This was enough of a boost to Arizona's profile to put them ahead.
In the end, that did not turn out so well.
Some other flip-flops between Selection Sunday and the actual games: The MWC's awful start to the
postseason was enough to change San Diego St from favored to not favored over St. Johns. Again, this
switch turned out poorly. Some almost bad switches - West Virginia started as a 0.3 point favorite over Maryland,
but this moved to only a 0.04 point favorite by the time of the game (because of the Big 12's
early tourney struggles). Also Louisville started as a
1.6 point favorite over Northern Iowa, but was only a 0.47 point favorite by game time.
ENTRY 10 - 4-9-15 - Bracket Thresholds - Post Analysis and System Awards
With the tournament over now, it is time for postseason analysis of all kinds to begin. The program
revision process will not begin in earnest until this summer when I have more time. What I can say is
the following:
- My program ended up in 9th place out of 64 on the College Basketball Composite's bracket pool. This pool was
a comparison of computer rankings, and consisted of each computer system filling out its "favorites"
bracket and comparing them. My program did very strongly in the first two rounds, owing mainly to its
upset picks of West Virginia (5) over Maryland (4) and Utah (5) over Georgetown (4). It also had three
of the final four correct - Kentucky, Wisconsin, and Duke. Some key misses that put it behind the top
programs were not picking Notre Dame over Kansas, and not picking Duke to make the title game.
- My program's bracketology finished in a tie for 63rd out of 136 on the Bracket Matrix. While this only puts
me a small step above the average and there is room for improvement, I am for the moment satisfied with
my progress so far and what I've achieved in making a seed predictor that uses only game results and nothing
else (like injury information or national perception). As far as I'm aware, I am one of only a very few
who does automated seed projections, and out of the few other programs I know of, mine finisehd the highest.
Now, I have two tables for you. The first table is the histogram of point values for the entire
tournament. Since this is the final one, I'll include program upset levels at many more settings than
the previous levels, in addition to total ESPN numbers (normalized to 10000) and the small ESPN group that
I was a part of and submitted brackets to with 56 entries (also normalized to 10000). The histogram is
broken into 40 point blocks to make the size of the table manageable and reduce noise. The row label
represents the low end of the interval.
Points | 0.3 | 0.5 | 0.65 | 0.8 | 1 | 1.2 | 1.4 | ESPN | My Group |
1700+ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
1660 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 2 | 0 |
1620 | 2 | 1 | 3 | 5 | 2 | 2 | 2 | 3 | 0 |
1580 | 7 | 7 | 9 | 4 | 4 | 2 | 1 | 9 | 0 |
1540 | 20 | 24 | 13 | 15 | 8 | 2 | 3 | 15 | 0 |
1500 | 21 | 34 | 43 | 29 | 20 | 10 | 8 | 30 | 0 |
1460 | 8 | 42 | 46 | 36 | 30 | 19 | 14 | 30 | 0 |
1420 | 15 | 27 | 43 | 41 | 37 | 34 | 28 | 50 | 0 |
1380 | 34 | 52 | 53 | 49 | 40 | 33 | 31 | 60 | 179 |
1340 | 58 | 94 | 63 | 75 | 65 | 62 | 29 | 90 | 0 |
1300 | 65 | 103 | 103 | 120 | 72 | 70 | 42 | 120 | 179 |
1260 | 67 | 140 | 128 | 133 | 97 | 78 | 52 | 140 | 0 |
1220 | 72 | 108 | 149 | 129 | 139 | 90 | 75 | 150 | 179 |
1180 | 67 | 111 | 111 | 135 | 145 | 99 | 98 | 180 | 0 |
1140 | 41 | 88 | 94 | 113 | 134 | 109 | 120 | 170 | 0 |
1100 | 221 | 137 | 117 | 125 | 142 | 119 | 124 | 190 | 179 |
1060 | 545 | 282 | 183 | 173 | 139 | 126 | 99 | 280 | 179 |
1020 | 718 | 495 | 342 | 232 | 156 | 147 | 117 | 470 | 179 |
980 | 733 | 585 | 458 | 370 | 230 | 198 | 150 | 490 | 179 |
940 | 796 | 676 | 553 | 435 | 326 | 281 | 191 | 560 | 357 |
900 | 928 | 656 | 630 | 591 | 439 | 360 | 253 | 600 | 179 |
860 | 1157 | 782 | 654 | 605 | 494 | 393 | 355 | 670 | 357 |
820 | 1292 | 957 | 854 | 672 | 664 | 514 | 422 | 630 | 179 |
780 | 1310 | 1146 | 948 | 789 | 683 | 561 | 514 | 720 | 714 |
740 | 904 | 1181 | 1079 | 1018 | 832 | 700 | 598 | 790 | 1786 |
700 | 563 | 997 | 1090 | 964 | 893 | 748 | 655 | 770 | 179 |
660 | 256 | 665 | 927 | 995 | 1035 | 907 | 807 | 700 | 893 |
620 | 82 | 380 | 689 | 857 | 939 | 976 | 929 | 560 | 714 |
580 | 16 | 163 | 353 | 624 | 805 | 890 | 949 | 420 | 893 |
540 | 1 | 49 | 167 | 344 | 623 | 856 | 948 | 310 | 714 |
500 | 1 | 12 | 70 | 187 | 393 | 655 | 855 | 210 | 714 |
460 | 0 | 4 | 19 | 80 | 216 | 461 | 627 | 150 | 0 |
420 | 0 | 0 | 6 | 39 | 111 | 279 | 453 | 110 | 357 |
380 | 0 | 0 | 2 | 12 | 61 | 131 | 243 | 70 | 357 |
340 | 0 | 0 | 1 | 2 | 22 | 53 | 132 | 60 | 0 |
330- | 0 | 0 | 0 | 1 | 4 | 34 | 76 | 180 | 179 |
Prob. Perfect | 4.7 x 10^-20 | 4.1 x 10^-16 | 1.3 x 10^-14 | 5.5 x 10^-14 | 8.7 x 10^-14 | 6.7 x 10^-14 | 4.0 x 10^-14 | | |
Some comments:
The probability of a perfect bracket is maximized at exactly 1.00. At this upset level, the
system would have picked 1 perfect bracket in 11 trillion. I think this is a reasonable overall
estimation of the public's chances at a perfect bracket as well. A few people managed to have
brackets with 6 or 7 games wrong. Assuming about a 1/8 chance of switching for every one of those
individual picks (since most of them were upsets that people were unlikely to choose) results
in the correct probability of one of those brackets having instead picked perfectly in some
parallel universe.
To give more conclusive data about the elite level bracket range, I'll need to refine the
histogram to show all the details, and run 100,000 simulations at each upset setting so there
is enough data to actually look at.
Points | 0.3 | 0.5 | 0.65 | 0.8 | 1.0 | 1.2 | 1.4 | ESPN |
1760+ | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
1750 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
1740 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 |
1730 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
1720 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 2 |
1710 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 2 |
1700 | 0 | 1 | 1 | 0 | 2 | 0 | 0 | 3 |
1690 | 0 | 2 | 0 | 0 | 0 | 1 | 0 | 4 |
1680 | 0 | 0 | 2 | 3 | 1 | 1 | 1 | 5 |
1670 | 0 | 2 | 0 | 4 | 5 | 1 | 2 | 6 |
1660 | 0 | 0 | 1 | 2 | 1 | 0 | 2 | 8 |
1650 | 0 | 0 | 1 | 5 | 5 | 0 | 3 | 9 |
1640 | 0 | 4 | 7 | 6 | 4 | 3 | 2 | 11 |
1630 | 0 | 9 | 8 | 4 | 8 | 2 | 3 | 13 |
1620 | 2 | 18 | 6 | 12 | 8 | 5 | 5 | 15 |
1610 | 15 | 14 | 11 | 11 | 6 | 7 | 2 | 18 |
1600 | 16 | 28 | 20 | 20 | 7 | 6 | 5 | 23 |
As I've said before, the setting you want to generate brackets at all depends on what your goal is
for how high you think your score will have to be to win your pool. You can figure out which setting
is your best one by, at each score level, seeing which setting is going to achieve that goal with the
highest probability.
I already stated that if your goal was a perfect bracket, you wanted the 1.0 setting. I'll assume
that the 1760+ bracket generated above was a fluke (it was a 1790, would have been good enough for
an overall 17th place on ESPN). Under that assumption, the 0.8 setting seems to be best for a goal
of 1610 or higher. Below that, the table below will tell you which setting you want:
1830-1920 - 1.0 setting
1610-1830 - 0.8 setting
1600-1610 - 0.5 setting
1580-1590 - 0.65 setting
1540-1580 - 0.5 setting
1380-1540 - 0.65 setting
1260-1380 - 0.5 setting
1100-1260 - 0.8 setting
0000-1100 - 0.3 setting
I will emphasize again that this is only for a single year. There have been previous years for which
as high as a 2.1 setting was the best for generating a perfect bracket. This happened to be a good
year to pick lots of favorites, so a relatively safe upset setting performed quite well.
The maximum score in my group was a 1390. The brackets I actually generated for that group were
three 1.0 brackets, a 1.2 bracket, and three 1.4 brackets. My probability of at least one of those
cracking 1390 turned out to be about 7.1%, which isn't very good. Had I instead generated all seven
at the 0.65 setting as suggested above, my probability of one of them making it would have been
8.7%, a small improvement. As it turns out, out of the 30 or so brackets I generated for various other sites, I ended up with a
1.2 setting bracket finishing with 1480 points.
Next time I have some time I'll post one more entry with a summary of program predictions
from the postseason, with win percentage and average error in prediction included.
ENTRY 9 - 3-31-15 - Final 4 Bracket Thresholds
Same thing as before for Final 4 histograms. Data is bunched into groups of 4, with the
label representing the low end of the four values. Column headers represent again four
upset levels for my program, as well as the ESPN percentiles. 10,000 simulations are done.
Points | 0.3 | 0.65 | 1 | 1.4 | ESPN |
1100+ | 0 | 1 | 1 | 0 | 5 |
1060 | 0 | 3 | 1 | 1 | 15 |
1020 | 1 | 14 | 4 | 1 | 50 |
980 | 36 | 41 | 23 | 11 | 90 |
940 | 556 | 144 | 55 | 30 | 240 |
900 | 1317 | 395 | 162 | 66 | 510 |
860 | 1976 | 801 | 343 | 137 | 970 |
820 | 2212 | 1172 | 594 | 247 | 1070 |
780 | 1880 | 1599 | 873 | 428 | 1330 |
740 | 1097 | 1654 | 1141 | 653 | 1250 |
700 | 570 | 1554 | 1320 | 896 | 1160 |
660 | 251 | 1195 | 1417 | 1191 | 930 |
620 | 94 | 762 | 1266 | 1260 | 700 |
580 | 7 | 394 | 1099 | 1286 | 500 |
540 | 1 | 182 | 725 | 1148 | 350 |
500 | 2 | 74 | 485 | 1005 | 230 |
460 | 0 | 13 | 286 | 721 | 160 |
420 | 0 | 2 | 126 | 470 | 110 |
380 | 0 | 0 | 58 | 245 | 70 |
340 | 0 | 0 | 14 | 133 | 60 |
330- | 0 | 0 | 7 | 71 | 180 |
|
Prob. Perfect | 1.2 x 10^-18 | 1.3 x 10^-13 | 7.5 x 10^-13 | 3.3 x 10^-13 | unknown |
Some Comments:
The optimal upset setting to achieve a perfect bracket has now dipped below 1 to 0.97. It has happened
3 times in the last 10 years that the optimal setting for a perfect bracket was less than 1, and none of
those was in the last 5 years. In 2011 the optimal setting for a perfect bracket was the 2.11 level!
According to the data provided by ESPN stats, three of the four final four participants were picked more
highly to make it there than what my program gave credit for at the level 1 upset setting. The public gave
the four teams these chances: Kentucky 77.7%, Wisconsin 41.8%, Duke 52.3%, Michigan St. 9.1%. This can
be compared to my program's numbers as below: 73.5%, 43.1%, 33.1%, and 2.5% respectively. Especially
Michigan St was not given enough credit in my program (in spite of being ranked 16th, deserving of more
like a 4 seed than a 7 seed). I'd like to think this was just a situation of the public getting lucky with
this one, but the only way to find out for sure is to dig for some more data!
2014 will be a good back
to back comparison seeing as it was kind of an opposite final four (lots of unexpected teams). The public's
numbers for the final four teams were Florida 61.9%, Wisconsin 20.7%, Kentucky 3.3%, Connecticut 1.4%. My
program's respective numbers for final four frequency (at the default 1.0 upset level) were
Florida 45.5%, Wisconsin 14.5%, Kentucky 4.3%, Connecticut 4.3%. If multiplying percentages, we find that
my program is about twice as likely to have this final four right than the public. It ends up being even
more lopsided than that, since in the public's case the individual probabilites are not independent. A
person is less likely to pick 2 high seeds to make the final four than the percentages would indicate.
In actuality, only 612 out of 11 million ESPN brackets got the final four correct, compared to about 1340
of my generated brackets out of 11 million.
For 2013 I also have available that only 47 brackets on ESPN out of 8.15 million got the final four correct.
For comparison, using individual final four percentages, my program would have gotten about 710 perfect
final fours if it had been allowed to fill out that many brackets. The individual numbers: The public picked
Louisville 52.6%, Syracuse 8.1%, Michigan 13.3%, Wichita St. 0.24% My program's respective numbers:
50.2%, 13.4%, 12.4%, 0.9%
For 2012 there were 25,304 perfect final fours out of 6.45 million ESPN brackets. My program with that
many brackets would have gotten about 42,200 perfect final fours. The individual numbers: The public picked
Kentucky 68.3%, Ohio St 34.4%, Kansas 18.2%, Louisville 6.8%. My program's respective numbers:
61.6%, 40.2%, 38.6%, 6.9%.
For 2011 there were just 2 brackets on ESPN out of 5.9 million with a perfect final four. My program with
that many brackets would have had about a 1 in 5 chance of getting a single perfect final four. This is
mostly due to my program having Butler and VCU, two of the final four teams, rated 51st and 78th respectively
entering the tournament. In past reincarnations of my bracket generator, I have had factors that boost
a team's ranking if they win a game or two, making them more likely to win more games after that and make
a deeper run. That is not present in the current system, but would obviously help the chances of a perfect
final four in a case like this. It is not clear whether in the long run it would hurt the other years
too much to be overall beneficial.
Conclusion: My program outperforms the public in final four participants (and thus, likely in bracket
points as well) in most years. It tends to give the underdogs a little more credit than the public, and
the difference is very exaggerated for the teams with very high seeds. Most years, enough underdogs win
that my program serves as a better picking device for distributing enough randomness. However, this year
was much different with 3 one-seeds and an underseeded 7-seed making the final four. The public got the
better of my program in 2015 with its love of the favorites.
It might be a more fair comparison of bracket picking abilities to compare expected values for number
of final four teams correct, rather than straight percentage of brackets with all four perfect. The
comparisons look like this (with my program set to 0.65 upset level as a third comparison):
2015: Program 1.522; Program(0.65) 1.89; Public 1.809
2014: Program 0.686; Program(0.65) 0.76; Public 0.873
2013: Program 0.769; Program(0.65) 0.87; Public 0.742
2012: Program 1.473; Program(0.65) 1.80; Public 1.277
We'll call this one a draw for the regular upset level, and a solid victory at the 0.65 upset level.
ENTRY 8 - 3-27-15 - Elite 8 Bracket Thresholds
Here are the Elite 8 histograms. I have bunched the data into groups of 2 this time so
the table isn't tremendously long. The label represents the lower of the two values.
The column headers are the same as before - they represent the upset level for that set
of simulations, where 0.3 = extremely conservative, 0.65 is the lowest end of my slider
on my generator page, 1 is the default, 1.4 is about 2/3 of the way to crazy, and ESPN
shows ESPN percentile values normalized to 10,000 simulations.
Points | 0.3 | 0.65 | 1 | 1.4 | ESPN |
820 | 0 | 0 | 0 | 1 | 5 |
800 | 0 | 0 | 1 | 1 | 15 |
780 | 0 | 3 | 6 | 4 | 30 |
760 | 11 | 22 | 12 | 5 | 70 |
740 | 66 | 53 | 23 | 16 | 130 |
720 | 287 | 136 | 49 | 26 | 240 |
700 | 658 | 266 | 122 | 52 | 420 |
680 | 1337 | 440 | 224 | 67 | 610 |
660 | 1536 | 661 | 284 | 150 | 810 |
640 | 1846 | 886 | 495 | 228 | 1150 |
620 | 1677 | 1116 | 630 | 309 | 1020 |
600 | 1306 | 1290 | 809 | 476 | 1000 |
580 | 848 | 1234 | 1028 | 597 | 910 |
560 | 389 | 1212 | 1094 | 727 | 770 |
540 | 161 | 941 | 1031 | 922 | 640 |
520 | 56 | 705 | 1052 | 935 | 500 |
500 | 16 | 476 | 883 | 1025 | 380 |
480 | 4 | 291 | 727 | 963 | 300 |
460 | 2 | 148 | 549 | 875 | 220 |
440 | 0 | 75 | 434 | 761 | 160 |
420 | 0 | 21 | 228 | 597 | 130 |
400 | 0 | 20 | 142 | 438 | 90 |
380 | 0 | 2 | 88 | 333 | 70 |
360 | 0 | 2 | 57 | 202 | 50 |
340 | 0 | 0 | 19 | 143 | 50 |
320 | 0 | 0 | 7 | 77 | 30 |
300 | 0 | 0 | 5 | 36 | 30 |
290- | 0 | 0 | 1 | 34 | 170 |
|
Prob. Perfect | 4.9 x 10^-18 | 5.9 x 10^-13 | 4.6 x 10^-12 | 2.4 x 10^-12 | unknown |
Some Comments:
I took some time this time to fish out more data about the ESPN percentiles. What we see is
that somehow the public is significantly outperforming the program at the higher levels. There
are a few explanations I can think of for this. One is that the public has a little more
knowledge about injuries and other situations like that than the program would have access to.
For example, ESPN actually had more brackets with Notre Dame in the Elite 8 than Kansas, probably
because of the continuing injury/eligibility problems with Kansas. My program had them about
even, with Kansas a slight favorite. Same thing goes for Virginia. This kind of foreknowledge
could represent a 60-80 point swing, which puts my program about in line with the ESPN numbers.
It continues to be the case that which upset setting to pick is all a matter of goal. My
program favorites bracket has a score of 680 right now, which outperforms any setting with
randomness just in terms of average. Then, the higher you wish to score, the more randomness
you have to allow. It comes at a steep price - the 1.4 upset setting had the highest score
I saw out of all the simulations (830), but it also has a much lower average and can make some
downright stinky brackets. The 1.4 setting only has a 3% chance of outperforming the
bracket that just picks every seed favorite so far (which currently sits at 650).
The probability of a perfect bracket is maximized at upset level 1.02 at this point, very near
the default. Mathematically, if you want a perfect bracket, you should be performing weighted
coin flips such that they are weighted exactly to the actual game result probabilities. This
suggests the default setting does a pretty good job of accurately reflecting those probabilites.
I have around 30 system-generated brackets entered into various pools right now, with various
upset settings chosen for them. The highest one right now is a 1.0 setting bracket, with a
score of 770. This has probability 0.1% of happening on any given bracket for that setting,
so is a bit higher than expected (but I'm not complaining!). Out of 30 brackets on that setting,
I would expect at least one to be 690 or above.
ENTRY 7 - 3-27-15 - Reviewing the Bracket Maker (so far)
With the first 48 games behind us (and 12 more this week - another update to come after that),
It is time to see how my bracket maker did this year. Is it outperforming the public?
What "upset" setting should it be set to in order to optimize points?
These questions are difficult to answer, in part because different goals can be kept in
mind here. For example, just setting it to "pick all favorites" would have resulted in
a bracket in the 98.7 percentile at ESPN at the Round of 32, and in the 98.6 percentile
at the Sweet 16. However, this bracket has a 0 percent chance of being a perfect bracket,
and would not have had a solid chance in a large bracket pool. I believe the solution
to this is to just show histograms of how my program does at different upset settings
at producing brackets that make it to certain level and let you decide.
What follows is a simulation of 10,000 brackets at 4 different upset levels: 0.3 is extremely
conservative, 0.65 is the lowest upset setting on my slider, 1 is the default, and 1.4 is about
2/3 of the way towards "crazy" on the slider. The ESPN
column has ESPN's total percentile numbers normalized to 10,000 for comparison. I do not have all of the
ESPN percentile data, so the top and bottom scores contain all the data above/below that
point. The point numbers are by ESPN's scoring system. The bottom row shows probability of
a perfect bracket after this round with each setting.
Points | 0.3 | 0.65 | 1 | 1.4 | ESPN |
300 | 2 | 3 | 4 | 1 | |
290 | 39 | 38 | 17 | 15 | |
280 | 262 | 174 | 92 | 42 | 130 |
270 | 923 | 567 | 296 | 168 | 610 |
260 | 2016 | 1223 | 717 | 401 | 930 |
250 | 2605 | 1879 | 1251 | 825 | 1310 |
240 | 2240 | 2120 | 1727 | 1286 | 1420 |
230 | 1316 | 1894 | 1935 | 1650 | 1590 |
220 | 483 | 1180 | 1663 | 1680 | 1350 |
210 | 96 | 611 | 1175 | 1536 | 990 |
200 | 17 | 230 | 699 | 1091 | 630 |
190 | 0 | 62 | 281 | 668 | 360 |
180 | 1 | 15 | 101 | 385 | 200 |
170 | 0 | 3 | 34 | 162 | 100 |
160 | 0 | 1 | 8 | 67 | 80 |
150 | 0 | 0 | 0 | 13 | 190 |
140 | 0 | 0 | 0 | 8 | |
130 | 0 | 0 | 0 | 2 | |
|
Prob. Perfect | 2.0 x 10^-10 | 6.9 x 10^-8 | 5.5 x 10^-7 | 5.7 x 10^-10 | 8.5 x 10^-8 |
Some Comments:
It is not surprising that since there were not many upsets in the round of 64, the smallest
upset setting (0.3) has the best numbers at pretty much every level. It is interesting to
look at the ESPN numbers and try to estimate how most people pick their brackets. The
conclusion I come to is that people filling out their brackets on ESPN come from all
different philosophies. The numbers look most like an average of all of the different
upset levels, as well as having a hugely disproportionate number of totally crazy brackets
equivalent to the highest upset level on my slider (3.5).
My 50 bracket ESPN pool had a high score of 270 after this round. If you wanted to be
sitting on top of this size of a pool, your best bet was just to go all "chalk". However,
in a slightly larger pool with a high of 290, you'd probably want the 0.3 setting.
Another goal might be to just have a perfect bracket. There was 1 such bracket on ESPN
out of 11.7 million. It turns out the optimal upset setting for my system was 1.17, which
would have given a probability of 6.4 x 10^-7 for a perfect bracket (1 in 1.6 million).
Overall, the 0.65 upset setting was probably best for a combination of high overall scores
and decent chance of perfection.
Now, for the same data after the Round of 32:
Points | 0.3 | 0.65 | 1 | 1.4 | ESPN |
580 | 0 | 0 | 1 | 0 | |
570 | 0 | 0 | 1 | 0 | |
560 | 0 | 0 | 0 | 0 | |
550 | 0 | 0 | 1 | 0 | |
540 | 1 | 4 | 4 | 2 | |
530 | 4 | 15 | 7 | 2 | |
520 | 16 | 22 | 18 | 3 | 30 |
510 | 65 | 56 | 32 | 8 | 40 |
500 | 149 | 100 | 45 | 33 | 70 |
490 | 391 | 211 | 93 | 32 | 60 |
480 | 712 | 271 | 119 | 65 | 240 |
470 | 1157 | 462 | 234 | 107 | 380 |
460 | 1479 | 640 | 335 | 174 | 520 |
450 | 1528 | 810 | 464 | 209 | 880 |
440 | 1424 | 956 | 570 | 317 | 830 |
430 | 1204 | 1063 | 678 | 426 | 900 |
420 | 836 | 1137 | 808 | 581 | 910 |
410 | 512 | 998 | 909 | 621 | 860 |
400 | 286 | 918 | 995 | 718 | 790 |
390 | 135 | 730 | 904 | 783 | 690 |
380 | 59 | 579 | 829 | 846 | 570 |
370 | 30 | 370 | 765 | 863 | 460 |
360 | 8 | 278 | 636 | 827 | 370 |
350 | 3 | 161 | 512 | 720 | 280 |
340 | 1 | 114 | 347 | 637 | 220 |
330 | 0 | 55 | 256 | 524 | 200 |
320 | 0 | 28 | 195 | 470 | 100 |
310 | 0 | 11 | 116 | 355 | 80 |
300 | 0 | 6 | 61 | 246 | 60 |
290 | 0 | 4 | 35 | 160 | 50 |
280 | 0 | 1 | 14 | 103 | 40 |
270 | 0 | 0 | 8 | 72 | 40 |
260 | 0 | 0 | 5 | 40 | 40 |
250 | 0 | 0 | 2 | 27 | 30 |
240 | 0 | 0 | 3 | 13 | 20 |
230- | 0 | 0 | 0 | 16 | 160 |
|
Prob. Perfect | 3.6 x 10^-17 | 7.0 x 10^-12 | 9.8 x 10^-11 | 8.5 x 10^-11 | unknown |
Some Comments:
The 0.3 upset level is better until about a score of 520. From that point, the 0.65 upset
level is better to about the 540 level. If you are shooting for anything higher than that,
the default 1 level is going to give you the highest chance of elite level brackets. The
optimal setting for a perfect bracket is about 1.12.
The ESPN numbers show about the same trend as last time - they're kind of an average of
all of the upset settings (plus throwing in the 2 or 3% of brackets filled out by people
who don't know what they're doing).
For some comparison to last year, usually the optimal upset setting for a perfect bracket
is about 1.4. It has been as high as 2.0 before. The amount and kind of upsets is
unusually small this year, and I wouldn't expect the above analysis of settings to be
true for every year. Some time this summer I'll do a more thorough analysis of what
upset setting is the best for particular goals over the last 15 years.
More to come next week once the sweet 16 and elite 8 games are finished!
ENTRY 6 - 3-16-15 - The NCAA Tournament Special
The bracket has been released, and the madness and analysis for the year begins.
You'll notice at the top of this page there is a link to my bracket predictor for this
year - it uses my rankings to calculate odds of upsets and flips weighted coins for all
of the matchups. Try it out and see what happens!
I ran the predictor 1000 times and compiled some data on frequency of both final four
appearance and national champion. Here are the results:
Final Four Appearances (out of 1000):
(1)Kentucky: 735
(1)Villanova: 446
(1)Wisconsin: 431
(2)Arizona: 386
(1)Duke: 331
(2)Virginia: 324
(2)Gonzaga: 266
(3)Iowa State: 164
(5)Utah: 103
(2)Kansas: 99
(3)Oklahoma: 93
(3)Notre Dame: 84
(4)North Carolina: 71
(3)Baylor: 65
(4)Georgetown: 54
(4)Louisville: 50
(5)Northern Iowa: 36
(6)Southern Methodist: 28
(7)Michigan State: 25
(5)West Virginia: 23
(7)Iowa: 22
(7)Wichita State: 18
(4)Maryland: 15
(6)Providence: 15
(10)Ohio State: 14
(5)Arkansas: 11
(10)Davidson: 11
(11)Texas: 11
(9)Saint Johns: 9
(6)Butler: 8
(6)Xavier: 6
(8)Oregon: 5
(8)NC State: 5
(9)Purdue: 5
(11)UCLA: 5
(7)Virginia Commonwealth: 4
(8)San Diego State: 4
(11)Brigham Young: 4
(9)Oklahoma State: 3
(9)Louisiana State: 3
(12)Stephen F Austin: 3
(8)Cincinnati: 1
(10)Georgia: 1
(10)Indiana: 1
(11)Dayton: 1
(13)UC Irvine: 1
National Champions (out of 1000):
(1)Kentucky: 359
(1)Villanova: 141
(1)Wisconsin: 119
(2)Arizona: 107
(2)Virginia: 81
(1)Duke: 69
(2)Gonzaga: 30
(3)Oklahoma: 16
(3)Iowa State: 15
(3)Notre Dame: 12
(3)Baylor: 10
(5)Utah: 10
(2)Kansas: 9
(4)North Carolina: 3
(4)Georgetown: 3
(5)Northern Iowa: 3
(7)Michigan State: 3
(7)Wichita State: 2
(4)Louisville: 1
(5)West Virginia: 1
(5)Arkansas: 1
(6)Southern Methodist: 1
(7)Iowa: 1
(10)Ohio State: 1
(10)Davidson: 1
(11)Texas: 1
Some notes on this frequency table:
- As expected, Kentucky has the highest chance to win it all that I've seen in a while.
However, their chances to win are still less than many people would think (at 36%).
- The last time a team was rated as highly as Kentucky in my system entering the tourney
was Duke 2001. That team went on to win the national title and won every game by
double digits along the way.
- The list of teams with at least a 1% chance to win the title is much smaller than in
most years. There are 12 teams here, compared with last year when there were around
20 teams with at least a 1% shot. It should be noted that Connecticut last year
had about a 0.7% chance to win the title in my system - right on this threshold.
- The same goes for final four appearances: There are 28 teams with at least a 1%
chance of a final four appearance; last year, that figure was about 40. The takeaway
here is that when filling out brackets, this is the year to go conservative and have
no teams above a 3 seed making the final four. Overall, 75% of the simulated final four
teams have been 1 or 2 seeds, and 85% have been 1-3 seeds. The cause of all of this is
that there is an unusually high amount of elite level teams. There are 6 teams this
year that are all rated higher in my system than last year's number 1 team.
- My program predicts the following outright seed upsets in the tourney:
(9)Purdue over (8)Cincinnati,
(10)Ohio St over (7)VCU,
(5)West Virginia over (4)Maryland,
(5)Utah over (4)Georgetown
- This can be compared to last year, which had 11 outright seed upsets picked.
ENTRY 5 - 2-24-15
Here's the quick run down for this week:
Week of 2-16 to 2-23
Basic Stats
Number of Games: 330
Picked Correctly: 256 (77.8%)
Higher Rated Winner (no home court advantage): 258 (78.2%)
Average MOV Error (unsigned): 8.21
Average MOV Error (signed): 1.21
Notable (Since there were many) Exact Point Spread Predictions (Error less than 0.5)
Boston College 86 Miami 89
Predicted margin -3.227034854241 Actual margin -3 Difference 0.22
Tennessee 48 Kentucky 66
Predicted margin -18.31023777106 Actual margin -18 Difference 0.31
Brigham Young 75 San Diego 62
Predicted margin 13.227448332893 Actual margin 13 Difference 0.22
Memphis 75 Connecticut 72
Predicted margin 3.3840347591871 Actual margin 3 Difference 0.38
Stanford 72 California 61
Predicted margin 10.693293116834 Actual margin 11 Difference 0.3
Five Worst Point Spread Predictions
UNC Greensboro 84 Furman 49
Predicted margin 6.0142286610693 Actual margin 35 Difference 28.98
Bowling Green 56 Miami OH 67
Predicted margin 18.430970995983 Actual margin -11 Difference 29.43
Middle Tennessee 90 Marshall 51
Predicted margin 8.5388320652441 Actual margin 39 Difference 30.46
Drake 78 Missouri State 43
Predicted margin 1.392955143852 Actual margin 35 Difference 33.6
Grambling State 44 Prairie View 95
Predicted margin -13.056085085757 Actual margin -51 Difference 37.94
Random Comments
For the second week in a row, the program was better off picking just straight up rather
than using its 4 point home court advantage. This, in spite of the signed margin of victory
being 1.21 (indicating that the home court advantage should be about 4 - 1.21 = 2.79). What
is happening here I believe is that the predictions with very large errors are continuing
to skew this average and make it basically not usable.
My focus this week has continued to be on the bracketolgy portion of the site, leading up
to all the march craziness. I made improvements this week by increasing the number of
factors involved in the calculation and improving the accuracy of my virtual RPI formula.
My RPI is now very close to correct (it evaluates a few neutral site games incorrectly).
Together with this, the bracketology also weighs the various record vs Top X of RPI factors
which show up in so many places, and has some simple accounting for number of good wins
and bad losses. The next project is to fix the program to ignore teams that are ineligible
for postseason play, because it's kind of silly to see Syracuse in the current projected
field. I'd also like to add some other factors like conference record, non-conference SOS,
and performance in conference tourney, but these are more difficult to back-optimize
in previous years because conference affiliations change so often.
ENTRY 4 - 2-16-15
Here's the quick run down for this week:
Week of 2-9 to 2-15
Basic Stats
Number of Games: 335
Picked Correctly: 223 (66.6%)
Higher Rated Winner (no home court advantage): 226 (67.5%)
Average MOV Error (unsigned): 9.12
Average MOV Error (signed): 0.66
Exact Point Spread Predictions (Error less than 0.5)
Marist 64 Siena 66
Predicted margin -1.7494888773551 Actual margin -2 Difference 0.25
Northern Kentucky 59 Florida Gulf Coast 65
Predicted margin -5.6067773087137 Actual margin -6 Difference 0.39
Bethune Cookman 51 North Carolina Central 65
Predicted margin -14.039066273478 Actual margin -14 Difference 0.03
Illinois State 62 Wichita State 68
Predicted margin -5.5963862505528 Actual margin -6 Difference 0.4
Mcneese State 72 Northwestern State 75
Predicted margin -3.0736620890667 Actual margin -3 Difference 0.07
New Hampshire 66 Binghamton 48
Predicted margin 18.181057129051 Actual margin 18 Difference 0.18
New Mexico State 74 Cal State Bakersfield 58
Predicted margin 16.002762340738 Actual margin 16 Difference 0
Northern Colorado 81 Montana 83
Predicted margin -1.6649045824558 Actual margin -2 Difference 0.33
Penn State 73 Maryland 76
Predicted margin -2.7781982563839 Actual margin -3 Difference 0.22
Syracuse 72 Duke 80
Predicted margin -7.6111319353393 Actual margin -8 Difference 0.38
Troy 80 UL Lafayette 84
Predicted margin -3.972624983872 Actual margin -4 Difference 0.02
Quinnipiac 57 Iona 60
Predicted margin -3.3948807386064 Actual margin -3 Difference 0.39
Five Worst Point Spread Predictions
Norfolk State 64 Maryland Eastern Shore 82
Predicted margin 8.5570037022892 Actual margin -18 Difference 26.55
Vermont 96 UMass Lowell 53
Predicted margin 15.837870294243 Actual margin 43 Difference 27.16
Duquesne 78 George Washington 62
Predicted margin -11.876744259073 Actual margin 16 Difference 27.87
Virginia Military Institute 93 Furman 59
Predicted margin 3.3057687588848 Actual margin 34 Difference 30.69
South Carolina Upstate 70 Jacksonville 89
Predicted margin 25.935471409207 Actual margin -19 Difference 44.93
Random Comments
This week was not my system's finest hour, it seems. The win percentage was quite low, and
for the first time the straight up rankings outperformed the home adjusted rankings in raw
victories. The average MOV error still indicates somewhere around a 3.5 point home court
advantage is probably correct though. At least I can take solace in the fact that most
computer systems performed poorly this week. The highest (non-adjusted) win percentage on
Massey's Composite site for the week was around 69%.
The big news as far as analysis goes on my site this week is a new Bracketology work in
progress. It will be entirely automated with no human inputs. You can go
here
if you are interested in that kind of thing.
ENTRY 3 - 2-9-15
Here's the quick run down for this week:
Week of 2-2 to 2-8
Basic Stats
Number of Games: 332
Picked Correctly: 238 (71.7%)
Higher Rated Winner (no home court advantage): 219 (66.0%)
Average MOV Error (unsigned): 8.82
Average MOV Error (signed): 0.53
Exact Point Spread Predictions (Error less than 0.5)
Grambling State 65 Mississippi Valley State 68
Predicted margin -2.5785381135264 Actual margin -3 Difference 0.42
Western Michigan 67 Kent State 66
Predicted margin 0.79828857564138 Actual margin 1 Difference 0.2
Memphis 74 Jacksonville State 48
Predicted margin 25.888427908482 Actual margin 26 Difference 0.11
Morehead State 72 Tennessee State 57
Predicted margin 15.463018075582 Actual margin 15 Difference 0.46
Marist 63 Niagara 61
Predicted margin 1.7157102814566 Actual margin 2 Difference 0.28
Furman 68 Mercer 74
Predicted margin -6.0725452646812 Actual margin -6 Difference 0.07
Saint Peters 69 Fairfield 58
Predicted margin 10.859642136147 Actual margin 11 Difference 0.14
Auburn 79 Mississippi 86
Predicted margin -6.5699636474489 Actual margin -7 Difference 0.43
Coppin State 77 South Carolina State 74
Predicted margin 3.3035461022013 Actual margin 3 Difference 0.3
UNC Wilmington 53 William and Mary 56
Predicted margin -2.5598318822355 Actual margin -3 Difference 0.44
Alcorn State 61 Jackson State 64
Predicted margin -2.6764482331066 Actual margin -3 Difference 0.32
Five Worst Point Spread Predictions
Lehigh 103 Army 74
Predicted margin 4.3492315438265 Actual margin 29 Difference 24.65
UC Irvine 56 UC Davis 75
Predicted margin 6.7440285349932 Actual margin -19 Difference 25.74
Pepperdine 50 San Diego 72
Predicted margin 8.7938203904259 Actual margin -22 Difference 30.79
Air Force 73 Wyoming 50
Predicted margin -8.5487373202941 Actual margin 23 Difference 31.54
Virginia Military Institute 56 UNC Greensboro 85
Predicted margin 8.8910546556228 Actual margin -29 Difference 37.89
Random Comments
I've been slammed with work this week and didn't have time to run any analysis this time.
It is a little interesting to see such a big difference between win percentage with/without
home court advantage factor this time, especially since the signed error is about the same
as always. Can't wait for bracket time to begin. I've already pencilled Old Dominion and
George Washington in for losses. Can't believe those teams are still in people's fields
at the moment. Oh, and Wyoming too. Why???
ENTRY 2 - 2-2-15
Here's the quick run down for this week:
Week of 1-26 to 2-1
Basic Stats
Number of Games: 331
Picked Correctly: 237 (71.6%)
Higher Rated Winner (no home court advantage): 229 (69.2%)
Average MOV Error (unsigned): 8.70
Average MOV Error (signed): 0.16
Exact Point Spread Predictions (Error less than 0.5)
Iowa State 89 Texas 86
Predicted margin 3.2603858094692 Actual margin 3 Difference 0.26
Evansville 89 Indiana State 78
Predicted margin 11.164180475701 Actual margin 11 Difference 0.16
Jacksonville 65 South Carolina Upstate 78
Predicted margin -13.455308523534 Actual margin -13 Difference 0.45
North Florida 86 Kennesaw St 67
Predicted margin 19.169628714031 Actual margin 19 Difference 0.16
Sacramento State 75 Montana State 59
Predicted margin 16.369371047185 Actual margin 16 Difference 0.36
UNC Greensboro 42 Wofford 58
Predicted margin -16.08390130768 Actual margin -16 Difference 0.08
Canisius 63 Quinnipiac 57
Predicted margin 6.0695848328644 Actual margin 6 Difference 0.06
Alcorn State 56 Southern 65
Predicted margin -9.3332346354188 Actual margin -9 Difference 0.33
Gardner Webb 66 Coastal Carolina 64
Predicted margin 2.0582901520273 Actual margin 2 Difference 0.05
Liberty 62 Charleston Southern 74
Predicted margin -11.809711897302 Actual margin -12 Difference 0.19
Mcneese State 68 New Orleans 61
Predicted margin 6.7302830568131 Actual margin 7 Difference 0.26
UNC Greensboro 73 Western Carolina 78
Predicted margin -5.2415019370929 Actual margin -5 Difference 0.24
Michigan State 76 Michigan 66
Predicted margin 9.7436801069967 Actual margin 10 Difference 0.25
Five Worst Point Spread Predictions
Canisius 67 Marist 75
Predicted margin 21.537915217441 Actual margin -8 Difference 29.53
Hofstra 72 Towson 86
Predicted margin 15.932669034668 Actual margin -14 Difference 29.93
Eastern Michigan 76 Ohio 40
Predicted margin 6.0523222974863 Actual margin 36 Difference 29.94
Miami 50 Georgia Tech 70
Predicted margin 13.595436838937 Actual margin -20 Difference 33.59
Arkansas Pine Bluff 105 Prairie View 68
Predicted margin 0.63891473545624 Actual margin 37 Difference 36.36
Random Comments
The stats seem pretty consistent with last week. The signed error is a little closer to 0 this
time.
I had some time today to run some histograms on prediction error in previous years, going
back to the 2013 season. I piled errors into 1 point intervals in the histogram. What I found
was the following:
- It will definitely be worth excluding non-D1 games from such studies. They accounted for almost
all of the highest error games, and give more credence to the idea that these games probably
do more harm than good in the ranking process. Some of the worst among these made it to nearly
60 points of error (all such cases involving too low of a home victory).
- The median was in the (signed) 0-1 points of error category, which is to say I'm not quite giving
away teams enough credit.
- The mean signed error was 0.066. Note that I truncated error at +/- 20 for this calculation.
- The mean unsigned error was 8.56.
- The middle 50% was from about -6.5 to 8.5.
- The single most likely outcome was 2-3 points of error.
- One surprise was the relatively high number of 20+ error data points, which is to say the away
team significantly overperformed. These represented almost 4% of the total data. My current
standard deviation of about 10.6 that I have been using for game outcome would suggest closer
to 2.5% of data should be in this range.
- The standard deviation of the actual histogram data (with the truncation at +/- 20) is 10.43.
If my end goal is to make an accurate NCAA Tourney predictor, this means the model I've been
working under is probably OK to continue with (since an error of 20 is probably already decisive
enough to change an outcome anyway).
The next thing I'm interested to gather data on is some correlation numbers between prediction
error and a number of factors that might cause more than average error: strength of teams
involved, difference in team strengths, proximity to recent difficult games, etc. We will
see if any of these seem to foretell a large prediction error.
ENTRY 1 - 1-25-15
I apologize for the current brutish way of putting this here (as well as the lack of interactivity
with anyone who happens to stumble this way), but I've wanted to do this for a while. So here
it is. It will eventually allow responses as well as look a little nicer.
So what do I want to say here? I consider myself merely a sports fan first and don't pretend to
have some kind of wisdom about sports that I must get out there. I'll let the analysts and the
other ratings sites who get paid to do it formulate the opinions. This blog is here in case
anyone is interested in my take on ratings systems and wants to follow the progress that my ratings
will make over the years. They are currently set to some predefined defaults that seemed to work
OK, but there is still much optimization to be done. For the sake of not appearing to tailor the
current ratings to some kind of hidden agenda, the formulas will remain unchanged for the duration
of the current basketball season. That doesn't mean I can't do some analyzing now though!
So just to get things started off, I'll do a quick run down on Sunday night each week of how
the system did last week in its basketball predictions. Here we go!
Week of 1-19 to 1-25
Basic Stats
Number of Games: 340
Picked Correctly: 244 (71.8%)
Higher Rated Winner (no home court advantage): 237 (69.7%)
Average MOV Error (unsigned): 8.87
Average MOV Error (signed): 0.77
Exact Point Spread Predictions (Error less than 0.5)
Arkansas State 54 Georgia State 60
Predicted margin -5.96 Actual margin -6 Difference 0.03
South Alabama 66 Arkansas Little Rock 64
Predicted margin 1.52 Actual margin 2 Difference 0.47
Green Bay 78 Illinois Chicago 55
Predicted margin 22.82 Actual margin 23 Difference 0.17
Cal Poly 66 Cal State Fullerton 55
Predicted margin 10.50 Actual margin 11 Difference 0.49
Kennesaw St 88 Stetson 82
Predicted margin 5.87 Actual margin 6 Difference 0.12
Providence 69 Xavier 66
Predicted margin 2.55 Actual margin 3 Difference 0.44
Gonzaga 91 Pacific 60
Predicted margin 31.14 Actual margin 31 Difference 0.14
Mississippi Valley State 65 Prairie View 72
Predicted margin -6.68 Actual margin -7 Difference 0.31
Xavier 89 DePaul 76
Predicted margin 13.41 Actual margin 13 Difference 0.41
Five Worst Point Spread Predictions
Stony Brook 47 Albany 64
Predicted margin 11.14 Actual margin -17 Difference 28.14
Milwaukee 67 Wright State 41
Predicted margin -3.42 Actual margin 26 Difference 29.42
Hartford 61 Maine 70
Predicted margin 21.09 Actual margin -9 Difference 30.09
Baylor 81 Unranked 61
Predicted margin 51.77 Actual margin 20 Difference 31.77
California 44 Arizona State 79
Predicted margin 0.22 Actual margin -35 Difference 35.22
Random Comments
I think my win percentage is pretty good as far as computer systems stand. It compares decently
to those on the Wobus analysis on Massey's composite site. The average spread difference
of 8.87 also compares decently to the vegas line error of 8.4 quoted by KenPom
here.
The signed error averages out the negative and positive errors and should give some idea of how
accurate my stated home court advantage of 4 points is. A positive 0.77 value suggests that my
home court advantage is slightly overstated and should be closer to 3.25. Something to watch and
continue to track in coming weeks. Something to watch out for here is runaway victories.
Might be a good idea to cap prediction error at +/- 15 or so when calculating these averages.
Also worth noting is that finally taking the time to dig through some data and make some histograms,
it seems that the standard deviation modifiers I am using to calculate odds might be slightly too
low. The proportion of games that were over 25 points away from expected was slightly higher
than expected using a Normal table, by about 30%. Part of this probably has to do with the fact
that my system currently does not account for error in the teams' sample means when performing
these calculations. I will play with this.