Model Madness 2020: EDM "Contra" ModelsteemCreated with Sketch.

in hive-162177 •  5 years ago 

Today, we will be covering our "fifth" model which is an extension of one the models from 2019 but with different goals and objectives. Since it partially uses contrarian selection (the intentional selection of underdog teams) it is nicknamed the "Contra" model. This model is an extension of the EDM model which was the first model created for Model Madness.

Also, we'll cover the only tournament set to start for tomorrow, the Mid-American Conference Championship. As always, I will divide the best teams in the conference into tiers and analyze their chances based on the numbers we get from my models.


mellow.png

EDM Contrarian "Contra" Model

The way the Contra model works is relatively simple. In the case where there are two teams and one of them is heavily favored by the EDM metric we are still going to select the heavy favorite. There's no reason to build a model that only selects underdogs because favorites typically win (otherwise they wouldn't be the favorite).

But in the case of close games where the difference between the two teams is closer, the "Contra" model will pick the underdog team. The reason for doing this differs from the stated goal of Model Madness. Rather than to increases the odds of getting a pick right, Contra is hoping to increases the odds of having a bracket that is more different than other brackets that might exist in a pool of brackets. Also, it should help me significantly in avoiding the braking of ALL of my brackets in the first game this year.

But why would someone want to pick "Contrarian" picks? Given a large enough group size, the odds of winning a group (and the money sometimes associated with such a group) are lower when everybody is picking the same teams. Although a contrarian bracket (or a bracket with some contrarian picks) is less likely to be correct, it is more likely to be a winner if chaos does indeed occur due to the fact there is less "competition" making those contrarian picks.

To summarize the point, although you are less likely to win, if you do pick correctly there are less people to compete against as most people picked the team(s) more likely to win. Rather than seek to maximize your score you are seeking to maximize your pick diversity hoping for a sufficient amount of chaos to make everyone's scores bad enough to compete with. Granted, this type of strategy only really makes you money across years and will typically be a loser for a single year because as mentioned at the beginning: favorites typically win.

Now that the strategy is explained and we have a model already to derive some metric to compare teams against, we need to determine a cut-off point for the difference that makes a "dominant" favorite versus a "vulnerable" favorite worth picking against. In terms of the EDM, we need to find a difference in EDM rating points to determine when to bet for a favorite and when to bet against. We should also figure out what a "favorite" is.

To answer the second question, a "favorite" can be either the team that most people pick (if you have access to this info, and people do for most NCAA brackets), the seed of the team, or the favorite based on a metric. For my purposes, I'm using the seed for the conference tourneys and the population selection data if that is provided again this year.

So, back to the other question, how many points should an underdog be spotted before selecting them becomes undesirable? For me, odds less than 40% don't seem to desirable. So, how do we calculate a difference that correlates with 40%? I went back and took my data from from over 5000 games and fit those games into 10 EDM Points bins and determined for that bin the win percentage of teams that had the advantage.

To clarify, I took all the games where the difference between teams was between 5-15 EDM points (and 15-25, 25-35, 35-45, and so on for entire set) and calculated win percentage for the favorites.

Next, I applied a 5-bin rolling average across the bins and assigned the average to the bins in place of the probability to smooth the numbers a little bit since some of the bins had less teams than others. I then found the highest difference bin which had a probability value of less than 60% and selected the number correlated with the the center of the bin. This ended up being 170 EDM points which corresponds to a 1.7 point advantage as measured by the EDM system.

Next, we demonstrate the Contra model on some example data, spotting the underdog team (determined by seeding) 170 EDM points and seeing who would win this "mock" tournament.

"Contra" Example
SeedTeamEDM
1A2000
2B1850
3C1800
4D1500
5E1650
6F1700
7G1300
8H1050

So, we are going to mock a simple 3-round single elimination tournament with a traditional seeding structure (1 vs 8, 1/8 vs 4/5, etc.). So, we have the following matchups in Round #1:

FavoriteUnderdog
#1 A (2000)vs#8 H (1050)
#4 D (1500)vs#5 E (1650)
#2 B (1850)vs#7 G (1300)
#3 C (1800)vs#6 F (1700)

A and B make it to the next round easily due to their large EDM differences (950 & 550). E makes it to the next round because they are the underdog and they have a higher EDM score. Granted had we opted to define underdogs by their EDM score, D would have been the underdog and they would have gone to the next round since the difference (150) was less than what they would have been spotted (170). F makes it to the next round, because although they have 100 less EDM points, they are within range (170) to pick them as a contrarian pick.

FavoriteUnderdog
#1 A (2000)vs#5 E (1650)
#2 B (1850)vs#6 F (1700)

In the next round, A makes it to final as they have too many points over E (350). F makes it to the final as a contrarian pick since the difference of 150 is less than the range of 170.

FavoriteUnderdog
#1 A (2000)vs#6 F (1700)

In the final round, A wins the tournament as the overall favorite. Even though a team existed that could be picked over A in B, A ended up being the safer pick as they were the dominant favorite over all the teams they ended up facing. So, there's a balancing act here. While you may pick less likely teams, the strong teams (that you really shouldn't pick against early) still end up making it later in the tournament.

So that's EDM "Contra" in a nutshell. Hopefully its contrarian picks will help my "perfect" brackets make it a few games further this year and this strategy could help you win a pool if this year happens to be particularly chaotic.

Mid-American Conference

Last Year's Champion: Buffalo

Leaders
TeamRecordSeedEDMSPMMASPMWRI
Akron24-712014.293 (68)0.1450 (32)0.1267 (31)0.290 (64)
Challengers
TeamRecordSeedEDMSPMMASPMWRI
Buffalo20-1151714.187 (126)0.0807 (85)0.0616 (94)0.233 (87)
Bowling Green21-1021491.096 (183)0.0912 (72)0.0578 (98)0.187 (119)
Ball State18-1331893.700 (94)0.0228 (154)0.0345 (132)0.226 (91)
Outsiders
TeamRecordSeedEDMSPMMASPMWRI
Kent State19-1261749.426 (119)0.0575 (109)0.0509 (109)0.196 (110)
Northern Illinois18-1341521.501 (184)0.0164 (165)0.0104 (169)0.143 (152)

Akron is the clear leader in the Mid-American conference having the 1 seed and leading the conference in all four model ratings that we have. They have solid ratings within the Top 75 and will present a threat to any team they face in the NCAA tournament if they win the MAC Championship which they'll need to do as their resume is lacking in quality wins.

The number one challenger is probably last year's victor Buffalo who comes in at the 5 seed. They have three Top 75 ratings and come into the the MAC Championship winning 6 of their last 8. Bowling Green is the number 2 seed and they were runners up in last tourney. The EDM doesn't like them too much as they come into the tournament with a 3 game losing streak. Ball State is the 3 seed and were the winners of the West Division in the MAC.

A couple of outsiders to mention are Kent State and Northern Illinois. Northern Illinois tied for the lead in the West Division, but lost tiebreakers as they lost to Ball State twice in the regular season. Their ratings are a little lackluster, all being outside the Top 150. Kent State, on the other hand have competitive ratings in the Top 120, but have a harder path to the championship given their 6 seed.

The "Contra" model (for those wondering) likes Akron.

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!