← Back to blog

Published on Tue Nov 25 2025 09:00:00 GMT+0000 (Coordinated Universal Time) by cresencio

Week 12 is in the books, and our ensemble delivered a solid 10-4 performance (71.4%) — perfectly respectable, but the real story is what happened under the hood. XGBoost had its best week of the season, our consensus picks went 6-for-6, and the coin-flip games… well, they lived up to their name.

The Week 12 Scoreboard

10-4
71.4% Accuracy

Complete Results

GamePredictionConfidenceResultMargin
BUF @ HOUBUF51.0%❌ HOU 23-19-4
NYJ @ BALBAL83.6%✅ BAL 23-10+13
NYG @ DETDET83.6%✅ DET 34-27+7
IND @ KCKC66.0%✅ KC 23-20+3
SEA @ TENSEA87.9%✅ SEA 30-24+6
PIT @ CHICHI50.1%✅ CHI 31-28+3
NE @ CINCIN60.8%❌ NE 26-20-6
MIN @ GBGB61.4%✅ GB 23-6+17
ATL @ NONO58.7%❌ ATL 24-10-14
CLE @ LVCLE50.1%✅ CLE 24-10+14
PHI @ DALPHI54.7%❌ DAL 24-21-3
JAX @ ARIJAX51.1%✅ JAX 27-24+3
TB @ LALA65.6%✅ LA 34-7+27
CAR @ SFSF71.9%✅ SF 20-9+11

Model Showdown: XGBoost’s Redemption Arc

Week 12 was a tale of model redemption. XGBoost, which had been struggling all season with a cumulative 41.6% accuracy, suddenly clicked into gear with a dominant 85.7% week.

Week 12 Model Breakdown

ModelW-LAccuracySeason Total
XGBoost12-285.7%41.6%
Random Forest12-285.7%44.9%
Logistic10-471.4%63.5%
ELO8-657.1%61.8%
Bayesian7-750.0%55.6%
Ensemble10-471.4%61.2%

The machine learning models finally had their moment. XGBoost and Random Forest tied for the week’s best performance at 85.7%, while our steady ELO model had an uncharacteristically rough outing at 57.1%.

Key Insight: XGBoost went from season-worst to week-best, a 44-point swing that suggests the model may finally be adapting to mid-season patterns.


The Perfect Consensus: 6-for-6

When all models agree, good things happen. Our consensus picks — games where every model pointed the same direction — went a perfect 6-for-6.

Consensus Games

GameConsensus PickConfidenceResultMargin
NYJ @ BALBaltimore83.6%+13
NYG @ DETDetroit83.6%+7
IND @ KCKansas City66.0%+3
SEA @ TENSeattle87.9%+6
TB @ LALA Rams65.6%+27
CAR @ SFSan Francisco71.9%+11

When our models unanimously agree, they’re now 42-8 (84%) on the season. That’s not just good — that’s bankable.


The Four Horsemen of Chaos

Every week has its upsets. Week 12 had four, and they came exactly where you’d expect — the coin-flip games under 55% confidence.

Week 12 Upsets

GameWe PickedWinnerConfidenceLoss Type
BUF @ HOUBuffaloHouston51.0%Coin flip
NE @ CINCincinnatiNew England60.8%Value play backfired
ATL @ NONew OrleansAtlanta58.7%Division chaos
PHI @ DALPhiladelphiaDallas54.7%Coin flip

The NE @ CIN miss is the interesting one — that was our biggest “value play” of the season with a 35.6% edge. The models saw a 1-point game and got burned when New England actually won outright. The other three were all in the “anything can happen” zone under 60%.


Confidence Calibration: The Truth in the Numbers

Season-long, our confidence buckets continue to hold up remarkably well:

Confidence RangeGamesCorrectAccuracy
50-55%402050.0%
55-60%351954.3%
60-65%281657.1%
65-70%251560.0%
70-80%322578.1%
80%+171482.4%

The pattern is clear: when we’re confident, we’re usually right. Games over 70% confidence hit at nearly 80%, while coin-flips are exactly that — 50/50.


The weekly trends tell an interesting story. XGBoost has been inconsistent all season but exploded in Week 12. ELO has been our steadiest performer but stumbled this week. The Ensemble continues to smooth out individual model volatility.


Scoring Analysis: Close But Not Always Right

We correctly predicted the game winner in 10 of 14 games, but how accurate were our score predictions?

MetricValue
Avg Total Error8.6 points
Avg Margin Error7.0 points
Best PredictionPHI@DAL (0.97 pts off total)
Worst PredictionMIN@GB (17.53 pts off total)

Notable scoring misses:

  • NYG @ DET: Predicted 46.4 total, actual was 61 (+14.6 off)
  • MIN @ GB: Predicted 46.5 total, actual was 29 (-17.5 off)
  • CAR @ SF: Predicted 44.6 total, actual was 29 (-15.6 off)

Score prediction remains our weakest area. Picking winners is much easier than predicting exact margins.


Season Standings Update

Through 178 games (12 weeks), here’s where each model stands:

ModelSeason W-LAccuracyTrend
Logistic113-6563.5%↗️
ELO110-6861.8%↘️
Ensemble109-6961.2%
Bayesian99-7955.6%↘️
Random Forest80-9844.9%↗️
XGBoost74-10441.6%↗️

Logistic still leads the season despite not being our “feature” model. The Ensemble remains our safest bet — never the best week, rarely the worst.


Key Takeaways

  1. Consensus is king: 6-for-6 this week, 84% on the season. When models agree, follow them.

  2. XGBoost awakens: After struggling all season, XGBoost finally showed what it can do with an 85.7% week. One week doesn’t make a trend, but it’s encouraging.

  3. The “lock of the year” held: SEA @ TEN at 87.9% confidence delivered as promised. Seattle won 30-24.

  4. Value plays can backfire: Our biggest edge of the season (NE @ CIN, +35.6%) went the wrong way when New England won outright 26-20. The models were right that it would be close, but wrong about the winner.

  5. Coin flips gonna flip: Three of our four misses (BUF/HOU, ATL/NO, PHI/DAL) were sub-59% confidence. The math works.

  6. XGBoost vindicated on Houston: Remember that 100% XGBoost confidence on Houston for Thursday Night? It was right. The ensemble’s 51% Buffalo pick was wrong.


Looking Ahead

Week 13 brings a crucial slate of games with playoff implications heating up. Our models will recalibrate with the latest data, and we’ll see if XGBoost’s breakout was a fluke or a turning point.

Lessons for Week 13:

  • Trust the consensus picks — they’re now 42-8 (84%) on the season
  • Be cautious with “value plays” where models disagree with the market
  • XGBoost may deserve more weight after this performance
  • Division games (like ATL @ NO) remain unpredictable

Stay tuned for Week 13 predictions!


Written by cresencio

← Back to blog