← Back to blog

Published on Tue Oct 21 2025 14:00:00 GMT+0000 (Coordinated Universal Time) by Claude with cresencio

When the Models Unite: Week 7 Results Analysis

Disclaimer: This analysis is for educational and entertainment purposes only. This is not betting advice, financial advice, or a recommendation to wager on sporting events. Always gamble responsibly.

The Verdict Is In

Last week, we asked: what happens when five prediction models agree? And what happens when they violently disagree?

Week 7 delivered answers—some triumphant, some humbling, and one absolutely perfect.

The headline: When all five models united behind a pick, they went 5 for 5. Perfect consensus, perfect execution. The Chiefs, Broncos, Bears, Lions, and Packers all delivered exactly as predicted.

But in the chaos games? The ones where models fought each other for supremacy? That’s where reality had some lessons to teach us.

The Model Report Card

Let’s start with the numbers. Across 15 Week 7 games, here’s how each model performed at picking winners:

ModelCorrectAccuracyGrade
ELO Rating12/1580.0%B+
Logistic Regression11/1573.3%C+
Ensemble11/1573.3%C+
XGBoost10/1566.7%D+
Bayesian9/1560.0%D-

The old guard wins. The simplest model—ELO rating, a system adapted from chess in the 1960s—beat every machine learning approach. Sometimes more data and complexity doesn’t equal better predictions. Sometimes you just need a solid, battle-tested algorithm that understands one thing really well: relative team strength.

The ensemble model, which combines all four approaches, landed in the middle of the pack. It successfully moderated extreme predictions but didn’t capture ELO’s consistent edge.

The Consensus Five: When Unity Means Victory

Five games had all models in perfect agreement. And all five predictions hit:

  1. KC 31, LV 0 — The week’s most confident pick (91.4%) delivered a shutout
  2. DEN 33, NYG 32 — Predicted at 82.5% confidence, Broncos survived a nail-biter
  3. CHI 26, NO 14 — Bears covered the 77.8% projection comfortably
  4. DET 24, TB 9 — Lions dominated Monday night as forecasted (72.3%)
  5. GB 27, ARI 23 — The closest consensus call (59.6%) still came through

What this tells us: Model consensus isn’t just a confidence signal—it’s a performance predictor. When five fundamentally different approaches all see the same outcome, they’re probably seeing something real.

The Upset Chronicles: When Confidence Meets Reality

Four games saw the ensemble model proven wrong. Let’s examine the chaos:

1. PIT @ CIN: The Thursday Night Humbling

Predicted: Steelers 87.2% confidence Reality: Bengals 33, Steelers 31

This was supposed to be the week’s safest pick after the consensus games. The logistic model screamed “Steelers blowout” at 89.9% confidence. The ensemble agreed: Pittsburgh by 12.

Then Thursday night happened.

The Bengals and Steelers combined for 64 points in a back-and-forth thriller—26 points more than predicted. The expected 37.9-point defensive slugfest became an offensive showcase. Cincinnati won by 2 in a game that shattered every model’s expectations.

The only model that got it right? ELO, which barely favored the Bengals at 50.3%. Sometimes a coin flip beats certainty.

2. MIA @ CLE: The Home Field Surprise

Predicted: Dolphins 63.9% Reality: Browns 31, Dolphins 6

Three models favored Miami on the road. Cleveland had other plans, delivering a 25-point beatdown. ELO once again was the lonely voice of reason, giving the Browns a slight 51.4% edge.

Interestingly, this game had the second-best total points prediction despite being an upset—off by just 0.2 points (predicted 37.2, actual 37).

3. CAR @ NYJ: The Tie That Wasn’t

Predicted: Jets 59.2%, Score: 20-20 Reality: Panthers 13, Jets 6

The models predicted a dead-even tie score. Instead, we got a defensive slog where the Panthers’ 13 points felt like a blowout. The predicted total of 40.2 was off by 21.2 points—the second-worst scoring miss of the week.

4. ATL @ SF: The Sunday Night Flip

Predicted: Falcons 59.3% Reality: 49ers 20, Falcons 10

Three models liked Atlanta. San Francisco’s defense had other ideas. The 49ers won by 10, proving that home field in primetime still matters.

The Perfect Prediction: HOU @ SEA

Amidst all the chaos, one prediction deserves its own spotlight:

Houston @ Seattle Predicted Total: 46.0 points Actual Total: 46 points Error: 0.0 points

Not 45.8 rounded up. Not 46.2 rounded down. Exactly 46 points. Seattle 27, Houston 19.

In a week where the average total error was 9.79 points, nailing a game to the exact combined score is the prediction equivalent of a hole-in-one. The ensemble gave Seattle a 57.8% win probability—they delivered. The models said 46 total points—the football gods agreed.

This wasn’t even a consensus game (XGBoost and Bayesian picked Houston), making the perfect total even more remarkable.

The Scoring Report: Points Predictions

Beyond just winners, how did the models handle scoring?

Closest Total Predictions:

  1. HOU @ SEA: 46.0 predicted, 46 actual (0.0 error) ⭐
  2. MIA @ CLE: 37.2 predicted, 37 actual (0.2 error)
  3. NE @ TEN: 44.7 predicted, 44 actual (0.7 error)
  4. NO @ CHI: 41.5 predicted, 40 actual (1.5 error)
  5. GB @ ARI: 47.0 predicted, 50 actual (3.0 error)

Worst Total Predictions:

  1. PIT @ CIN: 37.9 predicted, 64 actual (26.1 error) — The Thursday chaos
  2. CAR @ NYJ: 40.2 predicted, 19 actual (21.2 error) — The defensive surprise
  3. NYG @ DEN: 46.0 predicted, 65 actual (19.0 error) — The high-scoring thriller

Overall Performance:

  • Mean Total Error: 9.79 points
  • Median Total Error: 8.72 points
  • Mean Margin Error: 12.71 points

The models generally predicted totals within about 10 points, which is respectable given the inherent variance in NFL scoring. But those outliers—particularly Thursday night games—show how quickly chaos can overwhelm even sophisticated predictions.

What We Learned

1. Simplicity Wins

ELO’s 80% accuracy beat every complex machine learning model. The lesson? Sometimes a well-calibrated simple model that focuses on the core signal (team strength) outperforms models trying to capture every possible variable.

2. Consensus is Golden

5-for-5 on consensus picks. When fundamentally different modeling approaches all agree, listen.

3. Thursday Nights Are Chaos

The PIT-CIN game wasn’t just wrong—it was spectacularly wrong. Short rest, divisional intensity, prime-time pressure—these factors create volatility that historical data struggles to capture.

4. Home Field Matters in Upsets

Three of the four upsets involved models underestimating the home team (CLE, CAR, SF). Even in the analytics age, playing at home still provides an edge that’s hard to quantify.

5. Perfect Predictions Happen

That HOU-SEA total isn’t just luck—it’s validation that the underlying modeling approach, when it all comes together, can capture the true nature of a game. We should study what went right as much as what went wrong.

The Model Personalities Emerge

After watching Week 7 unfold, each model’s “personality” is becoming clearer:

ELO is the steady veteran—doesn’t overthink it, trusts the fundamentals, gets wins.

Logistic Regression is the bold contrarian—takes extreme positions, gets burned sometimes (PIT-CIN), but can nail upsets (MIA-CLE).

XGBoost is the inconsistent talent—shows flashes of brilliance but struggles with consistency.

Bayesian is the philosophical skeptic—high uncertainty, cautious predictions, lowest accuracy but might be the most honest about what we don’t know.

Ensemble is the committee chair—tries to find middle ground, usually lands in the middle of the pack.

Looking Ahead

Week 7 proved that consensus picks are gold, simple models can outperform complex ones, and even 87% confidence can be humbled by Thursday Night Football.

As we head into Week 8, the models have been recalibrated with fresh data. Will ELO maintain its edge? Will the Logistic model’s contrarian streak continue to pay off? Can we get another perfect scoring prediction?

The beauty of this system isn’t just making predictions—it’s learning from failure. Every upset teaches us something. Every perfect call validates the approach. And every week, the models get a little bit smarter.

Week 7 Final Tally:

  • 15 games analyzed
  • 5/5 consensus picks correct
  • 12/15 for ELO (best model)
  • 1 perfect total prediction
  • 87.2% confidence proved wrong in spectacular fashion

→ Check out Week 8 Predictions to see if the models learned their Thursday night lessons.


Analysis based on ELO ratings, Logistic Regression, XGBoost, Bayesian modeling, and Ensemble methods. All predictions generated before games were played. Results validated against official NFL scores as of October 21, 2025.

Disclaimer: This content is for informational and entertainment purposes only. All predictions are based on statistical models and historical data. Past performance does not guarantee future results. This is not betting, gambling, or financial advice. Please gamble responsibly and within your means.

Written by Claude with cresencio

← Back to blog