> Contents
18+
MLB Betting Model: Build Your Own System From Scratch (2026)
Picture this: it's Tuesday morning, the full MLB slate drops in 3 hours, and you have 14 games to evaluate. Gut feel says the Dodgers are a lock. Your buddy swears the White Sox are "due." Meanwhile, the sharp money is moving a line nobody's talking about.
Here's the difference between you and the sharps: they have a model. Not a crystal ball — a systematic process that converts data into probabilities, compares those probabilities to market odds, and tells them exactly which bets have positive expected value.
The good news? As of 2026, every piece of data you need to build an MLB betting model is free. FanGraphs, Baseball Savant, and Statcast give you the same raw numbers that professional syndicates use. What separates the winners is how they engineer those numbers into features, train models that actually predict outcomes, and manage bankroll with discipline.
This guide walks you through the entire process — from your first spreadsheet to a full Python ensemble model. Whether you're a complete beginner or a data scientist looking for MLB-specific feature engineering ideas, there's a level for you. Let's build something that actually works.
TL;DR — MLB Betting Model Quick Reference
Model Levels at a Glance
| Level | Tools | Time to Build | Expected Edge | Best For |
|---|---|---|---|---|
| Beginner | Spreadsheet + FanGraphs | 1-2 weeks | 1-3% | Learning the framework |
| Intermediate | Python + Regression | 3-4 weeks | 3-5% | Consistent small edges |
| Advanced | XGBoost + Ensemble | 6-8 weeks | 5-8% | Maximizing ROI |
Who This Guide Is For
This guide is for anyone who wants to move from gut-feel picks to a data-driven MLB betting system. You don't need a statistics degree — if you can use a spreadsheet, you can start at Level 1. If you know basic Python, jump straight to the intermediate section.
What Is an MLB Betting Model (and Why Build One)?
Model vs Gut Feel — The Key Difference
A betting model is a probability machine. You feed it data (pitcher stats, park factors, bullpen usage), and it outputs a probability for each possible outcome. That probability is then compared to the market odds to find +EV bets.
The difference matters: when you "feel" the Dodgers will win, you have no way to know if -180 is fair. When your model says the Dodgers have a 63% chance of winning, you can calculate that -180 implies only 64.3% — meaning the market is fairly priced and there's no bet.
What a Good Model Actually Does
A good MLB betting model does three things:
- Predicts win probability more accurately than the market (even by 2-3%)
- Identifies +EV bets where your probability exceeds the implied odds
- Sizes bets appropriately using Kelly Criterion or a variant
It does NOT predict winners with certainty. A 55% model is extremely profitable at the right odds. The goal isn't accuracy — it's calibration and edge identification.
Choose Your Level — Beginner, Intermediate, or Advanced
Beginner: Spreadsheet + Key Stats
Start here if you've never built a model. Track 4-5 key stats in a spreadsheet (pitcher xFIP, team wOBA, bullpen workload, park factor) and assign simple weights. You won't beat Vegas consistently, but you'll learn the framework and stop making purely emotional bets.
Time: 1-2 weeks | Tools: Google Sheets or Excel | Data: FanGraphs
If you're totally new to sports analytics, start with our MLB underdog betting strategy guide to see what a data-driven system looks like in practice before building your own.
Intermediate: Python + Regression
Level up with Python's pandas and scikit-learn libraries. Build logistic regression models, calculate proper feature importance, and backtest against historical odds. This is where most profitable amateur bettors operate.
Time: 3-4 weeks | Tools: Python, Jupyter Notebooks | Data: FanGraphs + Statcast
Advanced: XGBoost + Ensemble Methods
Combine multiple model types (linear regression, logistic regression, XGBoost) into an ensemble that's more robust than any single model. Add advanced features like pitch-level data, umpire strike zone tendencies, and real-time lineup adjustments.
Time: 6-8 weeks | Tools: Python, XGBoost, LightGBM | Data: Statcast + weather APIs
The same framework applies to other sports. Check out our NBA betting system breakdown and NFL betting strategy guide if you're building multi-sport models.
Phase 1: Data Collection — Where to Get MLB Data
FanGraphs — Team and Player Stats (xFIP, wOBA, K-BB%)
FanGraphs is the foundation. Download team-level and pitcher-level stats for the last 3-5 seasons. The key metrics:
- xFIP (Expected Fielding Independent Pitching): Predicts future pitcher performance better than ERA
- wOBA (Weighted On-Base Average): Captures total offensive value on a single scale
- K-BB% (Strikeout minus Walk Rate): The #1 predictor of pitcher quality
- BABIP (Batting Average on Balls in Play): Identifies luck regression candidates
Statcast (Baseball Savant) — Pitch-Level Data
Baseball Savant provides Statcast data — exit velocity, launch angle, spin rate, and expected stats (xBA, xSLG, xwOBA). These "expected" stats strip out fielding and luck, giving you a clearer picture of true talent.
Park Factors — Why Venue Matters
Park factors are the most underrated variable in MLB betting. Coors Field inflates run scoring by 38%. Dodger Stadium suppresses it by 12%. If your model doesn't adjust for venue, you're leaving edge on the table.
Scroll down to see our complete 30-stadium park factors chart with visual rankings.
Umpire and Weather Data
Umpire strike zone tendencies affect strikeout and walk rates. A tight-zone ump can add 0.5 runs to game totals. Weather — particularly wind speed and direction at Wrigley Field — directly impacts over/under bets.
Free vs Paid Data Sources Table
| Source | Cost | Data Type | Best For |
|---|---|---|---|
| FanGraphs | Free | Team/Player stats | Foundation metrics |
| Baseball Savant | Free | Statcast, pitch-level | Expected stats, spin rates |
| Retrosheet | Free | Historical play-by-play | Backtesting models |
| Weather API | Free tier | Wind, temperature, humidity | Game totals adjustment |
| Odds API | Free tier | Historical/live odds | Backtesting, CLV tracking |
| Sports Reference | Free | Historical standings | Season-level analysis |
Use the Odds Converter to switch between American, decimal, and fractional formats as you work with different data sources.
Phase 2: Feature Engineering — Turning Data Into Predictions
Predictive vs Descriptive Stats
This is where most beginners fail. They use descriptive stats (batting average, pitcher W-L record, RBIs) that tell you what happened, instead of predictive stats that forecast what will happen.
| Predictive (Use These) | Descriptive (Avoid These) |
|---|---|
| xFIP, SIERA | ERA, W-L Record |
| wOBA, xwOBA | Batting Average |
| K-BB% | Strikeouts alone |
| Barrel Rate, Hard Hit% | Total Hits |
| Base Running (BsR) | Stolen Bases |
| Park-adjusted metrics | Raw stats |
Bullpen Fatigue Index (-0.6 MPH per B2B = -0.25 Runs)
Research from multiple sources shows that relievers lose approximately 0.6 MPH on their fastball per back-to-back appearance. That velocity drop translates to roughly -0.25 runs per game of expected run prevention.
Build a bullpen fatigue index:
- Track each reliever's appearances in the last 3 days
- Weight recent appearances more heavily (yesterday > 2 days ago)
- Flag bullpens with 3+ relievers used in back-to-back games
This is one of the most exploitable edges in MLB because the market is slow to react to bullpen overuse, especially in the first half of doubleheader days.
Platoon Splits and Lineup Construction
Left-handed batters hitting against left-handed pitchers (LvL) perform significantly worse than RvL. Your model should include:
- Starting pitcher handedness
- Lineup composition (percentage of same-side batters)
- Historical platoon splits for key hitters
- Manager tendencies for lineup construction
Starting Pitcher Rolling Metrics
Don't use full-season stats for a pitcher who's been struggling for 3 weeks. Build rolling windows:
- Last 3 starts: Capture recent form
- Last 10 starts: More stable sample
- Season-to-date: Baseline
Weight the rolling windows: 40% last-3, 35% last-10, 25% season. This catches both hot streaks and regression better than raw season averages.
Feature Importance Rankings
Based on backtesting across 2019-2025 data, here's what matters most:
| Rank | Feature | Importance Score | Category |
|---|---|---|---|
| 1 | Starting Pitcher xFIP (rolling 10) | 0.18 | Pitching |
| 2 | Team wOBA (last 14 days) | 0.14 | Hitting |
| 3 | Park Factor | 0.12 | Venue |
| 4 | Bullpen Fatigue Index | 0.10 | Pitching |
| 5 | K-BB% (starter) | 0.09 | Pitching |
| 6 | Platoon Matchup Score | 0.07 | Lineup |
| 7 | Home/Away Split | 0.06 | Situational |
| 8 | Temperature + Wind | 0.05 | Weather |
| 9 | Umpire Zone Rating | 0.04 | Umpire |
| 10 | Rest Days (team) | 0.03 | Fatigue |
Phase 3: Model Types With Python Code (2026)
Linear Regression (Starting Point)
Linear regression predicts run totals directly. It's the simplest model but surprisingly effective for game totals.
from sklearn.linear_model import LinearRegression
import pandas as pd
# Load your feature matrix
features = ['sp_xfip', 'team_woba', 'park_factor',
'bullpen_fatigue', 'k_bb_pct', 'platoon_score']
X_train = train_data[features]
y_train = train_data['total_runs']
model = LinearRegression()
model.fit(X_train, y_train)
# Predict today's games
today_pred = model.predict(today_data[features])
Logistic Regression (Classification)
For moneyline bets, you want win probability, not run totals. Logistic regression outputs probabilities directly.
from sklearn.linear_model import LogisticRegression
X_train = train_data[features]
y_train = train_data['home_win'] # 1 or 0
model = LogisticRegression(max_iter=1000)
model.fit(X_train, y_train)
# Get win probabilities
probs = model.predict_proba(today_data[features])
home_win_prob = probs[:, 1] # probability of home win
XGBoost (Gradient Boosting)
XGBoost captures non-linear relationships that regression misses. It's the workhorse of professional MLB models.
import xgboost as xgb
params = {
'objective': 'binary:logistic',
'max_depth': 5,
'learning_rate': 0.05,
'subsample': 0.8,
'colsample_bytree': 0.8,
'eval_metric': 'logloss'
}
dtrain = xgb.DMatrix(X_train, label=y_train)
model = xgb.train(params, dtrain, num_boost_round=300)
# Predict
dtest = xgb.DMatrix(today_data[features])
probs = model.predict(dtest)
Ensemble Model (Combining All Three)
No single model is best for every game. An ensemble averages predictions from multiple models, reducing overfitting and improving calibration.
Python Code: Full Ensemble Pipeline
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.calibration import CalibratedClassifierCV
import xgboost as xgb
# Train individual models
lr_model = LogisticRegression(max_iter=1000)
lr_model.fit(X_train, y_train)
lr_probs = lr_model.predict_proba(X_test)[:, 1]
xgb_model = xgb.XGBClassifier(
max_depth=5, learning_rate=0.05,
n_estimators=300, subsample=0.8
)
xgb_model.fit(X_train, y_train)
xgb_probs = xgb_model.predict_proba(X_test)[:, 1]
# Weighted ensemble (tune weights via validation set)
ensemble_probs = 0.4 * lr_probs + 0.6 * xgb_probs
# Compare to market implied probability
for i, game in enumerate(today_games):
model_prob = ensemble_probs[i]
implied_prob = game['implied_probability']
edge = model_prob - implied_prob
if edge > 0.03: # 3% minimum edge threshold
kelly = (model_prob * (game['decimal_odds'] - 1)
- (1 - model_prob)) / (game['decimal_odds'] - 1)
bet_size = bankroll * kelly * 0.25 # quarter-Kelly
print(f"{game['teams']}: Edge {edge:.1%}, "
f"Bet ${bet_size:.0f}")
Phase 4: Backtesting and Validation
Train/Test Split Strategy (2019-2022 Train / 2023 Validate / 2024-2025 Test)
Never test your model on the same data you trained it on. Use a strict temporal split:
- Training set (2019-2022): ~9,700 games. Your model learns patterns from this data
- Validation set (2023): ~2,430 games. Tune hyperparameters and feature selection
- Test set (2024-2025): ~4,860 games. Final, untouched evaluation of true performance
If your model performs well on training data but poorly on the test set, you've overfit. Go back and simplify.
Key Metrics — Log Loss, Brier Score, Calibration
Win/loss accuracy alone is misleading. A model that says "52% on every game" has 52% accuracy but zero edge. Use proper scoring metrics:
- Log Loss: Penalizes confident wrong predictions. Lower = better. Target < 0.68
- Brier Score: Mean squared error of probabilities. Target < 0.24
- Calibration: When your model says 60%, the team should win ~60% of the time
Check calibration by plotting predicted probability vs actual win rate in buckets (50-55%, 55-60%, 60-65%, etc.). A well-calibrated model follows the diagonal line.
Avoiding Overfitting — The #1 Beginner Mistake
Signs of overfitting:
- Training accuracy > 60% but test accuracy < 52%
- Model loves obscure features (umpire ID, day of week) over fundamental stats
- Performance degrades dramatically on new seasons
Fixes:
- Use fewer features (5-8 is often optimal for MLB)
- Add regularization (L1/L2 in regression, max_depth limits in XGBoost)
- Cross-validate within your training set before touching the test set
- If a feature doesn't make baseball sense, remove it regardless of statistical significance
Phase 5: Converting Model Output to Bets
From Probability to Expected Value (EV Formula + Plain English)
The core formula:
In plain English: multiply your chance of winning by how much you'd win, then subtract the chance of losing times how much you'd lose. If the number is positive, the bet has +EV.
Example: Your model gives the Astros a 55% chance. The odds are +130 ($100 bet wins $130).
- EV = (0.55 × $130) - (0.45 × $100)
- EV = $71.50 - $45.00 = +$26.50 per $100 bet
That's a massive 26.5% edge. In reality, edges are usually 3-8%. Use our Value Bet Calculator to quickly check any bet, or run your numbers through the Edge Analyzer for a deeper breakdown.
Kelly Criterion for MLB Bet Sizing
The Kelly Criterion calculates the mathematically optimal bet size:
Where:
- b = decimal odds - 1 (net odds)
- p = your estimated win probability
- q = 1 - p (loss probability)
For the Astros example: b = 2.30 - 1 = 1.30, p = 0.55, q = 0.45
Full Kelly says bet 20.4% of your bankroll. That's aggressive. Smart bettors use fractions.
Quarter-Kelly — Why Less Is More
Full Kelly maximizes long-term growth but with brutal variance. A 30% drawdown is common. Quarter-Kelly (betting 25% of the Kelly-recommended amount) sacrifices some growth for dramatically smoother results.
| Strategy | Expected Growth | Max Drawdown | Risk of Ruin |
|---|---|---|---|
| Full Kelly | Maximized | 30-50% | Low but painful |
| Half Kelly | 75% of max | 15-25% | Very low |
| Quarter Kelly | 50% of max | 8-15% | Near zero |
Recommendation: Start with quarter-Kelly. Move to half-Kelly only after 500+ verified profitable bets. Use our Kelly Calculator to size every bet properly.
MLB Park Factors — Every Stadium Ranked (2024-2025)
Reading the Park Factors Chart
A park factor of 1.00 means the stadium is perfectly neutral — scoring matches the league average. Above 1.00 means the park inflates scoring (hitter-friendly). Below 1.00 means the park suppresses scoring (pitcher-friendly).
How to Use Park Factors in Your Model
Multiply your projected runs by the park factor. If your model projects 4.5 runs for the Rockies and they're playing at Coors Field (1.38), adjust to 4.5 × 1.38 = 6.21 projected runs.
For road games at pitcher-friendly parks like Dodger Stadium (0.88), adjust down: 4.5 × 0.88 = 3.96 projected runs.
MLB Park Factors — Every Stadium Ranked (2024-2025)
Park factors above 1.00 boost scoring, below 1.00 suppress it. Coors Field is the biggest outlier at 1.38 — adjust your model by 10-38% depending on venue.
Park factors based on 2024-2025 combined data from FanGraphs. Factors represent runs scored relative to league average (1.00). Values shift year-to-year based on weather patterns and roster changes.
Phase 6: Your Daily MLB Betting Workflow
Morning Routine (Lines + Lineups)
- 7:00 AM — Download overnight line movements from your sportsbook. Flag games where the line moved significantly (>10 cents on the moneyline)
- 8:00 AM — Run your model with projected lineups (lineups are typically confirmed 3-4 hours before first pitch)
- 9:00 AM — Compare model probabilities to current market odds. List all +EV games with edge > 3%
Pre-Game Checks (Weather, Umpires, Bullpen)
Before placing any bet, verify:
- Confirmed starting lineup (late scratches can kill edge)
- Weather conditions (wind at Wrigley, rain delays)
- Home plate umpire assignment
- Bullpen availability (check previous night's box scores)
Placing Bets and Tracking Results
Track every bet in a spreadsheet or Bet Tracker:
- Date, teams, model probability, market odds, bet size, result
- Calculate CLV (Closing Line Value) — did the line move toward your model's price?
- Review weekly: are your 60% games actually winning 60% of the time?
CLV Calculator is the single best tool for validating your model's edge over time.
MLB EV Calculator — Check Any Bet Instantly
Plug in your model's win probability and the market odds to see if a bet is +EV. The calculator shows expected value, edge percentage, and recommended Kelly Criterion bet sizing.
Prop Bet Models — Hits, Strikeouts, First Five Innings
Player Prop Models (Hits O/U, Strikeouts)
Player props use the same framework as game models but focus on individual performance:
- Strikeout props: Use pitcher K-rate (rolling 5 starts), batter K-rate vs handedness, and umpire zone data
- Hits over/under: Use batter xBA, pitcher contact management rate, and BABIP regression
- Home runs: Use barrel rate, hard-hit rate, park factor HR component, and wind direction
The key insight: player props have softer lines than game lines because sportsbooks spend less time pricing them. This is where edges hide in 2026.
First 5 Innings (F5) Model
First 5 innings (F5) bets isolate starting pitcher performance, removing bullpen uncertainty. Build a separate model with:
- Starting pitcher xFIP and rolling K-BB%
- Opposition batting vs that pitcher's handedness
- Park factor (still applies to first 5 innings)
F5 moneylines are especially valuable when a great starter faces a weak lineup but the bullpen is unreliable. Your full-game model might say "no bet" while the F5 model says "+EV."
Team Total Models
Instead of predicting which team wins, predict how many runs each team scores independently. Then compare to the posted team total line. This approach:
- Doubles your bet opportunities (2 team totals per game)
- Removes the correlation between two sides
- Works well with park factors and weather data
Use the Implied Probability Calculator to convert totals odds into breakeven probabilities. Understanding what alternate spreads mean can also help you find value in run lines at non-standard numbers.
What a Model Does NOT Include (Honest Limitations)
Injuries and Late Scratches
Your model can't predict that the ace pitcher will get scratched 2 hours before first pitch. Always re-run your model after lineups are confirmed and never pre-place bets on games where the starter isn't locked in.
Clubhouse Drama and Motivation
A team in a 10-game losing streak might rally after a players-only meeting. A team that clinched the playoffs might rest starters. These factors are real but nearly impossible to quantify. Accept this limitation rather than adding garbage "motivation" variables to your model.
Umpire Strike Zone Variance
While average umpire tendencies are useful, individual game variation is high. An ump who typically runs a tight zone might call it wide on a given night. Umpire data adds small edge but don't over-weight it.
When to Override Your Model
Override your model only when you have concrete information the model doesn't have:
- A confirmed lineup change after you ran the model
- A weather update (sudden wind shift)
- Verified injury news that isn't reflected in the data
Never override because "it doesn't feel right." If your gut disagrees with your model regularly, your model needs fixing — or your gut does.
If you're interested in systematic betting approaches beyond modeling, see how the Wong Teaser strategy applies a similar rules-based framework to NFL teasers, or explore progressive systems like Fibonacci and Labouchere — though these work differently from data-driven models.
Real Track Record — What to Expect
Realistic Win Rates and ROI Benchmarks
Let's be honest about what's achievable. Here are documented track records from verified MLB bettors:
| Bettor/Service | Season | Bets | Units | ROI |
|---|---|---|---|---|
| Zerillo (Action Network) | 2019 | 659 | +30.2 | 4.6% |
| Professional syndicate avg | Multi-year | 2000+ | Varies | 3-5% |
| Good amateur model | First season | 500+ | Varies | 2-4% |
| Break-even model | Any | Any | ~0 | 0% |
Notice that even elite performance is 3-5% ROI. Anyone promising 20%+ ROI is lying. Consistency over 500+ bets at 3% ROI is outstanding. Use our Variance Analyzer to understand how much your results can swing even with a real edge.
Sample Size Requirements
- 200 bets: You can start to see trends, but nothing is conclusive
- 500 bets: Minimum for statistical confidence. A 55% model has a ~95% chance of showing profit
- 1,000+ bets: Strong evidence of edge. Your 95% confidence interval narrows significantly
Don't abandon a solid model after 50 losing bets. Don't declare yourself a genius after 50 winning bets. The math needs time to converge. Track your bankroll growth over the full season.
If your model consistently beats the closing line (positive CLV) over 200+ bets, your methodology is sound even if short-term results are negative. CLV is the truest signal of long-term profitability.
FAQ
Frequently Asked Questions
Bonus allocation is limited per region. Claim before capacity runs out.




