Baseball Betting Explained: Identifying Regression Candidates
Some teams win more than they should. Some pitchers post ERAs they have no business posting. The results look real, the market prices them accordingly, and then everything falls apart over the next month. If you can identify those situations before the collapse, you're betting against a price built on borrowed time. That's what regression candidate analysis does for you.

What Regression Actually Means in Baseball
Regression to the mean isn't a theory. It's a mathematical reality in any sport where variance plays a significant role in short-term results. In baseball, run scoring, win-loss records, and individual stats all contain a meaningful luck component that evens out over large samples. When a team or player's results diverge sharply from their underlying skills, the divergence almost always closes over time.
The question isn't whether regression will happen. It's identifying when the gap is large enough to bet against at a price that hasn't caught up yet. Teams that win close games at an unsustainable rate, pitchers whose ERA sits well below their expected metrics, hitters whose production is built on a BABIP that their contact quality doesn't support — all of them are regression candidates. The challenge is finding them before the market does.
Want real-time value before the line moves? Check out Shurzy's Live MLB Odds to track movement, compare prices, and find the best numbers before first pitch. The edge is in the timing — and the timing starts here.
Using Pythagorean Records to Find Team Regression
A team's Pythagorean win record is the number of wins you'd expect from their run differential across all games played. It's a straightforward calculation based on runs scored and runs allowed that removes sequencing and clutch performance from the equation. When a team's actual win total diverges significantly from their Pythagorean record, they're either overperforming or underperforming relative to their underlying output.
How to use Pythagorean records for betting:
- A team with 10 more actual wins than their Pythagorean record suggests has been winning close games at an unsustainable rate, which is driven by factors like bullpen sequencing, strand rate luck, and walk-off timing rather than genuine team quality
- That team is a regression candidate: their moneyline prices are built on an inflated record, and their true talent level is closer to the Pythagorean number
- A team with 8 fewer actual wins than their Pythagorean record suggests has been losing close games through bad luck and poor sequencing rather than poor performance; their price is deflated relative to their underlying quality
The BaseRuns metric goes a step further than Pythagorean records by estimating expected runs from component offensive and pitching data, which removes park and sequencing effects at a more granular level. Both tools are pointing at the same thing: the gap between what happened and what should have happened given the underlying performance.
Identifying Player Regression Candidates
The same logic applies to individual hitters and pitchers. When a player's actual stats diverge significantly from their expected stats, the gap is informative. It tells you whether their recent production is sustainable or built on factors that will normalize.
Hitter regression signals worth tracking:
- A hitter whose batting average is .050 or more above his xBA is likely benefiting from a high BABIP that his hard hit rate and barrel percentage don't fully support; his hit props and H+R+R overs are likely priced on the inflated number
- A hitter whose xSLG and xwOBA significantly exceed his actual production is an under-priced positive regression candidate; his total bases overs at deflated prices have value
Pitcher regression signals worth tracking:
- A pitcher with an ERA 1.00 or more below his xERA has been getting results through strand rate luck, low BABIP, or favorable sequencing rather than genuine run prevention; he's a fade candidate before the regression arrives
- A pitcher with an ERA well above his xFIP and xERA has likely been hurt by poor defense or bad sequencing luck; he's an undervalued buy candidate
Read More: xFIP vs ERA: What Bettors Should Trust
Ready to go deeper than the moneyline? Explore Shurzy's Player Props to find strikeout lines, total bases, home run specials, and more. If you've done the matchup research, this is where you turn it into profit.
The Price Has to Be Wrong for the Edge to Exist
Here's the part most bettors miss about regression analysis. Being "due" for regression is meaningless if the market has already priced it in. A team with an inflated record that is now being priced as an underdog because the market has identified the overperformance doesn't give you an edge. The regression is already reflected in the line.
The edge only exists when:
- The regression signal is present in the underlying data
- The market price hasn't yet adjusted to reflect the impending regression
- You're acting before the public narrative catches up to what the numbers already show
The best regression candidate bets come in the middle of a team or player's hot stretch, when the public is fully buying the narrative and the price is at its most inflated relative to the underlying quality. That's the uncomfortable position regression betting requires: fading things that look like they're working.
Building a Simple Regression Candidate Checklist
You don't need to run a full model to identify regression candidates. A practical checklist covers the most important signals without requiring advanced tools.
For teams:
- Actual wins vs Pythagorean wins: gap larger than 6 in either direction is worth flagging
- Record in one-run games: teams winning more than 65% of one-run games are almost certainly overperforming
- Bullpen ERA vs bullpen xERA: large gap suggests unsustainable late-game run prevention
For pitchers:
- ERA vs xERA: gap larger than 0.80 in either direction is meaningful
- BABIP vs league average: outliers below .250 or above .350 without clear explanatory factors
- Strand rate: above 80% is unsustainably high; below 65% is unsustainably low
Want a second opinion before you lock it in? Check out Shurzy's MLB Predictions for data-backed picks, matchup breakdowns, and betting insights built for serious bettors. Smart bets start with smart analysis.
The Bottom Line on Identifying Regression Candidates
Regression analysis finds the gap between what's happening and what should be happening. Teams winning more than their run differential supports, pitchers posting ERAs their peripherals don't justify, and hitters whose stats are running ahead of their expected numbers are all regression candidates. The edge comes from identifying that gap while the market is still pricing off the inflated results. Once the regression arrives, the price catches up and the opportunity is gone.
Think you know baseball? Prove it. Play Shurzy's free Gridzy game — test your knowledge, challenge friends, and build your streak. No money. Just bragging rights.

Minimum Juice. Maximum Profits.
We sniff out edges so you don’t have to. Spend less. Win more.


RELATED POSTS
Check out the latest picks from Shurzy AI and our team of experts.


