This article is part of our MLB Observations series.
There's a school of thought in analytics that likens baseball players to dice rolls or coin flips. Once you know the base-rate probability of getting heads or rolling a seven with two dice, you can ascertain with a high degree of certainty whether you got proper odds on your bet. For example, if you're paying me 2:1 for tails on a fair coin, that's a great bet for me (at affordable amounts) even if the coin lands heads. The unlucky result (heads) does not render my decision to take tails at 2:1 wrong. My process -- which involved figuring out that 2:1 on a 1:1 bet is a great value -- was sound irrespective of the result.
The argument goes, if you can ascertain with reasonable accuracy the probability that a player does well or poorly in a given year, season or game, then you can, in those situations, feel good about your probability-based decisions, irrespective of the result. Just as getting heads didn't make tails a bad bet at 2:1, Nick Anderson giving up runs and losing the World Series didn't make taking Blake Snell (who was dealing) out of the game before he faced the Dodgers order a third time a horrendous decision. The probability at the time is not affected by the outcome of the decision.
Where this reasoning breaks down is that with a fair coin or set of dice, the results cannot convey new information about the decision. That you got three heads in a row does not make 2:1 on tails a bad bet, provided the coin is fair. But with players, the results cause us to update our probabilities, to change the base rate. Of course, we don't have access to those results at the time of our decision, but the very fact that results alter our priors means that the accuracy of our priors is subject to results-based scrutiny. We acknowledge our priors were less accurate before we incorporated the results. If priors require constant updating, they are always imperfect, unlike our priors about fair coins or dice which never require updating. Accordingly our priors about players are error-prone and must be refined and ultimately judged by results.
Ideally you want to judge your base rate by a large sample of results over time. Unfortunately, baseball -- and I'd argue life generally -- isn't settled on an average-expected-outcome basis of identical coin flips. It's settled by unique one-off events that are similar (and can be grouped together) only to varying degrees. The entire purpose of having a good analytic model is to get these unique one-off decisions right not merely often enough to beat the competition, but also when it matters most. Winning 100 $1 bets but losing two $60 ones is a net loss, despite an incredible winning percentage. The model used by the Rays failed in that respect because while we'll never know what Snell would have done, we know removing him resulted in them losing the World Series. One might argue their process was good, and even if it were the right decision probabilistically, that wouldn't guarantee the right result. But that begs the question. Whether the process was good depends on whether it nets a profit or a loss over time, adjusting for varying stakes. And all the times they pulled Snell and won -- (it's likely in many of those cases they would have won anyway) -- have a different impact than pulling him and losing the World Series.
Bottom line, this is a complex area, where it isn't possible to demonstrate the right decision, irrespective of outcome, and while outcome when the stakes are highest might not be entirely dispositive, becoming world champions is the ultimate goal of every franchise and must carry significant weight when evaluating the correctness of the decision. To argue otherwise is to confuse your model with reality.