Jump to content
Brewer Fanatic

Brewers Pythagorean Record for this year


molitor fan

Recommended Posts

I don't understand the more complex (3rd order, etc) formulas. However, I would at least guess the "overachieving" is largely because they've have a couple starters who have been simply annhilated (the 1-11 stat), while they have been competitive and close in the rest of the games. What is the run differential outside of the Hendrickson/Eveland throw away starts?
Link to comment
Share on other sites

Basically the more advanced look at hitting lines and pitching lines to estimate what those numbers say the Brewers should have given up and what they should have scored. Basically the Breweres "should" have scored about 15 more runs and "should" have allowed about 32 fewer.
Link to comment
Share on other sites


Thanks to Russ' blog, they had a run differential of -24 in games started by those two.
It sounds like Russ' blog caused the situation. http://forum.brewerfan.net/images/smilies/smile.gif

That’s the only thing Chicago’s good for: to tell people where Wisconsin is.

[align=right]-- Sigmund Snopek[/align]

Link to comment
Share on other sites

I don't understand the more complex (3rd order, etc) formulas.

 

There's not much to it. As was mentioned above, the 2nd order wins just looks at raw stats to estimate how many runs should have been scored and given up. I'm not a big fan of BP's way of doing it, since they basically use basic runs created (AB x OBP x SLG) to calculate the runs (they may have updated the methodology). I use BaseRuns, which is one of the better run estimators, IMO. Here's the the latest series probability:

 

http://photos1.blogger.com/blogger/7391/569/400/win%20prob619.gif

 

The BsR win% is essentially my 2nd order wins, which says that the Brewers are esentially a .500 team. BP's 3rd order goes the one extra step and adjusts for strength of schedule. The Brewers SOS is pretty neutral, so the 2nd and 3rd order wins are about the same.

 

As Brett mentioned, the Brewers unique position of having a revolving door of rookie pitchers get blown up has really skewed the Brewer's pythagorean record. Their run distribution is all messed up as a result. I have an spreadsheet mostly updated. For anyone interested, I'll post an up to date version soon.

Link to comment
Share on other sites

Russ, can you explain base runs?

 

And how can it be better and easier than the simple runs created, 98% accurate model?

 

Well, simple runs created isn't 98% accurate, for one. It's a great "back of the envelope" calculation but can be improved upon by a pretty simple equation.

 

Here's a great thread on Base Runs, started by it's inventor:

 

Base Runs

Link to comment
Share on other sites

Every year I've figured it it's been 98%+.

 

Are we calculating this differently?

 [b]2005, Basic Runs Created
TEAM AB OBP SLG Runs sRC diff %corr[/b] Boston 5626 0.357 0.454 910 912 2 99.8% NY Yankees 5624 0.355 0.450 886 898 12 98.6% Texas 5716 0.329 0.468 865 880 15 98.3% Cincinnati 5565 0.339 0.446 820 841 21 97.4% Philadelphi 5542 0.348 0.423 807 816 9 98.9% St. Louis 5538 0.339 0.423 805 794 -11 98.7% Cleveland 5609 0.334 0.453 790 849 59 92.6% Toronto 5581 0.331 0.407 775 752 -23 97.0% Oakland 5627 0.330 0.407 772 756 -16 97.9% Atlanta 5486 0.333 0.435 769 795 26 96.7% LA Angels 5624 0.325 0.409 761 748 -13 98.2% Tampa Bay 5552 0.329 0.425 750 776 26 96.5% Chicago Sox 5529 0.322 0.425 741 757 16 97.9% Colorado 5542 0.333 0.411 740 758 18 97.5% Baltimore 5551 0.327 0.434 729 788 59 91.9% Milwaukee 5448 0.331 0.423 726 763 37 94.9% Detroit 5602 0.321 0.428 723 770 47 93.5% NY Mets 5505 0.322 0.416 722 737 15 97.9% Florida 5502 0.339 0.409 717 763 46 93.6% Chicago Cub 5584 0.324 0.440 703 796 93 86.8% Kansas City 5503 0.320 0.396 701 697 -4 99.5% Seattle 5507 0.317 0.391 699 683 -16 97.7% Arizona 5550 0.332 0.421 696 776 80 88.5% Houston 5462 0.322 0.408 693 718 25 96.5% Minnesota 5564 0.323 0.391 688 703 15 97.9% LA Dodgers 5433 0.326 0.395 685 700 15 97.9% San Diego 5502 0.333 0.391 684 716 32 95.3% Pittsburgh 5573 0.322 0.400 680 718 38 94.4% San Francis 5462 0.319 0.396 649 690 41 93.7% Washington 5426 0.322 0.386 639 674 35 94.5%

Most teams last year weren't within 98% correct, at least by the way I calculated it. The way I did is actually a bit misleading, since the difference can be +/- 2% and still be labeled as "98% accurate". I used:

 

1- ABS[(act-calc)/act]

 

if you need a program to do it within 10 minutes, it's too much for me.

 

Just Excel. Takes about 1 minute.

Link to comment
Share on other sites

Miscellaneous thoughts:

  • Correlating to runs scored, Dan Fox puts RC formulas at .958 (simple) and .964 (current, more complicated Bill James formula).
  • Here's an example of a full season. Does this formula get more accurate with large samples?
     
      [li]2004 AL: .333 OBP x .433 SLG x 78,731 AB = 11,352 vs. 11,358 actual runs
  • 2004 NL: .329 OBP x .423 SLG x 88,622 AB = 12,333 vs. 12,018 actual runs

[/li]

[*]Russ, you've mentioned (very minor) flaws with using OBP times SLG times AB. Can you refresh our memories?

That’s the only thing Chicago’s good for: to tell people where Wisconsin is.

[align=right]-- Sigmund Snopek[/align]

Link to comment
Share on other sites

Correlating to runs scored, Dan Fox puts RC formulas at .958 (simple) and .964 (current, more complicated Bill James formula).

 

Just to make clear (I know you know this already) the abve values are r^2 values, not "percentage within", which Al brought up earlier.

 

Here's an example of a full season. Does this formula get more accurate with large samples?

 

Run estimators use aggregate stats, so context is largely stripped out (diff estimators do this diff ways). For instance, a linear weights approach would say that a HR is worth 1.4 runs, with the .4 part coming from an average number of base runners on during those home runs. Well, if a team has a very high OBP, a HR is actually worth more since there will be, on average, more base runners on. What about a big HR hitter in the 8th spot? He'll probably have less base runners than would be expected by looking only at team OBP (as basic RC and others do).

 

As you use a larger and larger sample, these peculiarities begin to cancel each other out, making the results become more and more accurate. As a general rule of statistical analysis, larger sample sizes result in greater accuracy for this very reason. Individual data has random uncertainty that should average toward the actual value.

 

Russ, you've mentioned (very minor) flaws with using OBP times SLG times AB. Can you refresh our memories?

 

Well, all run estimators are "flawed" by default, since they try to model the results of chain of probabalistic events with aggregate data. Unless you are using a simulator Markov Chain analysis, you are using a very simplified (essentially flawed) scoring model. Some are simply less flawed than others. Now how to measure "accurate" or "flawed" is subjective anyway. Dan Fox covers this in the below article:

 

A Closer Look at Run Estimation

 

No matter how advanced a run estimator is, it will never be perfect. Even if you know Lee is going to hit 45 home runs, no one can predict exactly WHEN he will and what the exact context will be when he does.

 

Sorry for the long rambling post.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

The Twins Daily Caretaker Fund
The Brewer Fanatic Caretaker Fund

You all care about this site. The next step is caring for it. We’re asking you to caretake this site so it can remain the premier Brewers community on the internet. Included with caretaking is ad-free browsing of Brewer Fanatic.

×
×
  • Create New...