Jump to content
Brewer Fanatic

Differing WAR numbers


I cringe when I see it used on MLB Network's Clubhouse Confidential to make swift judgements on a player. I LOVE that show, but I think sometimes there is not enough time to get into the nuances of these stats, and thus, they get mis-used.
Link to comment
Share on other sites

WAR is about giving results not showing how good a player is

 

How many times on this site have we seen it used to show how good player is? It seems like the most used stat to show someone is better than someone else.

It seems like a lot of times people look at WAR over a multi-year span though when trying to decide if one player is better than another. I don't really have a problem with this method as a starting point.

Link to comment
Share on other sites

It seems like a lot of times people look at WAR over a multi-year span though when trying to decide if one player is better than another. I don't really have a problem with this method as a starting point.

 

Yeah I would agree with this. If you are looking at 4 years of WAR it probably is going to be a decent comparison, not perfect but decent at least. Better than any other single number you can point to at least. I'm not sure it is better than just comparing offense and then fudging the numbers one way or the other to account for park, defense etc though. I just don't think they have a good grasp on adjusting those parts of the game yet, it is too subjective.

Link to comment
Share on other sites

It seems like a lot of times people look at WAR over a multi-year span though when trying to decide if one player is better than another. I don't really have a problem with this method as a starting point.

 

Why would four years of a stat that uses an inaccurate number in it's calculation be better than one year of a stat that uses an inaccurate number in it's calculation? Until they fix the defensive metric to match the level of offensive accuracy it's useless. Useless as in tells you nothing more than gut feelings or seeing those players twice in their careers does. Just because it's a stat that you can calculate does not mean it's better than nothing. When those calculations are based on inaccurate numbers to begin with it is impossible to come up with something demonstrably better than nothing.

There needs to be a King Thames version of the bible.
Link to comment
Share on other sites

It seems like a lot of times people look at WAR over a multi-year span though when trying to decide if one player is better than another. I don't really have a problem with this method as a starting point.

 

Why would four years of a stat that uses an inaccurate number in it's calculation be better than one year of a stat that uses an inaccurate number in it's calculation?

If the number is inaccurate when the sample size is small, it would stand to reason that if the sample size were larger the number would become more accurate.
Link to comment
Share on other sites

If the number is inaccurate when the sample size is small, it would stand to reason that if the sample size were larger the number would become more accurate.

 

They use the same single season sample size three years in a row not three years of UZR for one year of offensive stats. Big difference. It's like not using enough hops in making beer. If you make three batches of beer with too few hops in each batch you cannot combine all three batches of beer together and get the right amount of hops to make it taste right. Same thing with WAR. Combining them together doesn't make one good one. It's the same inaccurate ratio multiplied by three.

There needs to be a King Thames version of the bible.
Link to comment
Share on other sites

Let me try to address a few of the topics discussed here. Some of this has already been said but I will repeat it:

WAR is LITERALLY trying to estimate the wins a particular player's performance over a certain time was worth over an average replacement level player. It is just a framework, however. That framework can be implemented in any number of ways and Fangraphs and BF have come up with two such implementations. I am sure there will be more and hopefully they will continue to improve the methodology.

WAR is attempting to estimate the value of performance, not the skill level of a player. If a 9 year vet with a career .200 BA throws up a .300 BA in his 10th year, we say he was a .250 batter last year but we estimate his actual skill level at something much closer to .200. We know that a player does not always player at their true skill level and the most significant reason for that is the uncertainty associated with sample size.

UZR has a much larger uncertainty over one season than the offensive component of WAR. Two reasons. First, players get significantly less defensive opportunities than offensive opportunities over a given season. Lower sample size equals higher uncertainty. No metric can remove that uncertainty. Second, offensive metrics record what actually happened, play-by-play defensive metrics (like UZR) attempt to estimate the likelihood of an average defender making a particular play. That is based on a somewhat subjective estimate of ball trajectory and speed. Even if that estimate is unbiased, it will lower the precision (not accuracy) of the UZR estimate. If it is biased (and there is some evidence that it might be), it will lower both.

In summary, WAR probably represents the best framework available for estimating the value of performance that is publicly available, and Fangraphs and BR offers a reasonable implementation of that framework. There will be more, probably better implementations of it in the future and the WAR framework will probably be supplanted altogether by something better someday. Combine that with whatever observational (scouting) based evidence you think yo have on a player and that's going to be your best guess.
Link to comment
Share on other sites

If the number is inaccurate when the sample size is small, it would stand to reason that if the sample size were larger the number would become more accurate.

 

They use the same single season sample size three years in a row not three years of UZR for one year of offensive stats. Big difference. It's like not using enough hops in making beer. If you make three batches of beer with too few hops in each batch you cannot combine all three batches of beer together and get the right amount of hops to make it taste right. Same thing with WAR. Combining them together doesn't make one good one. It's the same inaccurate ratio multiplied by three.

No, this is not what they are doing with WAR if you look at 3 years of data. You're not taking the same bad measurement and multiplying it by three. You're taking 3 individual measurements, and coming up with a sum total of them over 3 years. To use your recipe comparison: If you need a gallon of milk and 3 pounds of chocolate to make cookies, it really doesn't matter what ratios you mix them in initially as long as they are all combined in the end.

"I wasted so much time in my life hating Juventus or A.C. Milan that I should have spent hating the Cardinals." ~kalle8

Link to comment
Share on other sites

No, this is not what they are doing with WAR if you look at 3 years of data. You're not taking the same bad measurement and multiplying it by three. You're taking 3 individual measurements, and coming up with a sum total of them over 3 years. To use your recipe comparison: If you need a gallon of milk and 3 pounds of chocolate to make cookies, it really doesn't matter what ratios you mix them in initially as long as they are all combined in the end.

 

Each individual WAR estimate is inaccurate because each single one has a bad metric in it. If we combine three bad readings we don't get one better one. We get a combination of three bad ones. Each individual year is off due to the UZR metric being used. You can't then just combine them and expect it to be more accurate than it's components.

 

If you need a gallon of milk and 3 pounds of chocolate to make cookies, it really doesn't matter what ratios you mix them in initially as long as they are all combined in the end.

 

Except you don't get the right ratio simply by combing the three. You get the same bad ratio as you started with. In your example you don't get one gallon of milk to three pounds of chocolate. You combined three recipes all of which had the wrong ratio of milk to chocolate. If the recipe called for one 3-1 ratio of milk to chocolate and you combine three recipes with a ratio of 1-1 you don't get a 3-1 ratio.

 

I am sure there will be more and hopefully they will continue to improve the methodology.

 

When they do I will take more stock in it's results. I am not against the goals nor the process. I am against the belief that it is of much use as it is now. What I don't understand with it is there is a fairly simple fix. Just use three years of UZR to one season of offensive metrics. While not perfect it at least uses apples to apples in it's calculations. The other option would be to weigh UZR less so at least the bad metric isn't as heavily influential in the result. Maybe the best way would be to separate offense and defense and come up with two WAR's. Then you could combine them in one overall number without the inaccuracies.

 

In summary, WAR probably represents the best framework available for estimating the value of performance that is publicly available

 

This would be were I disagree. We do have much more information available to us as we speak. Just not all in one number. IF that one number is off then it makes no sense to say well it's the best single number we have so lets just accept it until there is a better one. If it's methodology is off, and we all seem to accept it is because of the way they use defensive metrics, then it's entire result cannot be accurate. I think using a framework that is known to be based off bad information is worse than using none at all. At least we won't be using it as something it isn't, which is to say anywhere near accurate. AS it stands it isn't even on the right path to accuracy so why keep going down that road?

There needs to be a King Thames version of the bible.
Link to comment
Share on other sites

Except you don't get the right ratio simply by combing the three. You get the same bad ratio as you started with. In your example you don't get one gallon of milk to three pounds of chocolate. You combined three recipes all of which had the wrong ratio of milk to chocolate. If the recipe called for one 3-1 ratio of milk to chocolate and you combine three recipes with a ratio of 1-1 you don't get a 3-1 ratio.

Except that it is expected that you put in a ratio of 3-1, 2-1, and 4-1 in the various batches. The three batches together gives you an accurate amount. Bad analogy, but just trying to make it more correct for our given situation.

3 to 4 years of UZR data isn't perfect, but it gives you a pretty accurate evaluation of that players defensive value over that time. Just because you combined several individually inaccurate years doesn't mean the result is flawed as well.
Link to comment
Share on other sites

Just because you combined several individually inaccurate years doesn't mean the result is flawed as well.

 

I understand what you mean but it isn't accurate because you change the value of UZR when you incorporate it into another stat then use the- multiply by three for accuracy formula for UZR- with that new stat. That new value is derived from combining stats in the wrong amounts to begin with so it isn't going to represent the value of the original UZR stat simply by combining three years of WAR. It will instead give you some sort of hybrid number that was derived by that entire calculation. To expect something that has been changed prior to being multiplied to get you the same value of the original stat prior to combining it to others is just not correct mathematically.

There needs to be a King Thames version of the bible.
Link to comment
Share on other sites

WAR is calculated from runs. UZR is calculated in runs. I'm not sure where you think there is some magical calculation process transforming UZR into something it isn't. They aren't squaring the numbers by themselves. They are multiplying them by a constant.

 

x(3+1+17) = 3x + x + 17x

Where are you getting this "multiply by 3 for accuracy" thing?

 

From Fangraphs:

Offensive players – Take wRAA and UZR

(which express offensive and defensive value in runs above average) and

add them together. Add in a positional adjustment, since some positions

are tougher to play than others, and then convert the numbers so that

they’re not based on league average, but on replacement level (which is

the value a team would lose if they had to replace that player with a

“replacement” player – a minor leaguer or someone from the waiver wire).

Convert the run value to wins (10 runs = 1 win) and voila, finished!

"I wasted so much time in my life hating Juventus or A.C. Milan that I should have spent hating the Cardinals." ~kalle8

Link to comment
Share on other sites

I guess I am not explaining myself very well. Problem is I don't know how else to do it. Both offense and defense are based on runs but one is roughly three times more accurate than the other in each single WAR. When you use one measure at three times the accuracy rate of the other and count them as equal you then get a totally separate number than if you used each component at equivalent accuracy rates. To use three years of WAR calculated with equally weighted stats even though they are not equally accurate and think you get the same result as if you used one year of properly weighted stats is not accurate.

Lets put it this way. If you calculate WAR using the methods of today and you get three years of WAR like 2,3,2 it would not be the same as using three years of UZR for every year of offensive stats in any single year of WAR. If you have different single year WAR's from each way of calculating it you don't suddenly get the WAR of a properly weighted WAR by adding three years of the faulty weighed calculation.

Unless each calculation uses the same numbers each will have a different outcome. My argument is you cannot assume three years of a different method of calculations will be the same just because you now have a combined accuracy equivalent of one stat since that combined stat is not in each year of calculation.

 

What you posted form fangraphs tells us how they did it but fails to tell us why using the stats they did in the measure they did was proper. It certainly doesn't tell us that three years of it balances out it's poor use of unequal accuracy levels in equal measure.

There needs to be a King Thames version of the bible.
Link to comment
Share on other sites

If your UZR is 15, 10 and 5 over a 3 year period, that is going to add up to 30 runs, or 10/year.

If your offensive Runs are 30, 25, and 20, that is going to add up to 75 runs, or 25/year.

I don't see how you are going to get anything but 35 runs per year when you add those up in any way you choose.

 

If you think that 3 years of UZR is worthless and inaccurate, then thats a completely different story. If you think that defense is improperly weighted, that's a different story as well.

"I wasted so much time in my life hating Juventus or A.C. Milan that I should have spent hating the Cardinals." ~kalle8

Link to comment
Share on other sites

If you think that defense is improperly weighted, that's a different story as well.

 

That's what I was saying all along. I think we have been talking past each other. They use one year of UZR (defense) to one year of offense (lets call it calc1). To get an accurate reading you have to properly weigh defense to match offense. That means you have to use three years of defense to one of offense in each single calculation to get accurate results (lets call that calc2). If you take three years of calc1 and average them you get one number. If you take three years of calc2 and average them you do not get the same number as you would in calc1. If the two stats don't average out the same in the same amount of time then one is not the same as the other.

There needs to be a King Thames version of the bible.
Link to comment
Share on other sites

You'd prerhaps want to regress UZR but you can't use multiple years if you want to tell you how that player did in a particular year.

 

That's fine but using something that is known to be inaccurate isn't gonna tell you something accurate. If we accept that then why should we accept that it actually tells you how someone did in a particular year? I think you can use three years of UZR and still be at least a little more indicative of the particular season than the way it is done now. I guess it depends on how much variance one thinks a player has defensively from one season to the next. I tend to think defense varies less than offense so I don't really have a problem with using three years. It won't be entirely indicative of that season but the question isn't whether it is accurate or not but how much closer to accurate it is. If we want it to be all about a single season then we have to go back to the drawing board and find a way to measure defense in a single season. If we can't then WAR should be scrapped altogether. Good idea but if it's not workable in the real world then why keep using it?

I'm really not trying to be a wienie about this. It's more that using advanced metrics is very useful but only when it starts with good information. When we don't it leads to less understanding not more. Bad information is worse than none at all.

 

Really, though, Fangraphs breaks up the WAR components. Weight them as you see fit.

 

What's the proper weight for inaccurate information? There is nothing we know of that makes UZR in that amount accurate. Any amount of inaccuracy is bound to make a measure less accurate. The only thing we can do with it is reduce the bad effect but if we use single season defensive metrics we are going to have to use inaccurate information which does nothing but distort the reading.

There needs to be a King Thames version of the bible.
Link to comment
Share on other sites

Can I ask to anyone that's either posting in or reading this thead -- are you looking for a stat to describe a player's value in a given timeframe, or are you looking for a stat to describe true talent? Because it seems like a lot of the discussion so far here about WAR, & specifically its use of UZR, really is calling for a defensive stat that depicts true talent. Please correct me if I'm wrong, but that's the general sense I'm getting. And I think if that's the kind of stat that people are after, I'm not sure it can be found.

 

One thing that stands out from this discussion is that the statistics we have to describe offense are pretty well-understood & reasonably accurate portrayals of production or value... and the defensive stats we have currently aren't. I honestly don't think WAR is used as a be-all, end-all stat as much as people sometimes complain it is. It gives a pretty solid idea of what kind of production a player turned in in a given season, but if you don't find it a good relative measure, use a good offensive stat plus a well-informed scouting report+eyeball test on defense.

Stearns Brewing Co.: Sustainability from farm to plate
Link to comment
Share on other sites

Because it seems like a lot of the discussion so far here about WAR, & specifically its use of UZR, really is calling for a defensive stat that depicts true talent. Please correct me if I'm wrong, but that's the general sense I'm getting. And I think if that's the kind of stat that people are after, I'm not sure it can be found.

 

Probably not. At least not with one year of data. The number of chances will always be a problem with defensive stats even with perfect inputs. With perfect inputs though it would tell you about how many runs saved or lost over a season. FieldFX would help with that but it may not be publicly available anytime soon.

Fan is short for fanatic.

I blame Wang.

Link to comment
Share on other sites

Because it seems like a lot of the discussion so far here about WAR, & specifically its use of UZR, really is calling for a defensive stat that depicts true talent. Please correct me if I'm wrong, but that's the general sense I'm getting. And I think if that's the kind of stat that people are after, I'm not sure it can be found.

 

If it can't be found then it WAR can't be accurate. If WAR can't be accurate why keep viewing it as telling us something useful? In other words WAR is useless.

There needs to be a King Thames version of the bible.
Link to comment
Share on other sites

You'd prerhaps want to regress UZR but you can't use multiple years if you want to tell you how that player did in a particular year.

 

That's fine but using something that is known to be inaccurate isn't gonna tell you something accurate.

It's definitely imprecise over one season. It's not necessarily inaccurate, however. That is a very important distinction:

As I mentioned earlier, if the underlying batted ball trajectory and speed data that UZR uses is biased (and there, is some evidence that it is), that would decrease it's accuracy. Lower sample size decreases precision.

 

Link to comment
Share on other sites

There are 2 things we can look at though when we are talking about stats. Production and true talent. WAR isn't necessarily trying to measure true talent. It is trying to estimate production. I think is does a very good job of that.

Fan is short for fanatic.

I blame Wang.

Link to comment
Share on other sites

It's definitely imprecise over one season. It's not necessarily inaccurate, however. That is a very important distinction:

 

Fair point. I'll try not to use them interchangeably anymore. As far as the precision goes I'm still not convinced using three years of UZR does anything but make the single season measurement better even if it isn't all compiled in that season. Take MacGehee for example. His UZR and UZR/150 last season was 6.5 and 7.3. Does that sound more indicative of his defense last season than his career line of -4.9 and -2.1?

There needs to be a King Thames version of the bible.
Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

The Twins Daily Caretaker Fund
The Brewer Fanatic Caretaker Fund

You all care about this site. The next step is caring for it. We’re asking you to caretake this site so it can remain the premier Brewers community on the internet. Included with caretaking is ad-free browsing of Brewer Fanatic.

×
×
  • Create New...