Jump to content
Brewer Fanatic

UZR as it pertains to Hall/Weeks discussions


TheCrew07

I wanted to break this out into it's own thread.

 

This is some "fan boy" opinion of the system, I've actually taken the time to understand how it was created and exactly what it does. I feel it leans too heavily towards range because it uses weighted averages that don't cancel out.

 

UZR is primary based on fielding zones and uses retrosheet data as opposed to STATS data, if you're interested enough to care the differences are well documented elsewhere.

 

To begin the following is from MGL's explanation of his metric.

The entire field is broken down into 78 zones. These are the same zones you can find in the hit location diagram in the documentation section of the retrosheet website (www.retrosheet.org). Of these, UZR uses 64 of them. For infielders, only ground balls, including bunts, are looked at. Pop files caught or landing on an infield zone are excluded, as are line drives caught by an infielder or hit through the infield. For outfielders, all fair fly balls and line drives are included. None of the foul zones are used in UZR except for 3F and 3L, which are near the first and third base bags (for fair ground balls fielded in foul territory behind the bags). Catchers and pitchers are not included in UZR ratings.

For each zone, the computer keeps track of the following on a league-wide (for a particular year) basis:

  1. The number of hits in that zone.
  2. The average run value of a hit in that zone (using traditional lwts hit values).
  3. The number of outs recorded in that zone for each fielding position.
He's only working with the data that's available to him and I feel the zone concept is the main problem. Instead of true positioning data the 2 systems mentioned above provide zone data. This turns what could be an easy linear weighted system into weighted averages described below using ROE and non-ROE errors. If the data was provided relative to the position of the fielder, a very simple sliding scale would work wonderfully.

 

At the same time, the computer keeps track of the total number of fielding errors for each fielding position, but not for each zone individually. Actually it compiles fielding errors in two separate categories: One, ROE errors, are fielding errors that result in an ROE. All other errors, such as on a hit, or a second error on an ROE, are called non-ROE error
For now, all ROE errors are treated as outs (this is necessary because, at first, we use outs and ROE?s to establish the number of balls that a fielder gets to and because some ROE errors are committed by the receiver of a throw, in which case the fielder needs to get credit for an out). In the above data, "outs" actually refers to "outs plus ROE errors". Errors (ROE and non-ROE) will be accounted for later on.
Basically the calculations are all averages from the start. An average fielder makes so many plays in each, take the percentages either way for runs saved and bingo you have + or - runs for each zone.

 

First we establish the out rate for all ground balls hit into zone 56. That is 1419 divided by 2474 (1419 plus 1055), or .57. That is, 57% of all ground balls hit into zone 56 in 2002 were turned into outs (by all fielders).
Therefore, the "extra" value of a "caught ball" by a fielder in zone 56 is 1 minus .57, or .43 balls. Since Bordick caught 18 balls in zone 56, he has 18 times .43, or 7.7 "extra" caught balls so far
Well, since an average SS catches 294 balls in zone 56 out of 1419, or 20.7% of the outs, Bordick is responsible for 20.7% of the 79 hits as well, or 16.4 hits (the third baseman is responsible for the other 62.6 hits).
Since Bordick is responsible for 16.4 of the 79 hits in zone 56, he has 16.4 times .57, or 9.4 "negative" caught balls added to his 7.7 "positive" ones, for a total of -1.7 "extra" caught balls.

 

Now we want to convert those "extra" balls into runs saved or cost. For that, we use the average run value of a hit in zone 56 - which is .47 runs. Since a 2002 AL out is worth -.29 runs, the "swing" between an out and a hit is .47 plus .29, or .76 runs. Since Bordick caught 1.7 fewer balls in zone 56 than an average SS, he has cost his team 1.7 times .76, or 1.3 runs so far

If we do this for every zone in which any SS made at least one out (i.e., the applicable SS zones), and we add up all the runs Bordick saved or cost in each zone, we get a total of +6.2 runs, or 6.2 runs saved by Bordick while playing SS (he must have done well in the other zones
Again keep in mind that all this is done using weighted averages between runs and hits, etc. It's not possible to make the distinction between plays (easy or hard) so adjustments are made on the handedness of batter and pitcher, speed of ball, runners on base and outs, park factors, and G/F tendencies of the pitcher,

The average SS committed 169 ROE errors in 5218 balls gotten to in all zones. That is an error rate of 169 divided by 5218, or .032. Since Bordick got to a total of 277 balls in all zones, he should have committed .032 times 277, or 8.9 errors. Instead, Bordick committed only 1 error, for a net gain in errors of 7.9. Since an infield error is worth around .49 runs, the swing between an error and an out is .49 plus .29, or .78 runs. Therefore, Bordick saved another .78 times 7.9, or 6.2 runs, by virtue of his "good hands". So far, we have Bordick saving 6.2 runs with his range and another 6.2 runs with his sure hands.

The average SS committed 45 non-ROE errors and Bordick none. If we do the same calculations as above, using .3 as the value of a non-ROE error, we come up with Bordick saving another .72 runs. So it looks like even at the ripe old age of 36, Mike Bordick saved his team last year a total of 13 runs by virtue of his outstanding play (range and hands) at SS!

Since the field is broken into zones, range and fielding must be configured separately, and the range factor is somewhat loosely cacluated based on the total number of plays. This isn't a flaw in the methods as much as it is the way the data is reported. If balls in play were recorded as + X from the fielder it would allow much simpler calculations at all relative ranges giving a much more exact set of data as well removing many of the adjustments, such as shifts for example.
Ultimately, the best way to construct a UZR rate which represents the true value of a fielder in comparison to the UZR rate of an average fielder, and is the equivalent of UZR runs, is the following: We?ll use Mike Bordick as the example again.

First we take the simple ZR for all SS?s, including ROE errors as outs. This is the number of total outs plus ROE errors by all SS?s, divided by the total number of "chances" (outs and ROE errors plus the number of hits a SS is responsible for). In 2002, SS outs plus ROE errors were 5218, and SS "chances" were 6786, for an average ZR of .769.

Next, we multiply that average ZR of .769 by Bordick?s total "chances", which was 354. The result is 272. That is the number of outs plus ROE errors that an average SS would make given those 354 "chances". Bordick, on the other hand, got to (outs and ROE errors) 8.1 more balls than the average SS, for a total of 280.1 balls "gotten to" (272 plus 8.1). Of those 280.1 balls "gotten to", he committed only one error, for a total of 279.1 "outs". As you can see, those 280.1 balls "gotten to" and 279.1 "outs" are a "fiction", as Bordick actually got to 277 balls and recorded 276 outs.

Nevertheless, if we divide the 279.1 "fictional outs" by his 354 chances, we get a UZR rate for Bordick of .788, which should correspond almost exactly to his 13 total runs saved (UZR rate doesn?t account for non-ROE errors [uZR runs does], but we could certainly "fudge it in" if we wanted to).

In part 2 he goes on to detail how the adjustments are done which is fairly fascinating in itself but outside the scope I want to discuss.

 

Early on he's assigned essentially equal value to range and fielding, I'll agree that range is a fundamental component of defense, I don't personally feel it's as valuable as the ball being fielded. It doesn't how many balls a player can get to if he doesn't convert them into outs. The way UZR is constructed a player like Hall will come out as a + defender because range is considered an equal partner, when I feel range is only significant beyond "average". Since we're talking fielding zones there's no context and while intuitively it makes sense that the better range a player has the more outs they will convert, I'm just not sold that working the equation backwards gives you a true representation of range.

 

This is why I would favor a sliding scale based on range... the closer the ball is to the fielder the greater the negative impact and the farther away from the fielder the greater the positive impact. This would allow for a true measure of the value range, as opposed to the communities' best guess. Lets say we're measuring by feet, is it reasonable to expect all MLB quality fielders at 3B to make a play on a ball 6 feet away? 7? How far out do we go until we reach + range, range beyond average? Shouldn't just the outs made at + range factor into the discussion about how valuable range is? At the + range we're also dealing with the fewest number of chances, meaning it's statistically less relevant in comparison to the rest of the leage, the plays made will determine good, to great, to elite defenders, but the percentage of plays will be relatively small. The small percentage of plays at range shouldn't make range approximately as value as actually making a play in my opinion.

 

Unfortunately the data is what it is, and the metrics do the best they can with what they have, but I think it's certainly open to debate whether or not Hall is a + defender at 3B when his errR component of defense at 3B is -2.2 but his rngR is a 6.2. In fact, Bill doesn't have a single positive fielding calculation within UZR at any position in any year, range accounts for all his positive value as a defensive player as measured by UZR. Since I'm not convinced that range carries equal value to fielding, it's fair to debate whether or not Hall is actually a good defender and at the very least it's reasonable to question his poor fielding habits.

"You can discover more about a person in an hour of play than in a year of conversation."

- Plato

"Wise men talk because they have something to say; fools, because they have to say something."

- Plato

Link to comment
Share on other sites

Recommended Posts

Since I'm not convinced that range carries equal value to fielding

 

Here is where most people disagree with you. An out is an out in my opinion. Two players making the same number of outs with the same chances should be equal. If Hall boots an easy one that most players make that should be equal to Hall making a tough play most players can't make.

 

at the very least it's reasonable to question his poor fielding habits.

 

I agree with this. He does have bad habits, but like Weeks has the range to make extra plays many players cannot. That is why I liked Weeks better at 2B than Durham. Weeks missed easy plays that Durham would have made, but Durham missed plays that relied on range that Weeks makes easily.

 

Hall shows up around average in other metrics. Most people looking at Hall as a defender are not just looking at UZR. There is also Dewan's plus/minus system which Hall does well in. I can't believe I have to defend Bill Hall and his defense.

Fan is short for fanatic.

I blame Wang.

Link to comment
Share on other sites

  • 4 weeks later...

I honestly can't follow almost any of your post. Now, I am no UZR exert, so that may part of the problem. Here's what I do know:

 

First of all, here's a link to part 1:

 

http://www.baseballthinkf...n/lichtman_2003-03-14_0/

 

For the record, he's not using Retrosheet data, just their zone definitions. I think you'll find little or no zone information in the retosheet files. He's always had to buy his data. I believe he used Baseball Info Solution's data, although I've seen it also calculated with another data set (STATS?).

 

Second, I can't follow your logic that he's somehow subjectively weighting range and arm (is that what you mean by fielding?). He explains why he has to pull out the errors and how he puts them back in there. It sounds reasonable to me. Also keep in mind that this article was written in 2003. I'm sure he's refined his methodology since then.

 

One of your major beefs is that UZR doesn't know where the fielder is positioned and I think that's a valid concern. MGL has tried to say that good positioning is part of being a good defender. I don't really but that, though, since a fielder's position is often dictated by the coaching staff. I guess we'll have to wait until positioning is recorded. Someday soon, I hope.

 

All your other points... I just don't understand. I think you have some misunderstanding of how it works, especially considering some comments you made in the main forum.

Link to comment
Share on other sites

Baseball Info Solutions records infielder position, at least to an extent. Not for every play, but when Ortiz/Dunn/whoever hits into an exaggerated shift, the positioning is noted.
Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

The Twins Daily Caretaker Fund
The Brewer Fanatic Caretaker Fund

You all care about this site. The next step is caring for it. We’re asking you to caretake this site so it can remain the premier Brewers community on the internet. Included with caretaking is ad-free browsing of Brewer Fanatic.

×
×
  • Create New...