Wednesday, March 21, 2007

Runner tagging from third, here's the throw...

A few folks have been writing about measuring an outfielder's arm, most notably John Walsh at Hardball Times. Walsh's method can be found here, but the relevant information goes something like this:

There are many different kinds of plays that require an outfielder to use his arm, and it probably isn't possible to take all of them into account. To keep the analysis manageable, I've isolated five different outfield plays that I use to measure the prowess of outfield arms:

  1. S-1B: A single is hit to the OF with a runner on 1B and 2B unoccupied.
  2. S-2B: A single is hit to the OF with a runner on 2B.
  3. D-1B: A double is hit to the OF with a runner on 1B.
  4. F-3B: An OF fly is caught with a runner on 3B, fewer than 2 outs.
  5. F-2B: An OF fly is caught with a runner on 2B and 3B unoccupied, fewer than two outs.

For each play that falls into one of these categories, I classify the play into one of three possible outcomes:

  1. Kill: an assist was recorded by the outfielder
  2. Hold: the runner did not take the extra base
  3. Advance: the runner took the extra base
On an intuitive level, it makes sense. An outfielder's job in that situation is either to scare the runner so much that he doesn't dare try for the extra base or, if he does run, to throw him out. Walsh's system directly measures how often the runner was scared, and if he ran, how many times he was gunned down. The problem is that there's more to that job than just being able to throw well (i.e., far enough, and on-target).

Let's take a look at the situation of a double with a runner on first base, but this time from the perspective of the runner (or more likely, the third base coach). Whether or not he's even waved home will be a function of several factors: where the ball was hit (and how far away that is from home plate), how quickly the fielder gets to the ball, where on the basepaths the runner is when the fielder picks the ball up, how fast the runner is, and finally what the third base coach believes about the outfielder's arm, as well as how cautious/risk-seeking he is in deciding to send runners. The coach may also be considering who's on deck and whether it might be the wiser choice to stop the runner and leave it up to the next hitter. If the runner goes, then the result of the play will depend on how far the throw has to travel, the runner's speed and location when the throw is made, and the catcher's abilities in blocking the plate, as well as the outfielder's actual throwing abilities. I can grant that some of these are un-measurable (at least without donating my firstborn to Baseball Info Solutions), or that the errors are randomly distributed (i.e., over time, things even out).

The one that most concerns me is the distance that the throw has to travel, specifically because parks are shaped rather differently in the outfield. (Old Tiger Stadium and it's 440 foot CF fence come to mind.) A short left field fence not only means that right-handed pull hitters salivate, but that left fielders, have a shorter range of throws that they will have to make.

Does the distance which the throw has to travel affect whether the situation will result in a hold, kill, or advance? With so many other variables to consider, you might think it hard to dis-entangle such things. However, baseball provides us with a wonderful natural experiment. Consider the fly ball to the outfield with a runner on third and less than two outs (in other words, the would-be sacrifice fly). In every case, we can standardize how far away the batter is from home plate (90 feet) when the fielder has the ball in his hand. Thanks to Retrosheet, we have data on where the ball is (roughly). Through the use of some simple trigonometry, the Project Scoresheet hit location grid (used by Retrosheet), a little knowledge on the makeup of a baseball diamond, and the outfield dimensions of the park in question, it is possible to at least estimate how far away from home plate the central point of each zone is. It's not perfect, but it's good enough for government work. (If you'd like to know the specifics of my method, I will send them to you... or perhaps post them at a later date.)

To answer the question, I took the PBP data for 1993 to 1998, and selected out all instances in which, with a runner on third, a fly ball or line drive was hit to and caught by an outfielder with less than two out. There were 9,415 such instances. In those instances, 84% of the time (7910)the runner broke for home. 16% of the time, he stayed.

Did how deep the fly ball went have something to do with whether or not he went? I ran a binary logit regression on whether or not the runner went, with the fielder's estimated distance (in feet) from home plate as the predictor. Distance was a significant predictor (as might be expected... runners are more likely to tag on a deep fly than a shallow one), with a beta weight of .056.

Surprisingly, the Nagelkerke R-squared value was .495, suggesting that nearly half of the decision of whether or not the runner would run was based on where the ball was hit, before considerations of arm were taken into account. Also, using a cut-point of 50% probability, the model correctly predicted 90.1% of the obeserved cases. Clearly, runners think a lot about where the ball is before they try running home.

I should take a look at runner speed scores in that regression to see what that does, but I need to get to bed (perhaps this weekend).

Now, of the 7910 cases where the runner broke for home, 13 (.16%) of them resulted in a low-probability event (run down, another runner caught at another base). Of the remaining 7897, 2.9% of them (228) resulted in an out at home plate, with the rest (97.1%) resulting in the runner from third scoring. Again, runners appear to be very careful to pick their situations and seem to do a good job of it.

I ran a similar logit regression and found that distance was again a significant predictor (Beta = .038), although the Nagelkerke R-squared was a mere .172. Only 17% of the variance in whether or not the runner scored could be linked to the distance the throw would have to traverse. This leaves open the possibility that a good arm might very well be a huge part of the difference between a sac fly and a fly-out-throw-em-out double play.

The findings bring up some interesting conclusions. First off, we have evidence that runners are extremely smart and rarely take un-needed chances on the basepaths, at least when it comes to sac-flies. (I should look up how often they are gunned down in the other situations.) So, it could actually be more important to have the reputation of a good arm than an actual good arm, that the runner would think twice before trying to run. Second, if distance predicts whether or not a runner will go (and to a lesser extent if he will make it), then park dimensions matter. Players who play in smaller parks will have shorter throws to make. This could make their hold and kill rates look better, not because of their arm, but because of the fence behind them.

In Walsh's original article, the best throwing left fielder in baseball last year was Manny Ramirez, who just happens to play in front of the Green Monster in Fenway Park, with (you guessed it!) the shortest left field and left center dimensions in the majors.


While I'm here:

Did I mention I'll be in Boston next week?

1 comment:

Anonymous said...

First, I'd love to see your data from the 'Runner Tagging From Third...' article you wrote.

Second, I'm the programmer for Baseball Mogul. Shoot me an e-mail if you'd like a free review copy.

Thanks!