Pitch Combos Part 2: Batted Balls Only

Introduction

Last week, I posted an article on the best pitch type and location combinations in the strike zone. In that analysis, I included swinging and called strikes and assigned these pitches an xwOBA of zero. I noted that one could argue this choice gives too much weight to strikes relative to batted balls. Today, I am exploring the exact same question except exclusively on batted balls. Other than that, my methodology is exactly the same, so I encourage you to check out that article before reading any further.

Results

LHP on LHB (Left handed pitcher on Left handed batter)

Best:

Pitch TypeZoneAverage xwOBA
Sinker9 (Down and In)0.284
Slider7 (Down and Away)0.290
Two-Seam FB9 0.292
Sinker8 (Middle-Down)0.298
Slider4 (Middle-Away)0.307

Worst:

Pitch TypeZoneAverage xwOBA
Four-seam FB5 (Middle-Middle)0.465
Cutter50.453
Four-seam FB8 (Middle-Down)0.429
Four-seam FB 4 (Middle-Away)0.410
Slider80.407

A lot of shakeup from last week’s leaderboard. A few constants: fastballs continue to get crushed in the middle of the zone and away in the zone generally seems the best place to go. The effectiveness of down and in once again surprised me based on lefty hitters “dropping the barrel” on that location. Out of all the data, the batted ball only lefty-on-lefty split seemed to be the biggest outlier in terms of “best” zones, and I suspect it’s partially because it has the least amount of observations, giving it more statistical “noise” (only 7746 of the roughly 700,000 total 2018 pitches were lefty-on-lefty matchups that resulted in fair contact).

RHP on LHB

Best:

Pitch TypeZoneAverage xwOBA
Changeup7 (Down and Away)0.296
Four-seam FB 3 (Up and In)0.299
Sinker7 0.303
Two-seam FB7 0.329
Slider70.335

Worst:

Pitch TypeZoneAverage xwOBA
Two-seam FB 5 (Middle-Middle)0.500
Sinker50.472
Four-Seam FB50.452
Four-Seam FB 4 (Middle-Away)0.451
Two-Seam FB8 (Middle-Down)0.444

In terms of contact prevention, down and away seems key here.

LHP on RHB

Best:

Pitch TypeZoneAverage xwOBA
Four-Seam FB1 (Up and In)0.281
Changeup9 (Down and Away)0.294
Two-Seam FB 90.310
Sinker9 0.330
Cutter4 (Middle-In)0.353

Worst:

Pitch TypeZoneAverage xwOBA
Two-Seam FB5 (Middle-Middle)0.495
Four-Seam FB8 (Middle-Down)0.473
Four-Seam FB 6 (Middle-Away)0.461
Sinker50.457
Four-Seam FB 50.448

Again, down and away seems like a good contact manager.

RHP on RHB

Best:

Pitch TypeZoneAverage xwOBA
Two-Seam FB1 (Up and In)0.283
Cutter9 (Down and Away)0.286
Slider90.287
Curveball90.291
Two-Seam FB 90.301

Worst:

Pitch TypeZoneAverage xwOBA
Cutter5 (Middle-Middle)0.487
Four-Seam FB50.463
Four-Seam FB 8 (Middle-Down)0.458
Four-Seam FB 3 (Up and Away)0.439
Changeup50.437

Like the previous article, down and away and up and in have great success in righty-righty matchups.

Conclusion

Zone 8, or middle-down in the strike zone, stood out to me. While it did not make an appearance at all in the previous analysis, it consistently showed up here in the “Worst” leaderboards. If you look back at this article’s tables, you can see that in every batter/pitcher split, a fastball in zone 8 appears. Its previous absence indicates that while zone 8 gives up poor contact, it gets enough takes and swings and misses to make it, at the very least, not the worst of the worst.

I also noticed that high pitches never showed up in the “Worst” leaderboards for both analyses, not even once. Pitching up in the zone more, even with breaking balls, could be an underrated strategy.

There doesn’t seem to be much evidence of lefties loving the pitch down and in. In fact, this data indicates down and in could be an effective spot against lefties for both right-handers and left-handers. I tried searching for “lefties dropping the barrel” and “lefties down and in” on Google, but could not find anything definitive about the cliche’s origin. Regardless of the origin, the data does not back this cliche up.

I prefer my previous results to these. After mulling it over, I’ve concluded called strikes and especially swinging strikes are far too valuable to omit in any pitch analysis. Additionally, omitting these strikes makes combos that induce more contact look better than they should, while doing the opposite for pitches inducing whiffs and takes. However, I could see focusing exclusively on batted ball data useful for pitchers who don’t generate many swinging strikes/called strikes.

Future Research

While leaderboards are good for outlier detection and fun to look at, it misses the middle portion of the data, which can still be very insightful. I have an idea for building a pitcher statistic based on every single pitch thrown. The model would take in many features of a pitch, including velocity, location, movement, spin rate, prior pitch data (to account for sequencing), and likely more, and would output probabilities of different outcomes (probability of a swinging strike, a home run, etc.). After summing up a pitcher’s probabilities from every pitch thrown, you could formulate a statistic that evaluates pitchers on their fundamental underlying data; their actual pitches. However, I would likely need a machine learning model, possibly a neural network, in order to approach this problem, so until I become comfortable with implementing machine learning with R, this idea is on hold.

Thank you for reading! Please comment any other baseball topics you are interested in reading about or any thoughts you had. If you enjoyed my analysis, please follow me on Twitter for future posts.



Join the Conversation

  1. Unknown's avatar
  2. Ishaan Sethi's avatar

2 Comments

  1. Interesting series. Thanks!

    A couple of questions/thoughts:
    Could you give your reasoning for your methodology? Why exactly are you splitting the analysis in two parts? Not that this does not provide some interesting insights, but as you stated yourself the results itself do not give the whole picture if a pitch is good/bad. I think a holistic perspective first would have been better – you can always dive deeper into the specific strength/weaknesses of different pitches later (4-Seamer creates whiffs but if it is hit – its usually hit hard, ect.).
    As for presentation it might be an idea to group same-handedness (RHP vs RHB/LHP vs LHB) and different-handedness together as you results indicate that there are a lot of similarities – up and in 4-seamers seem to be only great against opposing handed batters.
    I never really grasped the difference between 2-seamers and sinkers and your results seem to show that they (at least for most pitchers) produce similar results. Throwing them together might tackle some sample-size issues as well.

    I would be interested to read some analysis of effective velocity. An interesting concept but one which is not debated too often and I haven’t really grasped yet.

    Like

    1. Hi Niklas, thanks for commenting.

      The reason I split this topic into two separate posts was because in the first post, I included swinging strikes and called strikes in a way that may not have been optimal for some people. I gave these strikes an xwOBA of zero, which could be debated. For example, let’s take an 0-0 count. If the hitter swings and misses the first pitch, they’re at an 0-1 count. If, instead, the hitter hits a high popup on the infield with an xwOBA of nearly zero, that’s almost always an out. Obviously, you would prefer the second situation as a pitcher, but in my methodology these two outcomes would be weighed the same. Now, in a two strike count, a swinging or called strike would get an out via strikeout, but not every swinging or called strike comes with two strikes, which led me to be conflicted about weighing these strikes as a zero xwOBA. Because I was conflicted about the above, I decided to use this methodology in the first post and have a second post with a subset of the original dataset only including batted balls. In my opinion, my first post is actually a holistic approach; it accounts for quality of contact, swinging strikes, and called strikes, only omitting foul pitches. You are correct about my second post not being as holistic, as it only accounts for quality of contact, which is why I treated it as a supplement rather than a whole new piece.

      In terms of sample size issues, I was fortunately able to use all pitches thrown in 2018 as my dataset, so I only really felt the lefty-lefty contact only split suffered from small sample size, and even that sample wasn’t minuscule. I appreciate the ideas on helping this.

      Also, thanks for the suggestion on effective velocity. I will definitely consider this topic in the future.
      Ishaan

      Like

Leave a comment

Design a site like this with WordPress.com
Get started