Chapter 4 Statcast

type of data available: pitch-by-pitch, summary statistics (in leaderboards)

Statcast is tracking technology commonly used in Major League Baseball. There are two ways to use data from Statcast in R:

  1. Downloading the data directly from the Baseball Savant website and importing it into R
  2. Using the baseballr package

Note Statcast data only goes back to 2015.

4.1 Baseball Savant Website

The Statcast data is very customizable. The leaderboard allows you to select which positions, teams, seasons, and thresholds. In the top right corner, there is an option of which statistic to find the leaderboard for, with hitting, pitching, fielding, running, and positioning. The data can be downloaded as a CSV file. Statcast Search allows for much more user customization. Here is a screenshot of the page:

In the top left corner of the data there are three icons: The middle blue option downloads the results of the search as a .CSV. The rightmost icon will download the data. It will contain pitch-by-pitch data, so the file is very large. Often it will only work for short lists of results, otherwise it will time out.

The file we downloaded is the default table from clicking on “Statistics”, “Player Pitching”, “2022”. No variables were added or removed.

sc_download <- read_csv("data/sc_pitching_2022.csv")
last_name first_name player_id year xba xslg xwoba xobp xiso exit_velocity_avg launch_angle_avg barrel_batted_rate
Wainwright Adam 425794 2022 0.270 0.419 0.328 0.326 0.148 87.8 11.0 6.5
Verlander Justin 434378 2022 0.207 0.331 0.255 0.248 0.124 87.8 16.9 6.3
Kluber Corey 446372 2022 0.261 0.416 0.310 0.294 0.155 87.1 18.5 6.9
Morton Charlie 450203 2022 0.228 0.389 0.314 0.315 0.161 89.3 13.6 9.5
Quintana Jose 500779 2022 0.257 0.377 0.305 0.312 0.120 86.5 9.9 5.5
Gibson Kyle 502043 2022 0.262 0.420 0.326 0.320 0.158 88.5 10.6 7.5
Darvish Yu 506433 2022 0.227 0.388 0.291 0.275 0.161 88.5 17.0 8.8
Kelly Merrill 518876 2022 0.230 0.383 0.296 0.291 0.153 88.5 14.0 8.3
Perez Martin 527048 2022 0.242 0.346 0.295 0.311 0.105 88.2 8.1 4.3
Anderson Tyler 542881 2022 0.225 0.350 0.275 0.272 0.124 85.0 16.9 4.9
Cole Gerrit 543037 2022 0.214 0.383 0.284 0.266 0.169 89.4 12.6 9.5
Lyles Jordan 543475 2022 0.267 0.452 0.341 0.325 0.184 88.6 15.3 10.4
Mikolas Miles 571945 2022 0.252 0.400 0.306 0.294 0.148 87.8 11.0 6.9
Gausman Kevin 592332 2022 0.242 0.380 0.285 0.272 0.138 89.0 12.2 8.1
Ray Robbie 592662 2022 0.223 0.373 0.295 0.292 0.150 89.7 14.7 7.9
Taillon Jameson 592791 2022 0.260 0.432 0.317 0.297 0.172 88.5 14.5 8.3
Gonzales Marco 594835 2022 0.267 0.435 0.330 0.320 0.169 86.7 14.5 7.2
Pivetta Nick 601713 2022 0.248 0.433 0.332 0.324 0.185 90.7 15.2 9.0
Bassitt Chris 605135 2022 0.228 0.359 0.290 0.291 0.132 85.7 10.8 6.6
Musgrove Joe 605397 2022 0.225 0.351 0.282 0.284 0.126 86.4 11.7 6.0
Nola Aaron 605400 2022 0.211 0.340 0.259 0.248 0.129 87.7 12.5 7.1
Rodon Carlos 607074 2022 0.198 0.309 0.254 0.260 0.111 89.0 19.4 6.5
Freeland Kyle 607536 2022 0.271 0.455 0.346 0.337 0.184 89.8 12.7 9.7
Fried Max 608331 2022 0.227 0.328 0.264 0.266 0.101 86.2 7.6 4.0
Irvin Cole 608344 2022 0.258 0.451 0.324 0.301 0.192 89.4 15.5 9.6
Marquez German 608566 2022 0.256 0.423 0.327 0.321 0.167 90.5 9.1 7.1
Quantrill Cal 615698 2022 0.258 0.417 0.321 0.313 0.159 87.6 13.8 7.5
Berrios Jose 621244 2022 0.275 0.466 0.346 0.329 0.191 90.0 13.9 9.5
Urias Julio 628711 2022 0.205 0.332 0.262 0.258 0.128 86.7 17.2 6.7
Lopez Pablo 641154 2022 0.239 0.378 0.301 0.302 0.138 87.9 11.0 9.0
Alcantara Sandy 645261 2022 0.215 0.331 0.267 0.268 0.116 87.8 5.5 5.3
Cease Dylan 656302 2022 0.184 0.292 0.257 0.273 0.109 86.8 15.0 6.2
Montgomery Jordan 656756 2022 0.258 0.400 0.310 0.303 0.143 88.5 9.8 7.1
Wright Kyle 657140 2022 0.244 0.384 0.306 0.308 0.139 89.0 4.4 6.8
Webb Logan 657277 2022 0.247 0.364 0.295 0.301 0.116 88.9 3.1 5.5
Ohtani Shohei 660271 2022 0.204 0.311 0.256 0.260 0.107 87.1 14.5 6.3
McKenzie Triston 663474 2022 0.222 0.397 0.293 0.274 0.175 90.2 19.7 9.8
McClanahan Shane 663556 2022 0.207 0.332 0.261 0.257 0.126 87.6 8.3 6.4
Valdez Framber 664285 2022 0.227 0.330 0.284 0.301 0.102 89.8 -3.6 5.8
Urquidy Jose 664353 2022 0.256 0.451 0.329 0.306 0.194 89.7 17.2 9.4
Manoah Alek 666201 2022 0.224 0.343 0.284 0.290 0.118 87.5 16.9 5.4
Gallen Zac 668678 2022 0.213 0.341 0.278 0.279 0.127 87.8 10.8 7.8
Burnes Corbin 669203 2022 0.212 0.337 0.273 0.275 0.124 87.2 10.0 5.7
Gilbert Logan 669302 2022 0.253 0.408 0.314 0.308 0.155 91.0 14.6 7.1
Bieber Shane 669456 2022 0.245 0.386 0.292 0.281 0.141 89.9 10.0 7.2

4.2 baseballr

library(baseballr)

There are four functions specifically for Statcast: statcast_search(), statcast_search_batters(), statcast_search_pitchers(), and statcast_leaderboards().

Leaderboards

There are two required arguments to access a leaderboard: ‘leaderboard’ and ‘year’. The options for ‘leaderboard’ are “exit_velocity_barrels”, “expected_statistics”, “pitch_arsenal”, “outs_above_average”, “directional_oaa”, “catch_probability”, “pop_time”, “sprint_speed”, and “running_splits_90_ft”. There are optional arguments that would help limit the observations produced; a full list can be found in the documentation.

sc_lead_evb <- statcast_leaderboards(leaderboard = "exit_velocity_barrels", year = 2022)
sc_lead_exp <- statcast_leaderboards(leaderboard = "expected_statistics", year = 2022)
Exit Velocity & Barrels Leaderboard:
year last_name, first_name player_id attempts avg_hit_angle anglesweetspotpercent max_hit_speed avg_hit_speed ev50 fbld gb max_distance avg_distance avg_hr_distance ev95plus ev95percent barrels brl_percent brl_pa
2022 Semien, Marcus 543760 547 19.9 32.4 110.1 87.3 97.8 91.3 84.2 430 186 394 191 34.9 37 6.8 5.1
2022 Rosario, Amed 642708 530 5.0 31.1 110.8 88.4 99.7 92.1 86.9 450 138 407 203 38.3 24 4.5 3.6
2022 Ramírez, José 608070 528 20.7 33.7 114.2 87.7 98.9 91.2 86.9 422 179 392 195 36.9 35 6.6 5.1
2022 Turner, Trea 607208 527 10.2 35.3 112.5 88.9 100.2 92.1 86.3 439 163 402 219 41.6 40 7.6 5.6
2022 Guerrero Jr., Vladimir 665489 526 4.3 27.9 118.4 92.8 105.5 98.2 90.3 467 144 407 265 50.4 59 11.2 8.4
2022 Freeman, Freddie 518692 517 13.6 42.9 112.3 91.3 100.8 94.5 87.5 446 185 407 247 47.8 51 9.9 7.2

Expected Statistics Leaderboard:
year last_name, first_name player_id pa bip ba est_ba est_ba_minus_ba_diff slg est_slg est_slg_minus_slg_diff woba est_woba est_woba_minus_woba_diff
2022 Semien, Marcus 543760 724 547 0.248 0.243 0.005 0.429 0.394 0.035 0.317 0.306 0.011
2022 Freeman, Freddie 518692 708 517 0.325 0.313 0.012 0.511 0.538 -0.027 0.393 0.403 -0.010
2022 Turner, Trea 607208 708 527 0.298 0.276 0.022 0.466 0.432 0.034 0.350 0.335 0.015
2022 Lindor, Francisco 596019 706 504 0.270 0.254 0.016 0.449 0.427 0.022 0.342 0.331 0.011
2022 Guerrero Jr., Vladimir 665489 706 526 0.274 0.281 -0.007 0.480 0.464 0.016 0.351 0.351 0.000
2022 Olson, Matt 621566 699 450 0.240 0.248 -0.008 0.477 0.467 0.010 0.344 0.347 -0.003