Tools: Installing MySQL on OS X

If you’ve ever installed MySQL on Linux, you know how easy it is. If you’ve ever installed MySQL on OS X, you know what a terrible pain-in-the-ass it is. It’s a pain whether you install the latest package directly from, or if you install via macports. In order to actually make it work there are about a half dozen post-installation steps you need to follow, none of which are documented anywhere reasonable.

Enter the lovely folks a Mac Mini Vault. They have created and published (via GitHub) a script that installs MySQL from beginning to end, along with several other useful scripts.

If you’re a Mac user and you want to use MySQL for sabermetric analysis, save yourself some headaches and use the script.

Situational Fastball Usage

In a comment on my THT article on pitch sequencing, MGL made the following observation:

One other thing that must be controlled for is game situation and that could be significantly affecting the results. For example, when the pitching team is ahead, especially way ahead, later in the game, the pitcher is more likely to throw a fastball on all pitches, more likely to throw a strike, etc. The batting team is more likely to be taking more pitches, etc.

Unsurprisingly given the source, this is absolutely correct!

To test MGL’s assertion, I computed the percentage of fastball variants (FF, FA, FT, FC, FS, SI in PITCHf/x) thrown by every pitcher from 2008-2014, broken out by platoon, inning, count, index (i.e. whether the pitch was the 1st, 2nd, 3rd, etc. of the PA), and run differential (i.e. how many runs ahead or behind their team was at the time). I then used the delta method to compare (a) the percentage of fastballs thrown in the 7th inning or later when the pitcher’s team was ahead by 4 or more runs, to (b) the percentage of fastballs thrown in the 7th inning or later when the run differential was between 1 and -1.

Here are some selected findings:

  • On the first pitch, pitchers threw fastballs at a 5.2% higher rate when far ahead
  • On the second pitch (ignoring count), pitchers threw fastballs at a 1.5% higher rate when far ahead
  • Irrespective of count or pitch index, pitchers threw fastballs at a 3.6% higher rate overall when far ahead
  • In 1-0 counts, pitchers threw fastballs at a 6.7% higher rate when far ahead
  • In 0-1 counts, pitchers threw fastballs at a 2.8% lower rate when far ahead

In deeper counts the sample sizes quickly get small, so I’ll stop at 1-0 and 0-1. But the results are definitive: pitchers lean more on their fastball late in games when their team is far ahead. The astute reader will note that pitchers threw fastballs less frequently in 0-1 counts when far ahead (this is also the case for 1-1 and 0-2 counts, in smaller samples), but that is not enough to offset the higher fastball rates in other counts.

In fact the effect shows up whenever the pitcher’s team is far ahead, not just late in games. In innings 4 through 6 pitchers threw fastballs at a 4.1% higher rate on the first pitch, and a 2.3.% higher rate overall, when far ahead than in close games.

So MGL’s observation is exactly right: to really do this kind of pitch-level analysis correctly, we need to control for much more than I did in my last article.

Pitch-level linear weights

One of my current sabermetric obsessions is pitch sequencing. It’s a huge topic, and to try and dent it ever so slightly I have found it useful to be able to assign run values to every major event in every count. In other words: pitch-level linear weights.

Ed Sheehan wrote about this in 2008. I used a similar methodology and updated the results for the bulk of the PITCHf/x era.


Count 1B 2B 3B HR E O PU B S
0-2 0.544 0.844 1.139 1.487 0.575 -0.168 -0.180 0.021 -0.171
1-2 0.523 0.824 1.118 1.466 0.554 -0.189 -0.201 0.040 -0.191
0-1 0.490 0.791 1.085 1.433 0.521 -0.222 -0.234 0.027 -0.054
2-2 0.483 0.784 1.079 1.426 0.514 -0.228 -0.240 0.091 -0.231
1-1 0.463 0.764 1.058 1.406 0.494 -0.248 -0.260 0.048 -0.060
0-0 0.452 0.752 1.047 1.395 0.483 -0.260 -0.272 0.035 -0.038
1-0 0.416 0.717 1.012 1.359 0.448 -0.295 -0.307 0.057 -0.047
2-1 0.415 0.715 1.010 1.357 0.446 -0.297 -0.309 0.097 -0.069
3-2 0.392 0.693 0.987 1.335 0.423 -0.320 -0.332 0.252 -0.323
2-0 0.359 0.660 0.954 1.302 0.390 -0.353 -0.365 0.101 -0.056
3-1 0.317 0.618 0.913 1.260 0.348 -0.394 -0.406 0.178 -0.075
3-0 0.258 0.559 0.854 1.201 0.290 -0.453 -0.465 0.119 -0.059

(Feel free to use these numbers as you see fit, but please cite accordingly.)

Good luck, and happy analyzing.