As you prepare for The Players this week, chances are you’re going to read a lot about TPC Sawgrass. While most of it will be high quality work, it will also be largely speculative. This course breakdown will be entirely data-driven, relying on regression.
Course History Model
As many of you already know, I hate course history. The sample size is too small for each player to get a reliable sense of how a player fits at the course, or whether that fit even matters. However, what if we could see how numerous players perform at a given course in order to dramatically increase the sample size? Thanks to DataGolf’s Historic Event Data, we can.
To begin, I took the top 25 and bottom 25 golfers in CH Index (DG defines CH Index as “the average (adj. for field strength) strokes-gained at the course”). Next, I regressed the CH Index of each player on their strokes gained profiles, bogey avoidance, and birdie or better percentage since 2016 (as far back as I collected data for my main model). Regressing the CH Index on career SG profiles instead of course-specific SG profiles makes the model more predictive, rather than descriptive. In other words, it tells us what skills are most likely to translate into success at Sawgrass.
This is the fourth week I’ve used this strategy and the results have been extremely encouraging. In previous weeks, adjusted r^2 (the proportion of variance in CH Index scores explained by the variance in SG profiles) ranged between .25 and .4. This week, that number jumps to .52! Very exciting.
Two weeks ago, the CH Model nailed value plays such as Keith Mitchell, Lucas Glover, Jason Kokrak, and Harold Varner III. This past week, it nailed Byeong-Hun An, Sungjae Im, Luke List, Glover & Kokrak once again, and Rafa Cabrera Bello. More of the same, please!
TPC Sawgrass Key Stats
Interestingly, only three stats are statistically significant. Strokes-gained: around the green leads the way, followed by strokes-gained: off the tee, and finally strokes-gained: approach. Even bogey avoidance and birdie or better percentage were completely insignificant when including SG stats, although bogey avoidance was twice as predictive if I left SG stats out of the model.
Using the regression analysis, I applied the following weight to each stat: SGATG 53%, SGOTT 36%, and SGAPP 11%.
Before we get into the model results by player, let’s talk about what this means and why the results are so counterintuitive. Let’s start with SG:APP. DataGolf’s Historic Event Data tool shows us that 39% of the variation in scores is due to iron play at Sawgrass. However, this only helps us if we can reliably predict SG:APP at Sawgrass, which we clearly can’t based on the model results. All of the water at Sawgrass is a likely contributor to our inability to predict SG:APP here, as is the inherent short term instability of SG:APP based on how strokes-gained stats are calculated.
SG:OTT is by far the most stable of the SG stats, so it’s no surprise to see it heavily weighted, while SG:ATG makes intuitive sense since fewer greens are hit here than most courses, and scrambling percentage is also lower than usual.
Players Who Stand Out
The next step is to create a custom model on Fantasy National, using the weights I specified above. I then select a rolling report of the players rank in the model over the last 12, 24, 50, and 100 rounds. Finally, I take these ranks and use a weighted average to come up with CH Model rankings.
For the full model results, head into FTA+ Chat. DM me on twitter for a free trial if you’re not already a subscriber. However, here are a few players who really stand out:
Tommy is 4th in the CH Model, but the 14th most expensive player on DK. He gains over half a stroke per round off the tee and over a third of a stroke in both iron play and short game. The iron number is good, but that around the green figure is elite.
Poulter had a real chance to win last year, and it seems that was no accident. At just $7600, he is 11th in the CH Model thanks to a short game on par with Fleetwood’s. His SG:OTT is only slightly positive, but that’s a good sign since he’s middle of the pack at best in distance.
Now we’re getting to the good stuff. List is nearly impossible to trust on a weekly basis, but for the third straight week, he’s a CH Model darling (one big time hit, one MC). Everyone knows List dominates off the tee, but his short game is sneaky good, too. List is 7th in the CH Model.
If you thought List’s ranking was good… Oh boy. Benny An ranks 3rd in the CH Model, gaining half a stroke per round in each ball striking category and over a quarter stroke around the greens.
The CH Model gold has been in the value plays, so here’s one more:
Another back-to-back appearance. Im is 8th in the CH Model, but his SG profile is nearly identical to An’s.
Find me on Twitter @alexblickle1, and I’ll post my favorite GPP Plays later this week.