Wide Receiver Metrics

Update

NOTE (1/29/24): this blog needs to be validated, as there are potential biases introduced into the data, due to how I hand selected the stats for the metrics and then used the same data for building the model.

Additionally, the running back modeling blog has been impacted by the same potential bias. That blog also may additionally have data pollution issues.

I will be revisiting this subject once more, after my compute has been upgraded. Expect updates within the next month. Amendment blogs will be linked here.

TL;DR

Even though most wide receiver stats are sticky, the position is difficult to predict accurately.
Trinity score does an okay job at laying out wide receiver expectations, and is slightly worse as a basis for a predictive model.
While there are many scores that perform better than trinity score, as a basis for a predictive model, it is hard to build something reliable.
Some variation of the yards after catch stat is present in the list of best alternative metrics to trinity score.
Predictive models seem to be much better at predicting the direction of change in fantasy points from year to year, than the exact number of fantasy points in the next season.

Overview

Welcome back all!

I hope the New Year has been treating everyone well so far!

For those who enjoyed the previous post exploring wide receiver sticky stats, this piece dives deeper into what wide receiver stats are useful for predictive modeling.

The goal here is to use the information from the strong correlations found in the previous piece, to 1) find concrete and useful trends and 2) build usable predictive models.

I'll primarily be using data from 2012-2022 to examine trends. Furthermore, I'll be using the same data to build and test predictive models.

Without further ado, let's dive in!

A Cursory Glance

I want to begin by examining this idea of the "trinity" score. For those unfamiliar with what this metric is, I recommend checking out this video.

In short, trinity is a metric that is supposed to be predictable and indicative of fantasy performance. It consists of the three following stats: target share, air yards share, and yards after the catch per reception.

This is what the trinity score, its components, and fantasy points - all normalized - against age look like together on a graph:

For the most part, the components of the trinity score do a good job of mirroring fantasy production. Trinity itself also mirrors fantasy points quite well, even though it's spikes are more subdued.

The trouble maker here is yards after the catch per reception. This makes sense too, as yards after the catch per reception correlates quite poorly with fantasy points:

So the question I'm now wondering, is why does trinity score utilize yards after the catch per reception? It appears to be counter-productive as a predictor for how a wide receiver will grow in their career. Does it hold back how strong the correlation between trinity score and fantasy points is? Trinity score does have a 0.80 correlation with fantasy points which is quite strong...

Maybe it makes a bit more sense if you take a look at the picture by season instead of age:

In this situation, yards after the catch per reception is a far better indicator of fantasy performance from season-to-season at the wide receiver position.

In fact, it's the best metric when looking at the data from this perspective:

I would say this is useful in that it shows how trinity keeps up with fantasy production every year, but I would be cautious to use this metric on how a wide receiver performs over the course of their career.

It's possible I'm missing something here, as I'm simply aggregating the three components to obtain the trinity value. However, any sort of linear combination would still yield similar results.

I believe what would be a more useful metric, is something that improves upon mirroring fantasy production by age, as we care about how players will develop in there careers, more than fluctuations in the overall fantasy production of a position group (unless there are drastic changes there).

That being said, I do want to point out that the correlation of trinity score in previous season and fantasy points in the current season is close to being strong:

That's a correlation value of about 0.586.

I want to emphasis that trinity score is a solid metric. The trinity value of a player matches their fantasy performance quite well, and the trinity value of a previous season is not a bad estimate of fantasy performance in the current season. This means we can 1) expect patterns to hold as players age and 2) use the metric as a decent indicator of player's performance in the upcoming season.

A Better Metric?

The question is, can we come up with a better metric than trinity score, specifically for predicting the career trajectory of a wide receiver?

My first attempt at this is to find the set of variables that has the strongest correlation to fantasy points. The assumption I'm making, for now, here is that since most wide receiver stats are sticky from season-to-season (i.e. strong correlation with themselves from year-to-year), we simply need to find the best set of variables to correlate with fantasy points.

Ideally, I would want to run this optimization in an exhuastive fashion. Regerttably, the number of permutations of stats that are fantasy relevant to the wide receiver position is a bit too large for my laptop - we are talking tens to hundreds of millions of possibilities.

So, in an effort to intelligently explore the state space, I remove stats that fluctuate from year-to-year, like touchdowns and two-point conversion stats.

The highest correlation with fantasy football from the state space I explored was 0.623:

This metric consists of receiving yards, receiving yards after the catch, air yards share, and receiving yards per team pass attempt, all normalized.

There were almost 30,000 permuations of wide receiver stats that had a correlation of at least 0.61 with fantasy points, whereas trinity was at 0.586.

For those curious of how this particular new metric appears against age, when caluclated in the same year as fantasy points:

Trinity is thrown on there for comparision. Both metrics become inaccurate at 31 on, but that's likely because there are only 47 total seasons of wide receivers who have played at least 10 games in season X and X + 1, during this time period.

This alternative metric is visably closer to the career trajectory of a wide receiver than trinity score is.

The same comparision, but against season instead of age:

Again, the new metric follows fantasy production much more closely than trinity score does. It seems that our assumption held pretty well.

So, while trinity score is a useful metric, and pretty easy to calculate, I would argue there are better options out there for creating a preidictive metric of wide receiver performance. This is one of many that is a stronger option, at least when keeping things linear.

For some other options, I'll post the top ten metrics I found that had the best correlation with fantasy points in a season.

Predictive Modeling

Now that we have this new metric, how does it compare to trinity when it comes to predictive modeling? You would expect that the new metric would perform better given the stronger correlation to fantasy points in the next season, but let's actually examine this.

We start by creating a multiple linear regression (MLR) model. This occurs in a pretty standard fashion, clean the data, split it into test and train sets, fit the model to the training data, and then evaluate it's performance on the test set.

The results of the model can be seen below:

On first glance, the model appears to do quite poorly, especially on the more elite performers. However, if we instead think about the "success" of the model as guessing in the right direction, there is some light at the end of the tunnel.

The green dots on the scatter plot represent points that were predicted in the same direction as the actual change in fantasy points. In other words, the model was correct at guessing an increase or decrease of a player's fantasy production, but, likely, incorrect at how much said change was.

Likewise, the red dots on the scatter plot represent points that were predicted in the wrong direction, compared to the actual direction of change in fantasy production.

When using these metrics, we see that the model has a success rate of about 63% on these 146 samples. I want to note that this particular model is worse at predicting higher end fantasy performers - as observed from the graph - with a success rate of 56.2% there (I define this as at least 101 actual fantasy points in a season). For those under the threshold, the accuracy was around 68.3%.

Based on the graph, a clear limitation of the model seems to be predictability range. The highest prediction is around 175 points. That's problematic and is something I need to look into further.

Now, if we compare this result to a model that uses trinity as the predictive metric instead, we get the following instead:

If you don't really see a noticeable difference, that's because there is not much change here. Distances from the expected values are larger and directional success is worse. So, yes, this version of the model is worse, but not by much.

Taking a look at our directional "success" stats, this model had an overall success rate of 61.6%. Again, the model is better at predicting the production of less relevant players, with a success rate of 63.4% (players who had under 101 actual fantasy points) and worse at predicting the production of more relevant players, with a success rate of 59.4% (player who had at least 101 actual fantasy points).

So, our expectation has been met, at least when it comes to a linear regression model. Trinity score is a solid metric, and while not the best, it is not far from better metrics.

For those interested in other models, I also ran a random forest regressor, gradient booster regressor and multilayer perceptron.

I consistently had the next most success with the random forest regressor. For those interested in the performance of the random forest regressor model, please check out the relevant section in the appendix, for both the new metric and trinity score based models.

Summary

Thanks to all who made it this far! The recap once more:

Even though most wide receiver stats are sticky, the position is difficult to predict accurately.
Trinity score does an okay job at laying out wide receiver expectations, and is slightly worse as a basis for a predictive model.
While there are many scores that perform better than trinity score, as a basis for a predictive model, it is hard to build something reliable.
Some variation of the yards after catch stat is present in the list of best alternative metrics to trinity score.
Predictive models seem to be much better at predicting the direction of change in fantasy points from year to year, than the exact number of fantasy points in the next season.

Again, thanks for reading!

Cheers,
Alex

Glossary

Term definitions below:

ay_sh: Air yards share.
tgt_sh: Target share.
Yards after catch (YAC) per reception: Yards after catch divided by the number of receptions a receiver has.
Multiple linear regression: Linear regression with more than one variable. Think y = ax + b but now with more variables: y = ax_0 + ax_1 + ax_2 + ... + b.
Random forest regressor: combines ensemble learning with decision trees to create multiple decision trees for the data. The output of the trees are averaged together for the final result. For more details, check out this blog post.
Gradient booster regressor: A technique that combines weak models to get better performance from the combination of the models. Great for data that has non-linear patterns. More details about the technique can be found in this blog post.
Multilayer perceptron: A simple neural network. Another great blog explaining the details.

Appendix

Alternative Metrics

These ten metrics had the highest correlation with fantasy points in the next season:

For stat defintions, please check out the glossary section on my previous post. Note that normalization is done via dividing all stats by their respective max value, and last refers to the value in the previous season.

Random Forest Regressor Models

The random forest regressor predictive models:

Directional success:

Overall: 60.3%
Less relevant fantasy players: 59.8%
More relevant fantasy players: 60.9%

Directional success:

Overall: 56.8%
Less relevant fantasy players: 58.5%
More relevant fantasy players: 54.7%