Wide Receiver Sticky Stats

TL;DR

Overview

This analysis is a continuation of the current sticky stat series I have been working on. Generally speaking, the exploration of sticky stats is part of a larger effort to find reliable predictors for fantasy football performance, at each position group.

The specific goal of this writing is to find receiving stats that are, positively, consistent from year-to-year and also serve as useful indicators of fantasy performance.

The data used for the findings below is primarily from 2012-2022, unless otherwise stated.

Before diving in, I do recommend checking out some of the other pieces in the series:

A Cursory Glance

As with the other data set explorations, we start with a general overview of the wide receiver data:

The heat map above is an encouraging start! While no one stat has a super strong correlation, with itself or other stats, there is strong correlation across the board.

This is a much better start, especially in comparision to the correlation heat map of general running backs.

I want to point out that the trend of touchdowns being inconsistent, across the position groups, continues. The correlation of receiving_tds vs. receiving_tds_last is 0.54, and it is by far the weakest self-correlated year-to-year stat.

Now, if we take a look at a scatter plot for one of the strongest correlations above, Receptions vs. Receptions Last, we can see that we have some noise:

To filter the data to find information on more fantasy relevant players, we will only keep data points where players have played at least 10 games each of the current and previous seasons.

This filter yields the following heat map:

For the most part, receiving stats are still positively consistent across the board. The only correlation that takes a serious hit is receiving_tds vs. receiving_tds_last, which drops to 0.41. Touchdowns seem to be a fluky stat no matter the position in fantasy.

A visual for how noisy the touchdown data is:

To get an even better sense of what the data currently looks with the applied filter, here is the scatter plot of Receptions vs. Receptions Last:

Hovering over some points, our data quality has improved much more! We have far less noise and a higher concentration of fantasy relevant players.

Perhaps we can do better, in regards to finding fantasy relevant players. Let's add an additional filter of being at least 2nd on the depth chart - in both seasons - as a wide receiver:

While this stronger filter removes about 100 samples from the previous heat map, the correlations remain largely unchanged. It seems that, outside of touchdowns, starting wide receivers tend to improve year-over-year.

The landscape for wide receivers is far more stable than the starting running back one.

In fact, it's even more stable compared to the starting quarterback one.

Digging Deeper

When performing this analysis, I have been trying to see if there's a difference between young and old starters at the position. The idea being that at a certain point, there's a drop-off due to injuries, father time catching up, or younger, better players taking over.

For quarterbacks, the age marker for the drop-off was around 30 years old. For backs, there wasn't a clear drop-off, at least when looking at stat correlations year-over-year. However, taking a look at yards per carry and total carries against fantasy points, shows a picture where backs severely decline by and after the age of 28.

The same doesn't really seem to apply for wide receivers. The biggest performance discrepancy I could by age was at 27.

The correlation heat map for wide receivers who were 29 or younger in their current season:

The correlation heat map for wide receivers who were older than 29 in their current season:

Yes, there is a bit of a dip, but for the most part it's close. Outside of touchdowns and fantasy points, these are still pretty solid correlations from year-to-year.

It would seem that so long as your aging receiver remains a top two option on the depth chart, they should stay fantasy relevant.

Easier said than done to stay a top two option into your thirties:

The graph above shows how many players are either the wide receiver one or two on the depth chart, by age. By 30-32, the sample size has greatly declined. Receivers in a starting role at 33 and 34 still exist, but by 35 there's almost none left.

The graph below shows the same information, but with the average depth chart position as the line, and sample size on hover.

Interestingly enough, if a receiver remains a starter as they get older, they are far more likely to be the wide receiver one (i.e. climb the depth chart), provided they stay heathy (recall this data is with receivers who have played at least 10 games in a current and previous season from 2012-2022).

Fantasy Relevance

Now, the important question: which of these stats - that is positively predictable - is fantasy relevant? At the end of the day, it's great if a stat is predictable, but that does not really matter to us if there's no use for it in predicting fantasy performance.

So, in this section, we will check both the correlation of a stat to fantasy points and to its value in the previous year.

Before continuing, I want to note that the data set appears to have a couple of repeat stats. I believe these are small variations of the same stat. I'm looking into the differences now, but await an answer from a creator of the data set.

Now that's out of the way, we first start with stats from 2012-2022, these are slightly more basic than the next gen stats that we will also examine in this section. The strongest filter (played 10 games in back to back seasons and at least wide receiver two in both years) yields the following bar graph, sorted by highest (left) to lowest (right) correlation to self:

For more information on what each abbreviation stands for, please check out the glossary below.

To make more sense of the bar graph above, let's apply a filter to the bar graph above: both correlation values must be at least 0.50. This yields the following bar graph:

All of these stats have at least decent-to-strong correlation with both themselves and fantasy points. The three in particular that have above 0.70 correlation for both fantasy points and themselves year-to-year, are wopr_y, ay_sh, and tgt_sh, aka weighted opportunity rating, air yards share, and target share.

Seems like wide receivers might be the most predictable position group. At the very least, this position group - for its fantasy relevant players - has the strongest correlation values for stats against fantasy points and against themselves year-over-year.

Diving further, by using Amazon Next Gen Stats, we see the following bar graph:

Before focusing in on the more useful stats, that could serve as indicators for wide receiver fantasy performance, I want to point out two stats in the bar graph above: avg_cushion_mean and avg_separation_mean. Both appear to be at the very least noisy, but more likely irrelevant to fantasy performance.

What does this mean? Well, chances are how much space a wide receiver gets before the snap and how much space a wide receiver has when the (in)completion occurs, doesn't matter for fantasy. I found this to be quite interesting as it goes against intuition and what I have read. It's possible that these stats don't matter for fantasy performance because wide receivers coming into the NFL are already good (enough) at breaking the press and getting separation. I plan to dig deeper on this in a future write-up.

Below we have a more focused version of the previous bar graph:

There is a little bit of change between the order of the columns of this focused bar graph and the previous focused one. I would attribute this mostly to the time frame differences. This advanced stat bar graph is from 2016-2022, whereas the first focused bar graph on more "basic" stats used data from 2012-2022.

I want to end the piece of the two following line graphs. The first compares the top five self-correlating stats from the 2012-2022 bar graph to fantasy points, on a season-by-season basis:

I want to point out the combined_metric, which is the average of the non-fantasy points metrics. I strongly encourage you to utilize the interactive ability of the graph to only display the combined_metric and fantasy_points-mean lines.

To do so, simply double-click on one line in the legend, and then single-click the other. Double-click either selected line to reset the graph.

You should notice that the combined_metric follows fantasy_points-mean pretty closely, but not exactly.

The second line graph compares the top five self-correlating stats from the 2012-2022 bar graph to fantasy points, by age:

We have the same combined_metric here, and I encourage you to toggle the lines on the graph to display only the combined_metric and fantasy_points-mean lines. The combined_metric follows fantasy_points-mean even better, in this case.

This very much touches on the idea of the trinity score, which is an aggregate metric composed of yards after catch (YAC), target share and air yards share.

I would argue that, based on the correlations above, YAC probably should not be used, if we are using a metric that only contains three stats. To me, YAC probably should be replaced with weighted opportunity rating or receiving first down share. Likely the latter as the former might be too repetitive with target share and air yards.

That being said, why limit yourself to three stats when creating a predictive metric? This position has a a plethora of useful indicators, and more information can be useful, especially when creating predictive models.

I'll touch more upon this idea, along with an examinaion of the trinity score, in a follow up piece.

Summary

That wraps up this one! As always, the recap:

  • Basic wide receiver stats correlate pretty strongly with themselves year-over-year, indicating that the wide receiver position is a positively predictable one.
  • This more or less holds for starting wide receivers.
  • Receiving touchdowns are the most fluky stat from the basic ones, but are more consistent year-over-year for starting wide receivers (0.40) than for starting quarterbacks (0.36) and starting running backs (0.25).
  • Starting wide receivers have somewhat strong correlations on most of their basic stats (receptions, targets, receiving yards, etc.).
  • There isn't as strong a drop-off for wide receivers, as for quarterbacks and running backs. As long as a wide receiver is #1 or #2 on the depth chart, they are likely to produce in fantasy. 30 and on is when the number of wide receivers as the #1 or #2 becomes rare-to-extinct.
  • Wide receivers have by far the strongest correlating stats, compared to running backs and quarterbacks. This comparison is in regards to the stat against fantasy points and the stat against itself from year-to-year. So, not only are wide receiver stats the most predictable here, but they also are good indicators of fantasy performance.
  • Three of the best metrics (I would argue the three best metrics) for wide receiver predictability are: weighted opportunity rating, air yards share, and target share.
  • Cushion and separation have extraordinarily poor correlation with fantasy points. These metrics appear to be extremely poor indicators for how a wide receiver performs in fantasy and are not really predictable themselves.

Thanks to all for reading!

Have a happy New Year!

Cheers,
Alex

Glossary

Term definitons below:

  • avg_yac_above_expectation_mean: Season average for a receiver's yards after catch (YAC) compared to their expected YAC.
  • avg_yac_mean: Average yards gained after catch by a receiver.
  • avg_expected_yac_mean: Average expected yards after catch, based on numerous factors using tracking data such as how open the receiver is, how fast they're traveling, how many defenders/blockers are in space, etc.
  • catch_percentage_mean: Percentage of caught passes relative to targets.
  • avg_separation_mean: The distance (in yards) measured between a WR/TE and the nearest defender at the time of catch or incompletion.
  • avg_cushion_mean: The distance (in yards) measured between a WR/TE and the defender they're lined up against at the time of snap on all targets.
  • avg_intended_air_yards_mean: Average air yards on all attempted passes.
  • percent_share_of_intended_air_yards_mean: The sum of the receivers total intended air yards (all attempts) over the sum of his team's total intended air yards. Represented as a percentage, this statistic represents how much of a team's deep yards does the player account for.
  • racr: Receiving (yards) Air (yards) Conversion Ratio - the number of receiving yards per air yards targeted per game.
  • rtd_sh: Receiving TDs share for a player.
  • receiving_epa: Total EPA on plays where this receiver was targeted.
  • depth_team_mean: The averge depth chart position for a player on a roster. Lower is better (1 is the starter).
  • dom: Dominator rating. "Displays the percentage of team yards and touchdowns a specific player accounts for. The idea is that the higher the number, the more dominant that player was for his respective team" - PFF.
  • yac_sh: Yards after catch share.
  • w8dom: A weighted version of dom that favors receiving yards over TDs.
  • ppr_sh: PPR fantasy points share.
  • ry_sh: Receiving yards share.
  • rtdfd_sh: Receiving TDs + 1st Downs share.
  • yptmpa: Receiving yards per team pass attempt.
  • rfd_sh: Receiving 1st Downs share.
  • tgt_sh: Target share.
  • ay_sh: Air yards share.
  • wopr_x: Weighted opportunity rating. "It takes a player's target share and share of team Air Yards and combines them in a way that best predicts both PPR and standard fantasy points. The formula for WOPR is: 1.5 * Target Share + 0.7 * Share of Team Air Yards" - NBC Sports.
  • wopr_y: Another variation of weighted opportunity rating.

Note that any of the stats above with avg and mean are seasonal average values. The redundancy comes from how the data is formatted and then processed in my tooling - please ignore it.

For further digging on and curiousity about stat definitions please refer to these vignettes. The majority of the definitions for this piece come the vignettes, specifically from the season stats and next gen stats. The rest come from the nfl_data_py repository - see the table defintions under the Working with seasonal data section.