Phasing Into Analytics: A Primer On NHL Analytics And How Experts Use Them (Part 1)


This is part 1 of a two part series by Shayna Goldman that shares insights from six hockey analytics experts about how they integrate analytics into hockey analysis.

Analytics are becoming more deeply integrated into hockey analysis. But as with anything that strays from the traditional approach––in this case, using the “eye test” to analyze hockey––there has been significant criticism. By using the eye test, a game is analyzed solely based on what is seen, rather than looking deeper into what analytics can reveal.

Many firm believers in the eye-test struggle to accept analytics, often citing reasons that are misinformed. Analytics do not try to change the traditional perspective of hockey. Instead, analytics should be seen as an additional tool that is available to analyze play and indicate successful and unsuccessful plays. Since analytics are more integrated in hockey than ever, this primer for the season can be used to identify resources available, beyond a simple list of analytical metrics, that can help those who wish to better understand analytics.

. . . . .

Ryan Stimson, a writer at Hockey-Graphs, started the Passing Project for the 2013-2014 season. Stimson noticed there was not as much data available to understand how offense is generated in hockey, like there was in soccer. That season, Stimson tracked games himself and now has a number of volunteers tracking games to gather as much data as possible for analysis.

Tracking games manually entails using pertinent shot information from an NHL’s play-by-play file. Then while watching a previously aired game, the final three passes prior to each shot attempt (including blocked and missed shots) are recorded. Details from those passes are also noted, such as the zone, lane, and if it was a stretch pass. Some other details tracked, Stimson explained, are “whether the pass originated from behind the net, crossed the Royal Road, went back to the point, etc.”  

Once the passing data from that game is collected, Stimson can use the data in descriptive ways to explain how offense was generated. “For example, was a team consistently more successful at advancing the puck down the opposition’s left side? Through the center? Do certain defensemen prevent passes from behind the net versus their teammates?” Stimson said, adding that this information could be used by a coaching staff to understand how a team made plays to set up a chance elsewhere on the ice, giving more information than just looking at shot locations.

Additionally, Stimson has used this data to predict scoring. “If you combine this with their individual shots, you arrive at a more comprehensive player performance metric that I call ‘Primary Shot Contributions’, or a player’s combined shots and passes that lead to shots. This has been shown to be more predictive of a player’s Primary Points over the course of season.”

At the Hockey Analytics Conference at the Rochester Institute of Technology, Stimson, along with Matt Cane, used passing data to evaluate defensive play. Here, Stimson explained how the passing data can exemplify more than simply looking at the “shots against” statistic could. The data showed that the way teams defend shot assists is both “repeatable and highly predictive of future goals against.” By looking at the detailed analysis conducted by Stimson, the passing data can “explain what has happened, predict what will happen, and also advise on how teams should approach the game from a tactical and systematic approach.”

By collecting this data, Stimson is doing analysis that uncovers many details that the eye-test cannot, which could be integrated into standard hockey analysis. To put it simply, Stimson says “It can be used to identify players that facilitate shot generations and attempting to neutralize those players. If a team cannot transition or recycle back to its playmakers because a team has identified and built a game-plan around stopping them, that has value.”

Much of the data collected identifies players that are optimal to a lineup, like adding depth players to create secondary scoring. Had teams studied this data, players like Kris Versteeg, Tyler Kennedy, Brad Boyes, and Teddy Purcell likely would have been signed sooner or signed at all (Boyes and Kennedy are currently without contracts). “We all know who the great players are, but it’s in building the depth of your lineup where you can really start to see the compounding effects of a skilled roster,” Stimson said.

Although this may seem complicated, even the casual fan of statistical analysis can embrace it. Stimson says, “players that pass the puck effectively generate offense for their teammates. It’s a skill, it’s an important one, and without that data, teams, coaches, agents, and fans have an incomplete view of a player and, subsequently, could all be missing out on value in how they evaluate a player. Analytics is really about finding and exploiting market inefficiencies before other teams do. Playmaking is such an important part of hockey and players are falling through the cracks.”

. . . . .

Charlie O’Connor is an analytics columnist at Broad Street Hockey, as well as a contributor at Hockey-Graphs. O’Connor also tracks games manually. The time it takes to track a game may differ based on the project, O’Connor explained. “I found that tracking all offensive zone entries and defensive zone exits during 5-on-5 play in a single game took about six hours to complete.”

Once the data is tracked, O’Connor presents it in two ways. The first is a presentation of raw data. Often, O’Connor synthesizes this data into the traditional metrics used by the hockey community. “For example, I’ll present the total number of entries created by a specific player in ‘Time On Ice’ adjusted form, using ‘Per 60’ metrics. This helps us to determine which players are most efficient in creating entries accounting for ice time, because one player may look like he is the best entry creator on the team by raw totals, but actually just receives far more 5v5 minutes to create those entries.”

The second way O’Connor presents the data tracked is as evidence in his articles to evaluate the Philadelphia Flyers. By having manually tracked data, the narratives O’Connor builds have more depth, improving the quality of analysis.

O’Connor uses the tracked data at both the player and team level. “The team-level tracking metrics have more value in the here and now, because the majority of them have been proven to be repeatable and meaningful by the community at large. The player level data doesn’t necessarily come with that certainty. But in a sense, the player level data might be more valuable because it is so unexplored right now. There are findings that have yet to be teased out at the player level that could be really exciting.”

In August, O’Connor focused a team-level project on the Flyers’ penalty kill. Focusing solely on the penalty kill can have some challenges, because as O’Connor wrote, “Teams usually spend between 400 and 500 minutes of a season shorthanded, compared to over 4,000 minutes at 5-on-5. This makes single-season PK data fairly noisy — in terms of sample size, it’s equivalent to only about 10 regular season hockey games. A penalty kill could be fine structurally, but hamstrung by poor goaltending performances or unlucky bounces. At 5-on-5, inflated (or deflated) percentages have more time to return to proper balances. Penalty kill metrics, on the other hand, are particularly volatile.”

The project was written in a series of four pieces, focusing on the system and tactics of the penalty kill, the neutral zone and defensive zone strategies, and the recommendations for the upcoming season to be as effective as possible on the penalty kill.

 

forecheck_by_month-0

As O’Connor acknowledged, this does not analyze every element of the Flyers’ penalty kill and much more research could be done on different levels––such as the micro-statistics of individual players. However, doing this type of analysis gives an in-depth evaluation to the penalty kill in a way in which could be extremely valuable to any team.

Like many fans, O’Connor does consider the usefulness of this data. For O’Connor, it has served as an evaluation tool and data to construct and support narratives for his articles, but he acknowledges the value of this type of data past his use––including to players and teams because it is “accounting for on-ice tactics.” That was a goal of O’Connor when working on the Flyers’ penalty kill project, to “determine which tactics used by the Flyers on the penalty kill drive the best outcomes. Looking solely at team level data (like I did) is just one piece of the puzzle, though. What if one player excels in one particular formation but struggles in another? I’m certain those scenarios exist, and it’s up to analysts to find them.”

For journalists in particular, O’Connor sees the usefulness of this data. Rather than just citing observations, narratives can be constructed and supported by evidence. “For example, if a player is struggling to score goals, a columnist may write that he’s ‘not getting to the dirty areas’ in front of the net where goals are usually scored. That may be an accurate observation, but why not prove it? That’s where manual tracking comes in — we go out and count the actual events so that we’re no longer just throwing information out there in print or on television without confidence that it’s accurate.”

Analytics can become complex, and some may struggle to understand them. But as O’Connor explains, many of the metrics studied from tracked data are intuitive, “Start talking to a casual hockey fan about Corsi or PDO, and it’s a pretty hard sell without a good communicator to explain what the stats mean. But something like ‘Controlled Entry Percentage’ is easier––it’s just the percentage of all offensive zone entries that occurred with control of the puck.”

However, analytics are not always necessary for the casual fan, which even an analytics columnist understands. But, if the casual fan does “want to delve into the ‘how’ and ‘why’ of hockey,” the information is available. And unfortunately, a fan that wants to delve deeper may not get that information from a broadcast of the game, even when an analyst is a former player with “first-hand tactical knowledge.” So, by having other resources available, fans now have the option to further explore play. “I’d like to think that my work helps to educate fans that want to be educated, whether I’m looking at penalty kill formations, public advanced statistics or manually-tracked metrics.”

. . . . .

Another valuable analytics resource is the work of Micah Blake McCurdy. McCurdy is a mathematician that creates data visualizations of hockey analytics. The data used in those visualizations is collected and published by the NHL. McCurdy processes and rearranges the NHL’s data to make these visualizations. “I’ve also trained some statistical models using this data, some of what I put out is the output of those models.”

 

Although the visualizations appear complex, they are simply a “graphical representations of what happened and what is likely to happen… Anything that you could quote a statistic about can be rendered visually. Sometimes it’s purely descriptive, so that you can see which players played with one another, which players got lots of minutes, which teams were eliminated early on in the season. Other times it’s predictive, showing which teams are likely to do well or return to earth after a run of ill or good fortune.”

McCurdy’s visualizations are done at both the player level and team level. At the player level, a number of statistics are synthesized into graphs, including “teammates, competition, shot results while on the ice, penalties, pace of play, and shooting and saving percentage.” The team level can include an aggregate of the aforementioned statistics. How much rest a team has is also studied at the team level, both to show how rest has affected teams and to predict how it will affect future games. The visualizations for teams are “weighted by models to estimate how likely various teams are to win games.”

cordelia-rankings

In order to create predictive data, McCurdy utilizes statistics software like Scipy and Statsmodels. When creating a predictive visualization, McCurdy has to select which features and statistics to include and which to consider. “The simulations can be described as possible futures that are consistent with the results we’ve seen in the past; by using a great deal of them we can estimate the likelihood of certain events (like making the playoffs) as well as quantify how sure we are of that likelihood.”

For example, McCurdy provides predictions throughout the season on the likelihood of each team making the playoffs. As the season progresses, he updates the predictions with the most recent data.

The data provided by McCurdy is primarily for fans, not analysts. In fact, McCurdy explained how he is “chiefly interested in connecting with people by appealing to their visual intuition instead of their facility with numbers.” With many analytics websites focusing on delivering the data primarily through numbers, following McCurdy’s visual approach can offer a unique look into analytics. “One strong appeal of visual work is the ability to put context markers on almost every plot showing team or league averages, or any kind of ‘expected output’ indicator. This makes it much easier to keep the proper perspective on results.”

McCurdy refers to his work as “reference material, not punditry.” These depictions can replace tables of numbers for entry-level analytics users. The charts created by McCurdy range in sophistication, from simple bar plots to detailed, complex graphs. However, the complicated charts do have written explanations to help translate it. “I have written explanations for some of the more complicated charts, but generally I try to establish a visual style and grammar that’s consistent across many plots, so that the things my readers learn once carry throughout the site.”

. . . . .

Hockey analytics are easier to grasp than some would have you believe. Analytics are simply a way to measure and discuss more specific scenarios that occur on ice. They can provide detailed information about successful offensive and defensive production, predict scoring, demonstrate an individual player’s value or how to build a successful penalty kill, provide supporting evidence in building narratives, explain what happened on the ice, and create visual models that are easier to process than a chart full of numbers. Part two of this feature will explore even more insights that analytics can provide into how the game of hockey is played.