USWNT passing – comparing positions and opponents’ FIFA rankings

Over the past couple of days I’ve been trying to figure out how to create a Tableau workbook that aggregates all our USWNT data in a similar fashion to the NWSL 2016 Tableau workbook. The main challenge has been figuring out how to best show and compare stats from USWNT that, quite frankly, are all over the place due to how varied the quality of opponents has been.

Thankfully, we’re able to use all the USWNT stats tables we’ve got in the GitHub repo and use the database.csv file, with data for all the matches in the WoSo Stats GitHub repo, to create something that can show something like passing stats adjusted for the opponent’s quality.

The visualizations for the USWNT data, for now, are the two worksheets in this Tableau workbook. Below, I’ll explain what each one is, and some more detail on how how the data was calculated and aggregated to make it easier for you to make similar visualizations.

I won’t delve too much into an actual analysis of the data in the two charts. There’s too much there to go into right now – and why have all the fun when you can do that, too? Anyways, on to the charts

Visualizing USWNT Open Play Passing Stats

First, this visualization of USWNT passing stats for the USWNT matches that we have in our database. Each mark on the chart below represents a USWNT player from a match in our database. The x-axis is her total number of open play passes attempted during that match, the y-axis is her open play passing completion percentage. The color is her designated “position” (more on this later) and the shape of the mark is whether or not the opponent, at the time, had a FIFA ranking in the top 15.

Screen Shot 2017-07-16 at 9.27.38 AM

 

Midfielders and defenders generally pass the ball more, which is to be expected. Forwards, who are often surrounded by defenders, and goalkeepers, who may often launch the ball forward, see less of the ball and have lower passing completion percentages. It’s pretty clear that differences in passes attempted and in passing completion percentage have to do with the nature of a player’s position. We need to better adjust for position.

Adjusting For A Player’s Position

This visualization shows passing stats adjusted for a USWNT player’s position by using her standard deviation from the average for USWNT players in her position.

Screen Shot 2017-07-16 at 9.52.13 AM.png

Now it’s easier to spot which players, given their “designated” position, attempted to pass the ball more than average and completed their passes at a higher percentage than average. On the other hand, it’s also easier to spot which players passed the ball less than average and completed their passes at a lower percentage than average.

To account for some outliers, in the chart below I used the filters to exclude performances from any USWNT players who played less than 30 minutes and any USWNT players who had less than 10 open play pass attempts.

Screen Shot 2017-07-16 at 10.06.09 AM.png

A few things stand out. One, it’s easier to rack up more passing attempts with a high passing completion percentage against lesser opponents, as indicated by how many more cross-shaped marks compared to circle-shaped marks are in the upper-right. And playing top opposition can drastically cut down on both, with several circle-shaped marks spread out throughout the bottom-left corner.

Players’ “Designated” Positions and Next Steps

About the positions. Players are only given one for all their matches, instead of one for each match. This means that a player like Allie Long who in this chart is classified as a “midfielder” is being misrepresented for games where she has played as a defender.

And even within positions, some further refinement could be used. Fullbacks like Kelley O’Hara and Ali Krieger, who are correctly classified as “defenders,” have a propensity towards lower passing completion percentages because, as fullbacks, they often play higher up the pitch where a completed pass is less likely. But because they’re defenders, their passing completion percentage’s standard deviation from the average for all defenders looks worse than it really is because they’re counted against centerbacks, who are also correctly called “defenders” but have some of the highest completion percentages in the game.

A next step is going to be to figure out a way to resolve that Allie Long problem and figure out, on a match-by-match basis, a player’s position for a given match. And then further breaking down some positions like defenders into fullbacks and centerbacks.

Another idea is to only show passing stats broken down by thirds of the fields. I suspect the difference in passing stats vs Top 15 opponents and non-Top 15 opponents would be even more stark when we look at the attacking third.

You can help!

This data only happens because of help from fans like you (yes, you)! The WoSo Stats project needs help to log more stats and location data for USWNT stats, and past NWSL seasons. With your help, we can get even more richer data to expand on what we know about the sport.

If you’re interested in logging data for matches (that are all publicly available on YouTube), read more here and email me at wosostats.team@gmail.com or send me a DM at @WoSoStats on Twitter. All the data logged will be publicly available on the WoSo Stats Github repo and will help me and others do more analyses like these!

Advertisements

Aerial duels in the 2016 NWSL season (through 54 matches)

In the WoSo Stats Shiny app is a section titled “Aerial Duels” that has data for how many times a player goes up for an aerial duel, and how often she wins them.  In the 2016 NWSL Season Tableau workbook, I originally didn’t include a visualization for aerial duels, but I recently created one to get a better look at how the distribution of players looks when you compare the amount of times they go up for an aerial duel per 90 minutes to the percentage of times they win an aerial duel.

You can view the “Aerial Duels” section of the Tableau visualization for yourself. As of this writing, with 54 matches logged for the season, two players, Dagny Brynjarsdottir and Natasha Kai, stand apart pretty clearly from the rest of the league for how often they are involved in an aerial duel per 90 minutes.

It of course makes sense that they’d have a lot of aerial duels; they’re both tall and are typically thrown into attacking positions high up the field. After Kai (15.6 aerial duels per 90) and Brynjarsdottir (13.6 aerial duels per 90), the rest of the field appears starting with another Portland Thorn, Lindsey Horan (10.1 aerial duels per 90).

screen-shot-2016-11-03-at-6-33-05-pm

The players with the highest aerial duel win percentage with a significant number of aerial duels per 90 (beyond the 25th percentile, the left edge of that light grey rectangle you see running parallel to the y-axis) are further back, with far less aerial duels per 90 but with generally greater defensive duties. The top four – again, with 54 matches logged so far – are Whitney Engen (82% of aerial duels won), Becky Sauerbrunn (78%), Julie King (78%), and Alanna Kennedy (72%).

Sauerbrunn and Kennedy noticeably have a very high win percentage while still being above the 75th percentile of aerial duels per 90. As is evident by looking at the chart, more aerial duels appears to correlate with a winning percentage approaching around 45%.

Finally, I looked at how each team compares. The Western New York Flash stands out for having four players -Erceg, Kennedy, McDonald, and Mewis – clustered in the top-right corner of the chart. No other team has a cluster like that.

Screen Shot 2016-11-03 at 6.54.58 PM.png

Meanwhile, have a look at the Seattle Reign. Their players are generally clustered behind the 75th percentile.

screen-shot-2016-11-03-at-6-58-57-pm

An interesting follow-up to this chart would be to break down the per 90 and win percentage by location. Each match with complete location data has the location of each aerial duel logged, so this is something that should be possible to visualize and analyze once a way of coding through the matches and sorting out the aerial duels by location is resolved.

Another more complex follow-up question is what happens after each of these aerial duels. If it went out of bounds, it was recovered by a teammate of the player who won the aerial duel, if it was cleared away, if the aerial duel resulted in a foul, and so on. This data is deep in the spreadsheet that is logged for each match and I haven’t yet figured out an easy way to do that type of analysis, but it is going to be worth digging into.

In the meanwhile, feel free to dig through the chart and have a look at this for yourself!