USWNT passing – comparing positions and opponents’ FIFA rankings

Over the past couple of days I’ve been trying to figure out how to create a Tableau workbook that aggregates all our USWNT data in a similar fashion to the NWSL 2016 Tableau workbook. The main challenge has been figuring out how to best show and compare stats from USWNT that, quite frankly, are all over the place due to how varied the quality of opponents has been.

Thankfully, we’re able to use all the USWNT stats tables we’ve got in the GitHub repo and use the database.csv file, with data for all the matches in the WoSo Stats GitHub repo, to create something that can show something like passing stats adjusted for the opponent’s quality.

The visualizations for the USWNT data, for now, are the two worksheets in this Tableau workbook. Below, I’ll explain what each one is, and some more detail on how how the data was calculated and aggregated to make it easier for you to make similar visualizations.

I won’t delve too much into an actual analysis of the data in the two charts. There’s too much there to go into right now – and why have all the fun when you can do that, too? Anyways, on to the charts

Visualizing USWNT Open Play Passing Stats

First, this visualization of USWNT passing stats for the USWNT matches that we have in our database. Each mark on the chart below represents a USWNT player from a match in our database. The x-axis is her total number of open play passes attempted during that match, the y-axis is her open play passing completion percentage. The color is her designated “position” (more on this later) and the shape of the mark is whether or not the opponent, at the time, had a FIFA ranking in the top 15.

Screen Shot 2017-07-16 at 9.27.38 AM

 

Midfielders and defenders generally pass the ball more, which is to be expected. Forwards, who are often surrounded by defenders, and goalkeepers, who may often launch the ball forward, see less of the ball and have lower passing completion percentages. It’s pretty clear that differences in passes attempted and in passing completion percentage have to do with the nature of a player’s position. We need to better adjust for position.

Adjusting For A Player’s Position

This visualization shows passing stats adjusted for a USWNT player’s position by using her standard deviation from the average for USWNT players in her position.

Screen Shot 2017-07-16 at 9.52.13 AM.png

Now it’s easier to spot which players, given their “designated” position, attempted to pass the ball more than average and completed their passes at a higher percentage than average. On the other hand, it’s also easier to spot which players passed the ball less than average and completed their passes at a lower percentage than average.

To account for some outliers, in the chart below I used the filters to exclude performances from any USWNT players who played less than 30 minutes and any USWNT players who had less than 10 open play pass attempts.

Screen Shot 2017-07-16 at 10.06.09 AM.png

A few things stand out. One, it’s easier to rack up more passing attempts with a high passing completion percentage against lesser opponents, as indicated by how many more cross-shaped marks compared to circle-shaped marks are in the upper-right. And playing top opposition can drastically cut down on both, with several circle-shaped marks spread out throughout the bottom-left corner.

Players’ “Designated” Positions and Next Steps

About the positions. Players are only given one for all their matches, instead of one for each match. This means that a player like Allie Long who in this chart is classified as a “midfielder” is being misrepresented for games where she has played as a defender.

And even within positions, some further refinement could be used. Fullbacks like Kelley O’Hara and Ali Krieger, who are correctly classified as “defenders,” have a propensity towards lower passing completion percentages because, as fullbacks, they often play higher up the pitch where a completed pass is less likely. But because they’re defenders, their passing completion percentage’s standard deviation from the average for all defenders looks worse than it really is because they’re counted against centerbacks, who are also correctly called “defenders” but have some of the highest completion percentages in the game.

A next step is going to be to figure out a way to resolve that Allie Long problem and figure out, on a match-by-match basis, a player’s position for a given match. And then further breaking down some positions like defenders into fullbacks and centerbacks.

Another idea is to only show passing stats broken down by thirds of the fields. I suspect the difference in passing stats vs Top 15 opponents and non-Top 15 opponents would be even more stark when we look at the attacking third.

You can help!

This data only happens because of help from fans like you (yes, you)! The WoSo Stats project needs help to log more stats and location data for USWNT stats, and past NWSL seasons. With your help, we can get even more richer data to expand on what we know about the sport.

If you’re interested in logging data for matches (that are all publicly available on YouTube), read more here and email me at wosostats.team@gmail.com or send me a DM at @WoSoStats on Twitter. All the data logged will be publicly available on the WoSo Stats Github repo and will help me and others do more analyses like these!

Advertisements

Passing networks for the Seattle Reign’s 2016 season

Following up on the previous post that delved into the passing network for the Portland Thorns’ 2016 season, now it’s time for the Seattle Reign. Same approach as last time, and the Seattle Reign’s passing networks as an Excel workbook can be downloaded here from the WoSo Stats GitHub repo.

I’ll also look at how the numbers compare to the Portland Thorns’ passing network, although as I write more of these for every team it’s going to be harder to keep these comparisons within the scope of one blog post.

First things first, the first sheet in the Excel workbook, and explanations again for what we’re looking at.

srfc-passnetwork-sheet1

The rows are players passing the ball and the columns are players receiving a completed pass. The cell in the bottom-left area where the “Yanez” coumn meets the “Barnes” column, then, is the total number of passes that Yanez completed to Barnes throughout the 2016 NWSL season.

Each cell only represents completed passes. This is extremely important, because we’re missing out on data about how many times a player was actually targeted by another teammate. This data is missing because, well, it can get extremely hard, if not outright impossible, to determine both from looking at the match spreadsheet and even during a match where a missed/blocked/cleared/intercepted pass was supposed to go. Maybe in the future we, or someone else, can go back through all these matches or future matches and figure out how to do that, but for now we’re going to have to go without that. But at the very least understand that these passing numbers only represent completed passes

The darker the green, the higher the value of the cell. The whiter the cell, the closer it is to zero.

These are raw numbers for the entire season, and they don’t take into account how many minutes each player combo was actually on the field. The table below does, with each cell now representing passes completed per 90 minutes on the field that player combo was on the field.

srfc-passnetwork-sheet2

As was done with the previous post, I hid the columns for players who never were on the field with any teammates for 270 or more minutes to exclude any extremely high passing per 90 numbers that may show up merely because a few passes were exchanged during very limited minutes.

Despite that, there are two player relationships with extremely higher completed passes per 90 than anyone else – Reed-to-Kawasumi (17.1) and Solo-to-Corsie (13.0) Outside of those two, there’s a concentration of passing relationship with relatively high numbers in the upper left portion of the spreadsheet, with a few more darkly-shaded cells further down the defender columns and defender rows.

Compared to the Portland Thorn’s per 90 passing network, where that upper-left region of the spreadsheet is lighter, a greater proportion of Seattle’s completed passes were coming from defenders or going to defenders. That doesn’t necessarily mean that Seattle’s midfielders or forwards were doing less. If you look at the raw numbers in the area for midfielders-to-midfielders and midfielders-to-forwards, it actually looks like Seattle had more completed passes per 90 going on – there was just even more passing going on in the back.

Below is the same spreadsheet, but with each row (each passer’s recipient) highlighted individually.

srfc-passnetwork-sheet3

This table makes more sense if you look at the columns and look for players with a high number of very dark cells, indicating that they’re a top completed passing target for several players.

With that in mind, Barnes and Fishlock stand out. Barnes was the #1 or #2 target for the most completed passes per 90 for six different players – Kopmeyer, Fletcher, Pickett, Fishlock, Utsugi, and Winters. Fishlock was the #1 and #2 for four different players – Barnes, Corsie, Little, and Solaun.

As for the forwards, the two biggest targets appear to have been Kawasumi and Yanez, with a relatively high number of passes per 90 going her way from the midfield, other forwards, and the defense.

Below, the highlighting is flipped around and each column’s highest values are highlighted.

srfc-passnetwork-sheet4

Now, look at which rows have a higher number of darker cells, indicating that they’re a top origin for completed passes per 90 for several players.

Defenders stand out as a top origin for completed passes, as opposed to the Thorns’ passing network where those columns were a lighter shade. The Reign in general appear to have some pretty extreme differences throughout this spreadsheet, with players like Utsugi, Fishlock, and Kawasumi passing to certain teammates way more than anyone else.

Passing networks for the Portland Thorns’ 2016 season

We have all 21 of the Portland Thorns’ 2016 matches logged in match spreadsheets like these, which when they’ve got location data I’ve combed through for some valuable location data. I wanted to see how much more passing data I could get out of these match spreadsheets, even without any location data, which only a few of our matches have. Below is a quick look at the numbers for a sort of “passing network”, but without the graphics and lines and instead with just tables and some useful formatting.

The R code I created to generate a table of shared passes and a table of shared minutes is here on the WoSo Stats GitHub repo. There are comments in the code that are hopefully enough to explain how it works but I’ll delve into that in greater detail in a future blog post. For now, let’s at that data for the Portland Thorns to get a better look at how the ball was being passed around.

The Excel spreadsheet shown below, based on tables you can create from the R code mentioned above, can be downloaded here from the WoSo Stats GitHub repo (click on the “Download” button).

I’m just going to briefly go over what we see when use some Excel formulas and conditional formatting, and what it can quickly tell us about how the Thorns were passing around the ball.

Screen Shot 2017-05-08 at 9.04.07 PM

Here’s the first sheet of the Excel workbook, and there’s a few important things to understand that will be true for the following sheets as well.

First of all, the rows are the players passing the ball, and the columns are the players receiving a completed pass. So, let’s look at the bottom-left cell.”Weber” is the row, so she’s the player passing the ball, and “Betos” is the column, so she’s the player receiving the pass; therefore, that cell represents the number of passes that Weber completed to Betos during the entire 2016 season. So, just one.

Second, each cell only represents completed passes. This is extremely important, because we’re missing out on data about how many times a player was actually targeted by another teammate. This data is missing because, well, it can get extremely hard, if not outright impossible, to determine both from looking at the match spreadsheet and even during a match where a missed/blocked/cleared/intercepted pass was supposed to go. Maybe in the future we, or someone else, can go back through all these matches or future matches and figure out how to do that, but for now we’re going to have to go without that. But at the very least understand that these passing numbers only represent completed passes. So, remember that value of 1 that was where the “Weber” row meets the “Betos” column? For all we know, maybe Weber tried passing the ball back to Betos another 10 times and they were missed (probably not, because forwards usually aren’t passing the ball back to their goalie that much, but you get the idea).

Finally, the darker the green, the higher the value of the cell, just in case it isn’t obvious. The whiter the cell, the closer to zero it is. The darker the cell, the closer to the highest value it is.

Okay, now that we’ve got all that out of the way, what’s going on here? There are some extremely dark pockets in this spreadsheet, but they’re not taking into account the fact that some players were on the field together way more than other pairings. Take Amandine Henry, for example – finished the season with 48.4 passes attempted per 90 minutes and 38.3 passes completed per 90 minutes, but her row and column of shared passes is way lighter than other Thorns players simply because they played more minutes and had more time to pass to each other.

We need another table that has the number of minutes a player shared with each teammate, which is below. Writing up the code to generate this was a pain in the ass, so please admire it just for a few seconds.

Screen Shot 2017-05-08 at 9.14.44 PM

This table is diagonally symmetric and, for the purposes of this analysis, will mainly be used to calculate the per 90 passing numbers below.

Screen Shot 2017-05-08 at 9.23.35 PM.png

You may have noticed the following players are missing: Berryhill, Lofton, Pratt, Skogerboe, Williamson, and Fitzgerald. This is because for this spreadsheet I hid the columns for players who never were on the field with any teammates for 270 or more minutes. This is to exclude any extremely high passing per 90 numbers that may show up merely because a few passes were exchanged during very limited minutes.

So, now we’re looking at the, for lack of a better term, the “passes completed by the row player to the column player per 90 minutes.” Remember that “Weber to Betos” cell we were looking at, the one in the bottom left? Now it reads as 0.13 passes completed by Weber to Betos every 90 minutes.

I also added each players overall passing completion percentages for the season at the end of each row and column, and the black lines are meant to block out different position players. Finally, the grey boxes are values that had less than 270 minutes. For example, look back to Weber – she was on the field with Betos for at least 270 minutes, so that 0.13 value appears, but she was only on the field with Franch for 91 minutes, so that cell value gets greyed out.

There’s a lot to dig into here, but one thing I like looking at is how defenders move the ball to the midfielders, how midfielders move the ball to the forwards, and how the goalkeepers and defenders try to get straight to the forwards. By looking at the defender rows, it looks like Klingenberg-to-Heath and Klingenberg-to-Horan are by far the most fruitful midfielder-to-defender passing relationships. The only other defender-to-midfielder to relationship that happens as much is Sonnett-to-Henry, and keep in mind Henry only played half the season.

In the midfielder rows, where they meet the forward columns, there’s less darker colors because it’s just harder to pass the ball to the forwards, so that section of the table is just naturally going to be a lighter shade most of the time. One stat that stands out to me is how the high number of passes Shim completed to Raso, 5.08, higher than any other midfielder-to-forward combo, especially considering they were only on the field together for 536 minutes.

Now, let’s look at this table with the highlighting done a little differently. Below is the same numbers as above, but with each row highlighted individually.

Screen Shot 2017-05-08 at 9.43.30 PM.png

Look at the Betos row, for starters. The higher value in that row is the 7.19 completed passes to Menges, so that’s going to be the darkest cell in the row. Meanwhile, the lowest value of 0.13 completed passes to Weber is the lowest, whitest cell. A few rows down, Sonnet’s highest value of 5.62 completed passes to Betos is the darkest cell, while her 0.85 completed passes to Heath is the lowest.

This table will probably make most sense if you look at the columns and look for which players have a high number of very dark cells. Menges appears to have been a very frequent passing target for almost every defender. Heath and Henry had a relatively high number of completed passes from defender, midfielders, and forwards. Nadim had a high number of completed passes from midfielders and other forwards, and Sinclair looks like she was deeper down the field and had a relatively high number of completed passes from midfielders and defenders.

Finally, let’s look at this highlighted flipped around. Now, each column’s highest values are highlighted.

Screen Shot 2017-05-08 at 9.58.20 PM.png

Take a quick look at the rows and see which players were more likely to be the origin of a completed pass. Klingenberg, across the board from goalkeepers all the way up to forwards, appears to have been the origin of a relatively high number of completed passes for many teammates. Farther down the table, Allie Long and Amandine Henry were the origin for a great deal of completed passes for several defenders, midfielders and forwards.

There’s more to dig into here, and especially when we compare these raw numbers to another team’s passing network. There are three other ones I’ve created for the Seattle Reign, Western New York Flash, and the Houston Dash that can be found here on the WoSo Stats GitHub repo. In a later blog posts, I’ll look compare these to each other to see just how wildly different a team can pass the ball around. For now, I hope you’ve enjoyed seeing the rich data we can glean into passing relationships from the data we’ve got.

Morgan Brian and Sarah Killion: Using stats to differentiate midfielders

Two weeks ago, I touched a bit on open play passing stats for Ali Krieger by breaking down attempts and completion percentage by thirds of the field. Since then, I challenged myself to see how much I could dig into passing stats to try to find some differences between two players who on the face of it look very similar – Morgan Brian and Sarah Killion. They’ve both played primarily as defensive midfielders, they both pass the ball a similar amount of times, and they have almost the same passing completion percentage.

The following data is also only for 40 out of 103 NWSL 2016 matches that we’ve logged with complete location data. To see the list of matches this data represents, see the database in the WoSo Stats Github and look for all the matches with “yes” in the “location.complete” column.

As you read through the post below, please consider that this data is only possible to hard work from fans like you who have been logging matches over the past year. The WoSo Stats project needs your help to log more stats and location data for the NWSL 2016 season, for USWNT matches, and beyond. The more data we get, the better we’ll be able understand the sport. If you’re interested in logging data for matches , read more here and email me at wosostats.team@gmail.com or send me a DM at @WoSoStats on Twitter. All the data logged with be publicly available on the WoSo Stats Github repo.

Getting the passing stats

If you’re not interested in the coding aspect of this or how to get this data yourself, feel free to skip ahead to the next section. All the data used is available to download from this Tableau visualization.

The instructions for how to use the creating-stats.R file are here in the WoSo Stats Github repo. If you’re familiar with R, first things first, source this R file and then run the getStatsInBulk function with the arguments shown below:

your_stats_list <- getStatsInBulk(competition.slug = “nwsl-2016”,location = “thirds”,location_complete=TRUE,section=”passing”)

This will take about a minute. Then run the mergeMatchList function with the following arguments to get the stats table as a data frame named “your_stats”:

your_stats <- mergeStatsList(stats_list = your_stats_list,add_per90 = TRUE,location = “thirds”,section=”passing”)

In there are columns for open play passes, which in the columns are called “opPass.” Open play passes are defined as all passes that aren’t one of the following – namely, dead ball plays:

  • Throw-ins
  • Corner kicks
  • Goal kicks
  • Free kicks
  • Drop kicks or throws by the goalkeeper

A change from previous posts is the “section” argument. Instead of creating a massive stat table with all sorts of stats you may not be interested in, you can now just create a stats table for a specific type of stats (attacking, passing, possession, defense, goalkeeping). For this analysis, we’ll only need to look at passing stats, so we can just assign “passing” to the section argument.

The “your_stats” data frame is the stats table that is behind the Tableau visualization that has all the charts shown below. The Tableau viz was created with Tableau Public, and you should be able to download it yourself. For now, let’s have a look at the data.

Overall Passing Stats

For starters, let’s look at how Brian and Killion look if we just look at two very basic stats – open play passes attempted per 90 and open play passing completion percentage, sorted by total open play passes attempted per 90.

Screen Shot 2017-03-12 at 4.15.54 PM

Both Brian and Killion have nearly the same stats. Brian has 52.1 open play passes attempted per 90 minutes with an 82.3% passing completion percentage. Killion has 53.3 open play passes attempted per 90 minutes with an 84.7% passing completion percentage.

There’s a lot that could be happening deeper underneath those stats, so let’s look at that bar chart, broken down by open play passes attempted per 90 for each third of the field (defensive, middle, and attacking). Here, we begin to see some differences in where Brian and Killion’s passes are happening, and some big similarities as well compared to the players around them.

Screen Shot 2017-03-12 at 4.03.00 PM.png

Killion, per 90 minutes, attempts a couple more open play passes in the middle 3rd. Brian, meanwhile, per 90 minutes, has a few more open play passes in the attacking 3rd. Brian seems slightly more attacking-minded and Killion attempts more of her passes out of the midfield. Killion, quite simply, with the matches we have that have location data logged data, attempts more open play passes out of the middle 3rd of the field, per 90 minutes, than anyone else in the league.

Compared to almost every other played visible here, they pass the ball in open play out of the middle 3rd more times than anyone else except for Barnes, who is only ahead of Brian. They both have a very high percentage of their passes coming out of the midfield.

Now, what about the passing percentages? Below is a chart stacking, for each player, their open play passing completion percentages in each third of the field. Almost everyone’s passing completion percentage drops as they get closer to the opponent’s goal, so here relative differences are what’s interesting to look at.

Screen Shot 2017-03-12 at 4.37.37 PM

Recall that Killion had more open play passes attempted out of the middle 3rd. Now we can see that she also has a significantly higher passing completion out of the middle 3rd, 85.7%, than Brian – and almost everyone else in this list of top-16 most open play passes attempted per 90, except for Little, who has an astonishing 90.1%, and Fletcher, with whom she’s tied.

Brian, on the other hand, has a significantly higher passing completion percentage out of the attacking 3rd, 77.5% and nearly 12 points higher than Killion – and also tied with Buczkowski for highest out of everyone visible here. Do the math against Brian’s 8.2 open play passes attempted per 90 out of the attacking 3rd, and she’s good for at least 6 completed passes in that third of the field for any given game.

We’ll break down these middle 3rd and attacking 3rd passes further by breaking them down in two different ways – by the direction of the pass (backwards, sideways, or forwards) and by how many were through balls, launch balls, or crosses. That’ll help us better understand what might be behind the differences in passing percentages and how they might differ in the types of passes they attempt.

Open Play Passes by Direction

Below are bar charts now for only Killion and Brian, showing the percentage of their open play pass attempts that went forward, sideways, and backwards, for each third of the field.

Screen Shot 2017-03-12 at 6.23.57 PM

Brian and Killion have virtually the same distribution of open play passes by direction in the middle 3rd, so any differences we can glean from our stats aren’t quite going to be found here. Killion’s open play passing direction in the attacking 3rd, however, is massively different. 71% of her open play passing attempts in the attacking 3rd are going forward, compared to Brian’s 40%. It’s not clear yet, although it might be a smart guess, if these forward pass attempts are what’s bringing down her passing completion percentage. Also recall that this represents about 5.4 and 8.2 open play pass attempts per 90 in the attacking 3rd for Killion and Brian, respectively. Do the math and this means that, even with less attempts in the attacking 3rd, Killion comes out at about 3.8 forward open play pass attempts per 90 compared to Brian’s 3.3. It’s a difference of 1 more forward pass attempt every other game for Killion.

Numbers for attempts by direction are good and give insight into how Brian and Killion are trying to move the ball around but we also have data on passing completion percentages. Below are bar charts breaking down open play pass attempts by direction in the middle 3rd. Each pair of bar charts is for a different direction – backwards, sideways, and forward. The red is incomplete pass attempts, and the orange is complete pass attempts.

Screen Shot 2017-03-12 at 6.39.06 PM.png

Recall that Killion had a couple more pass attempts per 90 in this third of the field, and a significantly higher passing completion percentage, but as far as distribution of direction of passes (the previous chart) they were both very similar. Now Killion and Brian have very similar numbers of pass completed per 90 minutes for backwards and sideways passes, but there’s a significant change for forward passes. Killion is good for almost 3 more completed forward passes in the middle third.

Now let’s look at this same chart, but for the attacking 3rd where there were big differences in the distribution of passes by direction and where Brian had a significantly higher passing completion percentage.

Screen Shot 2017-03-12 at 6.48.31 PM.png

The differences in completed passes are barely above 1, but they do add up, especially considering the total number of pass attempts in this third of the field for both players are in the single digits. So that difference of 0.9 more forward pass incompletions per 90 isn’t massive, but it is chipping away at Killion’s passing completion percentage.

At this point it’s worth noting that the past few charts mean different things depending on how much a “forward pass completion,” a higher “passing completion percentage,” or more “pass attempts per 90” means to you. It intuitively seems to make sense that more of each is good, but with these two players they’ve each had higher numbers in different areas – no one appears to be significantly higher across all stats. Killion in the middle 3rd has a few more forward passes completed, a higher completion percentage, and more open play pass attempts per 90. Brian in the attacking 3rd, however, has slightly more forward passes completed, a higher completion percentage, and more open play pass attempts per 90. If you’re going to get into a discussion about which midfielder is “better” based on these stats, you also need to talk about what you expect out of a defensive midfielder. How good to you expect them to be at passing in the midfield, and – assuming attacking duties aren’t their primary responsibilities – how good do they have to be in the attacking 3rd to make up for a difference compared to someone else in the middle 3rd?

And then there’s the question of how much passing numbers should be adjusted given a team’s players, formation, tactics, and overall performance. If Killion’s passing numbers in the middle 3rd on the face of it are good enough, is there something about the way Brian’s team, the Houston Dash, plays and performs that may forgive lower numbers? The same goes for the attacking 3rd – Brian’s numbers look better, but is there something about Killion’s team, Sky Blue FC, that when taken into consideration makes her a more valuable player than Brian in the attacking 3rd? And, as far as this project is concerned, how much of this extra information is in all the data we’ve already tracked and can thus analyze ourselves?

Some of this additional information is likely sitting in all the match spreadsheets that have been logged for this WoSo Stats project – there’s the potential for further insights if we could get data on passing networks, on situations such as when a team is trailing, on matchups based on the type of players and teams a player is going up against, and likely much more.

For now, let’s look at two more types of passing data. We’ll look at completed passes that go across different thirds of the field, and special types of passes – launch balls, through balls, and crosses in the middle 3rd and attacking 3rd.

Passing Range

The chart below shows the top players by open play passes attempted per 90, with passes completed from the middle 3rd into different thirds of the field (and within the middle third) and with passes completed from the attacking 3rd back into the middle 3rd and within that attacking 3rd. We only have data for completed passes because sometimes it’s not reliably possible to figure out where an incomplete pass was trying to go – such as when it’s blocked right in front of a player trying to pass the ball and it’s not clear just how far down the field the ball was supposed to go.

Screen Shot 2017-03-12 at 7.32.28 PM

Killion overall is completing more passes within and out of the midfield, close to 5 more. The great majority of those are passes that stay within the middle 3rd, and the same is true for Brian. Brian has a few more passes completed within the attacking 3rd. Overall, there doesn’t appear to be a whole lot here to differentiate the two. They’re both obviously distinct from a lot of other players visible here, but it looks like all we can tell from this is that Killion completes more passes per 90 minutes within the middle 3rd than Brian.

Through Balls, Launch Balls, and Crosses

Finally, a look at through balls and launch balls out of the middle 3rd, and through balls and crosses out of the attacking 3rd. Numbers for both players here per 90 minutes end up being small. In the red is incomplete open play pass attempts, and in the orange is complete open play pass attempts.

 

Screen Shot 2017-03-12 at 7.52.30 PM

Screen Shot 2017-03-12 at 7.52.37 PM

Killion in the middle 3rd appears better at launching the ball forward and completing a through pass, with more completions per 90 and a higher completion percentage for each type of pass.

There’s less to see in the attacking 3rd for either player. Killion and Brian barely complete any through balls from the attacking 3rd, likely because by the time they’re in the attacking 3rd from deep in the midfield most of the opposing team’s defense is already well situated in front of the goal. Killion attempts a negligible amount of crosses, and Brian completes about one cross every other game.

Next steps

These two players were an interesting case study because of how similar they are in playing style and how good they are. I had to explore quite a bit of stats as on the face of it they were quite similar with regards to passing attempts and completion percentage, even when broken down by thirds of the field.

In the future, I’d like to do this with other NWSL players who are also considered defensive midfielders – players like Buczkowski and Winters, and others – to see just how alike everyone who plays this type of midfielder role really is. I touched on this briefly, but something like a passing network visualized, showing just who is getting all these passes, could also shed light on not just where players like Killion and Brian distribute the ball, but who they’re passing it to. Are they passing it off to mostly defenders, wingers, attacking midfielders, or straight to the forwards? There’s also curious cases where each player has lined up not quite in the defensive midfielder role but maybe somewhere further up the midfield or outside the wing – it could be possible to account for those matches. And I haven’t even added any stats related to defending, which is a whole ‘nother aspect of being a defensive midfielder that is arguably just as important as how well they pass the ball.

This is all beyond the scope of this blog post, and I hope to revisit another time. Or feel free to go after it yourself, as the data is all there in the WoSo Stats GitHub repo. For now, I hope you’ve enjoyed a look at how the data we’ve logged can dig into the differences – and similarities – between two very good players who, with very few goals and assists, don’t show up prominently on traditional stats sheets based on goals and assists but, with the stats we’ve got, show up as vital parts of the midfield.

One last thing, and one last time, the WoSo Stats project needs your help! If you’re interested in logging data for matches , read more here and email me at wosostats.team@gmail.com or send me a DM at @WoSoStats on Twitter. All the data logged with be publicly available on the WoSo Stats Github repo.

A look at Ali Krieger’s passing and defending stats in the NWSL

Yesterday, I did a bit of how-to and analysis in one post for open play passing stats in the NWSL 2016 season – at least for the matches for which we have location data (40 out of 103, as of this writing, to be exact).

Now, I’m going to work with the same dataset in combination with the NWSL 2016 rosters database we’ve got to look at a very statistically interesting type of player – the fullback. We’ll focus one highly productive example – the Washington Spirit’s Ali Krieger, one of the fullbacks in our dataset with the most open play passes attempted per 90 minutes. We’ll look at how their passing and defensive responsibilities change in each third.

The following visualizations and data are all available to interact with and download from this Tableau visualization. All the data can be calculated from our WoSo Stats GitHub repo by following the instructions in yesterday’s post for how to use the R code I’ve created.

The following data is also only for 40 out of 103 NWSL 2016 matches that we’ve logged with complete location data. To see the list of matches this data represents, see the database in the WoSo Stats Github and look for all the matches with “yes” in the “location.complete” column.

Open Play Pass Attempts by Thirds of the Field

Let’s look again at this chart from yesterday- it’s all players with a minimum of 270 minutes logged with location data by the WoSo Stats project, sorted by open play pass attempts per 90, and broken down by what third of the field a player’s passes originated.

screen-shot-2017-02-25-at-11-28-48-am

Takes just a little bit of mental math to notice that out of all the defenders up here, the two with a noticeably higher percentage of pass attempts in the attacking 3rd are O’Hara and Krieger. Those of you who follow the sport will know why – they’re both fullbacks, asked to join the attack much more often than their counterparts at centerback.

Now, using the rosters database, let’s filter out anyone who isn’t a defender.

Sticking with these passing stats one more time before we delve into other stats, I duplicated the chart above, but I filtered out anyone who isn’t a defender. And now the stacked bar charts represent the percentage of all passing attempts that were within each third of the field. In other words, we’re now looking at defenders passing attempts, now broken down by what percentage was in each third. I sorted the following by percentage of passes in the attacking 3rd, so the defenders with the highest percentage of passes in the attacking 3rd are at the top.

 

screen-shot-2017-02-26-at-12-37-56-pm

The top, for those of you familiar with the NWSL 2016 season, is full of fullbacks. Pickett, Reed, Catley, Gilliland, Hinkle, they’re all there. Reed and Pickett, the two Seattle players visible here, are notably the only defenders with more than 25% of their pass attempts in the attacking 3rd. Further down the chart, we run into more defenders who were shuffled between the fullback and centerback role throughout the matches we’ve logged, and then more and more centerbacks.

Back to Krieger. 18.8% of Krieger’s open play pass attempts were in the attacking 3rd, compared to the other fullback on her team visible here, Kleiner, with 15.3%. Krieger’s middle 3rd open play pass attempts were 60.1% of all her attempts, and her defensive 3rd open play pass attempts were the remaining 21%.

There’s a lot of different things we could look at with regards to passing stats. How is Krieger passing the ball out of her defensive 3rd, relative to other defenders? What’s she doing in the midfield with over 60% of her pass attempts? And what’s going on with those 18.8% of passes in the attacking 3rd compared to everyone who’s above her – and below her?

Open Play Passing in the Attacking 3rd

Let’s just look at those attacking 3rd open play passing attempts. Below is the same group of players from above, now charting the percentage of open play pass attempts in the attacking 3rd vs. their open play passing completion percentage in that attacking 3rd.

Out of all the defenders whose percentage of open play pass attempts in the attacking 3rd is over the 75th percentile, Krieger is among two others – Reed and Klingenberg – whose completion percentage hovers around the the 75th percentile for open play passing completion percentage in the attacking third.

Screen Shot 2017-02-26 at 1.42.34 PM.png

Recall from yesterday’s post that the median open play passing completion percentage in this 3rd of the field for all players was 60.6%. Krieger’s is 65.4%.

Defending in the Middle and Defensive 3rd

Now let’s turn to defending stats. I started off with what in the stats table are called “possession disruptions,” – successful tackles and dispossessions of the opponent. That is, instances where a defender was attempting to go 1 on 1 with an opponent and strip the ball away. Below is a chart for all defenders with at least 270 minutes logged with location data, sorted by opponent possessions disrupted per 90 minutes, and broken down by what third of the field they were in.

Screen Shot 2017-02-26 at 2.25.50 PM.png

Krieger doesn’t even show up in this list. She’s down below in the middle of the pack, not even getting more than two opposing possessions disrupted per 90.

screen-shot-2017-02-26-at-2-26-00-pm

But there’s more than one way to defend and her contribution to defense is much more apparent when we look at a different type of defending stats – “ball disruptions.” That is, interceptions, blocks, and clearances of the opponent’s ball – usually a pass attempt. Below is a chart for all defenders with at least 270 minutes logged with location data, sorted the players by ball disruptions per 90 minutes, and broke it down by what third of the field they were in.

Screen Shot 2017-02-26 at 2.32.00 PM.png

 

Krieger is not only up there at #3, but she’s also surrounded by mostly centerbacks.

Now let’s look at those disruptions in the attacking 3rd and middle 3rd, broken down by whether they were interceptions, blocks, or clearances.

screen-shot-2017-02-26-at-2-34-55-pm

screen-shot-2017-02-26-at-2-35-03-pm

Krieger is also out of the top 15 when we look at ball disruptions in the defensive 3rd, but in the middle 3rd she’s ridiculously ahead of every other defender. Even when I included all players, not just defenders, in this visualization, she was still far and away the top of the list.

Screen Shot 2017-02-26 at 2.37.36 PM.png

The number of interceptions per 90 minutes, 4.1, that Krieger gets by themselves are higher than all ball disruptions for some players. I personally think extremely highly of interceptions, as they’re instances when a defending player not only stops the ball but wins clear possession of it – essentially getting credit for a turnover in possession. To get them that high up the pitch compared to every other player might not just make her a good defender, it can also make her a dangerous attacker.

We need your help!

As was noted above, this is only 40 matches out of a 103-game NWSL 2016 season. The WoSo Stats project desperately needs your help to log more basic stats and location data for the 2016 season. The more data we get, the better we’ll understand the sport.

If you’re interested in logging data for matches (that are all publicly available on YouTube), read more here and email me at wosostats.team@gmail.com or send me a DM at @WoSoStats on Twitter. All the data logged with be publicly available on the WoSo Stats Github repo.

 

 

How to break down NWSL passing stats by thirds of the field

In this post, we’re going to look at passing stats by location.

We’ll create two spreadsheets, one with stats for all NWSL 2016 matches that have been logged with location data by the WoSo Stats project, and another with those same stats but broken down by thirds of the field.

I’ll show you the R code used to generate them, and we’ll go over some Tableau visualizations I’ve created to dig into the passing data a little further.

The instructions for how to use the creating-stats.R file are here in the WoSo Stats Github repo. If you’re familiar with R, first things first, source this R file and then run the getStatsInBulk function with the arguments shown below:

your_stats_list <- getStatsInBulk(competition.slug=”nwsl-2016″, location_complete = TRUE)

This will take about a minute. Then run the mergeMatchList function with the following arguments to get the stats table as a data frame named “your_stats”:

your_stats <- mergeStatsList(stats_list = your_stats_list, location = “none”, add_per90 = TRUE)

In there are columns for open play passes, which in the columns are called “opPass.” Open play passes are defined as all passes that aren’t one of the following – namely, dead ball plays:

  • Throw-ins
  • Corner kicks
  • Goal kicks
  • Free kicks
  • Drop kicks or throws by the goalkeeper

The columns we’re going to be primarily concerned with are those named “opPass Att per 90” “opPass Comp Pct,” and it might be useful to also look at “opPass Comp per 90.” When we break these down by thirds of the field further below, they’ll be prefaced with their respective location – so, there will be “A3 Pass Att per 90,” “M3 Pass Att per 90,” and “D3 Pass Att per 90.”

If you don’t know anything about R, don’t worry, you can just follow along with the charts below and ignore all these details about the code and spreadsheets.

The data represented in this post will be available to download from this Tableau visualization. There, you can also interact with the charts shown below.

Another fair warning: the following data only represents 40 matches out of the 103 NWSL 2016 season. They’re all the NWSL 2016 matches in the database with “yes” marked off in the location.complete column. We need more help logging data, and that help could be you!

On to the data, though. What do open play passes look like, without regard for where they came from?

Open Play Passes (without location)

This chart shows open play passing completion percentages, sorted by open play passes attempted per 90. That is, the players at the top attempted the most passes in open play per 90 minutes (take their open play pass attempts, divide it by the number of minutes they played, and multiply that quotient by 90).

screen-shot-2017-02-25-at-10-51-31-am

Here is a table showing the data behind this chart, with an added column for open play passes that were actually completed. “GP” is games played (really, the games that we’ve logged) and “MP” is minutes played.

Screen Shot 2017-02-25 at 10.53.50 AM.png

The top 15 is full of players with generally very high passing completion percentages – all are above the median of 74.9%, except for Fishlock and Krieger.

This chart is stacked with Seattle Reign players, but it’s also stacked with largely defensive-minded players. Corsie, Fletcher, Barnes, Averbuch, O’Hara, Hickmann Alves, and Krieger – nearly half the players are defenders. Defenders usually have higher passing percentages (or at least they should), and they probably see more of the ball than the rest of their teammates, so it shouldn’t be surprising that, since we sorted by open play pass attempts per 90, we got a lot of defenders, and that most of them have pretty good passing completion percentages.

How to look at open play passing stats, then, in a way that accounts for a lot of passing going on in the defensive third. What’s going in with Little? Does her passing completion percentage fall off the top 15 if we could look at her passes in the attacking third? And what about O’Hara, a player who is known to run up and down the field? What does her passing look like in the defensive, middle, and attacking thirds of the field?

To get this data, we have to run some R code again.

Open Play Passes (broken down by thirds of the field)

To get a stats table with all stats broken down by thirds of the field (attacking, middle, and defensive thirds), run this code.

your_stats_list <- getStatsInBulk(competition.slug=”nwsl-2016″, location_complete = TRUE, location = “thirds”)

your_stats <- mergeStatsList(stats_list = your_stats_list, location = “thirds”,add_per90 = TRUE)

You might be sitting there for a few minutes, but the “your_stats” data frame, a 900-column table, will have what we’re looking for.

Now, when we sort by open play passes attempted per 90 and break down passes by thirds of the field for that top 15, it becomes clearer where everything was going on.

screen-shot-2017-02-25-at-11-28-48-am

Fishlock – who, in this dataset, it should be pointed out only has 4 matches logged with location data – is far ahead of the pack when it comes to open play pass attempts, but very few are from her own defensive third. The brunt of her open play pass attempts, as it is for almost everyone seen here, are in the middle of the field, but there is a significant portion of attempts going on in the attacking third.

Another player who had a relatively low open play passing completion percentage was Krieger, and the distribution of her passes is more even. Roughly 60% of her passes were in the middle, and roughly 20% in the defensive and attacking thirds. Her passing completion percentage is probably pretty good in the defensive third, but we’ll soon have look at what it’s like in the middle and attacking third.

And then there’s Little, who had a better open play passing completion percentage by over 20 percentage points than Fishlock, and that’s with a higher percentage of passes in the attacking third (27%, compared to Fishlock’s 24%).

What this chart lacks is passing completion percentages for each third of the field. For that, we can look at a chart, similar to the first one, but for each third of the field.

Open Play Passes in the Defensive 3rd

When looking at open play passing completion percentages in the defensive 3rd, and sorting by how many open play passes were attempted (per 90) out of the defensive 3rd, the chart is exclusively defenders and goalkeepers.

Screen Shot 2017-02-25 at 11.45.50 AM.png

Unsurprisingly, the media open play passing completion percentage, at 81%, in the defensive 3rd is higher than the median for all open play passes. There’s quite a range of passing completion percentages, from over 90% for the likes of Kallman and Fletcher and at or below 70% for Pressley and D’Angelo (a goalkeeper). That’s probably more of a reflection of how they’re trying to get the ball out of their own 3rd – D’Angelo and Pressley are probably launching more speculative long balls into the midfield and attacking 3rd, while Fletcher and Kallman might be passing the ball around in the defensive 3rd much more.

That requires a deeper look at the type of passes out of the defensive 3rd, but we’ll save that for another day. Now, let’s look at this chart, but for passes in the middle 3rd.

Open Play Passes in the Middle 3rd

In the middle 3rd, when looking at open play passing completion percentages in the middle 3rd and sorting by open play passes attempted in the middle 3rd, it’s a different story.

Screen Shot 2017-02-25 at 11.53.48 AM.png

Defenders are all out of the picture now, except for Barnes, and the top 15 is now stacked with midfielders. For those of you who follow the NWSL pretty closely, you’ll also notice these are mostly defensive-minded midfielders. Killion, Brian, Winters, Zerboni, Kyle, and Colaprico are all midfielders generally known to lie deep in the field and support the defense. And it makes sense they’d appear at the top of this list, and generally with such high passing completion percentages, as they’re likely to get the ball a lot, either from the defense, other midfielders passing back, or by winning it from the opposing team.

Little is no longer the #2 player, but she is #1 when looking at passing completion percentage for this top 15. She has an impressive 90.1% passing completion percentage in the middle 3rd with 33.2 open play passes attempted per 90 minutes in that third of the field. Killion is up there, too, with an 85.7% completion percentage in the middle 3rd with 36.5 open play passes attempted per 90 minutes.

Meanwhile, the rest of this top 15 is generally at or above the median of 76.3% for passing completion percentage. Fishlock sticks out for the wrong reason – with the most open play passes attempted per 90 in the middle 3rd (42.7) but with a passing completion percentage of only 65.5%, well below the 25th percentile.

What else could be look at here? There are a lot of passes here. How good are these numbers when we look at passes going forward? How many are being launched forward, or how many are going back to the defense? That’s another analysis for another day, but it’s worth considering if simply looking at pass attempts vs. pass completion percentage is going to hide players who maybe don’t pass the ball a lot out of the midfield and don’t have highest completion percentages – but, maybe they’re more likely to complete a through ball at the expense of a higher passing completion percentage from safer passes, or maybe they’re launching the ball forward and into aerial duels that their teammates are losing but are still creating dangerous loose balls their teams can capitalize on.

The median for passing completion percentage has been dropping the further up we go up the field. It was at 81.3% in the defensive 3rd, 76.3% in the middle 3rd, and now we’re going to see how far it drops in the attacking 3rd.

Open Play Passes in the Attacking 3rd

When we look at open play passing percentages in the attacking 3rd, and sort by open play passes attempted in that third per 90, the percentages are all over the place. There’s also a lot of new names – namely forwards and more attacking-minded midfielders.

screen-shot-2017-02-25-at-12-11-07-pm

The median open play passing completion percentage in this 3rd is low, at 60.6%. That makes sense, as you’re likely not going to have an easy time moving the ball around that close to an opponent’s goal. There’s several players who still stand out, though.

Back to Kim Little, her passing completion percentage out of this third, at 76.5%, is nearly 10 percentage points lower than in the middle 3rd. But compared to the rest of the field, she’s a star, over six percentage points over the 75th percentile.

Perhaps even more impressive is Washington’s Banini, who we haven’t even seen in the top 15 by open play pass attempts per 90 until now. With 14.4 open play pass attempts in this 3rd per 90, she’s getting off a completion percentage of 83.0%. That would be above the 75th percentile even in the middle 3rd!

Fishlock is here, too, although her passing completion percentage is comparable to the rest of the field, unlike in the middle 3rd where she was relatively very low. Relatively low compared to everyone else, though, is Mathis and Leon, who attempt to pass the ball a lot in this 3rd but struggle to get half of them completed.

If we were to break this down further, we’d want to look at how many of these completed passes are staying in the attacking 3rd of if a significant amount of passes out of this 3rd are going back to the midfield. Also, what about crosses, and are those types of high-risk-high-reward passes behind Melis’ and Leon’s low completion percentage? And what about forwards like Alex Morgan and Lynn Williams, who aren’t even in this top 15? Should we even expect them to have high passing attempt numbers, or should a table like this only include fullbacks, midfielders, inside forwards, and exclude players who’s job is to shoot first?

We need your help!

As was noted above, this is only 40 matches out of a 103-game NWSL 2016 season. The WoSo Stats project desperately needs your help to log more basic stats and location data for the 2016 season. The more data we get, the better we’ll understand the sport.

If you’re interested in logging data for matches (that are all publicly available on YouTube), read more here and email me at wosostats.team@gmail.com or send me a DM at @WoSoStats on Twitter. All the data logged with be publicly available on the WoSo Stats Github repo.

It’s been a year!

It’s been a year since this WoSo Stats project went live! To be precise, it’s been a year and two days since this tweet, when I first went public with this project.

A year later, we have over 100 matches in this project’s database, and are 75% complete with logging the NWSL 2016 season.

None of this would have been possible without the incredible, hard work of the dedicated volunteers behind this project. There have been dozens that have helped out, some for one or two matches, and some for far more, but each of them have helped us better understand this beautiful game.

It’s been a humbling experience seeing how eager fans have been to help do something that hasn’t been done before in women’s soccer. I truly believe that the growth of women’s soccer could be one of the next generation’s most interesting, fascinating stories in sports. In the meanwhile, we’ve got an NWSL season to finish, and even bigger hopes for this year.

-Alfredo