As mentioned in Part I of this multipart post, Luca Pappalardo prepared a video, for the Friends of Tracking channel in 2020, to talk about some elements of a paper related to an open Wyscout data set, and advanced statistics related to passing networks, flow centrality and player ranking.
For this post (Part III) I’m going to cover my take on Passing networks and Flow centrality.
I’ve forked the “mapping-match-events-in-Python” repo into my mmoffoot area and created a new branch called ‘englanddata’ to cover the data set of English Premier League information for the 2017-18 season.
Passing networks
This Notebook creates a player passing network for any of the matches covered in the data set. The passing network is a weighted network where nodes are players and weighted edges represent movements of the ball between players. The size of an edge is proportional to the number of passes between the players.
Some finer details on are covered in a research paper Cintia et al., The harsh rule of the goals: data-driven performance indicators for football teams, In Proceedings of the 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA’2015), 2015.
Now I wanted to continue with a passing network from the Tottenham Hotspur – Leicester City match (2018), given this was the match showing the events to this point, however it just turned out as a blank image for both teams via the Notebook, I tried another match and only got one team and finally the Arsenal – Burnley match turned out what was being looked for, so I moved over to this match.
The produced images need some time for study, even a quick look leaves me wondering what I can take from them. I also tried to gain some insight from a related paper from the authors P. Cintia, S. Rinzivillo, and L. Pappalardo, “A network-based approach to evaluate the performance of football teams,” in Proceedings of the Machine Learning and Data Mining for Sports Analytics workshop, ECML/PKDD 2015, were it mentions again that nodes are players, directed edges represent passes between players and the size of an edge is proportional to the number of passes between the players. Node 0 indicates the opponent’s goal, and edges ending in 0 node represent goal attempts. However there are no Node 0 in these passing networks.
The related video at minute 15:00 has a section that describes the passing networks too, but unfortunately there’s still not enough in there to give a hint as to what the take away is.
Flow centrality
Next up is flow centrality, which is a feature that can be computed on the passing network and in this Notebook is described as a way to capture the fraction of times that a player intervenes in those paths that result in a shot. They take into account defensive efficiency by letting each player start a number of paths proportional to the number of balls that he recovers during the match.
This concept is only lightly explained in the paper Duch et al., Quantifying the Performance of Individual Players in a Team Activity, PLoS ONE 5(6): e10937 as referenced in the Notebook and I must admit I still didn’t get it, but then a read of a source paper by Freeman LC, A set of measures of centrality based upon betweenness, Sociometry 40: 35–41, 1977 clears it all up “when a particular person in a group is strategically located on the shortest communication path connecting pairs of others, that person is in a central position”, and that same paper goes on to show measures that define centrality in terms of the degree to which a point falls on the shortest path between others and therefore has a potential for control of communication.
Dare I say Westwood (Burnley) is the betweenness man in control for Burnley.
Might be a follow up idea to classify based on how they measure cohesiveness within the team.
You must be logged in to post a comment.