As mentioned in Part I of this multipart post, Luca Pappalardo prepared a video, for the Friends of Tracking channel in 2020, to talk about some elements of a paper related to an open Wyscout data set, and advanced statistics related to passing networks, flow centrality and player ranking.
For this post (Part IV) I’m going to cover my take on the PlayeRank framework created by this team of researchers.
I’ve forked the “mapping-match-events-in-Python” repo into my mmoffoot area and created a new branch called ‘englanddata’ to cover the data set of English Premier League information for the 2017-18 season.
An exhaustive description of the PlayeRank framework is available in this paper Pappalardo, Luca, Cintia, Paolo, Ferragina, Paolo, Massucco, Emanuele, Pedreschi, Dino & Giannotti, Fosca (2019) PlayeRank: Data-driven Performance Evaluation and Player Ranking in Soccer via a Machine Learning Approach. ACM Transactions on Intelligent Systems and Technologies 10(5).
This Notebook builds player rankings from match events, the following steps are required:
- compute feature weights (learning)
- compute roles (learning)
- compute performance scores (rating)
- aggregate performance scores (ranking)
It doesn’t take long to run through the [In] steps of the Notebook and for the English data you end up with Figure 1 as seen below.
The visual output from the Notebook is interactive which is great as you can hover over the points to catch the name. For example in the striker role H. Kane is the outlier (at the top), S. Augero second. There’s even a drop down menu to do a comparison.
The positions are an interesting element to this ranking systems which is based on a role matrix.
And the top players from English Premier League 2017-2018 for each role position
- Position 0. = H. Kane
- Position 1. = L. Milivojevic
- Position 2. = N. Monreal
- Position 3. = D. Janmaat
- Position 4. = S. Mane
- Position 5. = M. Salah
- Position 6. = N. Otamendi
- Position 7. = J. Stephens
If I look at the PFA Premier League Team of the Year for 2017-18, Otamendi, Kane and Salah were named in it and also appear here in PlayeRank, but none of the rest. I wonder how D. Silva and K. De Bruyne both of whom are in the PFA team, missed out in this PlayeRank framework.
Overall over 4 posts I can say this is a great Jupyter Notebook, firstly to really learn about Jupyter Notebooks, and secondly to be able to see the structure and how to use WyScout data. It is so important given this data set is used by so many tops clubs for the scouting, analyses and recruitment of players.
I got caught a little on the passing networks, and the flow centrality but certainly a thread of more investigation on a measure of cohesiveness within the team, would be a nice continuation of this topic.
The player ranking and the full explanation of the PlayeRank Framework was fantastic and a joy to read and interact with.
You must be logged in to post a comment.