Expected goals league tables, 23-25 Aug 2019
I’ve had a few requests for an ‘expected goals league table’ recently so I thought that it would make a good quick project. If you’re not familiar with the concept, it’s basically calculating the league table using the expected goals (i.e. the quality of chances) scored and conceded by each team rather than the actual goal tally.
The idea behind this is that it gives a potentially fairer assessment of how each team is doing, based on how a hypothetical average team would have fared from the chances they created and allowed. As the expected goals model isn’t perfect, we can’t use these tables to say with certainty that a given team has been lucky (or unlucky) but it’s safe to say that significant differences tend not to be sustainable over the long run.
There were two main challenges that I came up against here. The first was how to compare these with the actual league tables without making the whole thing too cluttered. I also wanted to have at least one chart element, with points being the obvious choice and a nod towards the Cann table, which ate into the available space. I decided to take a similar approach to Understat and show the differences between the two tables using smaller, colour-coded text i.e. a green number means they look better in the data than in real life, while a red number means they look worse.
The second issue that came up was how to handle draws, as having two teams finish a match with exactly the same expected goals tally is extremely rare. The way I’ve tackled this is fairly simple – if you classify any match in which the teams’ expected goals totals are within 1/3 of a goal of each other as a draw, then you end up with a proportion of draws which is pretty close to the long-term average. I’m sure there are more mathematically rigorous ways to approach this, but this feels quite neat to me.
Anyway, here’s how the table currently looks for each division:
It’s still very early days in the top flight with just three games played. Interestingly Man Utd would join neighbours City at the top of an expected goals table right now despite only winning one of their opening three games: they out-created Crystal Palace and Wolves but only secured a point from those two matches. Arsenal and Leicester meanwhile would be sitting much lower if chance quality rather than goals scored was used to settle games (at least as far as my model can tell).
There are some huge swings in the second tier: it looks like QPR, Brentford, Derby and Stoke have been performing far better than their results suggest, with the Potters’ defence in particular having conceded far more regularly than the data suggests that it should have. There are some suspected overachievers too, with Bristol City and Swansea both seven points better off in the ‘real’ table than my model would expect.
The third challenge I neglected to mention above was sprung on me by the chaos in League 1 this season – as we’ve got two teams on negative points, I had to program in a way for the bars to be negative. Portsmouth and Southend look to be the biggest underachievers so far, with each looking to be about four points worse off in real life than if matches were settled by expected goals. Blackpool and Rochdale meanwhile may be in an unsustainably high position, as between them they’ve avoided defeat seven times in matches where they’ve been outcreated by more than a third of a goal.
What stood out for me in League 2 is the amount of green in the ‘for’ column and corresponding amount of red in the ‘against’ column. I checked and there’s only been an average of 2.3 goals per game in the division so far (compared with 2.6 last season) so hopefully there’ll be some bigger-scoring weeks to follow. The top five teams in this table all look to be underachieving so far, having each dominated at least one match they didn’t win by a sufficient margin to have done so. Forest Green and Macclesfield meanwhile look to have had a few fortunate results.