Updated match timelines for 2018/19

Once again I’ve started the new season even more slowly than Harry Kane, which I’m blaming on the World Cup soaking up a lot of the time I’d usually use for building and upgrading my various data and graphics tools.

 

Today I finally got around to finishing a revised template for my match timelines, which were one of the first “expected goals” graphics I built and therefore long overdue an update. If you’re not familiar with these, they track the cumulative quality of each team’s shots over the course of a match, measured in expected goals. They also overlay the actual goals scored so that the two can be compared. If you want a bit more context, the original explanation I wrote is here.
Here’s an example of an old one:
… and here’s how they look now:
The main improvements I’ve made are as follows:
More detail on the lines

The original version binned all shots into 90 buckets – one for each minute – which meant that if a team had several shots in the same minute it’d show up as one jump. The new versions show each shot separately at its exact timestamp within the match, so it’s now easier to distinguish a flurry of attempts from one big chance. Another minor (but long overdue) aesthetic improvement rolled into this is that the lines now jump vertically for each shot rather than diagonally to the next minute.

Inclusion of injury time

As part of the original binning process mentioned above, every shot taken in added time was rolled into the 45th and 90th minutes, which often obscured a lot of late action. The timelines are now scaled to show the correct amount of added time in each half and the shots will show up individually.

Ranking the top three players

One thing I was keen to do was make the graphics larger so that they could look more crisp and make better use of modern phone screens, but I quickly realised there was room to include additional data alongside the main timeline. I’ve plumped for including a small colour-coded bar graph showing the three players who racked up the highest expected goals totals in the game and therefore posed the greatest threat.

Correcting the value of multiple shots

Previously the model that powers these timelines would simply count up the expected goals (xG) value of each shot individually and then add all these together. This works fine in most cases but when a single attack contains two or more shots – e.g. in a goalmouth scramble or after a saved penalty – then the combined xG of the shots can add up to more than 1. This doesn’t make sense as it’s basically saying that you could have scored more than one goal from a single attack, so the shot values in these scenarios have been adjusted down. I’ve used a solution that Danny Page suggested, which is to consider the percentages of each shot *not* being scored to come up with an overall probability for the group of chances resulting in a goal and spread this value proportionally over all of these shots.


A quick note on data accuracy

While I’ve done the best job I can with the data available, the xG values may not be as accurate as those created from advanced datasets like Opta, StrataBet or StatsBomb, which include things like defender positioning and the exact co-ordinates of each chance. This is the main reason behind me not creating match-specific graphics for the Premier League and other top divisions, as others are covering these in a similar way with more granular information.