Data Rant: assessing Mourinho’s transfer strategy
It is often beneficial to look at the Premier League as a whole to see where Manchester United stands and to identify any general trends. Last season was an unmitigated disaster, even with the FA Cup victory, and lessons should be learned to avoid the further ‘Liverpoolization’ of the club. In earlier Data Rant columns, statistical theory was not strictly observed – emphasizing intuition and broad trends above technicalities. However, with more advanced techniques we can get a more precise picture.
In this column we use data from last season as recorded in whoscored.com. Only counting the players who played at least 1,000 minutes for his club, these players have been categorized into defence, midfield and attack and their whoscored ratings have been averaged.
First we look at points versus number of players who played more than 1,000 minutes, average defensive ratings, average midfield ratings and average attack ratings, as shown below.Significance: 0 = *** 0.001 = ** 0.01 = * 0.05 = . 0.1 = 1
The star next to the p value (the more the better) indicates statistical significance. Estimated figures show how strong the relationship between points and a particular variable is – notice that at 44.184, the relationship between points and defence is almost twice that between points and attack. Indeed, “defence wins championships” is a soundbite that resonates in the data.
“No1000” represents the number of players who played for more than 1,000 minutes for each team. Much of Leicester City and Tottenham Hotspurs’ success last season has been attributed to each manager’s limited rotation. There is little evidence to suspect that this is actually the case.
That midfield strength is not important in earning points may be surprising. At the same time, Tottenham and Leicester arguably punched above their weight be adopting a direct style. Arsenal was also less ideologically attached to the tiki taka philosophy than in the past. In fact Sir Alex Ferguson’s United side won the Premier League many times under with a weak midfield. While reinforcing the engine room is desirable, of course, the data shows that the Reds’ focus should be on strengthening defence and attack.
One of the key assumptions in linear regression being accurate is that variables are not related to each other. Football is a team game and it would have surprised few had defence, for example, been related to midfield. This is not the case. VIF is a measure of linear relationships between predictor variables – a VIF above 10 is a sign of trouble and the VIF figures are all around 2, as shown below.
This means that Mourinho’s decision to bring in a defender will not improve United’s midfield. To build the United squad, the new manager will have to bring in players in each of the key areas.
There are no obvious outliers in the data. While there is no consensus among statisticians, usually a Cook’s distance must be at least 0.5 for a data point to be considered an outlier. That is, all teams performed according to their squad strength last season, though it is interesting that United has the highest Cook’s distance in the top seven. This hints at United’s lack of consistency, which based on the performances is a correct observation.
We have worked with the points tally so far, now we can delve deeper. It is obvious, but to win games goals must be scored, while conceding fewer than the number scored. Indeed, a regression between points and goals scored and goals conceded, in the table below, reveals that they are equally important and statistically significant. The estimate for goals scored is positive, while that for goals conceded is negative. The intercept of 67.40 may be thought of as the “base” amount of points. If a team wins .51 points every time it scores and loses .81 points every time they let in a goal.
We now look at goals scored and conceded individually. Goals scored versus the number of players who played more than 1,000 minutes, average defensive ratings, average midfield ratings and average attack ratings. Here midfield strength is highly important, as show in the table below. Attacking strength is less important – even less important than number of players who played more than 1,000 minutes. Chance creation and team chemistry are important in scoring goals.Significance: 0 = *** 0.001 = ** 0.01 = * 0.05 = . 0.1 = 1
This is in line with previous Data Rant pieces: this column has long argued that chance creation is important, while finishing is a skill that is overrated.
Mourinho is not known for significant rotation and this trait may help improve United’s attacking prowess by default. To improve United’s firepower, however, a midfielder or two must be brought into the squad.
Bournemouth looks to be an outlier but United is decidedly not. Tactics have not been explicitly accounted for here, but the impression is that United simply lacked midfield quality, as in the table below.
United’s midfield is significantly weaker than the defence and attack. After all, United can now count on Marcus Rashford and Anthony Martial, while Memphis Depay may have a better second season and Zlatan Ibrahimovic looks set to arrive. United’s average rating in attack will likely improve, but the current United midfield lacks guile. One wonders whether Ed Woodward should have done more to secure Renato Sanches, who is now bound for Bayern Munich.
We now move on to goals conceded versus the number of players who played more than 1,000 minutes, average defensive ratings, average midfield ratings and average attack ratings.
This is perhaps the most surprising result, shown in the table below. Defensive and attacking strength are key in preventing goals. One interpretation is that the threat of a counter-attack can pin back opposition and prevent them from committing wholeheartedly to attack.Significance: 0 = *** 0.001 = ** 0.01 = * 0.05 = . 0.1 = 1
United’s forward line will only improve next season so the defence should hold. Ideally, a senior centre-back should join Eric Bailly at Old Trafford, and a proper right-back is also important. The priority, however, is the engine room. Mourinho is the ideal man to foster team chemistry necessary for an effective forward line, but it should be noted that he had Frank Lampard and Cesc Fabregas at Chelsea, Wesley Sneijder at Inter and Mesut Ozil at Real Madrid to orchestrate his forwards. Nobody of the type current resides at Old Trafford.