What the hell is an Expected Goal?
xG, Field Tilt and the numbers reshaping football
In every sport worth following, the strongest team doesn’t always win. And in some sports, with football being a prime example, not only does the strongest team not always win, but neither does the team that played better. But what does “playing better” even mean, though? Some people have tried to answer that question, developing metrics that put numbers to what, until recently, were just gut feelings and intuitions.
What follows is a cheat sheet for sounding football-smart at the pub, in the locker room or at the office.
Expected Goals (xG)
The idea behind xG is fairly simple, and can be summed up with one question: given the conditions under which this shot was taken, how likely is it to end up in the net?
Every data provider in football has its own model for calculating xG, but all of them train on a common set of variables: distance from goal, angle, body part used, the type of pass that preceded the shot and the number of defenders between the shooter and the goal.
The simpler models use binary logistic regression on these variables, while more sophisticated ones, like StatsBomb’s, use gradient boosted tree algorithms such as XGBoost or LightGBM. Either way, the output is a value between 0 and 1.
For reference: penalties are converted roughly 75% of the time, so the xG assigned to a penalty is 0.75.
The interpretation is straightforward: a team that scores fewer goals than the xG it accumulated in a match wasn’t clinical enough. Conversely, a team that scored more goals than its xG was either very sharp or very lucky.
Expected Assists (xA)
If a pass leads to a goal, it’s an assist. If it doesn’t, it isn’t. This creates distortions in how we evaluate players, because it ignores pass quality entirely. A player who plays 10 passes each worth 0.35 xG (good chances, none converted) ends up with 0 assists. Another who plays a 0.05 xG pass that somehow goes in off the goalkeeper gets credited with 1 assist. xA corrects this distortion by weighting each pass according to the quality of the opportunity it creates.
There are two interpretations of xA. The first, the shot-centric version, credits the passer with the xG of the resulting shot. The second, the pass-centric version, estimates the probability that a pass made under certain conditions will lead to a goal. This version requires training a model on a large historical dataset, much like the process used for xG.
Passes per Defensive Action (PPDA)
How do you measure the quality of a team’s pressing? Given that during the pressing phase the team doesn’t have the ball, one approach is to count how many passes the opposition manages to complete before losing possession. That’s PPDA: the average number of passes the opponent completes before being stopped by a tackle, interception, foul or mistake. A low PPDA means a team is pressing effectively; a high value means they’re giving the opposition too much room. As a benchmark, values below 10 are generally considered good.
Note: some implementations of PPDA only count passes in specific areas of the pitch, to exclude actions in zones that are less relevant.
Field Tilt
Field Tilt measures where on the pitch the game is actually being played. To calculate it, you look at the attacking third of the pitch and count the completed passes in that zone. You then divide that number by the total passes completed by both teams in the same area, giving you a measure of territorial dominance. The team with the higher Field Tilt is the one that spent more time in the opposition half.
Everything beyond this point is unnecessary and might even ruin the game.
Pitch Control
If the ball were in a given area of the pitch, which team would get there first? Pitch Control answers that question. While a spectator could answer it intuitively, implementing it in a model is considerably more complex than it sounds.
The basic idea: at every moment in a match, the pitch is divided into a grid of zones, and for each zone the model calculates which team could reach it first, factoring in the current position and speed of every player on the pitch. The zone is assigned to whichever team gets there first, according to the model. Aggregating all zones gives you a percentage of pitch controlled by each team at any given moment.
Unlike Field Tilt, which is a cumulative measure across the full match, Pitch Control is dynamic and changes from second to second.
Expected Threat (xT)
Despite the name echoing the first two metrics, xT is conceptually closer to Pitch Control. To calculate it, the pitch is divided into a grid of zones, and each zone is assigned a base value representing the historical probability of scoring when in possession in that area. When a player moves the ball from one zone to another, either by passing or dribbling, the xT of that action is the difference in value between the two zones. The sum of all these differences over ninety minutes gives the team’s total xT for the match. The team with the higher xT is the one that created the most dangerous opportunities, regardless of the final scoreline.


