A deep dive into the machine learning models, data pipelines, and AI systems behind our football predictions.
Everything in GoalMind LIVE starts with data. We use the API-Football service as our primary data provider, which gives us access to live match events, historical results, squad information, and statistics for over 176 football competitions worldwide.
During live matches, our poller fetches match events every 15–30 seconds. This includes goals (with scorer, minute, and assist), yellow and red cards, substitutions, and VAR decisions. Each event is stored with deduplication logic to prevent false goal notifications from VAR reversals.
Before each match, we fetch team lineups, confirmed starters, injury reports, recent form (last 5–10 matches), head-to-head records, and expected goals (xG) from previous fixtures. This data is used to build the AI analysis context.
We maintain a SQLite database of match results, team form, and prediction accuracy metrics. This data trains and validates our GoalSoon model and provides the head-to-head statistics displayed in match previews.
Squad lists, player positions, age, and nationality are refreshed weekly. This powers our squad viewer and ensures AI analyses have accurate information about which players are available for selection.
GoalSoon is our flagship machine learning model. It answers one specific question during live matches: how likely is a goal in the next 10 minutes?
This "next goal" probability is more actionable than a full-match prediction because it reflects the current state of the game, not just the pre-match expectations.
GoalSoon considers the following signals to compute its probability estimate:
Goals in football are not evenly distributed across 90 minutes. Statistical analysis of hundreds of thousands of matches shows a characteristic "time risk curve" — goals are more likely in certain phases (30–45 min, 60–75 min, and injury time). GoalSoon uses this empirical distribution as a baseline and adjusts it based on live game state.
The number of shots, on-target shots, and blocked shots in recent minutes (last 5 and last 15) are strong leading indicators of goal probability. A match with 4 shots on target in the last 5 minutes is statistically far more likely to produce a goal than one with 0 shots.
The current score dramatically affects how teams play. A team trailing by one goal with 10 minutes remaining pushes more aggressively, creating higher-risk defensive situations. GoalSoon incorporates score line as a contextual multiplier on attacking intent.
Contrary to the "lightning doesn't strike twice" intuition, matches that have produced recent goals tend to have elevated goal probability in the following minutes — teams are less settled, tactical shape is disrupted, and emotional momentum can produce quick follow-up goals or equalizers.
The quality gap between teams affects match dynamics. High-quality teams facing weaker opposition in a tight match create sustained pressure waves that GoalSoon captures through the ELO differential as a background probability adjustment.
GoalSoon outputs a probability percentage for home and away teams independently. These are displayed as coloured circles on live match cards:
ELO is a mathematical rating system originally developed for chess, adapted for football to measure the relative strength of teams based on match outcomes.
Every team starts with a baseline ELO rating. After each match, ratings are updated based on two factors: the result (win/draw/loss) and the expectation going in. Beating a much stronger team produces a large rating gain; beating a weaker team produces a small one. Losing to a weaker team causes a significant drop.
For the FIFA World Cup 2026, we maintain ELO ratings for all 48 participating nations. These ratings are updated after every World Cup fixture. We then run Monte Carlo simulations — repeating the entire tournament 10,000 times — to calculate each team's probability of advancing from the group stage, reaching the knockout rounds, and ultimately winning the tournament.
The written match analysis in GoalMind LIVE is generated by Claude, Anthropic's AI assistant. But it is not simply "write something about this match" — we feed Claude a carefully constructed context package for each analysis.
For each match analysis, our system compiles the following information into a structured prompt:
We use a structured prompt that instructs Claude to focus on tactical insight rather than generic football commentary. Analyses that mention specific player form data, explain the tactical implications of lineup changes, or reference historical patterns between these teams produce more valuable insights than vague match previews.
We also track AI analysis quality over time: after matches, we compare Claude's pre-match narrative with the actual match events to identify where the model's analysis was insightful and where it missed the mark. This informs how we structure future prompts.
We believe in transparency about our prediction performance. GoalMind LIVE tracks prediction accuracy for GoalSoon outcomes over rolling time windows.
It is important to understand what "accuracy" means in a probabilistic prediction system. A prediction of "65% chance of a home win" is not wrong if the away team wins — that 35% probability had to happen sometimes. Accuracy in our context is measured using Brier scores and calibration metrics: over hundreds of predictions, does a 65% probability actually happen around 65% of the time?
We continuously refine the GoalSoon model using this feedback loop, adjusting feature weights and probability calibration based on observed outcomes.
GoalMind LIVE is built on top of the following external services:
GoalSoon live probabilities, AI analyses, and ELO predictions are all available free in the GoalMind LIVE app.
Open GoalMind LIVE →