Projekte pro Jahr
Abstract
Data-related analysis in football increasingly benefts from Big Data approaches and machine learning methods. One relevant
application of data analysis in football is forecasting, which relies on understanding and accurately modelling the process of
a match. The present paper tackles two neglected facets of forecasting in football: Forecasts on the total number of goals and
in-play forecasting (forecasts based on within-match information). Sentiment analysis techniques were used to extract the
information refected in almost two million tweets from more than 400 Premier League matches. By means of wordclouds
and timely analysis of several tweet-based features, the Twitter communication over the full course of matches and shortly
before and after goals was visualized and systematically analysed. Moreover, several forecasting models including a random
forest model have been used to obtain in-play forecasts. Results suggest that in-play forecasting of goals is highly challenging, and in-play information does not improve forecasting accuracy. An additional analysis of goals from more than 30,000
matches from the main European football leagues supports the notion that the predictive value of in-play information is highly
limited compared to pre-game information. This is a relevant result for coaches, match analysts and broadcasters who should
not overestimate the value of in-play information. The present study also sheds light on how the perception and behaviour
of Twitter users change over the course of a football match. A main result is that the sentiment of Twitter users decreases
when the match progresses, which might be caused by an unjustifed high expectation of football fans before the match.
application of data analysis in football is forecasting, which relies on understanding and accurately modelling the process of
a match. The present paper tackles two neglected facets of forecasting in football: Forecasts on the total number of goals and
in-play forecasting (forecasts based on within-match information). Sentiment analysis techniques were used to extract the
information refected in almost two million tweets from more than 400 Premier League matches. By means of wordclouds
and timely analysis of several tweet-based features, the Twitter communication over the full course of matches and shortly
before and after goals was visualized and systematically analysed. Moreover, several forecasting models including a random
forest model have been used to obtain in-play forecasts. Results suggest that in-play forecasting of goals is highly challenging, and in-play information does not improve forecasting accuracy. An additional analysis of goals from more than 30,000
matches from the main European football leagues supports the notion that the predictive value of in-play information is highly
limited compared to pre-game information. This is a relevant result for coaches, match analysts and broadcasters who should
not overestimate the value of in-play information. The present study also sheds light on how the perception and behaviour
of Twitter users change over the course of a football match. A main result is that the sentiment of Twitter users decreases
when the match progresses, which might be caused by an unjustifed high expectation of football fans before the match.
Originalsprache | Englisch |
---|---|
Aufsatznummer | 23 |
Zeitschrift | Social network analysis and mining |
Jahrgang | 12 |
Ausgabenummer | 1 |
Seiten (von - bis) | 1-15 |
Seitenumfang | 15 |
ISSN | 1869-5469 |
DOIs | |
Publikationsstatus | Veröffentlicht - 12.2022 |
Fingerprint
Untersuchen Sie die Forschungsthemen von „A big data analysis of Twitter data during premier league matches: do tweets contain information valuable for in‑play forecasting of goals in football?“. Zusammen bilden sie einen einzigartigen Fingerprint.Projekte
- 1 Abgeschlossen
-
Nutzung von Big Data Analysen in Vorhersagemodellen im Sport
18.09.17 → 21.03.22
Projekt: Eigenfinanziert