A big data analysis of Twitter data during premier league matches: do tweets contain information valuable for in‑play forecasting of goals in football?

Fabian Wunderlich*, Daniel Memmert

*Korrespondierende*r Autor*in für diese Arbeit

Publikation: Beitrag in FachzeitschriftZeitschriftenaufsätzeForschungBegutachtung

Abstract

Data-related analysis in football increasingly benefts from Big Data approaches and machine learning methods. One relevant
application of data analysis in football is forecasting, which relies on understanding and accurately modelling the process of
a match. The present paper tackles two neglected facets of forecasting in football: Forecasts on the total number of goals and
in-play forecasting (forecasts based on within-match information). Sentiment analysis techniques were used to extract the
information refected in almost two million tweets from more than 400 Premier League matches. By means of wordclouds
and timely analysis of several tweet-based features, the Twitter communication over the full course of matches and shortly
before and after goals was visualized and systematically analysed. Moreover, several forecasting models including a random
forest model have been used to obtain in-play forecasts. Results suggest that in-play forecasting of goals is highly challenging, and in-play information does not improve forecasting accuracy. An additional analysis of goals from more than 30,000
matches from the main European football leagues supports the notion that the predictive value of in-play information is highly
limited compared to pre-game information. This is a relevant result for coaches, match analysts and broadcasters who should
not overestimate the value of in-play information. The present study also sheds light on how the perception and behaviour
of Twitter users change over the course of a football match. A main result is that the sentiment of Twitter users decreases
when the match progresses, which might be caused by an unjustifed high expectation of football fans before the match.
OriginalspracheEnglisch
Aufsatznummer23
ZeitschriftSocial network analysis and mining
Jahrgang12
Ausgabenummer1
Seiten (von - bis)1-15
Seitenumfang15
ISSN1869-5469
DOIs
PublikationsstatusVeröffentlicht - 12.2022

Fingerprint

Untersuchen Sie die Forschungsthemen von „A big data analysis of Twitter data during premier league matches: do tweets contain information valuable for in‑play forecasting of goals in football?“. Zusammen bilden sie einen einzigartigen Fingerprint.

Zitation