Forecasting Stadium Attendance Using Machine Learning Models: A Case of the National Football League Cover Image

Forecasting Stadium Attendance Using Machine Learning Models: A Case of the National Football League
Forecasting Stadium Attendance Using Machine Learning Models: A Case of the National Football League

Author(s): Yu Pang, Fengchen Wang
Subject(s): Sports Studies
Published by: Masarykova univerzita nakladatelství
Keywords: machine learning; stadium attendance forecast; Random Forest; CatBoost; XGBoost

Summary/Abstract: The study examines the use of machine learning models to forecast attendance at sports stadiums, specifically analyzing National Football League (NFL) games from 2000 to 2019, with over 5,055 regular-season games. The models, including Linear Regression, Classification and Regression Trees (CART), Random Forest, CatBoost, and XGBoost, integrate a diverse set of variables such as team performance, economic indicators, stadium characteristics, and weather conditions. Each model's accuracy and effectiveness are assessed using five statistical metrics. With a Mean Absolute Error (MAE) of 0.02 and a Root Mean Squared Error (RMSE) of 0.04, the models display high precision in predicting stadium attendance. The coefficient of determination (R²) reaches 77.27% after optimization. These figures suggest that the models, particularly Random Forest and CatBoost, are highly effective in forecasting attendance rates for NFL games. Key influences on game attendance include factors like 'stadium_name,' 'personal_income,' 'stadium_age,' and 'home_club_age', which emerge as significant predictors. This study fills a theoretical gap in the limited research on the NFL and provides valuable insights for strategic planning and decision-making in professional sports management.

  • Issue Year: 18/2024
  • Issue No: 2
  • Page Range: 147-164
  • Page Count: 18
  • Language: English
Toggle Accessibility Mode