SportSQL: Interactive System for Real-Time Sports Reasoning and Visualization

1Arizona State University

IJCNLP 2025 (System Demonstrations)

SportSQL system workflow

Natural language to SQL and visualizations over live Fantasy Premier League data

Abstract

SportSQL is a modular, interactive system for natural language querying and visualization of dynamic sports data, with a focus on the English Premier League (EPL). It translates user questions into executable SQL over a live, temporally indexed database constructed from real-time Fantasy Premier League (FPL) data. The system supports both tabular and visual outputs, leveraging the symbolic reasoning capabilities of Large Language Models (LLMs) for query parsing, schema linking, and visualization selection.

Introduction

Asking complex questions over sports data usually requires writing SQL or using fixed dashboards. SportSQL lets users ask questions in plain English and get back either tables or charts. The system combines real-time FPL data, a temporally indexed PostgreSQL database, and LLM-powered NL2SQL with optional multi-step “deep research” and automatic visualization selection.

This approach supports simple lookups (“How many goals has Haaland scored?”), multi-query analysis (“Compare Haaland and Salah over the last 3 seasons”), and chart generation (“Show me a chart of top scorers”), all through a single web interface.

Key Features

Three query modes: Single NL2SQL, Deep Research, and Interactive Visualization

1. Single-Query NL2SQL — Direct translation of natural language to SQL. Fast, single-shot execution; ideal for simple questions about current season stats (e.g., “How many goals has Erling Haaland scored?”).

2. Deep Research Mode — Multi-query comprehensive analysis. Automatic query decomposition into sub-questions, historical data across multiple seasons, player comparison and trend analysis (e.g., “Compare Haaland and Salah’s offensive performance over the last 3 seasons”).

3. Interactive Visualization — Automatic chart generation. LLM-powered visualization selection, dynamic charts from query results, and a pre-built gallery of common visualizations.

System architecture

  • Real-time data: Live updates from Fantasy Premier League API
  • Temporal indexing: Historical data across multiple seasons
  • LLM integration: Support for both Gemini and OpenAI models
  • PostgreSQL backend: Efficient query execution and data storage
  • Modular design: Clean separation of concerns for easy extension

DSQABENCH: Dynamic Sport Question Answering Benchmark

To evaluate system performance, we introduce DSQABENCH, comprising:

  • 1,700+ queries with SQL programs and gold answers
  • Database snapshots for reproducible evaluation
  • Diverse query types: Simple lookups, aggregations, comparisons, temporal queries
  • Real-world complexity: Ambiguous player names, team aliases, and temporal context

Example Queries

Simple:

  • “Who are the top 5 goal scorers this season?”
  • “How many assists does Saka have?”
  • “Which team has the most clean sheets?”

Deep Research:

  • “Compare Erling Haaland and Mohamed Salah’s offensive performance over the last 3 seasons”
  • “Analyze Liverpool’s defensive statistics and trends this season”
  • “Show me players who consistently outperform their expected goals”

Visualization:

  • “Show me a chart of top scorers”
  • “Visualize the relationship between expected goals and actual goals for Haaland”
  • “Plot team standings by strength”

Screenshots

SportSQL web interface supports direct and deep-analysis modes with optional visualization.

SportSQL visualization
Team standings example

Quick Start

Prerequisites: Python 3.10+ (< 3.12), PostgreSQL 15+, Gemini or OpenAI API key.

git clone https://github.com/coral-lab-asu/SportSQL.git
cd SportSQL
conda env create -f environment.yml
conda activate sportsql
# Set up .env with DB credentials and API keys
python src/database/setup_local_db.py
cd website && python app.py --server local --port 5000

Open http://localhost:5000. For full setup (PostgreSQL, env vars, LLM config), see the README and LOCAL_SETUP.md.

BibTeX

@inproceedings{ahuja-etal-2025-sportsql,
    title = "{SPORTSQL}: An Interactive System for Real-Time Sports Reasoning and Visualization",
    author = "Ahuja, Naman and others",
    booktitle = "Proceedings of the 2025 International Joint Conference on Natural Language Processing: System Demonstrations",
    year = "2025",
    url = "https://aclanthology.org/2025.ijcnlp-demo.11",
    pages = "TBD"
}