A modular, interactive system for natural language querying and visualization of dynamic sports data, with a focus on the English Premier League (EPL).

SportSQL translates user questions into executable SQL over a live, temporally indexed database constructed from real-time Fantasy Premier League (FPL) data. It supports both tabular and visual outputs, leveraging symbolic reasoning capabilities of Large Language Models (LLMs) for query parsing, schema linking, and visualization selection.
π Paper: SPORTSQL: An Interactive System for Real-Time Sports Reasoning and Visualization
π Project page: https://coral-lab-asu.github.io/SportSQL/
π Demo / Code: https://github.com/coral-lab-asu/SportSQL
To evaluate system performance, we introduce DSQABENCH, comprising:
# Clone the repository
git clone https://github.com/coral-lab-asu/SportSQL.git
cd SportSQL
# Create conda environment
conda env create -f environment.yml
conda activate sportsql
# Or use pip
pip install -r requirements.txt
# Set up PostgreSQL (macOS)
brew install postgresql@15
brew services start postgresql@15
# Configure environment variables
cp .env.example .env
# Edit .env with your database credentials and API keys
# Initialize local database with FPL data
python src/database/setup_local_db.py
# Start the Flask application
cd website
python app.py --server local --port 5000
# Open browser to http://localhost:5000
"Who are the top 5 goal scorers this season?"
"How many assists does Saka have?"
"Which team has the most clean sheets?"
"Compare Erling Haaland and Mohamed Salah's offensive performance over the last 3 seasons"
"Analyze Liverpool's defensive statistics and trends this season"
"Show me players who consistently outperform their expected goals"
"Show me a chart of top scorers"
"Visualize the relationship between expected goals and actual goals for Haaland"
"Plot team standings by strength"
SportSQL/
βββ src/ # Core source code
β βββ database/ # Database layer (PostgreSQL)
β βββ llm/ # LLM integration (Gemini/OpenAI)
β βββ nl2sql/ # Single-query NL2SQL
β βββ deep_research/ # Deep research mode
β βββ visualization/ # Chart generation
β
βββ website/ # Web interface
β βββ app.py # Flask application
β βββ static/ # CSS, JS, images
β βββ templates/ # HTML templates
β
βββ data/ # Dataset (CSV files)
βββ docs/ # Documentation
βββ scripts/ # Utility scripts
βββ benchmarking/ # Evaluation scripts & results
βββ update_player_mappings/ # Ground truth tools
See STRUCTURE.md for detailed documentation.
Create a .env file in the project root:
# PostgreSQL Configuration
LOCAL_DATABASE_HOST=localhost
LOCAL_DATABASE_PORT=5432
LOCAL_DATABASE_USER=your_username
LOCAL_DATABASE_PASSWORD=your_password
LOCAL_DATABASE_NAME=postgres
# LLM Configuration (choose one or both)
# Gemini (default)
API_KEY=your_gemini_api_key
GEMINI_MODEL=gemini-2.0-flash
# OpenAI (optional)
OPENAI_API_KEY=your_openai_api_key
GPT_MODEL=gpt-4o
# Use Gemini (default)
python website/app.py --server local
# Use OpenAI
python website/app.py --server local --llm openai
See docs/LLM_USAGE.md for detailed LLM configuration.
Run the evaluation pipeline on DSQABENCH:
# Evaluate the full pipeline
python scripts/evaluate_pipeline.py
# Test specific components
python scripts/test_evaluation.py
# Run benchmarking scripts
python benchmarking/scripts/llm_sql_evaluator.py
If you use SportSQL or DSQABENCH in your research, please cite:
@inproceedings{ahuja-etal-2025-sportsql,
title = "{SPORTSQL}: An Interactive System for Real-Time Sports Reasoning and Visualization",
author = "Ahuja, Naman and others",
booktitle = "Proceedings of the 2025 International Joint Conference on Natural Language Processing: System Demonstrations",
year = "2025",
url = "https://aclanthology.org/2025.ijcnlp-demo.11",
pages = "TBD"
}
# Test imports after reorganization
python test_imports.py
# Run evaluation tests
python scripts/test_evaluation.py
# Refresh local database with latest FPL data
python src/database/setup_local_db.py
# Update specific player data
python scripts/update_db.py
The modular architecture makes it easy to extend:
src/nl2sql/ or src/deep_research/src/llm/wrapper.pysrc/visualization/src/database/operations.pyWe welcome contributions! Please:
git checkout -b feature/amazing-feature)git commit -m 'Add amazing feature')git push origin feature/amazing-feature)This project is licensed under the MIT License - see the LICENSE file for details.
For questions or issues:
If you find SportSQL useful, please consider giving it a star β!