Building an AI Fantasy Assistant: Reverse Engineering ESPN's API

I built an AI-powered fantasy football assistant with a natural language chat interface.

Not separate tools you have to choose between.

A conversational system where you ask questions in plain English, and the AI automatically routes to the right tool, fetches the data it needs, and gives you actionable analysis.

Ask "Who should I start this week?" and it:

Detects you want lineup optimization
Pulls your current roster from ESPN's API
Fetches your opponent's lineup for context
Analyzes matchups and projections
Returns specific recommendations with reasoning

Ask "What trades can I make?" and it:

Switches to trade analysis mode
Scans all team rosters in your league
Identifies teams with complementary needs
Generates realistic multi-player trade proposals

The key innovation was removing the friction.

No dropdown menus, no selecting which feature you want.

Just chat, and the system figures out what you need.

The Architecture

This required three main components working together:

1. ESPN API Reverse Engineering

ESPN doesn't have a public API.

I reverse engineered their undocumented endpoints to pull live data.

2. AI Agent System with Tool Routing

The chat interface doesn't just pass your message to GPT-4.

It intelligently routes to specialized tools based on intent:

Lineup optimization
Trade analysis
Waiver wire recommendations
Player comparisons
Deep research with web search

Each function has its own data requirements and prompting strategy.

3. Rate Limiting Infrastructure

This is deployed where others can access it.

I implemented a $10/hour spending cap to prevent runaway OpenAI costs:

Pre-request cost estimation
Real-time usage tracking
Automatic request blocking when approaching limit
Per-feature cost breakdown in the dashboard

Part 1: How to Reverse Engineer Any API

This process works for ESPN, DoorDash, Netflix, LinkedIn, whatever.

The steps are always the same.

Step 1: Open Developer Tools

Press F12 or right-click and hit "Inspect", then go to the Network tab.

My process:

Clear out old requests
Filter to "Fetch/XHR" requests
Navigate to the page with data you want
Watch what requests show up

For ESPN, I found:

lm-api-reads.fantasy.espn.com/apis/v3/games/ffl/seasons/2025/segments/0/leagues/{LEAGUE_ID}

Step 2: Analyze the Request

Click on the request to see everything.

Headers show authentication methods, required custom headers, browser info.

URL Parameters reveal filtering, pagination, sorting options.

Request Body (for POST) shows expected data format.

Step 3: Extract Authentication

Most APIs require authentication.

Common methods:

Cookie-Based: Look in DevTools under Application → Cookies. Copy the values.

Token-Based: Check for Authorization headers like Authorization: Bearer <token>.

API Keys: Sometimes in URL parameters or custom headers.

Store these in environment variables, never commit to version control.

Step 4: Replicate the Request

Make the same request programmatically.

Key things:

Match all headers the browser sends
Include authentication
Use correct HTTP method
Structure request bodies exactly as expected

I use Python's requests library with a Session to maintain cookies, add custom headers, and make authenticated GET requests.

Step 5: Parse the Response

APIs use nested data structures and numeric IDs.

ESPN uses numeric IDs everywhere:

Position ID 2 = QB
Team ID 1 = Atlanta
Lineup slot 20 = Bench

I built mapping dictionaries by:

Parsing league settings dynamically
Hardcoding static values (NFL teams don't change)
Cross-referencing multiple endpoints

Step 6: Handle Production Concerns

This is where the real work begins.

Rate Limiting: I limit to 30 requests/minute for ESPN.

Caching: League settings cached all season, projections for 15 minutes.

Error Handling: Retry logic with exponential backoff, graceful failures.

Response Validation: Check fields exist before accessing, handle partial data during live games.

Part 2: The Chat Interface Architecture

The hard part wasn't reverse engineering ESPN's API.

It was building a chat system that intelligently routes to the right tools.

The Problem with Tool Selection

You can't just pass every message to GPT-4 and let it figure out what to do.

That's expensive and slow.

You need intelligent pre-routing based on user intent.

How I Built the Router

The chat handler analyzes the incoming message for keywords and context:

User: "Who should I start this week?"
→ Detects: "start" + "week" 
→ Routes to: Lineup Optimizer
→ Data needed: My roster + opponent roster + projections

User: "What trades can I make for a running back?"
→ Detects: "trade" + position mention
→ Routes to: Trade Analyzer
→ Data needed: All league rosters + positional needs

User: "Should I pick up Player X?"
→ Detects: "pick up" + player name
→ Routes to: Waiver Wire Analyzer
→ Data needed: Available players + my team needs

Each route triggers different ESPN API calls to gather only the necessary data.

The Data Collection Phase

Once intent is detected, the system makes multiple ESPN API calls:

For lineup optimization:

Get current NFL week
Fetch my team's roster with projections
Get opponent's roster and projections
Pull injury statuses
Retrieve matchup context

For trade analysis:

Fetch all team rosters in the league
Get season-long projections
Calculate positional needs per team
Pull team records and playoff positioning

This happens in parallel where possible to minimize latency.

The AI Layer

After data collection, I transform everything into clean, structured prompts.

Raw ESPN data is a mess for LLMs:

Nested objects, numeric IDs, inconsistent fields.

I built a transformation layer that:

Flattens nested structures
Translates numeric IDs to readable names
Formats specifically for context windows
Strips unnecessary fields to save tokens

The AI receives formatted data like:

Your Roster:
- Josh Allen (QB, BUF) - Proj: 22.5 pts - Status: Healthy - vs MIA
- Christian McCaffrey (RB, SF) - Proj: 18.3 pts - Status: Questionable - vs LAR

Opponent's Roster:
- Patrick Mahomes (QB, KC) - Proj: 24.1 pts - Status: Healthy - vs DEN

Instead of raw JSON with nested objects.

Context Window Optimization

Fitting everything into the context window required careful engineering.

For a start/bench decision, the agent needs:

My full roster (15 players with stats)
Opponent's roster (15 players with stats)
League scoring settings
Current lineup configuration
Matchup analysis

That's 3000+ tokens before the AI responds.

I optimized by:

Removing redundant information
Abbreviating field names
Including only relevant stats
Using token-efficient formats

This made the difference between agents that work and agents that hit limits.

The Specialized Agents

Different questions need different analysis approaches.

Lineup Optimizer: Considers variance, not just projections. A high-variance player might be better if you're projected to lose.

Trade Constructor: Scans all league teams to find complementary needs. Generates specific 1-for-1, 2-for-1, or 2-for-2 proposals with reasoning for both sides.

Waiver Wire Analyzer: Filters to only available players, ranks by upside and fit, focuses on weak positions.

Deep Research Agent: Makes web searches for injury reports, weather, defensive matchups. Slower (30-60 seconds) but gives analysis you can't get from projections alone.

Each agent has custom system prompts and output formats.

Part 3: The Production Engineering

Building a demo is easy.

Making it work reliably in production is hard.

Problem 1: Rate Limiting for Cost Control

This is deployed publicly, so I needed protection against runaway costs.

I implemented a $10/hour spending cap:

Before each request:

Estimate token usage based on data size
Check current hourly spending from session state
Block request if it would exceed limit
Return 429 error with clear message

After each request:

Record actual token usage
Calculate real cost (input + output tokens × pricing)
Update session state with timestamp
Clean up usage records older than 1 hour

This prevents $500 surprise bills if someone spams the API.

Problem 2: Handling ESPN's Data Inconsistencies

ESPN's API sometimes returns partial data, especially during live games.

I had to handle:

Missing player projections
Incomplete roster entries
Null injury statuses
Different league scoring formats
Edge cases like bye weeks

The wrapper validates every field before accessing it and fails gracefully when data is missing.

Problem 3: Making It Fast Enough to Be Usable

Nobody wants to wait 30 seconds for a lineup decision.

Optimizations I made:

Aggressive caching (settings cached all season, projections for 15 minutes)
Parallel ESPN API calls where possible
Request batching using ESPN's "view" system
Context window optimization to reduce AI processing time

Most decisions now feel instant despite making multiple API calls and AI requests.

Problem 4: Maintaining Conversation Context

The chat interface needs to remember previous messages.

I use Streamlit session state to persist:

Full conversation history
Previous tool calls
User preferences mentioned in chat
Cost tracking across the session

This lets you ask follow-up questions like "What about if I trade for a WR instead?" and the system knows you're continuing the trade analysis conversation.

The Technical Stack

Backend:

FastAPI for REST endpoints
Custom ESPN API wrapper
Rate limiter with cost tracking

Frontend:

React/Typescript
Real-time chat interface
Usage/cost dashboard
Quick action buttons

AI:

OpenAI for reasoning
Structured prompts per agent type
Context window optimization
Multi-agent routing system

Three Key Takeaways

Every web app exposes its APIs in the browser: The Network tab shows everything. Frontend apps have to make API calls, you can see exactly what they're doing.
Chat interfaces need intelligent routing: Don't just pass every message to the LLM. Pre-route based on intent, then gather only the necessary data and use specialized prompts.
Production requires engineering: Rate limiting, caching, error handling, and cost control aren't nice-to-haves. They're what separates demos from systems people actually use.

Disclaimer: This post describes techniques for accessing your own data through undocumented APIs for personal projects. Use responsibly, respect rate limits, and don't access data you don't have permission to view. This is for educational purposes and personal use only.