wrds-download

TUI/CLI tool for browsing and downloading WRDS data
Log | Files | Refs | README

commit daf97adc20f9e2acaa020727efd47940bd3572cb
parent 07747b009035be9ccb836492c7961c0937d1889f
Author: Erik Loualiche <[email protected]>
Date:   Fri, 20 Feb 2026 17:02:08 -0600

Merge pull request #6 from LouLouLibs/feat/claude-skill

Add Claude Code skill for WRDS downloads
Diffstat:
MREADME.md | 25+++++++++++++++++++++++++
Aclaude-skill-wrds-download/README.md | 40++++++++++++++++++++++++++++++++++++++++
Aclaude-skill-wrds-download/SKILL.md | 127+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 192 insertions(+), 0 deletions(-)

diff --git a/README.md b/README.md @@ -221,6 +221,28 @@ For machine-readable output (useful in scripts and coding assistants): wrds-dl info --schema crsp --table dsf --json ``` +## Claude Code skill + +A bundled [Claude Code](https://claude.com/claude-code) skill lets you download WRDS data using natural language: + +``` +/wrds-download CRSP daily stock data for 2020 +``` + +Claude will inspect the table, show you the structure, do a dry run for large tables, and download to Parquet. + +To install, copy the skill into your project or personal skills directory: + +```sh +# Project-level (committed to repo) +cp -r claude-skill-wrds-download .claude/skills/wrds-download + +# Personal (all your projects) +cp -r claude-skill-wrds-download ~/.claude/skills/wrds-download +``` + +See [`claude-skill-wrds-download/README.md`](claude-skill-wrds-download/README.md) for details. + ## How it works `wrds-dl` connects directly to the WRDS PostgreSQL server using [pgx](https://github.com/jackc/pgx). All operations — metadata browsing, column inspection, and data download — go through a single pooled connection (limited to 1 to avoid triggering multiple Duo 2FA prompts). @@ -240,6 +262,9 @@ Schema and table names are quoted as PostgreSQL identifiers to prevent SQL injec ``` wrds-download/ ├── main.go # entrypoint +├── claude-skill-wrds-download/ +│ ├── SKILL.md # Claude Code skill for natural-language downloads +│ └── README.md # skill installation instructions ├── cmd/ │ ├── root.go # cobra root command │ ├── tui.go # `wrds-dl tui` — launches interactive browser diff --git a/claude-skill-wrds-download/README.md b/claude-skill-wrds-download/README.md @@ -0,0 +1,40 @@ +# Claude Skill: wrds-download + +A [Claude Code](https://claude.com/claude-code) skill that lets you download data from WRDS using natural language. + +## Installation + +### Option 1: Copy into your project + +```bash +cp -r claude-skill-wrds-download .claude/skills/wrds-download +``` + +### Option 2: Copy to your personal skills (works across all projects) + +```bash +cp -r claude-skill-wrds-download ~/.claude/skills/wrds-download +``` + +## Prerequisites + +1. **`wrds-dl` binary** on your PATH — see [Installation](../README.md#installation) +2. **WRDS credentials** configured via environment variables, saved credentials, or `~/.pgpass` + +## Usage + +In Claude Code, type: + +``` +/wrds-download CRSP daily stock data for 2020 +``` + +``` +/wrds-download Compustat annual fundamentals, gvkey datadate and sales, 2010-2023 +``` + +``` +/wrds-download IBES analyst EPS estimates for Apple +``` + +Claude will inspect the table, show you the structure and row count, do a dry run for large tables, and download the data to a local Parquet file. diff --git a/claude-skill-wrds-download/SKILL.md b/claude-skill-wrds-download/SKILL.md @@ -0,0 +1,127 @@ +--- +name: wrds-download +description: Download data from the WRDS (Wharton Research Data Services) PostgreSQL database to local Parquet or CSV files. Use when the user asks to get data from WRDS, download financial data, or mentions WRDS schemas like crsp, comp, optionm, ibes, etc. +allowed-tools: Bash(wrds-dl *), Read, Grep +argument-hint: [description of data needed] +--- + +# WRDS Data Download + +You help users download data from the Wharton Research Data Services (WRDS) PostgreSQL database using the `wrds-dl` CLI tool. + +## Prerequisites + +The `wrds-dl` binary must be installed and on the PATH. The user must have WRDS credentials configured via one of: +- Environment variables: `PGUSER` and `PGPASSWORD` +- Saved credentials at `~/.config/wrds-dl/credentials` +- Standard `~/.pgpass` file + +If `wrds-dl` is not found, tell the user to install it from https://github.com/LouLouLibs/wrds-download/releases or build from source with `go build`. + +## Workflow + +Follow these steps for every download request: + +### Step 1: Identify the table + +Parse the user's request to determine the WRDS schema and table. Common mappings: + +| Dataset | Schema | Key Tables | +|---------|--------|------------| +| CRSP daily stock | `crsp` | `dsf` (daily), `msf` (monthly), `dsi` (index) | +| CRSP events | `crsp` | `dsedelist`, `stocknames` | +| Compustat annual | `comp` | `funda` | +| Compustat quarterly | `comp` | `fundq` | +| Compustat global | `comp_global_daily` | `g_funda`, `g_fundq` | +| IBES | `ibes` | `statsum_epsus`, `actu_epsus` | +| OptionMetrics | `optionm` | `opprcd` (prices), `secprd` (security) | +| TAQ | `taqmsec` | `ctm_YYYYMMDD` | +| CRSP/Compustat merged | `crsp` | `ccmxpf_linktable` | +| BoardEx | `boardex` | `na_wrds_company_profile` | +| Institutional (13F) | `tfn` | `s34` | +| Audit Analytics | `audit` | `auditnonreli` | +| Ravenpack | `ravenpack` | `rpa_djnw` | +| Bank Regulatory | `bank` | `call_schedule_rc`, `bhck` | + +If unsure which table, ask the user or use `wrds-dl info` to explore. + +### Step 2: Inspect the table + +Always run `wrds-dl info` first to understand the table structure: + +```bash +wrds-dl info --schema <schema> --table <table> +``` + +Use the output to: +- Confirm the table exists and has the expected columns +- Note column names for the user's requested variables +- Check the estimated row count to warn about large downloads + +For JSON output (useful for parsing): `wrds-dl info --schema <schema> --table <table> --json` + +### Step 3: Dry run + +For tables with more than 1 million estimated rows, or when a WHERE clause is involved, always do a dry run first: + +```bash +wrds-dl download --schema <schema> --table <table> \ + --columns "<cols>" --where "<filter>" --dry-run +``` + +Show the user the row count and sample rows. Ask for confirmation before proceeding if the row count is very large (>10M rows). + +### Step 4: Download + +Build and run the download command: + +```bash +wrds-dl download \ + --schema <schema> \ + --table <table> \ + --columns "<comma-separated columns>" \ + --where "<SQL filter>" \ + --out <output_file> \ + --format <parquet|csv> +``` + +#### Defaults and conventions +- **Format**: Use Parquet unless the user asks for CSV. Parquet is smaller and faster. +- **Output path**: Name the file descriptively, e.g., `crsp_dsf_2020.parquet` or `comp_funda_2010_2023.parquet`. +- **Columns**: Select only the columns the user needs. Don't use `*` on wide tables — ask what variables they need. +- **Limit**: Use `--limit` for testing. Suggest `--limit 1000` if the user is exploring. + +#### Common filters +- Date ranges: `--where "date >= '2020-01-01' AND date < '2021-01-01'"` +- Specific firms by permno: `--where "permno IN (10107, 93436)"` +- Specific firms by gvkey: `--where "gvkey IN ('001690', '012141')"` +- Fiscal year: `--where "fyear >= 2010 AND fyear <= 2023"` + +### Step 5: Verify + +After download completes, confirm the file was created and report its size: + +```bash +ls -lh <output_file> +``` + +## Error handling + +- **Authentication errors**: Remind the user to set `PGUSER`/`PGPASSWORD` or run `wrds-dl tui` to save credentials. +- **Table not found**: Use `wrds-dl info` to check schema/table names. WRDS schemas and table names are lowercase. +- **Timeout on large tables**: Suggest adding a `--where` filter or `--limit` to reduce the result set. +- **Duo 2FA prompt**: The connection triggers a Duo push. Tell the user to approve it on their phone. + +## Example interactions + +**User**: "Download CRSP daily stock data for 2020" +→ `wrds-dl info --schema crsp --table dsf` +→ `wrds-dl download --schema crsp --table dsf --where "date >= '2020-01-01' AND date < '2021-01-01'" --out crsp_dsf_2020.parquet` + +**User**: "Get Compustat annual fundamentals, just gvkey, datadate, and sales" +→ `wrds-dl info --schema comp --table funda` +→ `wrds-dl download --schema comp --table funda --columns "gvkey,datadate,sale" --out comp_funda.parquet` + +**User**: "I need IBES analyst estimates" +→ `wrds-dl info --schema ibes --table statsum_epsus` +→ Ask what date range and variables they need, then download.