Data Quality & Verification
We would rather you verify our data than take our word for it. The free samples are identical in structure to the paid files, so you can run every check below before spending anything. This page shows you exactly how, and lists what our own validation does.
Current coverage (live)
These figures come from our live catalog API and update automatically; query /api/status for the latest.
| Dataset | Coverage | Scale |
|---|---|---|
| Stock daily & 1-minute | 2003-11 to the latest weekly bundle | 15,000+ active and 50,000+ delisted tickers |
| Options EOD | 2002-05 to the latest trading day | ~6,000 underlyings on a typical day |
You can confirm coverage for any individual symbol with the Ticker Lookup on the stocks page or /api/symbol/{ticker}.
Verify the free sample yourself
Download SAMPLE_AAPL_day.csv (or any file from Free_day.zip) and run:
import pandas as pd
df = pd.read_csv("SAMPLE_AAPL_day.csv", parse_dates=["Time"])
# 1) Structure: 16 columns, no duplicate dates
assert list(df.columns)[:8] == ["Time","Open","Close","Volume","High","Low","Average","Transactions"]
assert df["Time"].is_unique
# 2) OHLC sanity: High is the max, Low is the min, volume non-negative
assert (df["High"] >= df[["Open","Close","Low"]].max(axis=1)).all()
assert (df["Low"] <= df[["Open","Close","High"]].min(axis=1)).all()
assert (df["Volume"] >= 0).all()
# 3) Trading-day continuity: no gap longer than a long weekend/holiday break
gaps = df["Time"].diff().dt.days.dropna()
print("max gap (days):", gaps.max()) # small number = no missing stretches
# 4) Adjustment columns present and consistent on a split date
print(df[df["Split"].notna()][["Time","Close","AdjClose","Split"]])
print("rows:", len(df))
For the AAPL daily sample this confirms the 4-for-1 split on 2020-08-31 (the Split column shows 1:4; adjusted volume before the split is exactly 4× the raw volume), that High/Low bound every bar, and that there are no missing trading days. The same script works on the paid files unchanged.
A buyer's verification checklist
Before relying on the full dataset, we suggest spot-checking:
- Liquid names (AAPL, TSLA, SPY) for general sanity and volume scale;
- A renamed ticker (FB → META) to confirm the old symbol ends at the rename and the new one continues;
- A reused symbol to confirm separate companies are kept as separate files, not spliced;
- A delisted name to confirm its full history is present and frozen at the delisting date;
- A split date and an ex-dividend date to confirm the adjusted columns move correctly;
- Options: a deep in-the-money contract (Greeks/IV may be blank where no usable quote exists) and an index option (SPX, VIX) for European-style handling.
Email [email protected] the symbols you care about and we will confirm coverage and date ranges before you buy.
Our validation process
On every update run we:
- Write each file to a temporary name and rename it only on success — an interrupted run never leaves a truncated file in the dataset;
- Advance the published dataset version only after the entire run completes, so customers always receive a complete, consistent snapshot;
- Retry upstream requests with backoff and re-process any failed tickers on the next run;
- Cross-check every bar against an exchange holiday and half-day calendar;
- Verify column count and date continuity on every incremental append.
Full details, field definitions and known limitations are on the data methodology page.
What we do not claim
We are a focused, low-overhead data shop, not a tier-1 institutional vendor. We do not provide audited error statistics, an SLA, or tick-level trades and quotes. Prices are consolidated across venues and may differ slightly from single-exchange or differently-consolidated sources. If your use case requires contractually guaranteed accuracy, verify a sample against your reference source first — which is exactly what the checks above are for.
Last updated: 2026-06-12. Questions: [email protected].