Sample data: Phase-1 sample snapshot. Official crawling and weekly benchmark jobs are not connected yet. All price, latency and score values validate the product structure only and must be replaced by traceable production data before launch.

Methodology, sources and neutrality

This page is the trust base. Production must publish prompt sets, scoring rules, sampling parameters, sources, update cadence, benchmark budgets and the open-source runner repository.

Open methodology

Each task bucket describes prompts, sampling, scoring and update cadence.

Visible freshness

Every data point carries update time, source type and benchmark version.

Cost control

Benchmark jobs reserve budget caps, caching, sampling and monthly cost fields.

Task buckets

Required fields for every data point

Open-source status

Phase 1 reserves the fields, but the benchmark runner repository has not been created yet. Current methodology version: methodology-draft-v0.1.