methodology-draft-v0.1 · 2026-06-09 22:40 CST

Compare 30+ LLMs by price, speed and real-task performance

A developer decision table for model cost, TTFT, context, Chinese coverage and task-specific quality. Phase 1 ships the usable interface and data pipeline contract.

30+Model snapshot
10Task buckets
$0.03Cheapest sample
DailyUpdate cadence
Sample data: Phase-1 sample snapshot. Official crawling and weekly benchmark jobs are not connected yet. All price, latency and score values validate the product structure only and must be replaced by traceable production data before launch.

Model price and latency table

The demo-v2-design-4 clean table style, extended into a real decision surface.

Pricing
ModelInputOutputTTFTContextValueUpdated
DSDeepSeek V3DeepSeek · closed$0.14/1M$0.28/1M124ms128K962026-06-09Compare
QWQwen 2.5 72BAlibaba Cloud · open$0.35/1M$0.70/1M156ms128K912026-06-09Compare
G4GPT-4oOpenAI · closed$2.50/1M$10.00/1M89ms128K732026-06-09Compare
C3Claude 3.5 SonnetAnthropic · closed$3.00/1M$15.00/1M95ms200K702026-06-09Compare
GMGemini 2.0 FlashGoogle · closed$0.10/1M$0.40/1M112ms1M942026-06-09Compare
DBDoubao ProVolcano Engine · closed$0.11/1M$0.22/1M141ms128K952026-06-09Compare
KMKimi K2Moonshot AI · closed$0.18/1M$0.72/1M168ms200K882026-06-09Compare
GLGLM-4 PlusZhipu AI · closed$0.80/1M$0.80/1M184ms128K782026-06-09Compare

Task bucket leaders

Ten task buckets use sample scores to validate routing and ranking logic.

Phase-1 core flows

The homepage leads directly to routing, comparison, alerts and methodology.

Router

Paste a prompt and get Top 3 recommendations by cost, speed and quality.

Compare

Compare price, TTFT, context, task scores and provenance side by side.

Get price alerts

The phase-1 form returns a local API confirmation before email storage is connected.

Provider coverage

Chinese and global providers are shown in one surface to avoid English-only benchmark bias.

Trust layer before growth tricks

Every row is designed to carry update time, source type, benchmark version and methodology links before this becomes a production data service.

Methodology

Open methodology

Each task bucket describes prompts, sampling, scoring and update cadence.

Visible freshness

Every data point carries update time, source type and benchmark version.

Cost control

Benchmark jobs reserve budget caps, caching, sampling and monthly cost fields.

Reports

Initial content focuses on methodology, Chinese model selection and routing cost.