Sub-100M parameter language models, same eval harness, transparent methodology.
| # | Model | Org | Params | WikiText-2 ↓ | BLiMP ↑ | ARC-Easy ↑ | Training Tokens | Released | Links |
|---|
Higher is better
Higher is better
Lower is better · bubble size = perplexity (smaller bubble = better)
High efficiency zone (≥1σ above trend) highlighted
Open a PR with your model's benchmark results and reproduction steps. We require: params, training data provenance, eval harness used, and scores for at least 2 of the 3 benchmarks.