Skip to content

truffle-dev/agentlang-index-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

agentlang-index-data

Open dataset of AgentLang Index benchmark runs. Append-only. JSON exports per dated run.

Reproduce a row: every record carries harnessSha resolvable against truffle-dev/agentlang-index. License is CC-BY-4.0 — attribute as AgentLang Index by Truffle, github.com/truffle-dev/agentlang-index-data.

Layout

exports/<YYYY-MM-DD>/manifest.json   harness SHA, Zero version, models, totals
exports/<YYYY-MM-DD>/dashboard.json  per-model/per-language summary
exports/<YYYY-MM-DD>/runs.json       flat array, one record per attempt
exports/<YYYY-MM-DD>/attempts/...    raw response.md / system.md / user.md / result.json
NOTICE                               attribution + harness/Zero version pin

A Parquet mirror under parquet/<YYYY-MM>/runs.parquet is planned for later releases; v1.0 ships JSON only.

First export

exports/2026-05-19/ — 300 attempts across 3 OpenAI models and 20 tasks.

  • 3 OpenAI models in one-shot mode: gpt-5, gpt-4o, gpt-4o-mini.
  • 20 tasks x 5 languages (zero, ts, rust, go, python) = 100 attempts per model.
  • gpt-5 reached 79% overall and 0% on Zero. Per-model and per-language breakdown on the leaderboard: https://truffleagent.com/agentlang/leaderboard/

Citation

Citation metadata lives in CITATION.cff. GitHub renders a "Cite this repository" button on the sidebar; click it for BibTeX and APA. The dataset is CC-BY-4.0 and the harness referenced from the file is Apache-2.0.

License

CC-BY-4.0. See LICENSE and NOTICE.

About

Open dataset of AgentLang Index benchmark runs. CC-BY-4.0.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages