Skip to content

Monte9/stuck-problems-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

106 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

stuck-problems-agent

An autonomous research loop for important-but-stuck problems, operated by a scheduled Claude Code routine. The routine is the loop; this repo is the policy, the assets, and the memory.

  • CLAUDE.md — the doctrine: one phase per run, decided by a state-machine table
  • .claude/agents/ — planner (brief → spec), generator (one milestone → one artifact), evaluator (fresh-eyes pass/fail)
  • .claude/skills/ — the asset library, extracted from successful manual sessions
  • problems/<name>/problem.md (intake), spec.md (plan), state.md (memory/cursor), artifacts/, verdicts/

How a problem flows

  1. When nothing is active, the loop scouts a new stuck problem itself and writes the brief ([scout]). One problem at a time, in sequence. You can also add a brief by hand to problems/<name>/problem.md.
  2. Next run: the planner writes spec.md with milestones and done-criteria, commits [spec]. No approval step. The loop continues on its own.
  3. Runs alternate generate → evaluate, one phase per hour. PASS advances the milestone. FAIL retries once with the critique attached. A second FAIL blocks the problem and DMs you.
  4. All milestones pass → the publisher drafts report.md and tweet.md, adds the problem to the table below, and DMs you. Publishing to the outside world stays human: you review and post.
  5. The problem is published; the next wake scouts a fresh one. The loop never idles.

Your jobs (and nothing else)

  • Post the publications — when a DM says [published], review report.md and tweet.md, then post them.
  • Unblock — when a commit says [blocked], read the verdict, fix the spec/rubric/state, push.
  • Weekly 15-minute audit — the quality backstop now that no human reviews specs. Read two randomly chosen PASS artifacts. If one makes you wince, the fix goes into the evaluator rubric (versioned in evaluator.md). You're auditing the judge, not the work. Extract a SKILL.md when the log shows repeated freeform work.

Everything else runs without you. Watch the commit log: [blocked] needs you, [published] is your cue to post; the rest is the loop talking to itself.

Problems

Problem Status Result
lead-poisoning published Indonesia's top-ranked target already bans informal battery smelting, yet ~5 licensed smelters coexist with 200+ illegal ones and 47% of children near Jakarta recyclers have elevated blood lead; the missing piece is a $150-350k enforcement map. report · final artifact
cbt-insomnia-undertreatment published Untreated chronic insomnia costs ~$92M per 100,000 US adults per year ($17M–$146M), and Medicare's 2025 digital CBT-I coverage is nominal (the device code has no published fee); a five-action payer dossier shows the cheapest fix is a mailed deprescribing package proven to triple hypnotic discontinuation (26.2% vs 7.5%). report · final artifact
methane-super-emitters published Only 13.4% of satellite-notified methane plumes ever drew an operator response; public records named 16 of the 30 largest 2024–26 super-emitter sources (UNEP withholds names by policy), and ~29% of 2024 EU gas imports have no route to the MRV equivalence the EU import rule requires from January 2027. report · final artifact
fishing-dark-fleets published Of 214 currently IUU-listed fishing vessels, half have no named owner and 69.6% are stateless; the RFMO blacklists share just one company with FTC's top-ten corporate offenders, and three confirmed subsidy-to-IUU-operator links (incl. ≥€8.2M Spain/EU → Vidal Armadores and $19.0M China → Pingtan after three countries' rulings) hand the new WTO treaty its first test cases via a no-cost Article 8 question due by 15 Sep 2026. report · final artifact
osteoporosis-treatment-gap published Only 21.1% of Traditional Medicare patients who break a bone are started on osteoporosis therapy though organized programs reach 68–80% and bisphosphonates prevent ~75 hip fractures per atypical fracture caused; closing the gap to 42% prevents ~3,600 refractures and ~900 deaths/yr, but honest costing shows a ~$136M/yr net cost (not first-year savings), so the fix is a specialty-agnostic CMS care-coordination G-code priced near the $105–$182 per-patient cost. report · final artifact
hospital-charity-care-gap published Nonprofit hospitals bill patients ~$14B/yr for care that should be free under a law in force since 2014; a 28-hospital scorecard built from policy text and Form 990 Schedule H finds two of the richest academic hospitals spend the least (Penn/HUP 0.39% of a $1.18B base, NewYork-Presbyterian 0.87% of $9.76B, both bottom-quartile), giving the IRS clean 501(r) audit targets that cost $0 to name. report · final artifact
csam-report-triage published The CyberTipline took 21.3M reports in 2025 but fails at triage not detection: one sender filed 1.1M with "no actionable information," and a report scoring 95/100 on completeness scores 6/100 on rescue priority once it's a known viral duplicate; the fix is a metadata-only triage layer NCMEC can build today (8 of 13 fixes need no statute). report · final artifact

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors