Discovery
All entries

Tag

#opencode

15 entries tagged with #opencode.

GitHubLibraryFeatured

Garden Skills - production skill pack for Claude Code, Cursor, and Codex

Three carefully-scoped skills: web-design-engineer (with an anti-cliche blocklist that breaks the generic-AI-landing-page loop), gpt-image-2 (80+ templates, three runtime modes including advisor-only fallback), and kb-retriever (layered data_structure.md navigation for bounded local-KB retrieval). Tested across Claude Code, Claude.ai, Cursor, Codex, Gemini, OpenCode.

Why I saved this - The web-design skill's anti-cliche blocklist is the most opinionated take on 'stop producing the same hero + 3 cards' I've seen.
GitHubToolFeatured

PostTrainBench - can a CLI agent post-train a base LLM in 10 hours?

Benchmark measuring whether Claude Code, Codex CLI, Gemini CLI, and OpenCode can autonomously improve 4 small base models (Qwen3-1.7B/4B, SmolLM3-3B, Gemma-3-4B) on 7 evals (AIME, BFCL, GPQA, GSM8K, HealthBench, HumanEval, Arena Hard) within a single H100 GPU and 10 hours. Includes agent-as-judge anti-reward-hacking and baseline-replacement penalties for tampering.

Why I saved this - Current leader: Opus 4.6 via Claude Code at 23.2 average. The reward-hacking safeguards (eval tampering and model-substitution detection, baseline-replacement penalty) are the part most agent benchmarks skip.

Browse other tags