The idea for Agent-Blackbox came from a real 3 AM Slack message I never want to see again.
Our multi-agent pipeline had just wiped the staging database. PM-agent said the requirement was clean. Coder-agent said it followed the spec perfectly. Verifier-agent said it never even received output. Four hours of grepping through thousands of log lines later, we still had no idea which agent actually failed.
That's the "accountability vacuum" β and I built Agent-Blackbox to solve it.
**What problem does it solve?**
When a single LLM call fails, you debug it. When a chain of 5 agents fails, you play detective. Every agent points fingers at the next one. Logs are scattered across systems. There's no verifiable chain of custody for decisions.
Agent-Blackbox gives you `git blame` for AI agents. In 3 seconds, it tells you exactly which agent broke the chain β with cryptographic proof, not guesswork.
**How it works (the 10-second version)**
Under the hood, it implements two IETF drafts:
- **JEP (Judgment Event Protocol)** β a minimal, cryptographically signed log format for agent decisions.
- **JAC (Judgment Accountability Chain)** β a `task_based_on` field that links every decision to its parent.
Four verbs β `J` (Judge), `D` (Delegate), `T` (Terminate), `V` (Verify) β model any accountability workflow. Every action produces a signed JEP receipt with Ed25519 signatures. When something breaks, you trace the `task_based_on` chain back to the failure point.
**How it evolved**
The first version was a messy internal script that just hashed logs. Then I realized the problem was bigger than my team β *everyone* deploying multi-agent systems was hitting the same wall. So I rebuilt it around IETF drafts to make it an open standard, not another closed ecosystem.
v1.0 ships with:
- β Rust core engine (fast)
- β Python SDK (TypeScript coming)
- β `blame-finder dashboard` for visual causality trees
- π§ LangChain / CrewAI native adapters (in progress)
**What's next?**
I want to make Agent-Blackbox the default accountability layer for agentic workflows β like what `git blame` did for code. PDF/HTML blame reports, real-time alerting, and deeper framework integrations are on the roadmap.
**Try it yourself**
```bash
pip install agent-blame-finder
blame-finder dashboard
```
GitHub: https://github.com/hjs-spec/Agen...
I'd love feedback β especially from anyone else who's been woken up at 3 AM by a broken agent pipeline. What's your current debugging workflow? How do you trace failures across agents today?
No comment highlights available yet. Please check back later!
About Agent-Blackbox on Product Hunt
βgit blame for AI agents β find who broke prod in 3sβ
Agent-Blackbox was submitted on Product Hunt and earned 2 upvotes and 1 comments, placing #209 on the daily leaderboard. π Agent Blame-Finder - Cryptographic blackbox for multi-agent systems. Find which agent messed up in 3 seconds. JEP/JAC IETF reference implementation. - hjs-spec/Agent-Blackbox
Agent-Blackbox was featured in Task Management (84k followers), Developer Tools (511k followers), Artificial Intelligence (466.2k followers) and GitHub (41.2k followers) on Product Hunt. Together, these topics include over 180.7k products, making this a competitive space to launch in.
Who hunted Agent-Blackbox?
Agent-Blackbox was hunted by yuqiang@JEP. A βhunterβ on Product Hunt is the community member who submits a product to the platform β uploading the images, the link, and tagging the makers behind it. Hunters typically write the first comment explaining why a product is worth attention, and their followers are notified the moment they post. Around 79% of featured launches on Product Hunt are self-hunted by their makers, but a well-known hunter still acts as a signal of quality to the rest of the community. See the full all-time top hunters leaderboard to discover who is shaping the Product Hunt ecosystem.
Want to see how Agent-Blackbox stacked up against nearby launches in real time? Check out the live launch dashboard for upvote speed charts, proximity comparisons, and more analytics.