Built 3 AI agents scored AAA, Zach Wilson broke them on purpose, then I fixed them better than before. Shoutout to the DataExpert.io team for putting together something really practical. What I appreciate is how each agent type is separated as its own selectable mode so you can see exactly what happens under the hood.
Knocked out 3 agent assignments back to back:
1. Context Overflow Agent -
fixed a tool dumping 571K tokens per query down to 133 tokens (99.98% reduction)
2. Infinite Research Agent -
tamed a runaway agent from $12/query crashes to $0.001 with a BudgetManager that tracks every token in real time
3. MCP Middleware Agent -
built a keyword router that cuts 25 tools down to 6-11 per query, eliminating tool confusion
On top of the agents I added:
- 46 tests across all 3 agents, all passing, all hitting real APIs
- Langfuse wired in for full observability
- Per-step cost tracking with live dashboards
- Token usage and latency monitoring on every single LLM call
- BudgetManager class with graceful degradation when cost limits are hit
- No mocks, no shortcuts, production-grade testing
Every small and mid-size company should have this kind of AI agent platform running in their business.
The tools exist now, the barrier is just knowing how to wire them together.
Code: https://lnkd.in/didukzk7
Tech stack:
Python + LangChain + Claude Sonnet + Langfuse + pytest + Claude Code
So excited building AI agents.
2026 is the year of agents -
not just chatbots that talk, but agents that actually do things, track their own costs, and know when to stop.
#AIAgents #LangChain #Langfuse #AIEngineering #BuildInPublic