Semi-automatic mode with hooks
Lessons from six previous days had accumulated: the agent is fast, but needs guardrails. The final architecture uses semi-automatic mode — the agent works autonomously within defined boundaries and calls in a human when those boundaries are crossed.
The key tool is hooks. A pre-commit hook verifies the agent isn't overstepping its permissions (no changes outside assigned directories, no config file modifications without approval). A post-commit hook reports the result to a communication channel. A pre-push hook requires human approval for merging to main.
This system doesn't require a human to watch every step — just respond to notifications. An effective mix of autonomy and control.
Code quality: mid-level, not senior
After seven days I have a clear picture of agentic code quality: it corresponds to a solid mid-level developer. Functional, readable, mostly free of obvious bugs. But the senior dimension is missing.
What the agent does well:
- Implementing defined patterns and conventions
- Writing unit tests for new code
- Refactoring repetitive code
- Generating documentation and comments
- Data transformations and mappings
What the agent does poorly or not at all:
- Architectural decisions with long-term impact
- Security auditing and threat modelling
- Performance optimisation requiring system-level knowledge
- Identifying race conditions and deadlocks
- Weighing business context in technical decisions
Conclusion: agentic AI needs a senior developer as technical lead, not as operator. The lead defines the architecture, sets guardrails and reviews output. An operator just waits for whatever the agent generates.
What to take away from seven days
Agentic AI development is usable in production — but not as a replacement for a development team. As a productivity multiplier for a senior developer, yes.
The series continues with two bonus articles: guardrails (how to keep the agent under control) and a knowledge base (how to get the most from AI when it has access to organisational knowledge).
Want to see how business processes can be automated? Book a consultation — we start where vibe-coding ends.
Appendix: Memecoin experiment
In parallel with the main experiment I had the agent design and partially implement a simple memecoin website — deliberately a trivial project where I didn't mind if the agent made mistakes.
The goal was simple: test how the agent behaves without guardrails and without senior oversight. The result was predictable: functional, but full of decisions that wouldn't survive a week in production. No input validation, hardcoded API keys in the code, no rate limiting.
The experiment confirmed the thesis from day seven: an agent without guardrails is fast to production and dangerous in operation. With guardrails, it's an excellent tool.
The memecoin project was never deployed to production and was never planned to be. It was a sandbox — and as a sandbox, it served its purpose.