Agentic Development
Best Practices for the AI-Native Organization
CONFIDENTIAL – INTERNAL USE ONLY
Executive Summary
The shift to agentic development represents a once-in-a-generation inflection point for software engineering organizations. By leveraging autonomous AI agents that plan, implement, test, and iterate on code with minimal human direction, forward-thinking teams are realizing productivity multipliers of 3–5x while maintaining—and in many cases improving—code quality, security posture, and time-to-market.
This guide codifies our organizational best practices for adopting agentic workflows across the software development lifecycle (SDLC). It is intended to serve as the canonical reference for all engineering teams transitioning from traditional or AI-assisted development to fully agentic paradigms. Adherence to these practices is expected for all greenfield projects effective immediately, and for brownfield projects on a rolling basis aligned to sprint cadence.
Key Takeaway: Organizations that fail to adopt agentic practices risk falling behind by an estimated 18–24 months in delivery velocity, making recovery increasingly difficult as competitors compound their advantages.
The Agentic Maturity Model
Our organization recognizes four stages of AI-augmented development maturity. All teams are expected to reach Stage 3 by end of Q2 2026 and Stage 4 by Q4 2026, per the Digital Transformation Steering Committee’s mandate.
| Stage | Name | Description | Target KPI |
|---|---|---|---|
| 1 | AI-Assisted | Autocomplete and inline suggestions (e.g., Copilot). Developer retains full control. | 1.2–1.5x velocity |
| 2 | AI-Augmented | Agents handle discrete tasks (test generation, documentation, boilerplate). Human reviews all output. | 1.5–2.5x velocity |
| 3 | Agentic | Agents autonomously execute multi-step workflows from spec to verified implementation. Human focuses on architecture and review. | 3–5x velocity |
| 4 | Agentic Swarm | Orchestrated multi-agent systems operating in parallel across the full SDLC. Human acts as strategic orchestrator. | 5–10x velocity |
Core Principles of Agentic Development
Spec-Driven Everything
In agentic workflows, the specification is the product. A well-crafted specification is the single most important artifact in the development process, as it serves as both the instruction set for AI agents and the acceptance criteria for human reviewers. Teams must invest significantly more time in specification authoring than in traditional development. The general rule of thumb: if a perfect specification combined with a perfect template would allow an agent to build the product entirely on its own, your specification is good enough.
- All work items must begin with a machine-readable specification before any agent is invoked.
- Specifications must include acceptance criteria, edge cases, security constraints, and integration boundaries.
- Specifications are living documents: agents update them as implementation reveals ambiguity.
- Product managers and engineering leads co-author specifications in structured formats (YAML, Markdown with frontmatter, or the approved Spec Kit templates).
Human-in-the-Loop, Always
Every line of code must be reviewed by a human, regardless of who or what wrote it. Treating AI-generated code as automatically trustworthy is a termination-eligible offense under our Engineering Standards of Conduct. The human role is shifting from writing code to reading and evaluating code—this is not a reduction in responsibility but a transformation in the nature of the work.
Engineers should develop strong intuitions for AI delegation over time. As a general framework: delegate tasks that are easily verifiable and where you can quickly assess correctness. The more conceptually difficult or design-dependent a task, the more likely it should remain with the human or be worked through collaboratively rather than fully delegated.
Test-Driven Agent Loops
The single biggest differentiator between disciplined agentic engineering and undisciplined “vibe coding” is testing. Without a solid test suite, agents will cheerfully declare “done” on broken code. With tests, an agent can iterate in a closed loop until all tests pass, providing high confidence in the result.
The mandated workflow for all agentic task execution is:
- Agent receives task specification and decomposes it into subtasks.
- Agent writes a failing test for the first subtask (Red).
- Agent implements code until the test passes (Green).
- Agent runs the full test suite to confirm no regressions.
- Agent updates task status and run log, proceeds to next subtask.
- Human reviews the completed batch.
Mandatory Requirement: Integration tests are required for all agent-generated code. Unit tests alone do not catch the integration failures that commonly emerge from AI-generated implementations. Browser automation via Playwright with MCP integration is required for frontend agent work.
The Standard Agentic Workflow
All engineering teams must adopt the following end-to-end workflow. Deviations require written approval from the VP of Engineering.
Phase 1: Context Preparation
Before invoking any agent, engineers must prepare the operational context. This includes maintaining up-to-date project templates containing architecture rules, coding standards, integration patterns, and guardrails. Think of the template as the agent’s onboarding document—if a new hire couldn’t build the feature from the template alone, the template is inadequate.
- Maintain an
agents.md(or equivalent rules file) in every repository root. - Include dependency maps, API schemas, and architectural decision records (ADRs).
- Refresh context documents monthly or after any significant architectural change.
Phase 2: Specification Generation
Leverage AI agents to collaboratively draft specifications from product requirements. The agent can generate initial drafts, but the product owner and tech lead must review and approve all specifications before implementation begins. Specifications should be stored in version control alongside the code they describe.
Phase 3: Agent-Driven Implementation
Deploy coding agents using the approved toolchain (see Section 6). Agents execute within the test-driven loop described above. For complex features, use the multi-agent orchestration pattern: a planning agent decomposes the work, specialized coding agents handle implementation in parallel, and a review agent synthesizes results.
Phase 4: Human Review and QA
All agent-generated code passes through standard code review. Reviewers should pay particular attention to architectural coherence, security implications, and edge cases—areas where agents reliably underperform. UX review remains a fully human activity. If issues are found, the process loops back to specification refinement.
Phase 5: Continuous Deployment
Agent-generated code follows the same CI/CD pipeline as human-generated code, with the addition of mandatory AI-specific linting rules and security scanning. Canary deployments are required for all agent-generated changes exceeding 500 lines of diff.
Governance and Security
Agents must be treated with the same scrutiny applied to third-party contractors. They have the autonomy to edit files and execute commands, and our security posture must reflect that reality.
Access Controls
- Agents operate under scoped permissions: read access to the full codebase, write access only to designated feature branches.
- No agent may push directly to main, staging, or production branches.
- Agents are prohibited from executing destructive operations (database drops, force pushes, production deployments) without human confirmation.
- All agent actions are logged to an immutable audit trail with full provenance tracking.
Model and Tool Governance
Only models and tools approved by the Architecture Review Board (ARB) may be used in production workflows. The current approved toolchain is maintained in the internal wiki and updated quarterly. Unauthorized use of consumer-grade AI tools (ChatGPT web interface, unauthorized Copilot instances, personal API keys) for production code is prohibited.
All agent interactions must transit through the approved API gateway, which enforces rate limiting, content filtering, and data loss prevention (DLP) policies. No source code may be sent to models hosted outside the approved vendor list.
Intellectual Property
All code generated by agents during the course of employment is company property under existing IP assignment agreements. Engineers must not train, fine-tune, or provide feedback to external models using proprietary code or specifications without written authorization from Legal.
Approved Toolchain
| Category | Approved Tools | Use Case | Status |
|---|---|---|---|
| IDE Agent | Cursor, Windsurf | Interactive agentic coding | Production |
| CLI Agent | Claude Code, Codex CLI | Headless agent execution | Production |
| Multi-Agent Orchestration | Custom (MCP-based) | Parallel agent swarms | Pilot |
| Code Review Agent | GitHub Copilot Review | Automated PR review | Production |
| Testing Agent | Playwright + MCP | Integration/E2E testing | Production |
| MCP Registry | GitHub MCP Registry | Tool discovery and integration | Production |
Teams requiring tools outside the approved list must submit a Toolchain Exception Request (TER) through the ARB portal. Estimated turnaround is 3–5 business days.
Metrics and Reporting
All teams must report the following metrics monthly through the Engineering Analytics Dashboard. These metrics will be reviewed in the quarterly Engineering Leadership Forum and will inform resource allocation and team performance assessments.
| Metric | Definition | Target |
|---|---|---|
| Agent Adoption Rate | % of commits with agent involvement | >60% by Q2 |
| Velocity Multiplier | Story points delivered vs. 2024 baseline | 3x by Q3 |
| Agent Code Quality Score | Defect rate in agent-generated vs. human-generated code | Parity or better |
| Spec Coverage | % of work items with machine-readable specifications | >90% by Q2 |
| Test Coverage (Agent) | Integration test coverage for agent-generated code | >80% |
| Mean Time to Review | Average time from agent PR submission to human approval | <4 hours |
Anti-Patterns to Avoid
The following practices are explicitly prohibited. Repeated violations will be addressed through the standard performance management process.
Vibe Coding in Production: Using conversational AI prompting to generate production code without specifications, tests, or structured review. Vibe coding is acceptable for prototypes and throwaway scripts; it is never acceptable for code that will be deployed to any environment accessible by customers.
Cargo-Cult Adoption: Installing agentic tooling without updating workflows, review processes, or team structures. Simply giving every developer a Cursor license does not constitute agentic development.
Agent Autopilot: Allowing agents to operate without human checkpoints. Every agent workflow must include mandatory human review gates as defined in Section 4.
Prompt-and-Pray: Providing vague, underspecified instructions to agents and hoping for good output. Invest in specification quality; the quality of agent output is directly proportional to the quality of the input specification.
Shadow AI: Using unauthorized AI tools, personal API keys, or consumer-grade interfaces for work that touches production code or proprietary data.
Upskilling and Enablement
The transition to agentic development requires a deliberate investment in new skills. Engineering managers must ensure all team members complete the required training by the deadlines specified below.
| Training Module | Audience | Deadline |
|---|---|---|
| Agentic Fundamentals (L100) | All engineers | March 31, 2026 |
| Specification Authoring (L200) | Senior engineers, TLs | April 30, 2026 |
| Agent Orchestration (L300) | Staff engineers, architects | June 30, 2026 |
| Agentic Security (L200) | All engineers | April 30, 2026 |
| AI Governance for Managers (L100) | Engineering managers, directors | March 31, 2026 |
Completion of L100 modules is a prerequisite for the annual performance review cycle. Engineers who complete all applicable modules will receive the “Agentic Practitioner” digital badge on their internal profile.
Looking Ahead
We stand at the beginning of a fundamental transformation in how software is built. The organizations that will thrive in 2027 and beyond are the ones making strategic investments today. Every month of delay represents not just lost productivity but lost learning—and the compound effect of that lost learning creates a competitive gap that becomes increasingly difficult to close.
This guide will be updated quarterly to reflect evolving best practices, new tooling approvals, and lessons learned from our pilot programs. All feedback should be directed to the Agentic Development Center of Excellence (AgenCoE) via the #agentic-dev Slack channel or the dedicated Jira intake form.
Together, we are building the future of software engineering at this organization. Your commitment to these practices is what will make that future a reality.
Document Owner: VP of Engineering, Digital Transformation Office Next Review: Q2 2026 Distribution: All Engineering Staff