Agentic Development

Best Practices for the AI-Native Organization


CONFIDENTIAL – INTERNAL USE ONLY


Executive Summary

The shift to agentic development represents a once-in-a-generation inflection point for software engineering organizations. By leveraging autonomous AI agents that plan, implement, test, and iterate on code with minimal human direction, forward-thinking teams are realizing productivity multipliers of 3–5x while maintaining—and in many cases improving—code quality, security posture, and time-to-market.

This guide codifies our organizational best practices for adopting agentic workflows across the software development lifecycle (SDLC). It is intended to serve as the canonical reference for all engineering teams transitioning from traditional or AI-assisted development to fully agentic paradigms. Adherence to these practices is expected for all greenfield projects effective immediately, and for brownfield projects on a rolling basis aligned to sprint cadence.

Key Takeaway: Organizations that fail to adopt agentic practices risk falling behind by an estimated 18–24 months in delivery velocity, making recovery increasingly difficult as competitors compound their advantages.


The Agentic Maturity Model

Our organization recognizes four stages of AI-augmented development maturity. All teams are expected to reach Stage 3 by end of Q2 2026 and Stage 4 by Q4 2026, per the Digital Transformation Steering Committee’s mandate.

StageNameDescriptionTarget KPI
1AI-AssistedAutocomplete and inline suggestions (e.g., Copilot). Developer retains full control.1.2–1.5x velocity
2AI-AugmentedAgents handle discrete tasks (test generation, documentation, boilerplate). Human reviews all output.1.5–2.5x velocity
3AgenticAgents autonomously execute multi-step workflows from spec to verified implementation. Human focuses on architecture and review.3–5x velocity
4Agentic SwarmOrchestrated multi-agent systems operating in parallel across the full SDLC. Human acts as strategic orchestrator.5–10x velocity

Core Principles of Agentic Development

Spec-Driven Everything

In agentic workflows, the specification is the product. A well-crafted specification is the single most important artifact in the development process, as it serves as both the instruction set for AI agents and the acceptance criteria for human reviewers. Teams must invest significantly more time in specification authoring than in traditional development. The general rule of thumb: if a perfect specification combined with a perfect template would allow an agent to build the product entirely on its own, your specification is good enough.

  • All work items must begin with a machine-readable specification before any agent is invoked.
  • Specifications must include acceptance criteria, edge cases, security constraints, and integration boundaries.
  • Specifications are living documents: agents update them as implementation reveals ambiguity.
  • Product managers and engineering leads co-author specifications in structured formats (YAML, Markdown with frontmatter, or the approved Spec Kit templates).

Human-in-the-Loop, Always

Every line of code must be reviewed by a human, regardless of who or what wrote it. Treating AI-generated code as automatically trustworthy is a termination-eligible offense under our Engineering Standards of Conduct. The human role is shifting from writing code to reading and evaluating code—this is not a reduction in responsibility but a transformation in the nature of the work.

Engineers should develop strong intuitions for AI delegation over time. As a general framework: delegate tasks that are easily verifiable and where you can quickly assess correctness. The more conceptually difficult or design-dependent a task, the more likely it should remain with the human or be worked through collaboratively rather than fully delegated.

Test-Driven Agent Loops

The single biggest differentiator between disciplined agentic engineering and undisciplined “vibe coding” is testing. Without a solid test suite, agents will cheerfully declare “done” on broken code. With tests, an agent can iterate in a closed loop until all tests pass, providing high confidence in the result.

The mandated workflow for all agentic task execution is:

  1. Agent receives task specification and decomposes it into subtasks.
  2. Agent writes a failing test for the first subtask (Red).
  3. Agent implements code until the test passes (Green).
  4. Agent runs the full test suite to confirm no regressions.
  5. Agent updates task status and run log, proceeds to next subtask.
  6. Human reviews the completed batch.

Mandatory Requirement: Integration tests are required for all agent-generated code. Unit tests alone do not catch the integration failures that commonly emerge from AI-generated implementations. Browser automation via Playwright with MCP integration is required for frontend agent work.


The Standard Agentic Workflow

All engineering teams must adopt the following end-to-end workflow. Deviations require written approval from the VP of Engineering.

Phase 1: Context Preparation

Before invoking any agent, engineers must prepare the operational context. This includes maintaining up-to-date project templates containing architecture rules, coding standards, integration patterns, and guardrails. Think of the template as the agent’s onboarding document—if a new hire couldn’t build the feature from the template alone, the template is inadequate.

  • Maintain an agents.md (or equivalent rules file) in every repository root.
  • Include dependency maps, API schemas, and architectural decision records (ADRs).
  • Refresh context documents monthly or after any significant architectural change.

Phase 2: Specification Generation

Leverage AI agents to collaboratively draft specifications from product requirements. The agent can generate initial drafts, but the product owner and tech lead must review and approve all specifications before implementation begins. Specifications should be stored in version control alongside the code they describe.

Phase 3: Agent-Driven Implementation

Deploy coding agents using the approved toolchain (see Section 6). Agents execute within the test-driven loop described above. For complex features, use the multi-agent orchestration pattern: a planning agent decomposes the work, specialized coding agents handle implementation in parallel, and a review agent synthesizes results.

Phase 4: Human Review and QA

All agent-generated code passes through standard code review. Reviewers should pay particular attention to architectural coherence, security implications, and edge cases—areas where agents reliably underperform. UX review remains a fully human activity. If issues are found, the process loops back to specification refinement.

Phase 5: Continuous Deployment

Agent-generated code follows the same CI/CD pipeline as human-generated code, with the addition of mandatory AI-specific linting rules and security scanning. Canary deployments are required for all agent-generated changes exceeding 500 lines of diff.


Governance and Security

Agents must be treated with the same scrutiny applied to third-party contractors. They have the autonomy to edit files and execute commands, and our security posture must reflect that reality.

Access Controls

  • Agents operate under scoped permissions: read access to the full codebase, write access only to designated feature branches.
  • No agent may push directly to main, staging, or production branches.
  • Agents are prohibited from executing destructive operations (database drops, force pushes, production deployments) without human confirmation.
  • All agent actions are logged to an immutable audit trail with full provenance tracking.

Model and Tool Governance

Only models and tools approved by the Architecture Review Board (ARB) may be used in production workflows. The current approved toolchain is maintained in the internal wiki and updated quarterly. Unauthorized use of consumer-grade AI tools (ChatGPT web interface, unauthorized Copilot instances, personal API keys) for production code is prohibited.

All agent interactions must transit through the approved API gateway, which enforces rate limiting, content filtering, and data loss prevention (DLP) policies. No source code may be sent to models hosted outside the approved vendor list.

Intellectual Property

All code generated by agents during the course of employment is company property under existing IP assignment agreements. Engineers must not train, fine-tune, or provide feedback to external models using proprietary code or specifications without written authorization from Legal.


Approved Toolchain

CategoryApproved ToolsUse CaseStatus
IDE AgentCursor, WindsurfInteractive agentic codingProduction
CLI AgentClaude Code, Codex CLIHeadless agent executionProduction
Multi-Agent OrchestrationCustom (MCP-based)Parallel agent swarmsPilot
Code Review AgentGitHub Copilot ReviewAutomated PR reviewProduction
Testing AgentPlaywright + MCPIntegration/E2E testingProduction
MCP RegistryGitHub MCP RegistryTool discovery and integrationProduction

Teams requiring tools outside the approved list must submit a Toolchain Exception Request (TER) through the ARB portal. Estimated turnaround is 3–5 business days.


Metrics and Reporting

All teams must report the following metrics monthly through the Engineering Analytics Dashboard. These metrics will be reviewed in the quarterly Engineering Leadership Forum and will inform resource allocation and team performance assessments.

MetricDefinitionTarget
Agent Adoption Rate% of commits with agent involvement>60% by Q2
Velocity MultiplierStory points delivered vs. 2024 baseline3x by Q3
Agent Code Quality ScoreDefect rate in agent-generated vs. human-generated codeParity or better
Spec Coverage% of work items with machine-readable specifications>90% by Q2
Test Coverage (Agent)Integration test coverage for agent-generated code>80%
Mean Time to ReviewAverage time from agent PR submission to human approval<4 hours

Anti-Patterns to Avoid

The following practices are explicitly prohibited. Repeated violations will be addressed through the standard performance management process.

Vibe Coding in Production: Using conversational AI prompting to generate production code without specifications, tests, or structured review. Vibe coding is acceptable for prototypes and throwaway scripts; it is never acceptable for code that will be deployed to any environment accessible by customers.

Cargo-Cult Adoption: Installing agentic tooling without updating workflows, review processes, or team structures. Simply giving every developer a Cursor license does not constitute agentic development.

Agent Autopilot: Allowing agents to operate without human checkpoints. Every agent workflow must include mandatory human review gates as defined in Section 4.

Prompt-and-Pray: Providing vague, underspecified instructions to agents and hoping for good output. Invest in specification quality; the quality of agent output is directly proportional to the quality of the input specification.

Shadow AI: Using unauthorized AI tools, personal API keys, or consumer-grade interfaces for work that touches production code or proprietary data.


Upskilling and Enablement

The transition to agentic development requires a deliberate investment in new skills. Engineering managers must ensure all team members complete the required training by the deadlines specified below.

Training ModuleAudienceDeadline
Agentic Fundamentals (L100)All engineersMarch 31, 2026
Specification Authoring (L200)Senior engineers, TLsApril 30, 2026
Agent Orchestration (L300)Staff engineers, architectsJune 30, 2026
Agentic Security (L200)All engineersApril 30, 2026
AI Governance for Managers (L100)Engineering managers, directorsMarch 31, 2026

Completion of L100 modules is a prerequisite for the annual performance review cycle. Engineers who complete all applicable modules will receive the “Agentic Practitioner” digital badge on their internal profile.


Looking Ahead

We stand at the beginning of a fundamental transformation in how software is built. The organizations that will thrive in 2027 and beyond are the ones making strategic investments today. Every month of delay represents not just lost productivity but lost learning—and the compound effect of that lost learning creates a competitive gap that becomes increasingly difficult to close.

This guide will be updated quarterly to reflect evolving best practices, new tooling approvals, and lessons learned from our pilot programs. All feedback should be directed to the Agentic Development Center of Excellence (AgenCoE) via the #agentic-dev Slack channel or the dedicated Jira intake form.

Together, we are building the future of software engineering at this organization. Your commitment to these practices is what will make that future a reality.


Document Owner: VP of Engineering, Digital Transformation Office Next Review: Q2 2026 Distribution: All Engineering Staff