Why Agentic AI Here

We built a small agentic workflow to show how AI can assist engineering, QA, and platform teams end-to-end. Instead of a single prompt, a chain of agents passes context forward: metrics → discovery → engineering → quality → platform → test design → summary. The demo runs locally with a mock model or with OpenAI if you drop in a key, and it ships reports plus auto-generated Playwright tests.

What Is Agentic AI (and Why It Helps Engineering)

Agentic AI is about orchestrating multiple specialized agents that reason over your real signals and pass context along, instead of one-off prompts. Benefits for engineering teams:

  • Context continuity: Each step (metrics, discovery, engineering, quality, platform) inherits prior findings, so outputs stay grounded.
  • Signal-driven: Agents consume repo signals, metrics, and external links (CI/Sonar/Fortify) to avoid hallucinated guidance.
  • Actionable artifacts: Plans, guardrails, and tests are generated as Markdown/HTML plus executable Playwright specs—things you can drop into CI.
  • Modular and extensible: Swap models (mock/OpenAI), add tools (log readers, CI APIs), or new agents without rewriting the flow.

What’s in the Playbook

  • Scenario-first: A JSON scenario defines the problem, stack, constraints, and signals (repo links, CI/Sonar/Fortify, etc.).
  • Agent chain: Each agent consumes prior outputs and emits its slice (risks, plans, guardrails, tests).
  • Signals-aware: Metrics and external signals are injected so the agents reason over real inputs, not generic lore.
  • Test generation: Optional --run-tests auto-generates Playwright UI/API specs and runs them; results land in the report.
  • Offline-friendly: Mock model by default; flip to OpenAI via config/model.json or OPENAI_API_KEY.

How the Flow Runs

  1. Load scenario + signals: scenarios/banking-app.json sets goal/tech/constraints; optional metrics (data/metrics.json) and links (config/signals.json) are attached.
  2. Chain agents: Metrics → Discovery → Engineering → Quality → Platform → TestDesigner → Summary. Each step gets prior outputs plus signals.
  3. Model calls: Mock responses by default; OpenAI if configured.
  4. Render: Markdown (and optional HTML) report in reports/, with metrics status, plans, risks, and generated tests.
  5. (Optional) Tests: --run-tests generates Playwright specs in tests/generated/ and runs them; results are stitched into the report.

Demo Web App

  • Static banking UI in web/; run npm run web then open http://localhost:3000 (login creds on the card).
  • Use it as context when running the agent chain; with --run-tests, the Playwright specs hit the UI (login/dashboard) and the /api/account endpoint.
  • Repo: github.com/amiya-pattnaik/agentic-engineering-playbook

Why This Matters for Teams

  • Product-aligned outputs: Plans and risks stay anchored to your metrics and signals, not generic advice.
  • Guardrails baked in: Platform and quality agents add policies (CICD, security, budgets) alongside engineering steps.
  • Automation-friendly: Tests and reports are artifacts you can drop into CI; mock mode keeps it offline for demos.
  • Extensible: Add scenarios for your services, wire real tools (logs, CI APIs, SLOs), and swap the model client without changing the flow.

Quickstart (clone + run)

# clone and install
git clone https://github.com/amiya-pattnaik/agentic-engineering-playbook.git
cd agentic-engineering-playbook
npm install

# mock model, Markdown report
node src/run.js scenarios/banking-app.json --metrics data/metrics.json

# Markdown + HTML report
node src/run.js scenarios/banking-app.json --metrics data/metrics.json --html

# Add external signals + auto-generated Playwright tests
node src/run.js scenarios/banking-app.json --metrics data/metrics.json --signals config/signals.json --html --run-tests

# Shortcut for metrics + html + tests
npm run demo:tests

Use OpenAI instead of mock:

cp config/model.example.json config/model.json   # add your key
node src/run.js scenarios/banking-app.json --metrics data/metrics.json

How to Extend

  • Add more scenarios under scenarios/*.json with goal, constraints, techStack, and inputs.repoSignals.
  • Teach agents to read real repo files or CI logs (hook into src/tools.js).
  • Point signals at your GitHub/Sonar/Fortify endpoints in config/signals.json.
  • Keep tests: run npx playwright install chromium once; then --run-tests or npm run demo:tests will generate and execute specs.

Notes on other LLMs (Claude, Gemini, etc.)

The flow is model-agnostic. models.js exposes a generate method and picks a model. To add another provider:

  • Implement a new model class (e.g., ClaudeModel, GeminiModel) mirroring OpenAIModel (take key/model name, call the provider’s chat endpoint, return text).
  • Update selectModel() to check ANTHROPIC_API_KEY or GEMINI_API_KEY (or config/model.json) before falling back to mock.
  • Add provider-specific settings (max tokens, safety filters) to config/model.json.

Agents stay unchanged—they just call generate().

Closing Thought

Agentic AI doesn’t have to be abstract. A small, auditable chain—fed by your signals and capped with tests—shows how AI can assist engineering, QA, and platform teams without hand-waving. Start with mock mode, layer in your real signals, then graduate to your preferred model when you’re ready.