Agentic AI Engineering Playbook: A Practical Demonstration with Banking app using Node.js

What Is Agentic AI (and Why It Helps Engineering)

Agentic AI is about orchestrating multiple specialized agents that reason over your real signals and pass context along, instead of one-off prompts. Benefits for engineering teams:

Context continuity: Each step (metrics, discovery, engineering, quality, platform) inherits prior findings, so outputs stay grounded.
Signal-driven: Agents consume repo signals, metrics, and external links (CI/Sonar/Fortify) to avoid hallucinated guidance.
Actionable artifacts: Plans, guardrails, and tests are generated as Markdown/HTML plus executable Playwright specs—things you can drop into CI.
Modular and extensible: Swap models (mock/OpenAI), add tools (log readers, CI APIs), or new agents without rewriting the flow.

How Agentic AI Works

The core mechanism of an agentic system typically involves several key components:

Goal Definition: The user or system provides a high-level objective (e.g., “Resolve the user bug report”).
Planning/Reasoning: Using a Large Language Model (LLM) or similar reasoning engine, the agent breaks the goal down into a series of smaller, manageable steps.
Action Execution: The agent uses tools, APIs, or code execution environments to interact with the external environment (e.g., searching documentation, writing code, running tests).
Observation/Feedback: The agent observes the results of its actions and compares them against the goal state.
Reflection/Iteration: If the goal isn’t met, the agent reflects on its observations and revises its plan, continuing the cycle until completion or failure. This iterative process allows the AI to self-correct and handle unforeseen circumstances, making it more robust and capable than non-agentic AI in complex, real-world tasks.

Why Agentic AI Here

We built a small Agentic workflow to show how Agentic AI can assist software engineering, quality engineering, and platform engineering teams end-to-end. Instead of a single prompt, a chain of agents passes context forward: metrics → discovery → engineering → quality → platform → test design → summary. The demo runs locally with a mock model or with OpenAI if you drop in a key, and it ships reports plus auto-generated Playwright tests.

What’s in the Playbook

Scenario-first: A JSON scenario defines the problem, stack, constraints, and signals (repo links, CI/Sonar/Fortify, etc.).
Agent chain: Each agent consumes prior outputs and emits its slice (risks, plans, guardrails, tests).
Signals-aware: Metrics and external signals are injected so the agents reason over real inputs, not generic lore.
Test generation: Optional --run-tests auto-generates REST API / UI (Playwright) specs and runs them; results land in the report.
Offline-friendly: Mock model by default; flip to OpenAI via config/model.json or OPENAI_API_KEY.

How the Flow Runs

Load scenario + signals: scenarios/banking-app.json sets goal/tech/constraints; optional metrics (data/metrics.json) and links (config/signals.json) are attached.
Chain agents: Metrics → Discovery → Engineering → Quality → Platform → TestDesigner → Summary. Each step gets prior outputs plus signals.
Model calls: Mock responses by default; OpenAI if configured.
Render: Markdown (and optional HTML) report in reports/, with metrics status, plans, risks, and generated tests.
(Optional) Tests: --run-tests generates Playwright specs in tests/generated/ and runs them; results are stitched into the report.

Demo Web App

Static banking UI in web/; run npm run web then open http://localhost:3000 (login creds on the card).
Use it as context when running the agent chain; with --run-tests, the Playwright specs hit the UI (login/dashboard) and the /api/account endpoint.
Repo: github.com/amiya-pattnaik/agentic-engineering-playbook

Why This Matters for Teams

Product-aligned outputs: Plans and risks stay anchored to your metrics and signals, not generic advice.
Guardrails baked in: Platform and quality agents add policies (CICD, security, budgets) alongside engineering steps.
Automation-friendly: Tests and reports are artifacts you can drop into CI; mock mode keeps it offline for demos.
Extensible: Add scenarios for your services, wire real tools (logs, CI APIs, SLOs), and swap the model client without changing the flow.

Quickstart (clone + run)

# clone and install
git clone https://github.com/amiya-pattnaik/agentic-engineering-playbook.git
cd agentic-engineering-playbook
npm install

# mock model, Markdown report
node src/run.js scenarios/banking-app.json --metrics data/metrics.json

# Markdown + HTML report
node src/run.js scenarios/banking-app.json --metrics data/metrics.json --html

# Add external signals + auto-generated Playwright tests
node src/run.js scenarios/banking-app.json --metrics data/metrics.json --signals config/signals.json --html --run-tests

# Shortcut for metrics + html + tests
npm run demo:tests

Use OpenAI instead of mock:

cp config/model.example.json config/model.json   # add your key
node src/run.js scenarios/banking-app.json --metrics data/metrics.json

How to Extend

Add more scenarios under scenarios/*.json with goal, constraints, techStack, and inputs.repoSignals.
Teach agents to read real repo files or CI logs (hook into src/tools.js).
Point signals at your GitHub/Sonar/Fortify endpoints in config/signals.json.
Keep tests: run npx playwright install chromium once; then --run-tests or npm run demo:tests will generate and execute specs.

Notes on other LLMs (Claude, Gemini, etc.)

The flow is model-agnostic. models.js exposes a generate method and picks a model. To add another provider:

Implement a new model class (e.g., ClaudeModel, GeminiModel) mirroring OpenAIModel (take key/model name, call the provider’s chat endpoint, return text).
Update selectModel() to check ANTHROPIC_API_KEY or GEMINI_API_KEY (or config/model.json) before falling back to mock.
Add provider-specific settings (max tokens, safety filters) to config/model.json.

Agents stay unchanged—they just call generate().

Closing Thought

Agentic AI doesn’t have to be abstract. A small, auditable chain—fed by your signals and capped with tests—shows how AI can assist engineering, QA, and platform teams without hand-waving. Start with mock mode, layer in your real signals, then graduate to your preferred model when you’re ready.