Skip to content

Article Review: Group 21 — Agentic AI Production Patterns

Articles Reviewed

  1. "Agentic AI for Investment Management: From Concept to Production" — Farhad Malik / Data Science Collective (Dec 2025) — Comprehensive guide to deploying agentic AI in financial services: firm structure mapping, agent anatomy, orchestration patterns, guardrail framework, and risk taxonomy. First article in a planned 9-part series.

Key Concepts

The Semi-Autonomous Principle

The article's foundational design constraint: agents are semi-autonomous, not fully autonomous. Humans set objectives, define constraints, approve actions, and retain final accountability. The agent handles execution within those boundaries.

This is the same principle our architecture embeds in handler design — handlers execute idempotently within defined constraints, but the pipeline (ProcessorApi → documentextractor → SNS → loader/uploader) is orchestrated by the system, not by individual handlers making autonomous decisions.

Five-Component Agent Architecture

Every production agent needs five components:

  1. Goal — The specific purpose (e.g., validate a trade, reconcile positions)
  2. LLM — The planning and reasoning engine
  3. Tools — External functions the agent calls (APIs, databases, code execution)
  4. Memory — Short-term (conversation context) + long-term (vector DB, persistent knowledge)
  5. Orchestration Layer — Manages multi-step processes, coordinates tools, handles action sequences

Six Orchestration Patterns

Pattern Description When to Use
ReAct Reason → Act → Observe loop General-purpose tasks with dynamic tool selection
Plan-and-Execute Create full plan first, then execute step-by-step Complex multi-step workflows with dependencies
Reflex Stimulus triggers immediate action, no planning Simple threshold-based alerts and notifications
Hierarchical Parent agent delegates to child agents Complex tasks decomposable into independent sub-tasks
Multimodal Process images, audio, video as inputs/outputs Document processing, media analysis
Toolformer Model learns when to call tools Tasks where tool selection is non-obvious

Four-Gate Guardrail Framework

Every agent action passes through four gates:

  1. Input validation — Catch bad data before processing
  2. Output verification — Check that results make sense
  3. Human approval workflows — For material decisions
  4. Detailed audit trails — Who did what and why

Three-Layer Deployment Strategy

The article maps deployment priority to organizational structure:

Layer Risk ROI Accuracy Tolerance Start Here?
Back office (settlement, reconciliation) Lowest Clear 95-98% sufficient POC target
Middle office (risk, trade validation) Medium Highest 95-98% sufficient Scale target
Front office (research, alpha generation) Highest Variable Requires sophisticated reasoning Last

When NOT to Use Agents (Six Boundaries)

  1. Tasks requiring 100% deterministic accuracy (NAV computation, trade reconciliation)
  2. Emergency scenarios requiring immediate human judgment
  3. Decisions with regulatory/legal ambiguity
  4. Workflows requiring subjective judgment or ethics
  5. Legacy systems with organizational resistance to integration
  6. Environments without centralized, governed data

Risk Taxonomy (Six Categories)

  1. Model risks — Hallucinations, overconfidence, misinterpretation
  2. Data risks — Poor quality inputs, stale data, unreconciled sources
  3. Operational risks — Cascading failures, API outages, hidden dependencies
  4. Human risks — Over-reliance, weak review controls, poor approval workflows
  5. Regulatory risks — Unexplainable decisions, lack of auditability
  6. Financial risks — Mispriced trades, incorrect risk calculations

Relevance to Our Architecture

Pattern Mapping: Agents to NGE Service Modules

The five-component agent architecture has a striking parallel to our service module anatomy:

Agent Component NGE Equivalent
Goal Handler purpose (event type routing via SNS filter policies)
LLM Core business logic (core/process.py)
Tools Shell infrastructure (shell/ — S3, SQS, SNS, RDS, ES)
Memory Database state (per-case MySQL) + checkpoint state (DynamoDB/S3)
Orchestration Checkpoint pipeline state machine + SNS event routing

Our service modules are not AI agents, but they follow the same structural decomposition: a clear goal, a reasoning/processing layer, external tool integrations, persistent state, and orchestration logic.

Orchestration Pattern Alignment

Our existing patterns map to the six orchestration patterns:

  • Plan-and-Execute → Our checkpoint pipeline pattern. documentloader's 11-step state machine creates the full plan (checkpoint definitions), then executes step-by-step with resumability.
  • Hierarchical → pr-review's multi-agent architecture. Orchestrator delegates to 5 specialized review agents, each working independently, then aggregates results.
  • ReAct → Not directly used. Our handlers don't dynamically reason about which tool to call — routing is pre-determined by SNS filter policies.
  • Reflex → Our alarm-based responses. DLQ depth > 0 triggers alerting without planning or reasoning.

Four-Gate Framework vs. Our Guardrails

Gate Our Implementation
Input validation Handler-level: parse SQS message, validate required fields (caseId, batchId, jobId)
Output verification Checkpoint validation: verify step completion before advancing state machine
Human approval Not currently implemented — our pipelines are fully automated. The article suggests this is appropriate for deterministic, rule-based workflows.
Audit trails CloudWatch structured logging, X-Ray tracing, SNS event history

The absence of human approval in our pipeline is correct per the article's framework — eDiscovery document processing is high-volume, rule-based, and tolerates automated execution. Human review enters at the product level (lawyers reviewing processed documents), not the infrastructure level.

The Accuracy Tolerance Insight

The article's 95-98% accuracy threshold for agent suitability maps to our domain:

  • Document processing — Tolerates imperfect extraction (OCR errors, metadata parsing edge cases). Humans review the final output. Suitable for agentic patterns.
  • Bates stamping — Must be deterministic and sequential. Zero tolerance for gaps or duplicates. Not suitable for probabilistic agents.
  • Search indexing — Tolerates some imprecision (relevance ranking is inherently fuzzy). Suitable.
  • Billing/metering — Must be exact. Not suitable.

Potential Application: AI-Assisted Architecture Review

The article's multi-agent orchestration pattern (Quarterly Investor Report example: 4 parallel agents → orchestrator → assembler → human review → sender) maps directly to a potential enhancement of our pr-review service:

Current: 5 parallel review agents → verifier aggregator → PR comment Enhanced: Add domain-specific agents for eDiscovery compliance, data integrity, and multi-tenant safety. The hierarchical pattern supports this naturally.

Build vs. Buy Decision Framework

The article's build/buy/partner framework suggests: - Build when workflows are highly proprietary and the org can support 8+ GenAI/ML engineers - Buy when standardized solutions exist - Partner for specialized capabilities

For Nextpoint: eDiscovery document processing is highly proprietary (per-case isolation, chain of custody, legal hold requirements). The NGE architecture is a "build" decision. AI-assisted tooling (pr-review, nextpoint-ai) is a "build on top of vendor models" hybrid — we build the orchestration, use vendor LLMs for reasoning.

Ask the Architecture ×

Ask questions about Nextpoint architecture, patterns, rules, or any module. Powered by Claude Opus 4.6.