Article Review: Group 21 — Agentic AI Production Patterns¶

Articles Reviewed¶

"Agentic AI for Investment Management: From Concept to Production" — Farhad Malik / Data Science Collective (Dec 2025) — Comprehensive guide to deploying agentic AI in financial services: firm structure mapping, agent anatomy, orchestration patterns, guardrail framework, and risk taxonomy. First article in a planned 9-part series.

Key Concepts¶

The Semi-Autonomous Principle¶

The article's foundational design constraint: agents are semi-autonomous, not fully autonomous. Humans set objectives, define constraints, approve actions, and retain final accountability. The agent handles execution within those boundaries.

This is the same principle our architecture embeds in handler design — handlers execute idempotently within defined constraints, but the pipeline (ProcessorApi → documentextractor → SNS → loader/uploader) is orchestrated by the system, not by individual handlers making autonomous decisions.

Five-Component Agent Architecture¶

Every production agent needs five components:

Goal — The specific purpose (e.g., validate a trade, reconcile positions)
LLM — The planning and reasoning engine
Tools — External functions the agent calls (APIs, databases, code execution)
Memory — Short-term (conversation context) + long-term (vector DB, persistent knowledge)
Orchestration Layer — Manages multi-step processes, coordinates tools, handles action sequences

Six Orchestration Patterns¶

Pattern	Description	When to Use
ReAct	Reason → Act → Observe loop	General-purpose tasks with dynamic tool selection
Plan-and-Execute	Create full plan first, then execute step-by-step	Complex multi-step workflows with dependencies
Reflex	Stimulus triggers immediate action, no planning	Simple threshold-based alerts and notifications
Hierarchical	Parent agent delegates to child agents	Complex tasks decomposable into independent sub-tasks
Multimodal	Process images, audio, video as inputs/outputs	Document processing, media analysis
Toolformer	Model learns when to call tools	Tasks where tool selection is non-obvious

Four-Gate Guardrail Framework¶

Every agent action passes through four gates:

Input validation — Catch bad data before processing
Output verification — Check that results make sense
Human approval workflows — For material decisions
Detailed audit trails — Who did what and why

Three-Layer Deployment Strategy¶

The article maps deployment priority to organizational structure:

Layer	Risk	ROI	Accuracy Tolerance	Start Here?
Back office (settlement, reconciliation)	Lowest	Clear	95-98% sufficient	POC target
Middle office (risk, trade validation)	Medium	Highest	95-98% sufficient	Scale target
Front office (research, alpha generation)	Highest	Variable	Requires sophisticated reasoning	Last

When NOT to Use Agents (Six Boundaries)¶

Tasks requiring 100% deterministic accuracy (NAV computation, trade reconciliation)
Emergency scenarios requiring immediate human judgment
Decisions with regulatory/legal ambiguity
Workflows requiring subjective judgment or ethics
Legacy systems with organizational resistance to integration
Environments without centralized, governed data

Risk Taxonomy (Six Categories)¶

Model risks — Hallucinations, overconfidence, misinterpretation
Data risks — Poor quality inputs, stale data, unreconciled sources
Operational risks — Cascading failures, API outages, hidden dependencies
Human risks — Over-reliance, weak review controls, poor approval workflows
Regulatory risks — Unexplainable decisions, lack of auditability
Financial risks — Mispriced trades, incorrect risk calculations

Relevance to Our Architecture¶

Pattern Mapping: Agents to NGE Service Modules¶

The five-component agent architecture has a striking parallel to our service module anatomy:

Agent Component	NGE Equivalent
Goal	Handler purpose (event type routing via SNS filter policies)
LLM	Core business logic (`core/process.py`)
Tools	Shell infrastructure (`shell/` — S3, SQS, SNS, RDS, ES)
Memory	Database state (per-case MySQL) + checkpoint state (DynamoDB/S3)
Orchestration	Checkpoint pipeline state machine + SNS event routing

Our service modules are not AI agents, but they follow the same structural decomposition: a clear goal, a reasoning/processing layer, external tool integrations, persistent state, and orchestration logic.

Orchestration Pattern Alignment¶

Our existing patterns map to the six orchestration patterns:

Plan-and-Execute → Our checkpoint pipeline pattern. documentloader's 11-step state machine creates the full plan (checkpoint definitions), then executes step-by-step with resumability.
Hierarchical → pr-review's multi-agent architecture. Orchestrator delegates to 5 specialized review agents, each working independently, then aggregates results.
ReAct → Not directly used. Our handlers don't dynamically reason about which tool to call — routing is pre-determined by SNS filter policies.
Reflex → Our alarm-based responses. DLQ depth > 0 triggers alerting without planning or reasoning.

Four-Gate Framework vs. Our Guardrails¶

Gate	Our Implementation
Input validation	Handler-level: parse SQS message, validate required fields (caseId, batchId, jobId)
Output verification	Checkpoint validation: verify step completion before advancing state machine
Human approval	Not currently implemented — our pipelines are fully automated. The article suggests this is appropriate for deterministic, rule-based workflows.
Audit trails	CloudWatch structured logging, X-Ray tracing, SNS event history

The absence of human approval in our pipeline is correct per the article's framework — eDiscovery document processing is high-volume, rule-based, and tolerates automated execution. Human review enters at the product level (lawyers reviewing processed documents), not the infrastructure level.

The Accuracy Tolerance Insight¶

The article's 95-98% accuracy threshold for agent suitability maps to our domain:

Document processing — Tolerates imperfect extraction (OCR errors, metadata parsing edge cases). Humans review the final output. Suitable for agentic patterns.
Bates stamping — Must be deterministic and sequential. Zero tolerance for gaps or duplicates. Not suitable for probabilistic agents.
Search indexing — Tolerates some imprecision (relevance ranking is inherently fuzzy). Suitable.
Billing/metering — Must be exact. Not suitable.

Potential Application: AI-Assisted Architecture Review¶

The article's multi-agent orchestration pattern (Quarterly Investor Report example: 4 parallel agents → orchestrator → assembler → human review → sender) maps directly to a potential enhancement of our pr-review service:

Current: 5 parallel review agents → verifier aggregator → PR comment Enhanced: Add domain-specific agents for eDiscovery compliance, data integrity, and multi-tenant safety. The hierarchical pattern supports this naturally.

Build vs. Buy Decision Framework¶

The article's build/buy/partner framework suggests: - Build when workflows are highly proprietary and the org can support 8+ GenAI/ML engineers - Buy when standardized solutions exist - Partner for specialized capabilities

For Nextpoint: eDiscovery document processing is highly proprietary (per-case isolation, chain of custody, legal hold requirements). The NGE architecture is a "build" decision. AI-assisted tooling (pr-review, nextpoint-ai) is a "build on top of vendor models" hybrid — we build the orchestration, use vendor LLMs for reasoning.

Ask the Architecture ×

Ask questions about Nextpoint architecture, patterns, rules, or any module. Powered by Claude Opus 4.6.