ADR-005: Extract Bulk Operations to Lambda Service¶

Status¶

Proposed

Date¶

2026-03-19

Context¶

Bulk document operations are currently handled by Sidekiq jobs in the Rails monolith, dominated by a single 784-line BulkActionJob that handles 15+ distinct operations.

Current State¶

Job	Lines	What It Does
`BulkActionJob`	784	God object: labels, tags, fields, review status, privilege, confidentiality, custodians, placeholders, bates removal, redaction foldering
`BulkDeleteJob`	47	Trash/permanently delete exhibits (batches of 100)
`BulkRestoreJob`	39	Restore trashed exhibits + labels
`BulkLabelActivationJob`	16	Activate/deactivate labels
`BulkLabelDestroyJob`	28	Delete labels + archive search reports
`BulkSubreviewAssignmentJob`	23	Assign/unassign users to subreview folders

Key patterns in current code: - All jobs extend BackgroundProcessing (Sidekiq base class) - PerCaseModel.set_case(case_id) for multi-tenant DB connection - Exhibits processed in slices of 100-1000 - TrackedBackgroundJob for progress tracking (polled by frontend) - kick_off_indexing_for_exhibit_ids(slice) after each batch for ES reindexing - Manual retry logic for MySQL lock wait timeouts (3 retries, 1s sleep) - Redis for respawn attempt counting - BulkActionJob checks FieldLock, safe_to_modify_confidentiality?, Nutrient integration

Trigger point: DocumentsController → Mixins::DocumentBulkManipulation → BulkActionJob.perform_async

Why Extract?¶

BulkActionJob is a god object — 15+ operations in one 784-line file with complex branching
Sidekiq contention — bulk ops compete with all other Sidekiq jobs on shared Redis queues
No horizontal scaling — Sidekiq workers scale with the Rails deployment, not independently
Progress tracking is poll-based — frontend polls TrackedBackgroundJob table
NGE integration already exists — confidentiality updates already call update_nutrient_confidentiality

Decision¶

Extract bulk operations into a Lambda-based service following the NGE service module pattern.

Architecture¶

Rails App
  │
  ├── POST /documents/bulk_update  (existing)
  │     │
  │     ▼
  │   SNS: BulkOperationRequested
  │     │
  │     ▼
  │   SQS Queue → Lambda Handler
  │     │
  │     ├── BulkLabelProcessor      (labels, designations, reorder)
  │     ├── BulkFieldProcessor      (shortcut, author, doc_type, notes, date)
  │     ├── BulkReviewProcessor     (review status, privilege, confidentiality)
  │     ├── BulkCustodianProcessor  (add/remove/update custodians)
  │     ├── BulkDeleteProcessor     (trash, permanent delete, restore)
  │     └── BulkTagProcessor        (add/remove tags)
  │           │
  │           ▼
  │       Per-case MySQL (writer_session)
  │           │
  │           ▼
  │       SNS: BulkOperationProgress / BulkOperationCompleted
  │           │
  │           ▼
  │       PSM (Athena) ← Rails polls for progress (existing pattern)
  │
  └── Nutrient API (for NGE confidentiality/bates)

Phase 1: Decompose BulkActionJob (in Rails first)¶

Before extracting to Lambda, decompose the god object into focused service classes within Rails. This is zero-risk refactoring:

# app/services/bulk_operations/
├── label_processor.rb        # Labels, designations, reorder
├── field_processor.rb        # Shortcut, author, doc_type, notes, date
├── review_processor.rb       # Review status, privilege, confidentiality
├── custodian_processor.rb    # Add/remove/update custodians
├── tag_processor.rb          # Add/remove tags
├── delete_processor.rb       # Trash, delete, restore
├── placeholder_processor.rb  # Non-imaged placeholders, bates removal
└── base_processor.rb         # Shared: exhibit loading, slicing, progress, ES reindex

BulkActionJob becomes a thin dispatcher that routes to the correct processor.

Phase 2: Extract to Lambda¶

Move each processor to a Lambda function following the NGE hexagonal pattern: - core/ — pure business logic (processor classes) - shell/ — MySQL session, Nutrient client, ES reindex trigger - handlers/ — SQS event parsing, routing, error handling

Phase 3: Progress via PSM¶

Replace TrackedBackgroundJob polling with PSM events (same pattern as batch processing): - Lambda emits progress events to SNS - PSM captures all events via Firehose → Parquet → Athena - Rails polls Athena for progress (existing NgeCaseTrackerJob pattern)

What Stays in Rails¶

UI controllers — bulk edit modal, parameter validation, exhibit ID expansion
ES indexing trigger — the Lambda emits IndexingRequested events; existing indexer picks them up
Permission checks — can_bulk_update_folder? etc. stay in Rails authorization layer

Consequences¶

Positive¶

God object eliminated — 784-line BulkActionJob decomposed into focused processors
Independent scaling — Lambda scales with bulk operation volume, not Rails deployment
Consistent architecture — follows same SNS/SQS/Lambda/PSM pattern as document processing
Phase 1 is zero-risk — decomposition happens within Rails first, no infrastructure changes
Retry handling improves — SQS visibility timeout + DLQ replaces manual 3-retry MySQL lock logic

Negative¶

Two execution paths — during migration, some operations run in Rails, others in Lambda
Latency increase — SNS→SQS→Lambda adds ~1-2s vs direct Sidekiq enqueue
Multi-tenant complexity — Lambda needs same PerCaseModel.set_case equivalent in Python

Risks¶

BulkActionJob has hidden coupling — 784 lines likely contain edge cases not visible from structure analysis. Phase 1 (in-Rails decomposition) mitigates this by surfacing all coupling before extraction.
ES reindexing coordination — bulk ops trigger reindexing after each slice. Must ensure Lambda→ES reindex trigger works reliably.
Nutrient API calls from Lambda — confidentiality updates require Nutrient API access. Must validate network path from Lambda VPC to Nutrient service.

Ask the Architecture ×

Ask questions about Nextpoint architecture, patterns, rules, or any module. Powered by Claude Opus 4.6.