ADR-005: Extract Bulk Operations to Lambda Service¶
Status¶
Proposed
Date¶
2026-03-19
Context¶
Bulk document operations are currently handled by Sidekiq jobs in the Rails monolith,
dominated by a single 784-line BulkActionJob that handles 15+ distinct operations.
Current State¶
| Job | Lines | What It Does |
|---|---|---|
BulkActionJob |
784 | God object: labels, tags, fields, review status, privilege, confidentiality, custodians, placeholders, bates removal, redaction foldering |
BulkDeleteJob |
47 | Trash/permanently delete exhibits (batches of 100) |
BulkRestoreJob |
39 | Restore trashed exhibits + labels |
BulkLabelActivationJob |
16 | Activate/deactivate labels |
BulkLabelDestroyJob |
28 | Delete labels + archive search reports |
BulkSubreviewAssignmentJob |
23 | Assign/unassign users to subreview folders |
Key patterns in current code:
- All jobs extend BackgroundProcessing (Sidekiq base class)
- PerCaseModel.set_case(case_id) for multi-tenant DB connection
- Exhibits processed in slices of 100-1000
- TrackedBackgroundJob for progress tracking (polled by frontend)
- kick_off_indexing_for_exhibit_ids(slice) after each batch for ES reindexing
- Manual retry logic for MySQL lock wait timeouts (3 retries, 1s sleep)
- Redis for respawn attempt counting
- BulkActionJob checks FieldLock, safe_to_modify_confidentiality?, Nutrient integration
Trigger point: DocumentsController → Mixins::DocumentBulkManipulation → BulkActionJob.perform_async
Why Extract?¶
- BulkActionJob is a god object — 15+ operations in one 784-line file with complex branching
- Sidekiq contention — bulk ops compete with all other Sidekiq jobs on shared Redis queues
- No horizontal scaling — Sidekiq workers scale with the Rails deployment, not independently
- Progress tracking is poll-based — frontend polls
TrackedBackgroundJobtable - NGE integration already exists — confidentiality updates already call
update_nutrient_confidentiality
Decision¶
Extract bulk operations into a Lambda-based service following the NGE service module pattern.
Architecture¶
Rails App
│
├── POST /documents/bulk_update (existing)
│ │
│ ▼
│ SNS: BulkOperationRequested
│ │
│ ▼
│ SQS Queue → Lambda Handler
│ │
│ ├── BulkLabelProcessor (labels, designations, reorder)
│ ├── BulkFieldProcessor (shortcut, author, doc_type, notes, date)
│ ├── BulkReviewProcessor (review status, privilege, confidentiality)
│ ├── BulkCustodianProcessor (add/remove/update custodians)
│ ├── BulkDeleteProcessor (trash, permanent delete, restore)
│ └── BulkTagProcessor (add/remove tags)
│ │
│ ▼
│ Per-case MySQL (writer_session)
│ │
│ ▼
│ SNS: BulkOperationProgress / BulkOperationCompleted
│ │
│ ▼
│ PSM (Athena) ← Rails polls for progress (existing pattern)
│
└── Nutrient API (for NGE confidentiality/bates)
Phase 1: Decompose BulkActionJob (in Rails first)¶
Before extracting to Lambda, decompose the god object into focused service classes within Rails. This is zero-risk refactoring:
# app/services/bulk_operations/
├── label_processor.rb # Labels, designations, reorder
├── field_processor.rb # Shortcut, author, doc_type, notes, date
├── review_processor.rb # Review status, privilege, confidentiality
├── custodian_processor.rb # Add/remove/update custodians
├── tag_processor.rb # Add/remove tags
├── delete_processor.rb # Trash, delete, restore
├── placeholder_processor.rb # Non-imaged placeholders, bates removal
└── base_processor.rb # Shared: exhibit loading, slicing, progress, ES reindex
BulkActionJob becomes a thin dispatcher that routes to the correct processor.
Phase 2: Extract to Lambda¶
Move each processor to a Lambda function following the NGE hexagonal pattern:
- core/ — pure business logic (processor classes)
- shell/ — MySQL session, Nutrient client, ES reindex trigger
- handlers/ — SQS event parsing, routing, error handling
Phase 3: Progress via PSM¶
Replace TrackedBackgroundJob polling with PSM events (same pattern as batch processing):
- Lambda emits progress events to SNS
- PSM captures all events via Firehose → Parquet → Athena
- Rails polls Athena for progress (existing NgeCaseTrackerJob pattern)
What Stays in Rails¶
- UI controllers — bulk edit modal, parameter validation, exhibit ID expansion
- ES indexing trigger — the Lambda emits
IndexingRequestedevents; existing indexer picks them up - Permission checks —
can_bulk_update_folder?etc. stay in Rails authorization layer
Consequences¶
Positive¶
- God object eliminated — 784-line
BulkActionJobdecomposed into focused processors - Independent scaling — Lambda scales with bulk operation volume, not Rails deployment
- Consistent architecture — follows same SNS/SQS/Lambda/PSM pattern as document processing
- Phase 1 is zero-risk — decomposition happens within Rails first, no infrastructure changes
- Retry handling improves — SQS visibility timeout + DLQ replaces manual 3-retry MySQL lock logic
Negative¶
- Two execution paths — during migration, some operations run in Rails, others in Lambda
- Latency increase — SNS→SQS→Lambda adds ~1-2s vs direct Sidekiq enqueue
- Multi-tenant complexity — Lambda needs same
PerCaseModel.set_caseequivalent in Python
Risks¶
- BulkActionJob has hidden coupling — 784 lines likely contain edge cases not visible from structure analysis. Phase 1 (in-Rails decomposition) mitigates this by surfacing all coupling before extraction.
- ES reindexing coordination — bulk ops trigger reindexing after each slice. Must ensure Lambda→ES reindex trigger works reliably.
- Nutrient API calls from Lambda — confidentiality updates require Nutrient API access. Must validate network path from Lambda VPC to Nutrient service.
Ask questions about Nextpoint architecture, patterns, rules, or any module. Powered by Claude Opus 4.6.