NGE vs Legacy Code Divergence Map¶

Overview¶

This document maps every location in the Rails codebase where execution diverges based on whether a case is NGE or Legacy. A case is permanently one or the other — nge_enabled? is set at case creation and never changes.

Three conditional flags control divergence: 1. Npcase#nge_enabled? — Is the case NGE? Used broadly across models, controllers, views, jobs. 2. Batch#nge_batch? — Returns processor_job_id.present?. A batch-level check (NGE batches have a processor job ID assigned by ProcessorApi.import()). 3. Exhibit#processing_in_nge — Boolean column. Set to true when a document is actively being processed by Nutrient/DPS. Gates toolbar actions in the UI.

Total divergence points: 85+ (43 in backend, 45+ in views/frontend)

1. Document Ingestion & Import¶

How documents enter the system — completely different pipelines.

Backend¶

#	File	Condition	NGE Path	Legacy Path
1	`batches_controller.rb:297`	`nge_batch?`	Call `nge_batch_process`: check event dupes, queue `NgeCaseTrackerJob` (2 min delay)	Call `legacy_batch_process` directly
2	`batches_controller.rb:377`	`unless nge_batch?`	Skip setting `next_check_for_complete_time_gmt`	Set next check time to 60s (polling-based completion)
3	`batch.rb:845`	—	Definition: `nge_batch?` = `processor_job_id.present?`	—
4	`batch.rb:466`	`nge_batch?`	Include `batch_process_details` in meta update	Exclude it
5	`batch.rb:644`	`nge_batch?`	Update `ImportStatus` records with batch status	Return immediately (no ImportStatus)
6	`batch_pst.rb:7`	`nge_enabled?`	Call `restructure_pst_info` (JSON-ready format)	`ExtractedPstFolderRollupCollection`

Views¶

File	NGE	Legacy
`imports/_navigation.html.erb:1`	Different import navigation	Standard navigation
`imports/select.html.erb:11`	Mailbox: "OST, PST, MBOX"	Mailbox: "PST, MBOX"
`imports/select.html.erb:13`	Production: "DAT, CSV, OPT, LOG"	Production: "DAT, CSV"
`imports/select.html.erb:41`	Hidden additional import options	Shows additional options
`imports/select.html.erb:56`	Button: "Start Import"	Button: "Next"
`batches/show.html.erb:12`	Different batch detail rendering	Standard rendering
`batches/list.html.erb:12`	Different batch list rendering	Standard rendering
`batches/_settings.html.erb:3`	"Import Details: {name}"	"Import Data Settings"
`batches/dedupe.html.erb:10`	Different dedup settings UI	Standard dedup UI
`general_settings/import.html.erb:11`	Hidden	Shows import settings

2. Batch Completion & Lifecycle¶

How batches are finalized — NGE skips all Legacy polling/retry machinery.

#	File	Condition	NGE Path	Legacy Path
7	`batch_completion_job.rb:23`	`!nge_batch?`	Skip retrying failed batch split jobs	Retry failed batch parts if max not reached
8	`batch_completion_job.rb:26`	`unless nge_batch?`	Skip in-progress check and `BatchCleanup` retry	Check for in-progress jobs; run `BatchCleanup.process`
9	`batch_completion_job.rb:62`	`!nge_batch?`	Never reschedule next check; proceed to `complete_batch`	Reschedule via `update_batch_next_check_time`
10	`batch_completion_job.rb:69`	`unless nge_batch?`	Skip `Exhibit.request_indexing`	Trigger ES reindex
11	`batch_completion_job.rb:130`	`unless nge_batch?`	Skip `backfill_email_family_id`	Backfill `family_id` on exhibits
12	`batch_completion_job.rb:134`	`nge_enabled?`	Add non-imaged placeholders for containers (pst, ost, mbox, zip, 7z, tar, rar)	Skip placeholder creation
13	`batch_completion_job.rb:164`	`unless nge_batch?`	Skip `processing_errors_array` (errors come from Athena)	Scan processing jobs, create `BatchProcessingEvent` records
14	`batch_completion_job.rb:387`	`nge_batch?`	`pdfs_to_create?` → `false` (Nutrient handles PDFs)	Check `search_hilite` for PDF generation

Why so different: NGE handles retries via SQS, indexing via the loader pipeline, error tracking via Athena, and PDF rendering via Nutrient. The Legacy polling/retry loop in batch_completion_job is entirely replaced.

3. Batch Cancellation¶

#	File	Condition	NGE Path	Legacy Path
15	`batch.rb:624`	`nge_batch?`	`cancel_nge_import` → `ProcessorApi.cancel_import()` (external API call, raises on failure)	`cancel_non_nge_import` → local DB update + `BatchStatusUpdateJob`
16	`batch_completion_job.rb:43`	`!nge_batch?`	On cancel: skip "still processing" check, cleanup immediately	Reschedule check if jobs still in progress
17	`batches_controller.rb:276`	`can_cancel_batch(nge_batch?)`	Permission check with `nge=true`	Permission check with `nge=false`
18	`batches/_batch_sidebar.html.erb:48`	`can_cancel_batch(nge_batch?)`	NGE permission check for cancel button visibility	Legacy permission check

Tag / Custom Field Creation During Import¶

How tag values are deduplicated during load file imports — fundamentally different approaches.

Aspect	NGE (documentloader)	Legacy (Rails workers)
Dedup mechanism	MySQL `TagDedupe` table with SHA256 hash + PK constraint	Elasticsearch query per tag value
Race condition handling	SAVEPOINT + IntegrityError catch + 3 retries	ES acts as read-after-write cache
Tagging (exhibit↔tag)	`INSERT IGNORE` bulk insert	Per-row ActiveRecord create
ES in write path?	No — ES indexed separately for search	Yes — queried per value to check existence
DB access	Direct via RDS Proxy (no Rails)	Through ActiveRecord connection pool
Performance at scale	Scales linearly with MySQL write throughput	Degrades with ES query volume (millions of queries on large imports)

Key files: - NGE: documentloader/shell/tags_ops.py (insert_or_get_tag_id), shell/taggings_ops.py (INSERT IGNORE) - Legacy: Rails Tag model + Elasticsearch check before creation

Why Legacy uses ES: Parallel workers (fork-based) create race conditions on tag creation. ES check prevents duplicates across workers. This is intentional and correct for Legacy's architecture.

Why NGE doesn't need ES: MySQL atomic constraints (TagDedupe PK + SAVEPOINT rollback) handle concurrent Lambda invocations natively. SHA256 hash works around MySQL's varchar(255) key length limit for tag names up to varchar(2000).

For new modules (ADR-005, ADR-006): Follow documentloader's TagDedupe pattern. Never replicate Legacy's ES-check-before-write pattern in NGE modules.

Document Deduplication During Import¶

How duplicate documents are detected — insert-based (NGE) vs query-based (Legacy).

Aspect	NGE (documentloader)	Legacy (Rails)
Storage	Separate `doc_dedupe` table (insert-based detection)	No separate table — queries `Exhibits` table directly
Dedup key	Composite PK: `(npcase_id, message_id, bcc, md5, doc_type)`	Query by `expansive_hash` + `email_message_id` on Exhibits
Hash algorithm	MD5 (`content_hash` or `attachments_hash`)	MD5 → then SHA256 of MD5 hashes (`expansive_hash`)
When checked	Before exhibit creation (PROCESS_STARTED checkpoint)	After exhibit creation (post-creation merge)
Duplicate handling	Skip creation, return existing exhibit IDs	Create exhibit, then link as duplicate via `ExhibitDedupeMerger`
Race condition handling	SAVEPOINT + IntegrityError + retry with backoff + force-create fallback	Optimistic locking (ActiveRecord)
Email-specific logic	`message_id` = email_message_id; `bcc` part of key (metadata-aware)	`email_message_id` query + `my_dupe?()` metadata filtering (author, date, reply_id)
Attachment logic	`message_id` = `family_id` (parent email); `md5` = content_hash	Grouped via `expansive_hash` (hash of all attachment hashes in family)
Scope	Per-case (`npcase_id` in PK)	Per-case (implicit in case DB schema)

Key files: - NGE: documentloader/shell/exhibit_ops.py (add_exhibit(), resolve_dedupe_fields(), find_dupes()) - NGE model: core/models/db_models.py (DocDedupe class) - Legacy: rails/app/models/exhibit.rb (dupe?(), merge_as_dupe(), ExhibitDedupeMerger)

Why NGE uses insert-based (DocDedupe table): Detecting duplicates via failed INSERT is atomic — no gap between "check" and "create" where a race condition can occur. The composite PK encodes document identity (content hash + email metadata + doc type). SAVEPOINT allows graceful rollback without aborting the entire transaction.

Why Legacy uses query-based: Legacy creates the exhibit first, then checks if it's a duplicate. This allows richer metadata-based dedup (my_dupe?() checks author, date, reply_id) but means duplicate exhibits are temporarily created and then merged.

Key behavioral difference: NGE prevents duplicate exhibits from being created. Legacy creates duplicates and then merges them. NGE approach is more efficient for large imports with high duplicate ratios — no wasted DB writes for documents that will be merged anyway.

Batch settings that control dedup: - allow_dupes on Batch — if true, skip DocDedupe insert entirely (both NGE and Legacy) - dedupe_using_message_id? — use email message_id for dedup - dedupe_using_meta? — use metadata fields for dedup - bcc_merge_on — include/exclude BCC in dedup key

For new modules: Follow documentloader's insert-based DocDedupe pattern. Pre-check dedup is cheaper than post-creation merge.

4. Document Viewer & Page Rendering¶

How documents are displayed — Nutrient (PSPDFKit) vs S3 page images.

Backend¶

#	File	Condition	NGE Path	Legacy Path
19	`theater_processor.rb:316`	`nge_enabled`	Get image via `NextpointNutrient.get_cached_filename_for_theater`	Get image via `NextPointS3.get_cached_filename`
20	`theater_document_decorator.rb:51`	`nge_enabled`	`preload = false` (don't preload theater images)	`preload = true`
21	`exhibit.rb:2050`	`nge_doc?`	Spreadsheet URL: rewrite to `/xlsx/{id}/content.xlsx` (NGE S3 layout)	Replace extension with `.xlsx`
22	`documents_controller.rb:245`	NGE-only	`regenerate_pdf`: creates `native_pdf_ocr_job`, sets `processing_in_nge = true`	No legacy equivalent
88	`document_editor.rb:837`	`nge_enabled?`	`process_generate_pdf_options`: download PDF via Nutrient API	Render `generate_pdf_options` modal for user
89	`document_editor.rb:1174`	`nge_enabled`	`select_page_heights`: get heights from Nutrient API (`nutrient_document_info`)	Query `attachments` table for page heights

Views¶

File	NGE	Legacy
`documents/show.html.erb:57`	Render `_nge_document_toolbar` partial	Skip
`documents/_nge_document_toolbar.html.erb`	Entire partial (NGE-only)	N/A
`documents/_scrollable_page_set.html.erb:2,33,83`	Hidden (3 blocks)	Page thumbnails with S3 images
`documents/_action_bar.html.erb:52`	"Wire" button hidden	Shows "Wire" button
`documents/_exhibit_page_content_preview.html.erb:179`	Different preview rendering	Standard preview
`documents/show_document_files.html.erb:65`	Hidden	Shows document files section
`documents/_js_templates.html.erb:227`	Wire button hidden	Shows wire button
`page_notes/_sidebar_document_notes.html.erb:1`	Hidden	Shows sidebar notes

React Components¶

File	NGE	Legacy
`NgeDocumentViewPdf.jsx:12`	`isProcessingInNge` gates PDF viewer	N/A
`DocumentToolbar.jsx:87`	`isProcessingInNge` disables toolbar	Standard toolbar
`DownloadControl.jsx:13`	"Original files are not available"	"Original File Removed"
`documentViewPdfUtils.jsx:67,92`	Different PDF rendering setup	Standard rendering

5. Bates Stamping & Numbering¶

Page numbering — NGE validates against Nutrient page counts.

Backend¶

#	File	Condition	NGE Path	Legacy Path
23	`bates_stamp_job.rb:33`	`@is_nge`	Send admin emails on Nutrient API failures or page count mismatches	Skip ensure block
24	`bates_stamp_job.rb:81`	`@is_nge`	Send `bates_stamp_processing_finished_email`, return	Queue `BatesStampCompletionJob`, send notification
25	`bates_stamp_job.rb:106`	`@is_nge`	Call `NextpointNutrient.nutrient_document_info` for page count; limit bates to Nutrient pages	Use DB-based verified page count only
26	`bates_stamp_job.rb:135`	`@is_nge`	Stop stamping at Nutrient page count; reset `processing_in_nge`	Stamp all verified pages
27	`bates_stamp_job.rb:146`	`nge_enabled?`	Reset `processing_in_nge = false` after stamping	Do not touch flag
28	`bates_management_controller.rb:12`	`has_attribute?(:processing_in_nge)`	Set `processing_in_nge = true` before processing	Do not set flag
29	`exhibit.rb:508`	`nge_enabled?`	After removing bates, reset `processing_in_nge = false`	Only remove bates

Views¶

File	NGE	Legacy
`label_bates/new.html.erb:158`	Hidden additional options	Shows options
`general_settings/_exhibit_stamp_template.html.erb:13`	Different stamp template	Standard template
`general_settings/update_stamp_format.html.erb:1,6`	Different stamp format	Standard format
`production_endorsement_schemes/_form.html.erb:165,199`	Different endorsement options	Standard options
`production_endorsement_schemes/placeholder_stamp_image.html.erb:25`	Different placeholder stamp	Standard stamp

Stamp Configuration (NGE-only fields)¶

#	File	Condition	NGE Path	Legacy Path
86	`general_settings_controller.rb:12`	`nge_enabled`	Set `stamp_placement` (vertical/horizontal) + `stamp_names` (array of name/position)	Skip — uses `stamp_format` + `use_background_color_for_stamp` only
87	`general_settings_controller.rb:220`	`nge_enabled?`	Set `confidentiality_stamps_position` (left/right) on ConfidentialityCode	Skip — not applicable for Legacy

6. Image Markups & Redactions¶

How annotations are applied — NGE uses Nutrient API, Legacy uses processing jobs.

#	File	Condition	NGE Path	Legacy Path
30	`image_markups_controller.rb:50`	`nge_enabled?`	Auto-redaction: set `processing_in_nge = true`, queue `AutoRedactionJob`. Other markups: `skip_processing_jobs: true`	Original markup logic with processing jobs
31	`auto_redaction_job.rb:138`	`has_attribute?(:processing_in_nge)`	Reset `processing_in_nge = false` after completion	N/A — entire job is NGE-only
32	`sync_annotation_ids_job.rb:59`	Always	Reset `processing_in_nge = false` after sync	N/A — entire job is NGE-only

Views¶

File	NGE	Legacy
`image_markups/edit.html.erb:20`	Different markup editor	Standard editor

NGE locks toolbar actions while Nutrient is processing a document.

#	File	Condition	When `processing_in_nge = true`	When idle / Legacy
33	`toolbar_permission_helper.rb:74`	`!processing_in_nge?`	Disable "add new page"	Allow
34	`toolbar_permission_helper.rb:86`	`!processing_in_nge?`	Disable "rotate/duplicate page"	Allow
35	`toolbar_permission_helper.rb:95`	`!processing_in_nge?`	Disable "split/delete page"	Allow
36	`toolbar_permission_helper.rb:109`	`!processing_in_nge?`	Disable "add/replace native"	Allow
37	`documents/_js_templates.html.erb:205`	`nge_enabled?`	Disable export during annotation jobs	Allow

8. Non-Imaged Placeholders¶

How placeholder pages are created for container files and non-imaged documents.

#	File	Condition	NGE Path	Legacy Path
38	`non_imaged_placeholder.rb:31`	`nge_enabled?`	`create_instant_layer_for_nge`, set S3 path directly	`setup_placeholder_file` (create local file, upload to S3)

9. Family Linking & Batch Details¶

How email thread/family relationships are displayed.

#	File	Condition	NGE Path	Legacy Path
39	`batch_family_linking.rb:7`	`nge_enabled?`	Return linked batches with merged `username` (hash)	Raw AR relations
40	`batch_family_linking.rb:20`	`nge_enabled?`	Merge `username` and `docs` count into hashes	Raw AR query result
41	`batch_family_linking.rb:61`	`nge_enabled?`	Return `[{key:, value:}]` array (JSON-ready)	HTML string with `<br/>` separators

10. Export & Production¶

How document exports are rendered and delivered.

Backend¶

#	File	Condition	NGE Path	Legacy Path
42	`batch_notification_decorator.rb:20`	`nge_batch?`	Show description for unknown events	Show external text only
43	`share_job_mixins.rb:73`	—	Pass `nge_enabled:` flag to share job payload	—

Views¶

File	NGE	Legacy
`exports/_export.html.erb:14`	Different export rendering	Standard
`exports/_export.html.erb:40`	`size_of_nge_zips`	`export_volumes.first.file_size`
`exports/show.html.erb:28,61`	Different export detail rendering	Standard
`notification/shared_export_*.erb` (4 files)	Different export notification emails	Standard emails
`general_settings/edit_confidentiality_code.html.erb:24`	Different confidentiality code editing	Standard
`application/_download_all.html.erb:5`	Different download behavior	Standard

11. Global UI & Layout¶

Platform-wide UI differences.

File	NGE	Legacy
`layouts/application.html.erb:32`	Body class: `nge` (enables global CSS)	Body class: `legacy`
`general/_js_support.html.erb:7`	Sets `NP['is_nge_enabled'] = 'true'`	`'false'`
`general/_current_case.html.erb:3,11`	Container: `nge-case_name_for_display_container` + NGE indicator	Standard container
`general/_case_access_list.html.erb:59`	Shows NGE indicator on case list	No indicator
`general/tab_bars/_import_export_center_tab_bar.html.erb:26`	Different tab bar	Standard tab bar
`general/_banner_editor.html.erb:44`	Different banner editor	Standard
`documents/_labels_editor.html.erb:51`	Calls `handleExhibitStamp` on label save	Standard save

12. Review & Coding (Shared — No Divergence in Logic)¶

Review and coding (applying labels, privilege designations, confidentiality codes, review status) work on both NGE and Legacy cases. The business logic is identical — the only difference is the rendering layer:

Aspect	Legacy	NGE
Document rendering	Page images (TIFF/PNG) loaded from S3	PDF rendered from Nutrient with annotation overlays
Review controller	Standard page data	Sets Nutrient secrets for PDF viewer (`reviews_controller.rb:129`)
Label save	Standard save	Also calls `handleExhibitStamp` to update Nutrient bates overlay (`_labels_editor.html.erb:51`)
Bates/confidentiality display	Baked into page images	Nutrient overlay layers rendered on-demand
Coding overlay import	Same logic	Same logic (no `nge_enabled?` checks in `coding_overlays_controller.rb`)

This is an important distinction: review and coding are not divergent features — they are shared workflows where the underlying document representation differs (page images vs PDF). The business logic (label assignment, privilege tagging, review status, bulk coding) is identical across both systems.

13. Document Exchange (Wire)¶

"Wire" is the internal codebase name; "Exchange" is the user-facing product name (configured via $global_config[:wire_product_name] = 'Exchange' in nextpoint_global.yml). Same operation — transferring documents between cases.

This is a Legacy-only feature — hidden in NGE UI. The wire system itself has minimal NGE-specific code, but a cross-type validation prevents transfers between NGE and Legacy cases.

#	File	Condition	NGE Path	Legacy Path
90	`general_settings_controller.rb:294`	`nge_enabled != target.nge_enabled`	Block wire transfer with `db_type_mismatch` error	Same — prevents cross-type transfers in both directions

File	NGE	Legacy
`documents/_action_bar.html.erb:52`	"Exchange" button hidden	Shows "Exchange" button
`documents/_js_templates.html.erb:227`	Wire button hidden in toolbar	Shows wire button

Legacy wire transfer architecture:

Multi-phase approval workflow for transferring documents between cases:

OutgoingWire phases: initial_setup → loadfile → work_order → target_approval → fully_approved

Models: OutgoingWire (source), IncomingWire (destination), ExhibitOutgoingWire (join)

Jobs: - WireSetupJob — Links exhibits, deduplicates, advances phases - DocumentShareGenerationJob — Creates SQLite DB + CSV loadfile of selected exhibits - WireConfirmationJob — Cross-case/cross-account approval handshake - DocumentShareJob — Executes transfer: creates IncomingWire + batch (wire_transfer type) in target case, iterates SQLite DB, copies exhibits/attachments/S3 files per document - DirectDocumentShareJob — Shortcut for intra-account transfers - DepositionShareJob — Deposition-specific transfers

Flow: User selects exhibits → WireSetupJob → DocumentShareGenerationJob (loadfile) → optional loadfile review → optional work order approval → cross-account approval via WireConfirmationJob → DocumentShareJob copies documents to target case DB + S3.

NGE interaction: Only nextpoint_share_job_mixins.rb:73 — when wire creates a new destination case, it propagates nge_enabled from source case.

Legacy wire vs NGE documentexchanger:

Aspect	Legacy Wire	NGE documentexchanger
Trigger	User clicks Exchange → `WireSetupJob`	API Gateway (sync) + SQS (async) dual entry point
Approval workflow	Multi-phase (`OutgoingWire`: initial_setup → loadfile → work_order → target_approval → fully_approved)	Not yet integrated into Rails
DB transfer	`DocumentShareJob` iterates SQLite DB, copies exhibits one by one via `document.transfer!`	AWS Glue ETL for bulk database copy
S3 file copy	Per-document S3 copy inside `DocumentShareJob`	Per-document Lambda processors via dynamic SQS queues
Annotations	Copies page images (bates/confidentiality already baked in)	Must process via Nutrient (PDF overlays, no page images)
Infrastructure	Sidekiq jobs on shared Redis queue	Dynamic Lambda + SQS provisioned per exchange, torn down on completion
OCR	Copies existing search text	Re-OCR via Hyland Filters (source is PDF, not page images)

NGE status: documentexchanger is built but not yet integrated into the Rails app. The Legacy wire buttons are simply hidden via nge_enabled? in views.

NGE-Only Code (No Legacy Equivalent)¶

These entire components exist only for NGE cases:

Component	Type	Purpose
`NgeCaseTrackerJob`	Sidekiq job	Polls Athena for NGE processing events
`NgeExportJob`	Sidekiq job	Invokes documentexporter Lambda
`AutoRedactionJob`	Sidekiq job	Nutrient-based auto-redaction
`SyncAnnotationIdsJob`	Sidekiq job	Reconciles Nutrient annotation IDs
`NgePageService`	Service	HTTP client for documentpageservice
`ProcessorApi`	Lib	HTTP client for NGE Processor API
`ProcessorApiHelper`	Helper	Import payload builder for NGE
`Batch::AsUpload` workflow	Model	`before_commit` → `initiate_import` for NGE
`mixins/nge_batches.rb`	Controller	NGE batch listing, Athena queries
`_nge_document_toolbar.html.erb`	View	NGE document toolbar partial
`NgeDocumentViewPdf.jsx`	React	NGE PDF viewer component

Legacy Functionality Not Yet Modularized into NGE¶

The Nextpoint platform has two suites: Discovery (document processing, search, review) and Litigation (video, depositions, transcripts, treatments). Many features are common across both suites. NGE modularized the common processing workflows (import, export, exchange).

Common Functionality (Both Discovery and Litigation)¶

These features work in both case types (trial_prep = Litigation, review = Discovery):

Area	Key Components	NGE Status
Import/Upload	`ImportsController`, `BatchesController`, `S3UploadController`, `CaseFolderController`	Modularized (extractor → loader → uploader)
Export/Production	`ExportsController`, `ProductionTemplatesController`, `DocumentExportJob`	Modularized (documentexporter)
Exchange/Wire	`OutgoingWiresController`, `IncomingWiresController`, wire jobs	Built (documentexchanger, not yet integrated)
Document Viewer	`DocumentsController`, `DocumentPagesController`, `AttachmentsController`	Shared — page images (Legacy) vs Nutrient PDF (NGE)
Search	`SearchController`, `SearchAggregationController`, Elasticsearch 7.4	Legacy only
Labels/Coding	`LabelsController`, `CodingOverlaysController`, `BulkLabelsController`	Shared — same logic, different rendering
Bates/Stamps	`BatesManagementController`, `ExhibitStampingController`, `BatesStampJob`	Shared — Nutrient overlays (NGE) vs page images (Legacy)
Markups/Redactions	`ImageMarkupsController`, `HighlightsController`, `PageNotesController`	Shared — Nutrient API (NGE) vs processing jobs (Legacy)
Custodians	`CustodiansController`, `CustodianExhibitsController`	Legacy only
Custom Fields/Grid	`CustomFieldsController`, `GridColumnsController`, `GridTemplatesController`	Legacy only
Family Linking	`FamilyLinkingsController`, `FamilyLinkingJob`	Legacy only (NGE handles during ingestion)
Reporting	`CustomReportController`, `UserActivityReportsController`, `AnalyticsController`	Legacy only
Case Management	`NpcasesController`, `CasePermissionsController`, `CaseNotesController`	Legacy only (core platform)
User/Account	`AccountsController`, `UsersManagementController`, `UserLicensesController`	Legacy only (core platform)
AI	`AiAssistantController`, `ChatbotController` (Bedrock)	Legacy only
Bulk Operations	`BulkDeleteJob`, `BulkRestoreJob`, `BulkActionJob`, `BatchLabelJob`	Legacy only

Additional common features not listed above:

Area	Components	NGE Status
Document Review	`ReviewsController`, sub-review assignments	Shared — same logic, different rendering
Chronology	`ChronologyController` — timeline view	Legacy only
File Room	`FileRoomController` — virtual binders	Legacy only
Search Hit Reports	`SearchHitReportController`, searcher/post-processing jobs	Legacy only

Litigation-Specific Features (Legacy Only)¶

Feature	Components
Evidence Dashboard	`EvidenceController` (`verify_trial_prep` required)
Theater/Presentation	`TheaterController`, `Theater::PagesController`

Litigation Suite — Processing (Separate Domain, Entirely Legacy)¶

The Litigation processing workflows handle video, depositions, and trial presentation — none have NGE equivalents.

EC2 Workers:

Worker	Function
`TranscodeWorker`	Video transcoding via FFmpeg
`VideoStitchWorker`	Multi-segment video stitching
`VideoSyncWorker`	Video synchronization
`FlvConversionWorker`	FLV format conversion
`UpdateVideoAspectRatioWorker`	Video metadata update
`TranscriptParseWorker`	Deposition transcript parsing (LEF, PTX, CMS formats)
`DepositionZipWorker`	Deposition package extraction
`TreatmentWorker`	Litigation presentation images (callout/highlight composites)

Sidekiq Jobs:

Job	Function
`DesignationVideoJob`	Video designation processing
`DepositionPdfJob`	Deposition PDF generation
`DepositionTextJob`	Deposition text extraction
`DepositionShareJob`	Deposition sharing between cases
`DepositionSummaryReportJob`	Deposition summary reports
`TranscriptMetadataReportJob`	Transcript metadata reports
`DepositionDesignationMergeJob`	Merge deposition designations
`DepositionVolumeExhibitsInFolderLinkerJob`	Link exhibits in deposition folders

Discovery Suite — Not Yet Modularized¶

These Discovery features run in Legacy Rails/Sidekiq with no NGE module equivalent:

Category	Components	Description
Search	`SearchHitReportSearcherJob`, `SearchHitReportPostProcessingJob`, `SearchHitReportDeletionJob`, Elasticsearch 7.4, custom Parslet query DSL	Full-text search, hit reports
Review/Coding	Review UI, theater, coding overlays, labels, privilege — shared across NGE/Legacy	Core review workflow (same logic, different rendering)
Bulk Operations	`BulkDeleteJob`, `BulkRestoreJob`, `BulkLabelActivationJob`, `BulkLabelDestroyJob`, `BulkActionJob`, `BulkSubreviewAssignmentJob`, `BatchLabelJob`, `SubreviewSplitJob`	Mass document operations
Bates/Stamps	`BatesStampJob`, `BatesRemovalJob`, `BatesStampCompletionJob`, `BatesRemovalCompletionJob`, `ConfidentialityStatusJob`, `RedactAnnotationsJob`	Runs on both NGE and Legacy (uses Nutrient for NGE)
Document Operations	`PageDeleteJob`, `SplitDocumentOnFlagsJob`, `DocumentPdfCompletionJob`, `CodingOverlayJob`, `FamilyLinkingJob`, `CustodianUpdateJob`, `CustodianDestroyerJob`, `NearDupeTrackerJob`	Individual document manipulation
Wire/Exchange	`WireSetupJob`, `WireConfirmationJob`, `DocumentShareJob`, `DocumentShareGenerationJob`, `DirectDocumentShareJob`, `RemoteWireConfirmationJob`	Legacy wire system (documentexchanger built but not integrated)
Reports	`CustomReportJob`, `UserActivityReportJob`, `ReviewLogJob`, `PageCountReportForRelevancyJob`, `GridDataExportJob`	Custom reports, user activity
Export Utilities	`ExportCopyJob`, `PdfLambdaJob`, `CaseFolderImportJob`	Export duplication, legacy import

EC2 Workers (Discovery — Legacy only):

Worker	Function
`SpreadsheetConversionWorker`	XLS/XLSB/CSV → XLSX for spreadsheet viewer
`DocumentPropertiesUpdateWorker`	Document metadata extraction
`FileEmailWorker`	Email document files to users
`DownloadExhibitPdfWorker`	PDF download generation

Platform Services — Entirely in Rails¶

Service	Description
Authentication	Cognito SRP + session-based + HMAC-SHA1 API auth
Authorization	RBAC via Action/Role/RoleNpcase tables
User Management	Account/User/NpcaseUser CRUD, Cognito provisioning
Case Management	Create/archive/delete cases, per-case DB provisioning
Billing	Account billing, ingestion limits, plan management
AI Features	Bedrock agent, AI summaries, chatbot
Notifications	`DelayedEmailJob`, `BannerAlertJob`, email, alerts, audit logging
Admin/Ops	Background job management, EC2 monitoring, ES indexing, `DatabaseArchiveJob`

Summary: What's Modularized vs What's Not¶

MODULARIZED:                                  NOT MODULARIZED:
────────────                                  ────────────────
Processing (Stage 5):                         LITIGATION SUITE (separate domain):
  ✓ Document ingestion (extractor)              ✗ Video (transcode, stitch, sync)
  ✓ Content extraction (extractor)              ✗ Treatments (presentation images)
  ✓ DB writes + batch lifecycle (loader)
  ✓ Page image generation (uploader)           COMMON / DISCOVERY (still in Legacy):
  ✓ Page manipulation (pageservice)              ✗ Bulk operations (delete, label, restore)
  ✓ Archive extraction (unzipservice)            ✗ Bates/confidentiality (shared, Nutrient for NGE)
                                                 ✗ Wire approval workflow (exchanger not integrated)
Analysis (Stage 7):                              ✗ Reporting (custom, user activity)
  ✓ Search query parser (QLE — production)
  ◐ Search hit reports (SHR — prototype)        PLATFORM (core Rails):
  ✓ AI transcript summaries (nextpoint-ai)       ✗ Auth + RBAC + case mgmt + billing
                                                 ✗ Notifications + admin tooling
Production (Stage 8):
  ✓ Export/production (exporter)               SEPARATE PRODUCT:
                                                 ◆ Data Mining (eda + eda-front-end)
Cross-stage:                                       Own architecture, own AWS accounts
  ◐ Document exchange (exchanger, not live)

✓ = production   ◐ = built/prototype   ◆ = separate product

Summary¶

Functional Area	Backend	Views	Total	Core Difference
Document ingestion	6	10	16	ProcessorApi HTTP vs Legacy workers
Batch completion	8	0	8	External (Athena/Nutrient) vs internal polling/retry
Batch cancellation	3	1	4	External API cancel vs local DB update
Document viewer	4	12	16	Nutrient/PSPDFKit vs S3 page images
Bates stamping	7	5	12	Nutrient page count validation vs DB count
Markups/redactions	3	1	4	Nutrient API + AutoRedactionJob vs processing jobs
Toolbar locking	4	1	5	`processing_in_nge` flag gates actions
Placeholders	1	0	1	Instant Nutrient layer vs local file upload
Family linking	3	0	3	JSON hashes vs AR objects/HTML
Export/production	2	6	8	Different size calc, rendering, emails
Global UI	0	7	7	Body class, JS global, indicators
Wire transfer	0	2	2	Hidden in NGE (uses documentexchanger)
TOTAL	43	45	86

EDRM Mapping¶

The EDRM (Electronic Discovery Reference Model) defines 9 stages for how digital data flows through litigation. Here's how the Nextpoint platform maps to each stage, and what's modularized vs Legacy.

EDRM Stage	Nextpoint Coverage	Suite	NGE Status	Legacy Components
1. Information Governance	Not directly covered	N/A	N/A	N/A (pre-litigation)
2. Identification	Custodian management spans Collection (assignment at import) and Review (reassignment via bulk ops) — not a separate stage in Nextpoint	Common	Legacy only	`CustodiansController` (part of stages 4 + 6)
3. Preservation	S3 storage, `PendingDelete` (deletion prevention)	Common	Legacy only	S3 lifecycle, case folder management
4. Collection	Upload files to File Room, S3 case folder, cloud sources; custodian assignment	Common	Legacy only	`FileRoomController`, `S3UploadController`, `CaseFolderController`, `DropboxController`
5. Processing	Import pipeline: file extraction, OCR, dedup, de-NISTing, format conversion, family linking, page image generation	Common	Modularized	`ImportsController`, `BatchesController`; NGE: extractor → loader → uploader; Legacy: EC2 workers (Preprocess → Container → Conversion → Page)
6. Review	Document viewer, coding, labels, privilege, confidentiality, sub-reviews, bulk operations, custodian reassignment	Common	Shared logic (Nutrient vs page images)	`ReviewsController`, `LabelsController`, `CodingOverlaysController`, `CustodiansController`, bulk jobs
7. Analysis	Search, search hit reports, analytics, chronology, AI summaries, near-dupe detection	Common	Partially modularized	query-language-engine (search parser — production), search-hit-report-backend (hit reports — prototype), nextpoint-ai (transcript summaries — production). Legacy: `SearchController`, `AnalyticsController`, `ChronologyController`
8. Production	Export with bates stamps, confidentiality codes, load files, productions	Common	Modularized	`ExportsController`, `ProductionTemplatesController`, Legacy `ExhibitZipVolumeWorker`
9. Presentation	Theater mode, treatments, video depositions, designations	Litigation	Legacy only	`TheaterController`, `TreatmentsController`, `DepositionsController`

NGE Modules by EDRM Stage¶

EDRM Stage          Module(s)                           What They Replace/Add
──────────          ─────────                           ────────────────────
4. Collection       (no module — File Room,             File upload and case folder
                     S3 upload remain in Legacy)         management stay in Rails

5. Processing  ──→  documentextractor (entry point +    PreprocessWorker, ContainerWorker,
                      file conversion)                   ConversionWorker (LibreOffice, Tika)
                    unzipservice (archive extraction)    ContainerWorker for ZIP/RAR/7Z
                    documentloader (DB writes, dedup)    BatchCompletionJob, family linking
                    documentuploader (page images)       PageWorker (image gen, Nutrient)
                    documentpageservice (page ops)       Page manipulation workers

7. Analysis    ──→  query-language-engine (search        Legacy Ruby/Parslet parser in
                      parser, TypeScript ECS)             lib/search/ (production)
                    search-hit-report-backend (hit       SearchHitReportSearcherJob
                      reports, Ruby Lambda)               (prototype)
                    nextpoint-ai (AI transcript          New capability — Bedrock Claude
                      summaries, Python Lambda)           summaries (production)

8. Production  ──→  documentexporter (Lambda + Step Fn)  ExhibitZipVolumeWorker + LoadfileWorker

Cross-stage    ──→  documentexchanger (not integrated)   Wire/exchange system (stage 4→8)

What This Tells Us¶

NGE tackled the compute-heavy stages first: Processing (5) and Production (8) are where the heavy lifting happens — file conversion, OCR, image generation, PDF rendering, ZIP assembly. These benefit most from Lambda/ECS auto-scaling.

Analysis (7) is now being modularized: Three new services are extracting functionality from the Rails monolith: - query-language-engine — production, replaces Legacy Parslet parser - search-hit-report-backend — prototype, offloads ES search + Parquet/Athena analytics - nextpoint-ai — production, adds new AI summarization capability (Bedrock Claude)

The human-intensive stages remain in the monolith: Review (6) and Presentation (9) are interactive, UI-driven workflows where the bottleneck is human decision-making, not compute. These are well-served by the Rails monolith + Sidekiq.

Complete Platform Map by EDRM Stage¶

NEXTPOINT PLATFORM — EDRM Stage Mapping
═══════════════════════════════════════════

EDRM Stage 1: Information Governance
  (Not covered — pre-litigation)

EDRM Stage 2: Identification
  Custodian management spans stages 4 + 6 (not a separate stage in Nextpoint)

EDRM Stage 3: Preservation
  Legacy: S3 storage, PendingDelete (deletion prevention)

EDRM Stage 4: Collection
  Legacy: FileRoomController, S3UploadController, CaseFolderController,
          DropboxController, custodian assignment at upload time

EDRM Stage 5: Processing ─── MODULARIZED
  NGE:    ✓ documentextractor  — pipeline entry point, file conversion (Hyland)
          ✓ documentloader     — DB writes, batch lifecycle, dedup, family linking
          ✓ documentuploader   — Nutrient page images (PDF, no page images)
          ✓ documentpageservice — page reorder/rotate/add/remove/split (PDFBox)
          ✓ unzipservice       — archive extraction (ZIP/RAR/7Z/TAR/GZIP/BZIP2)
  Legacy: PreprocessWorker → ContainerWorker → ConversionWorker → PageWorker

EDRM Stage 6: Review ─── SHARED (same logic, different rendering)
  Legacy Rails (both NGE + Legacy cases):
          ReviewsController, LabelsController, CodingOverlaysController,
          CustodiansController, bulk ops (delete/restore/label/subreview)
  Rendering: Legacy = page images (TIFF/PNG) from S3
             NGE = PDF from Nutrient with annotation overlays

EDRM Stage 7: Analysis ─── PARTIALLY MODULARIZED
  NGE:    ✓ query-language-engine     — search query parser (TypeScript ECS)
          ◐ search-hit-report-backend — hit reports (Ruby Lambda, prototype)
          ✓ nextpoint-ai              — AI transcript summaries (Bedrock)
          ◐ neardupe                  — near-dupe detection (PySpark EMR, POC)
  Legacy: SearchController, AnalyticsController, ChronologyController,
          Elasticsearch 7.4, near-dupe (Databricks production), custom reports

EDRM Stage 8: Production ─── MODULARIZED
  NGE:    ✓ documentexporter — Step Functions + ECS Fargate
  Legacy: ExhibitZipVolumeWorker + ExhibitLoadfileWorker
  Key:    Legacy downloads pre-stamped page images from S3
          NGE renders from Nutrient PDF with bates/confidentiality overlays

EDRM Stage 9: Presentation ─── LEGACY ONLY (Litigation suite)
  Legacy: TheaterController, TreatmentsController, DepositionsController,
          video transcoding/stitching/sync, transcript parsing (LEF/PTX/CMS)

Cross-Stage:
  NGE:    ◐ documentexchanger — document exchange (built, not integrated)
  Legacy: OutgoingWire/IncomingWire, WireSetupJob, DocumentShareJob

SEPARATE PRODUCT — Data Mining (own AWS accounts):
  ◆ eda           — Ruby Lambda + Batch + dtSearch (Stages 4-8)
  ◆ eda-front-end — TypeScript SPA + 53 Lambda API + DynamoDB

✓ = production   ◐ = built/prototype   ◆ = separate product

Stages 1-3 are thin: Information Governance, Identification, and Preservation are lightly covered — Nextpoint focuses on stages 4-9 (Collection through Presentation).

Mapping Divergences to NGE Service Modules¶

Each functional divergence area maps to one or more NGE service modules that replaced the Legacy behavior. Some modules handle multiple functional areas.

By NGE Module¶

documentextractor (via ProcessorApi) — Pipeline Entry Point¶

Handles: Ingestion trigger + Cancellation + Pipeline orchestration

Functional Area How documentextractor handles it

Document ingestion (16 pts) ProcessorApi.import() calls documentextractor's POST /import endpoint. documentextractor assigns a worker, extracts content (text, metadata, file conversion via Hyland Filters), and publishes SNS events. These fan out to documentloader (DB writes via SQS), documentuploader (page images via SQS), and PSM (event capture via Firehose). Replaces Legacy's PreprocessWorker → ContainerWorker → ConversionWorker → PageWorker chain.

Batch cancellation (4 pts) ProcessorApi.cancel_import() calls DELETE /import/{case}/{job}/{batch}. documentextractor tears down the processing pipeline. Replaces Legacy's local DB status update + BatchStatusUpdateJob.

Key insight: documentextractor is the NGE entry point from Rails — it's the service that ProcessorApi talks to. It publishes SNS events that fan out to documentloader (DB writes), documentuploader (page images), and PSM (Firehose event capture). Each downstream module also publishes its own events for further subscribers.

documentloader (downstream from documentextractor)¶

Handles: Batch lifecycle + DB writes + Family linking

Functional Area	How documentloader handles it
Batch completion (8 pts)	Job processor manages batch lifecycle — creates SQS/Lambda per batch, monitors queue depth, does multi-pass DLQ redrive, atomic teardown. Replaces Legacy's polling loop (`next_check_for_complete_time_gmt`), `BatchCleanup.process`, and `Exhibit.request_indexing`.
Family linking (3 pts)	documentloader assigns `family_id` during ingestion via email thread detection. Replaces Legacy's `backfill_email_family_id` post-processing.

Key insight: documentloader's job processor replaces the Legacy batch polling/retry/completion/cleanup machinery. Combined with documentextractor's ingestion trigger, these two modules account for 31 of the 86 divergence points.

documentuploader (Nutrient/PSPDFKit infrastructure)¶

Handles: Document viewing + Placeholders + Provides infrastructure for bates/markups

Functional Area	How documentuploader handles it
Document viewer (16 pts)	This is the fundamental rendering shift. Legacy stores individual page images (TIFF/PNG) on S3 — bates stamps, confidentiality codes, and coding are applied directly to those image files. NGE has only the PDF in Nutrient — no individual page images exist. All annotations (bates, confidentiality, coding) are Nutrient overlays rendered on-demand. documentuploader provisions the Nutrient document; Rails reads via `NextpointNutrient.get_cached_filename_for_theater`.
Placeholders (1 pt)	NGE creates Nutrient layers directly (`create_instant_layer_for_nge`) instead of generating local placeholder files and uploading to S3.
Bates/markups infrastructure	documentuploader provisions the Nutrient document (`nutrient_id = document_{case}_{batch}_{nge_doc}_{id}`) that bates stamping and markups operate against.

Key insight: documentuploader doesn't just upload — it establishes the Nutrient document that the entire NGE document viewing, stamping, and annotation stack depends on. Every NextpointNutrient.* call in the divergence map exists because documentuploader set up the Nutrient document.

documentpageservice (via NgePageService)¶

Handles: Page manipulation + triggers bates/OCR workflows

Functional Area	How documentpageservice handles it
Document viewer — page operations	`NgePageService.process_nge_page_job()` for reorder, rotate, add, remove, split. Called from `attachment.rb:627` and `document_natives_controller.rb:57`. Sets `processing_in_nge = true` while operating.
Bates stamping (12 pts)	Rails calls `NextpointNutrient.nutrient_document_info()` for page counts (set up by documentuploader), then stamps via `ExhibitNutrientAction` concern. documentpageservice handles the underlying page manipulation when pages need OCR or regeneration (`native_pdf_ocr_job`).
Toolbar locking (5 pts)	The `processing_in_nge` flag is set whenever documentpageservice is processing a document. This gates all toolbar actions (add/rotate/split/delete/replace) in the UI until the operation completes.

Key insight: documentpageservice is the reason processing_in_nge exists. Every toolbar lock and unlock in the divergence map is triggered by a documentpageservice operation starting or completing.

documentexporter (via NgeExportJob)¶

Handles: Export/production

Functional Area	How documentexporter handles it
Export/production (8 pts)	`NgeExportJob` invokes `{region}-{env}-nge-export-lambda` async. Step Functions + ECS Fargate handle image conversion, PDF rendering, ZIP assembly. Replaces Legacy's `ExhibitZipVolumeWorker` + `ExhibitLoadfileWorker`. Different export size calculation (`size_of_nge_zips` vs `export_volumes.first.file_size`) and notification emails.

Key difference — page images vs PDF: Legacy stores individual page images (TIFF/PNG) on S3 with bates stamps, confidentiality codes, and coding applied directly to the image files. NGE only has the PDF document in Nutrient — no individual page images exist. So documentexporter must use Nutrient to overlay bates/confidentiality/coding annotations onto the PDF when generating export images. This is why the export rendering and size calculation differ between NGE and Legacy.

documentexchanger¶

Handles: Wire transfer replacement

Functional Area	How documentexchanger handles it
Wire transfer (2 pts)	documentexchanger with dynamic Lambda+SQS provisioning per exchange and AWS Glue ETL replaces the Legacy wire transfer system. The Legacy wire buttons are simply hidden in NGE UI.

unzipservice¶

Part of the ingestion pipeline, invoked by documentextractor:

Module	Role in divergence
unzipservice	Archive extraction during ingestion. Replaces Legacy's `ContainerWorker` for ZIP/RAR/7Z.

Nutrient (PSPDFKit) — Cross-Cutting Service¶

Nutrient is not an NGE module but a SaaS dependency that multiple modules and Rails itself call directly. It appears in 37 of the 86 divergence points:

Caller	Nutrient Usage
documentuploader	Provisions documents, generates page images
Rails — ExhibitNutrientAction	Bates stamping, confidentiality stamps, page labels
Rails — BatesStampJob	Page count validation against Nutrient
Rails — AutoRedactionJob	Term and pattern redactions
Rails — SyncAnnotationIdsJob	Annotation ID reconciliation
Rails — SplitDocumentOnFlagsJob	Document splitting via Nutrient API
Rails — theater_processor	Theater view image retrieval
Rails — NativePlaceholder	Non-imaged placeholder layer creation

Summary: Module → Functional Areas¶

documentextractor ───────┬── Document ingestion (16)   ← entry point via ProcessorApi
(ProcessorApi)           └── Batch cancellation (4)
                                                        Total: 20 points

documentloader ──────────┬── Batch completion (8)       ← downstream from extractor
(job processor)          └── Family linking (3)
                                                        Total: 11 points

documentuploader ────────┬── Document viewer (16)
(Nutrient provisioning)  └── Placeholders (1)
                                                        Total: 17 points

documentpageservice ─────┬── Bates stamping (12)
(NgePageService)         ├── Toolbar locking (5)
                         └── Markups/redactions (4)
                                                        Total: 21 points

documentexporter ────────── Export/production (8)
(NgeExportJob → Lambda)                                 Total: 8 points

documentexchanger ───────── Wire transfer (2)
                                                        Total: 2 points

No NGE module ───────────── Global UI (7)
(CSS/JS only)                                           Total: 7 points

Note: Rails only talks to documentextractor (via ProcessorApi) and documentloader events come back via Athena. unzipservice is invoked by documentextractor for archive extraction and is transparent to Rails.

Ask the Architecture ×

Ask questions about Nextpoint architecture, patterns, rules, or any module. Powered by Claude Opus 4.6.

NGE vs Legacy Code Divergence Map¶

Overview¶

1. Document Ingestion & Import¶

Backend¶

Views¶

2. Batch Completion & Lifecycle¶

3. Batch Cancellation¶

Tag / Custom Field Creation During Import¶

Document Deduplication During Import¶

4. Document Viewer & Page Rendering¶

Backend¶

Views¶

React Components¶

5. Bates Stamping & Numbering¶

Backend¶

Views¶

Stamp Configuration (NGE-only fields)¶

6. Image Markups & Redactions¶

Views¶

7. Toolbar & UI Locking¶

8. Non-Imaged Placeholders¶

9. Family Linking & Batch Details¶

10. Export & Production¶

Backend¶

Views¶

11. Global UI & Layout¶

12. Review & Coding (Shared — No Divergence in Logic)¶

13. Document Exchange (Wire)¶

NGE-Only Code (No Legacy Equivalent)¶

Legacy Functionality Not Yet Modularized into NGE¶

Common Functionality (Both Discovery and Litigation)¶

Litigation-Specific Features (Legacy Only)¶

Litigation Suite — Processing (Separate Domain, Entirely Legacy)¶

Discovery Suite — Not Yet Modularized¶

Platform Services — Entirely in Rails¶

Summary: What's Modularized vs What's Not¶

Summary¶

EDRM Mapping¶

NGE Modules by EDRM Stage¶

What This Tells Us¶

Complete Platform Map by EDRM Stage¶

Mapping Divergences to NGE Service Modules¶

By NGE Module¶

documentextractor (via ProcessorApi) — Pipeline Entry Point¶

documentloader (downstream from documentextractor)¶

documentuploader (Nutrient/PSPDFKit infrastructure)¶

documentpageservice (via NgePageService)¶

documentexporter (via NgeExportJob)¶

documentexchanger¶

unzipservice¶

Nutrient (PSPDFKit) — Cross-Cutting Service¶

Summary: Module → Functional Areas¶

Sign In