Skip to content

Elasticsearch Cluster Management — Reference Implementation

Overview

Self-managed Elasticsearch 7.x cluster. Shared physical indexes with per-case filtered aliases. ~125 physical indexes, ~30+ TB, 10B+ documents across 3 independent regional deployments.

Architecture

Index Structure

Shared indexes, per-case aliases:

Physical Index:  production_exhibits_20200101_00001  (multiple cases share this)
Case Alias:      production_{npcase_id}_exhibits     (filtered by npcase_id term)

  • Each physical index holds documents from many cases
  • Per-case isolation via filtered aliases: { term: { npcase_id: npcase_id } }
  • NOT index-per-case — shared indexes with filtered routing

Index Naming Convention

{environment}_{type}_{identifier}_{sequential_number}
  production_exhibits_20200101_00001
  production_deposition_volumes_20200101_00001

Index Types

  • Exhibits — Document metadata + page-level search text
  • DepositionVolumes — Transcript search

Parent-Child (Join Field)

Parent: Exhibit (document metadata, 290+ fields)
  └── Children: Attachments (pages of the document)
       - bates, npcase_id, position, search_text
       - Routing: {npcase_id}_exhibit_{exhibit_id}

Note: "Attachment" in ES context = pages of the document (TIFF/PNG/PDF pages), NOT email attachments or child documents. Document family relationships use document_relations table in MySQL.

CRITICAL ISSUE: Hot Partition Problem

This is the biggest ES operational issue.

The Problem

New index creation only triggered when largest primary shard > 30GB, checked ONLY at case creation time (SetupNewCaseJob#setup_indices). Never re-evaluated after.

New cases created → all assigned to latest index (e.g., 00125)
All grow simultaneously
Index 00125 grows from 5MB to 300GB-1TB
Large import on one case degrades search for ALL co-located cases

Evidence

  • Index size variance: 5MB to 1TB (200x difference)
  • Latest index handles bulk of both read AND write load
  • Indexes 00120-00125 all tiny (5MB-1.1GB) — suggests manual creation attempts
  • MAX_SHARD_SIZE = 30.gigabytes in elasticsearch_index.rb line 4

Impact

  • Large imports cause search degradation for unrelated cases
  • ES cluster hot spots on nodes hosting the latest index
  • No automatic rebalancing — manual intervention required

Root Cause

# elasticsearch_index.rb — only checks at case creation
def self.current_index_for(type)
  latest_index = find_latest_index(type)
  if largest_primary_shard_size(latest_index) > MAX_SHARD_SIZE
    create_next_index(type)  # Only happens here
  end
  latest_index
end

Fix Options (for ADR-008)

  1. Periodic shard size check — Background job evaluates shard sizes, creates new index when threshold exceeded (not just at case creation)
  2. ILM (Index Lifecycle Management) — ES native solution for size-based rollover with automatic alias management
  3. Time-based indexes — Monthly/quarterly index rotation regardless of size
  4. Better case distribution — Spread new cases across existing under-capacity indexes instead of always assigning to the latest

NOT viable: Index-per-case — 20K+ cases would create 40K+ shards minimum (primary + replica). ES recommends <1000 shards per node. Would require 40+ dedicated nodes just for shard management overhead.

Stack

Component Version/Detail
ES Server 7.x (self-managed, not OpenSearch)
ES Ruby Gem elasticsearch 7.4.0
elasticsearch-model 7.0.0
elasticsearch-rails 7.0.0
Client timeout 180 seconds
Dynamic mapping :strict

Index Mappings

Exhibit Mapping (290+ fields)

Key fields and nested structures:

Field/Structure Type Purpose
npcase_id integer Per-case filtering (alias filter)
nge_document_id keyword NGE document identifier
exhibit_join join Parent-child for exhibit→pages
es_tags (nested) custom_field_id, name, path, date Custom field tag values
es_exh_designations (nested) positions, label_ids Label/folder assignments
shr_tags (nested) search report tags Search hit report data
relationships (nested) document relationships Family linking
search_text text (nextpoint_analyzer) Full-text content
bates_start, bates_end keyword Bates number range
privileged, confidentiality keyword Review status

Custom Analyzers

  • nextpoint_analyzer — Email parsing, edge n-gram, path hierarchies
  • nextpoint_search_analyzer — Search-time analyzer
  • edge_ngram_analyzer — Autocomplete (min_gram: 1, max_gram: 6)
  • custom_path_tree — Folder path hierarchy
  • Character filters — Dash replacement, comma/semicolon handling

Indexing Pipeline

Per-Document Indexing

# searchable.rb — single document index
def index_document(options = {})
  doc = as_indexed_json(options)
  __elasticsearch__.client.index(
    index: name_for_index,
    id: elasticsearch_id,
    body: doc,
    routing: elasticsearch_routing
  )
end

Bulk Indexing (Fork-Based)

# bulk_indexable.rb — parallel batch processing
def bulk_import(options = {})
  import(options.merge(batch_size: 1000)) do |response|
    log_metrics(response) if ENV['LOG_INDEXING_METRICS'] == 'true'
  end
end

# Forks child processes (up to CPU count)
# Each fork processes 1000-record slices independently
# Parent waits for all children to complete

Indexing Request System

Managed via ElasticsearchIndexRequest model:

State Meaning
Requested Index update queued (indexing_requested_at_gmt set)
Started Indexer picked up request (indexing_started_at_gmt set)
Completed Indexing done (indexing_last_completed_at set)

Incremental indexing: Only indexes documents where updated_at_gmt >= last_completed_at - 10s. The 10-second overlap handles clock skew.

Attachment (Page) Batch Size — Adaptive

DEFAULT_ATTACHMENT_BATCH_SIZE = 1000
On RequestEntityTooLarge/GatewayTimeout: batch_size *= 0.5
On success: batch_size *= 1.1
Minimum: 1

Monitoring

Index Status Check Script

File: script/elasticsearch_index_status_check.rb

Alert Threshold Action
Requests running too long > 24 hours Email admins with indexer IP
Requests waiting too long > 2 hours Email admins

Health Check

# Rake task
rake elasticsearch:running
# Calls: Elasticsearch::Model.client.cat.health

Indexing Metrics (Optional)

Enable via ENV['LOG_INDEXING_METRICS'] = 'true'

Logs per batch: group, step, count, type, npcase_id, duration_sec Steps tracked: database_query, batch_to_bulk, bulk_import, index, delete, run

Reindexing

Zero-Downtime Reindex Script

File: script/elastic_reindexer.rb

  1. Creates new index with updated mappings
  2. Reindexes from old → new index
  3. Switches aliases atomically
  4. Supports per-case reindexing via dbs parameter

Critical for ADR-008 (ES upgrade) — this script handles the actual migration.

Rake Tasks

Task Purpose
elasticsearch:running Health check
elasticsearch:create_indices Create default indexes
elasticsearch:create_aliases Create per-case filtered aliases
elasticsearch:delete_indices Remove indexes
elasticsearch:create_index_request_rows Init index request tracking
elasticsearch:redo_all_the_things Full setup workflow
elasticsearch:setup_test_environment Test env setup

Key Files

File Purpose
config/elasticsearch.yml ES host configuration
config/initializers/elasticsearch.rb Client setup (timeout, connection)
lib/search/elasticsearch/elasticsearch_index.rb Index creation, shard management, MAX_SHARD_SIZE
lib/search/elasticsearch/constants.rb Analyzers, tokenizers, mapping types
lib/search/elasticsearch/indices/mappings/exhibit.yml Exhibit mapping (290+ fields)
lib/search/npcase_index_alias.rb Per-case alias management
app/models/elasticsearch_index_request.rb Index request lifecycle
app/models/elasticsearch_indexer.rb Indexer operation, error handling
app/models/concerns/searchable.rb Per-document indexing
app/models/concerns/bulk_indexable.rb Bulk fork-based indexing
app/models/concerns/exhibit_searchable.rb Exhibit as_indexed_json (37 methods)
app/models/concerns/attachment_searchable.rb Page indexing with join field
app/models/filter_search.rb Document filtering via MySQL JOINs (not ES!)
script/elastic_reindexer.rb Zero-downtime reindex
script/elasticsearch_index_status_check.rb Monitoring/alerts
lib/tasks/elasticsearch.rake Management rake tasks

Connection to ADRs

ADR Relationship
ADR-008 (ES upgrade) Hot partition fix, mapping migration, reindexer script, server version upgrade
ADR-011 (custom field S3) FilterSearch must migrate from MySQL JOINs to ES queries for tag filtering
BACKLOG Hot partition is the biggest ES operational issue
Ask the Architecture ×

Ask questions about Nextpoint architecture, patterns, rules, or any module. Powered by Claude Opus 4.6.