Elasticsearch Cluster Management — Reference Implementation¶

Overview¶

Self-managed Elasticsearch 7.x cluster. Shared physical indexes with per-case filtered aliases. ~125 physical indexes, ~30+ TB, 10B+ documents across 3 independent regional deployments.

Architecture¶

Index Structure¶

Shared indexes, per-case aliases:

Physical Index:  production_exhibits_20200101_00001  (multiple cases share this)
Case Alias:      production_{npcase_id}_exhibits     (filtered by npcase_id term)

Each physical index holds documents from many cases
Per-case isolation via filtered aliases: { term: { npcase_id: npcase_id } }
NOT index-per-case — shared indexes with filtered routing

Index Naming Convention¶

{environment}_{type}_{identifier}_{sequential_number}
  production_exhibits_20200101_00001
  production_deposition_volumes_20200101_00001

Index Types¶

Exhibits — Document metadata + page-level search text
DepositionVolumes — Transcript search

Parent-Child (Join Field)¶

Parent: Exhibit (document metadata, 290+ fields)
  └── Children: Attachments (pages of the document)
       - bates, npcase_id, position, search_text
       - Routing: {npcase_id}_exhibit_{exhibit_id}

Note: "Attachment" in ES context = pages of the document (TIFF/PNG/PDF pages), NOT email attachments or child documents. Document family relationships use document_relations table in MySQL.

CRITICAL ISSUE: Hot Partition Problem¶

This is the biggest ES operational issue.

The Problem¶

New index creation only triggered when largest primary shard > 30GB, checked ONLY at case creation time (SetupNewCaseJob#setup_indices). Never re-evaluated after.

New cases created → all assigned to latest index (e.g., 00125)
  ↓
All grow simultaneously
  ↓
Index 00125 grows from 5MB to 300GB-1TB
  ↓
Large import on one case degrades search for ALL co-located cases

Evidence¶

Index size variance: 5MB to 1TB (200x difference)
Latest index handles bulk of both read AND write load
Indexes 00120-00125 all tiny (5MB-1.1GB) — suggests manual creation attempts
MAX_SHARD_SIZE = 30.gigabytes in elasticsearch_index.rb line 4

Impact¶

Large imports cause search degradation for unrelated cases
ES cluster hot spots on nodes hosting the latest index
No automatic rebalancing — manual intervention required

Root Cause¶

# elasticsearch_index.rb — only checks at case creation
def self.current_index_for(type)
  latest_index = find_latest_index(type)
  if largest_primary_shard_size(latest_index) > MAX_SHARD_SIZE
    create_next_index(type)  # Only happens here
  end
  latest_index
end

Fix Options (for ADR-008)¶

Periodic shard size check — Background job evaluates shard sizes, creates new index when threshold exceeded (not just at case creation)
ILM (Index Lifecycle Management) — ES native solution for size-based rollover with automatic alias management
Time-based indexes — Monthly/quarterly index rotation regardless of size
Better case distribution — Spread new cases across existing under-capacity indexes instead of always assigning to the latest

NOT viable: Index-per-case — 20K+ cases would create 40K+ shards minimum (primary + replica). ES recommends <1000 shards per node. Would require 40+ dedicated nodes just for shard management overhead.

Stack¶

Component	Version/Detail
ES Server	7.x (self-managed, not OpenSearch)
ES Ruby Gem	`elasticsearch 7.4.0`
elasticsearch-model	7.0.0
elasticsearch-rails	7.0.0
Client timeout	180 seconds
Dynamic mapping	`:strict`

Index Mappings¶

Exhibit Mapping (290+ fields)¶

Key fields and nested structures:

Field/Structure	Type	Purpose
`npcase_id`	integer	Per-case filtering (alias filter)
`nge_document_id`	keyword	NGE document identifier
`exhibit_join`	join	Parent-child for exhibit→pages
`es_tags` (nested)	custom_field_id, name, path, date	Custom field tag values
`es_exh_designations` (nested)	positions, label_ids	Label/folder assignments
`shr_tags` (nested)	search report tags	Search hit report data
`relationships` (nested)	document relationships	Family linking
`search_text`	text (nextpoint_analyzer)	Full-text content
`bates_start`, `bates_end`	keyword	Bates number range
`privileged`, `confidentiality`	keyword	Review status

Custom Analyzers¶

nextpoint_analyzer — Email parsing, edge n-gram, path hierarchies
nextpoint_search_analyzer — Search-time analyzer
edge_ngram_analyzer — Autocomplete (min_gram: 1, max_gram: 6)
custom_path_tree — Folder path hierarchy
Character filters — Dash replacement, comma/semicolon handling

Indexing Pipeline¶

Per-Document Indexing¶

# searchable.rb — single document index
def index_document(options = {})
  doc = as_indexed_json(options)
  __elasticsearch__.client.index(
    index: name_for_index,
    id: elasticsearch_id,
    body: doc,
    routing: elasticsearch_routing
  )
end

Bulk Indexing (Fork-Based)¶

# bulk_indexable.rb — parallel batch processing
def bulk_import(options = {})
  import(options.merge(batch_size: 1000)) do |response|
    log_metrics(response) if ENV['LOG_INDEXING_METRICS'] == 'true'
  end
end

# Forks child processes (up to CPU count)
# Each fork processes 1000-record slices independently
# Parent waits for all children to complete

Indexing Request System¶

Managed via ElasticsearchIndexRequest model:

State	Meaning
Requested	Index update queued (`indexing_requested_at_gmt` set)
Started	Indexer picked up request (`indexing_started_at_gmt` set)
Completed	Indexing done (`indexing_last_completed_at` set)

Incremental indexing: Only indexes documents where updated_at_gmt >= last_completed_at - 10s. The 10-second overlap handles clock skew.

Attachment (Page) Batch Size — Adaptive¶

DEFAULT_ATTACHMENT_BATCH_SIZE = 1000
On RequestEntityTooLarge/GatewayTimeout: batch_size *= 0.5
On success: batch_size *= 1.1
Minimum: 1

Monitoring¶

Index Status Check Script¶

File: script/elasticsearch_index_status_check.rb

Alert	Threshold	Action
Requests running too long	> 24 hours	Email admins with indexer IP
Requests waiting too long	> 2 hours	Email admins

Health Check¶

# Rake task
rake elasticsearch:running
# Calls: Elasticsearch::Model.client.cat.health

Indexing Metrics (Optional)¶

Enable via ENV['LOG_INDEXING_METRICS'] = 'true'

Logs per batch: group, step, count, type, npcase_id, duration_sec Steps tracked: database_query, batch_to_bulk, bulk_import, index, delete, run

Reindexing¶

Zero-Downtime Reindex Script¶

File: script/elastic_reindexer.rb

Creates new index with updated mappings
Reindexes from old → new index
Switches aliases atomically
Supports per-case reindexing via dbs parameter

Critical for ADR-008 (ES upgrade) — this script handles the actual migration.

Rake Tasks¶

Task	Purpose
`elasticsearch:running`	Health check
`elasticsearch:create_indices`	Create default indexes
`elasticsearch:create_aliases`	Create per-case filtered aliases
`elasticsearch:delete_indices`	Remove indexes
`elasticsearch:create_index_request_rows`	Init index request tracking
`elasticsearch:redo_all_the_things`	Full setup workflow
`elasticsearch:setup_test_environment`	Test env setup

Key Files¶

File	Purpose
`config/elasticsearch.yml`	ES host configuration
`config/initializers/elasticsearch.rb`	Client setup (timeout, connection)
`lib/search/elasticsearch/elasticsearch_index.rb`	Index creation, shard management, MAX_SHARD_SIZE
`lib/search/elasticsearch/constants.rb`	Analyzers, tokenizers, mapping types
`lib/search/elasticsearch/indices/mappings/exhibit.yml`	Exhibit mapping (290+ fields)
`lib/search/npcase_index_alias.rb`	Per-case alias management
`app/models/elasticsearch_index_request.rb`	Index request lifecycle
`app/models/elasticsearch_indexer.rb`	Indexer operation, error handling
`app/models/concerns/searchable.rb`	Per-document indexing
`app/models/concerns/bulk_indexable.rb`	Bulk fork-based indexing
`app/models/concerns/exhibit_searchable.rb`	Exhibit `as_indexed_json` (37 methods)
`app/models/concerns/attachment_searchable.rb`	Page indexing with join field
`app/models/filter_search.rb`	Document filtering via MySQL JOINs (not ES!)
`script/elastic_reindexer.rb`	Zero-downtime reindex
`script/elasticsearch_index_status_check.rb`	Monitoring/alerts
`lib/tasks/elasticsearch.rake`	Management rake tasks

Connection to ADRs¶

ADR	Relationship
ADR-008 (ES upgrade)	Hot partition fix, mapping migration, reindexer script, server version upgrade
ADR-011 (custom field S3)	FilterSearch must migrate from MySQL JOINs to ES queries for tag filtering
BACKLOG	Hot partition is the biggest ES operational issue

Ask the Architecture ×

Ask questions about Nextpoint architecture, patterns, rules, or any module. Powered by Claude Opus 4.6.