Elasticsearch Cluster Management — Reference Implementation¶
Overview¶
Self-managed Elasticsearch 7.x cluster. Shared physical indexes with per-case filtered aliases. ~125 physical indexes, ~30+ TB, 10B+ documents across 3 independent regional deployments.
Architecture¶
Index Structure¶
Shared indexes, per-case aliases:
Physical Index: production_exhibits_20200101_00001 (multiple cases share this)
Case Alias: production_{npcase_id}_exhibits (filtered by npcase_id term)
- Each physical index holds documents from many cases
- Per-case isolation via filtered aliases:
{ term: { npcase_id: npcase_id } } - NOT index-per-case — shared indexes with filtered routing
Index Naming Convention¶
{environment}_{type}_{identifier}_{sequential_number}
production_exhibits_20200101_00001
production_deposition_volumes_20200101_00001
Index Types¶
- Exhibits — Document metadata + page-level search text
- DepositionVolumes — Transcript search
Parent-Child (Join Field)¶
Parent: Exhibit (document metadata, 290+ fields)
└── Children: Attachments (pages of the document)
- bates, npcase_id, position, search_text
- Routing: {npcase_id}_exhibit_{exhibit_id}
Note: "Attachment" in ES context = pages of the document (TIFF/PNG/PDF pages),
NOT email attachments or child documents. Document family relationships use
document_relations table in MySQL.
CRITICAL ISSUE: Hot Partition Problem¶
This is the biggest ES operational issue.
The Problem¶
New index creation only triggered when largest primary shard > 30GB, checked ONLY at
case creation time (SetupNewCaseJob#setup_indices). Never re-evaluated after.
New cases created → all assigned to latest index (e.g., 00125)
↓
All grow simultaneously
↓
Index 00125 grows from 5MB to 300GB-1TB
↓
Large import on one case degrades search for ALL co-located cases
Evidence¶
- Index size variance: 5MB to 1TB (200x difference)
- Latest index handles bulk of both read AND write load
- Indexes 00120-00125 all tiny (5MB-1.1GB) — suggests manual creation attempts
MAX_SHARD_SIZE = 30.gigabytesinelasticsearch_index.rbline 4
Impact¶
- Large imports cause search degradation for unrelated cases
- ES cluster hot spots on nodes hosting the latest index
- No automatic rebalancing — manual intervention required
Root Cause¶
# elasticsearch_index.rb — only checks at case creation
def self.current_index_for(type)
latest_index = find_latest_index(type)
if largest_primary_shard_size(latest_index) > MAX_SHARD_SIZE
create_next_index(type) # Only happens here
end
latest_index
end
Fix Options (for ADR-008)¶
- Periodic shard size check — Background job evaluates shard sizes, creates new index when threshold exceeded (not just at case creation)
- ILM (Index Lifecycle Management) — ES native solution for size-based rollover with automatic alias management
- Time-based indexes — Monthly/quarterly index rotation regardless of size
- Better case distribution — Spread new cases across existing under-capacity indexes instead of always assigning to the latest
NOT viable: Index-per-case — 20K+ cases would create 40K+ shards minimum (primary + replica). ES recommends <1000 shards per node. Would require 40+ dedicated nodes just for shard management overhead.
Stack¶
| Component | Version/Detail |
|---|---|
| ES Server | 7.x (self-managed, not OpenSearch) |
| ES Ruby Gem | elasticsearch 7.4.0 |
| elasticsearch-model | 7.0.0 |
| elasticsearch-rails | 7.0.0 |
| Client timeout | 180 seconds |
| Dynamic mapping | :strict |
Index Mappings¶
Exhibit Mapping (290+ fields)¶
Key fields and nested structures:
| Field/Structure | Type | Purpose |
|---|---|---|
npcase_id |
integer | Per-case filtering (alias filter) |
nge_document_id |
keyword | NGE document identifier |
exhibit_join |
join | Parent-child for exhibit→pages |
es_tags (nested) |
custom_field_id, name, path, date | Custom field tag values |
es_exh_designations (nested) |
positions, label_ids | Label/folder assignments |
shr_tags (nested) |
search report tags | Search hit report data |
relationships (nested) |
document relationships | Family linking |
search_text |
text (nextpoint_analyzer) | Full-text content |
bates_start, bates_end |
keyword | Bates number range |
privileged, confidentiality |
keyword | Review status |
Custom Analyzers¶
- nextpoint_analyzer — Email parsing, edge n-gram, path hierarchies
- nextpoint_search_analyzer — Search-time analyzer
- edge_ngram_analyzer — Autocomplete (min_gram: 1, max_gram: 6)
- custom_path_tree — Folder path hierarchy
- Character filters — Dash replacement, comma/semicolon handling
Indexing Pipeline¶
Per-Document Indexing¶
# searchable.rb — single document index
def index_document(options = {})
doc = as_indexed_json(options)
__elasticsearch__.client.index(
index: name_for_index,
id: elasticsearch_id,
body: doc,
routing: elasticsearch_routing
)
end
Bulk Indexing (Fork-Based)¶
# bulk_indexable.rb — parallel batch processing
def bulk_import(options = {})
import(options.merge(batch_size: 1000)) do |response|
log_metrics(response) if ENV['LOG_INDEXING_METRICS'] == 'true'
end
end
# Forks child processes (up to CPU count)
# Each fork processes 1000-record slices independently
# Parent waits for all children to complete
Indexing Request System¶
Managed via ElasticsearchIndexRequest model:
| State | Meaning |
|---|---|
| Requested | Index update queued (indexing_requested_at_gmt set) |
| Started | Indexer picked up request (indexing_started_at_gmt set) |
| Completed | Indexing done (indexing_last_completed_at set) |
Incremental indexing: Only indexes documents where updated_at_gmt >= last_completed_at - 10s.
The 10-second overlap handles clock skew.
Attachment (Page) Batch Size — Adaptive¶
DEFAULT_ATTACHMENT_BATCH_SIZE = 1000
On RequestEntityTooLarge/GatewayTimeout: batch_size *= 0.5
On success: batch_size *= 1.1
Minimum: 1
Monitoring¶
Index Status Check Script¶
File: script/elasticsearch_index_status_check.rb
| Alert | Threshold | Action |
|---|---|---|
| Requests running too long | > 24 hours | Email admins with indexer IP |
| Requests waiting too long | > 2 hours | Email admins |
Health Check¶
Indexing Metrics (Optional)¶
Enable via ENV['LOG_INDEXING_METRICS'] = 'true'
Logs per batch: group, step, count, type, npcase_id, duration_sec Steps tracked: database_query, batch_to_bulk, bulk_import, index, delete, run
Reindexing¶
Zero-Downtime Reindex Script¶
File: script/elastic_reindexer.rb
- Creates new index with updated mappings
- Reindexes from old → new index
- Switches aliases atomically
- Supports per-case reindexing via
dbsparameter
Critical for ADR-008 (ES upgrade) — this script handles the actual migration.
Rake Tasks¶
| Task | Purpose |
|---|---|
elasticsearch:running |
Health check |
elasticsearch:create_indices |
Create default indexes |
elasticsearch:create_aliases |
Create per-case filtered aliases |
elasticsearch:delete_indices |
Remove indexes |
elasticsearch:create_index_request_rows |
Init index request tracking |
elasticsearch:redo_all_the_things |
Full setup workflow |
elasticsearch:setup_test_environment |
Test env setup |
Key Files¶
| File | Purpose |
|---|---|
config/elasticsearch.yml |
ES host configuration |
config/initializers/elasticsearch.rb |
Client setup (timeout, connection) |
lib/search/elasticsearch/elasticsearch_index.rb |
Index creation, shard management, MAX_SHARD_SIZE |
lib/search/elasticsearch/constants.rb |
Analyzers, tokenizers, mapping types |
lib/search/elasticsearch/indices/mappings/exhibit.yml |
Exhibit mapping (290+ fields) |
lib/search/npcase_index_alias.rb |
Per-case alias management |
app/models/elasticsearch_index_request.rb |
Index request lifecycle |
app/models/elasticsearch_indexer.rb |
Indexer operation, error handling |
app/models/concerns/searchable.rb |
Per-document indexing |
app/models/concerns/bulk_indexable.rb |
Bulk fork-based indexing |
app/models/concerns/exhibit_searchable.rb |
Exhibit as_indexed_json (37 methods) |
app/models/concerns/attachment_searchable.rb |
Page indexing with join field |
app/models/filter_search.rb |
Document filtering via MySQL JOINs (not ES!) |
script/elastic_reindexer.rb |
Zero-downtime reindex |
script/elasticsearch_index_status_check.rb |
Monitoring/alerts |
lib/tasks/elasticsearch.rake |
Management rake tasks |
Connection to ADRs¶
| ADR | Relationship |
|---|---|
| ADR-008 (ES upgrade) | Hot partition fix, mapping migration, reindexer script, server version upgrade |
| ADR-011 (custom field S3) | FilterSearch must migrate from MySQL JOINs to ES queries for tag filtering |
| BACKLOG | Hot partition is the biggest ES operational issue |
Ask questions about Nextpoint architecture, patterns, rules, or any module. Powered by Claude Opus 4.6.