ADR-011: Custom Field Storage Modernization (S3 + ES + Async Audit)¶
Status¶
Proposed
Date¶
2026-03-20
Context¶
The Problem¶
Heavy Aurora MySQL load during imports with many custom fields. Legacy path generates 100+ queries per document (50+ Tag.create_or_find_with_index + 50+ Tagging.create). With 100K documents × 50 custom fields = 20M queries, imports take ~10 hours with 80-100% database CPU.
Current Scale¶
- 18K+ Legacy cases, ~1K NGE cases
- Legacy cases will remain until deleted (no old-case migration)
- Tags/taggings are the biggest tables in per-case databases
- 96% of tagging writes are never read — ES is the actual query layer
Current Data Flow¶
Extractor → S3 (metadata.json) → SQS → DocumentLoader
↓
Per document:
1. Create Exhibit in Aurora (core fields)
2. For each custom field (50+):
- Tag.create_or_find_with_index (2-3 queries)
- Tagging.create (1 query)
3. Index to Elasticsearch
Total: 100+ queries per document (Legacy path)
NGE path: Reduced via TagDedupe + INSERT IGNORE, but still significant
Current Tables¶
- CustomField: Field definitions (title, input_type) — small, stays in RDS
- Tag: Unique values per field (normalized) — large, 2.5B+ rows across all cases
- Tagging: Links exhibits to tags (with user_id for audit) — largest, 5B+ rows
What We Verified¶
- Tagging model HAS audit trail via
AuditsTransactionsconcern (ModelAudit records) - Schema has NO actual FK constraints — only Rails-level
dependent: :delete_all - Concurrent edit safety is advisory (unique index + try-fail), not pessimistic locks
- FilterSearch (document browse/filter) uses MySQL JOINs on tags/taggings for ALL cases (NGE and Legacy)
- Text search uses Elasticsearch (already)
- Coding overlays DO update tags/taggings
Decision¶
Move custom field values from MySQL (Tag + Tagging tables) to S3 JSON + Elasticsearch for new imports. Maintain audit trail via existing ModelAudit system at the Exhibit level.
New Data Flow¶
Extractor → S3 (metadata.json) → SQS → DocumentLoader
↓
Per document:
1. Create Exhibit in Aurora (core fields only)
2. Store custom fields in S3 JSON
3. Index to Elasticsearch (includes custom fields)
4. Async audit sync to Aurora (ModelAudit only, no Tag/Tagging)
Total: ~10 queries per document
Storage Strategy¶
S3 Structure:
JSON Format:
{
"exhibit_id": 67890,
"custom_fields": {
"client_name": "Acme Corporation",
"contract_date": "2024-01-15",
"priority": "High",
"amount": "50000"
},
"metadata": {
"imported_at": "2024-01-01T10:00:00Z",
"batch_id": 789
}
}
Audit Trail¶
Use existing ModelAudit system at Exhibit level (not per-Tagging):
class Exhibit < PerCaseModel
include AuditsTransactions
attribute :custom_fields_json, :text
audits_transactions(
unreported: [:custom_fields_json],
updated_instance_changes: lambda { |exhibit, model_audit|
old_cf = JSON.parse(model_audit.parsed_previous_version['custom_fields_json'] || '{}')
new_cf = JSON.parse(model_audit.parsed_updated_version['custom_fields_json'] || '{}')
changes = []
(old_cf.keys + new_cf.keys).uniq.each do |field|
changes << [field, old_cf[field], new_cf[field]] if old_cf[field] != new_cf[field]
end
changes
}
)
end
FilterSearch Migration¶
Current FilterSearch uses MySQL JOINs for tag-based document filtering. This must be migrated to query Elasticsearch instead. The document list uses infinite scroll with AG Grid — the switch from MySQL JOINs to ES queries is compatible with this UI pattern.
Phase: FilterSearch → ES migration happens in Phase 2, before Phase 3 (stop MySQL tag writes).
Custom Field ID Caching¶
CustomField definitions (title → id mapping) must be cached for import performance. All Lambda invocations within a batch should access the same cache.
Options: - ElastiCache Redis (shared across Lambdas) — recommended - DynamoDB (if Redis not available in Lambda VPC) - S3 JSON manifest per batch (simplest, loaded once per Lambda cold start)
Legacy Case Support¶
Legacy cases (18K+) continue using existing Tag/Tagging tables. No migration of old data. Old cases only lose Tag/Tagging data when cases are deleted.
The dual path: - New imports (NGE): S3 JSON + ES + ModelAudit - Legacy cases: Existing Tag/Tagging MySQL path (unchanged) - Reads: S3 first, fallback to MySQL (for Legacy data)
Architecture¶
Import Path (documentloader — Python)¶
def process_document(event, metadata):
# 1. Create Exhibit (core fields only)
exhibit = create_exhibit(session, metadata)
# 2. Store custom fields in S3
s3_client.put_object(
Bucket=METADATA_BUCKET,
Key=f"case_{case_id}/exhibit_{exhibit.id}_custom_fields.json",
Body=json.dumps({
"exhibit_id": exhibit.id,
"custom_fields": metadata.get("custom_fields", {}),
"metadata": {"imported_at": now_iso(), "batch_id": batch_id}
}),
ContentType="application/json"
)
# 3. Index to ES (includes custom fields)
es_tags = build_es_tags(metadata["custom_fields"], custom_field_cache)
index_to_es(exhibit, es_tags)
# 4. No Tag/Tagging writes — 100+ queries eliminated
Read Path (Rails — document editor/viewer)¶
class Exhibit < PerCaseModel
def custom_fields
@custom_fields ||= read_from_s3_or_mysql
end
private
def read_from_s3_or_mysql
# Try S3 first (new imports)
metadata = S3.get_object(
bucket: 'nextpoint-metadata',
key: "case_#{npcase_id}/exhibit_#{id}_custom_fields.json"
)
JSON.parse(metadata.body.read)["custom_fields"]
rescue Aws::S3::Errors::NoSuchKey
# Fallback to MySQL (Legacy imports)
tags_by_field = tags.includes(:custom_field).group_by(&:custom_field)
tags_by_field.transform_values { |tags| tags.map(&:name) }
end
end
Update Path (Rails — document editor, coding overlays)¶
def update_custom_field(field_name, new_value, user_id)
metadata = read_metadata_from_s3
old_value = metadata[:custom_fields][field_name]
# 1. Update S3
metadata[:custom_fields][field_name] = new_value
write_metadata_to_s3(metadata)
# 2. Update Elasticsearch
ElasticsearchUpdateJob.perform_async(id, field_name, new_value)
# 3. Audit via ModelAudit (1 query, not 5)
self.custom_fields_json = metadata[:custom_fields].to_json
@self_before_action = { id: id, custom_fields_json: {field_name => old_value}.to_json }.to_json
perform_audit
end
Migration Plan¶
Phase 1: Parallel Write (Weeks 1-2)¶
- documentloader writes custom fields to S3 JSON alongside existing Tag/Tagging
- Validate S3 data matches MySQL data
- No user-facing changes
Phase 2: FilterSearch → ES Migration (Weeks 3-4)¶
- Rewrite tag-based filters in FilterSearch to query ES instead of MySQL JOINs
- Test with infinite scroll AG Grid views
- Deploy behind feature flag
Phase 3: Read from S3 (Weeks 5-6)¶
- Exhibit model reads custom fields from S3 (MySQL fallback for Legacy)
- Document editor uses S3 for display
- Updates write to S3 + ES + ModelAudit (stop MySQL Tag/Tagging writes)
Phase 4: Cleanup (Weeks 7-8)¶
- Stop parallel MySQL writes for new imports
- Keep existing Tag/Tagging data for Legacy cases (read-only)
- Archive unused code
- Performance optimization
Performance Comparison¶
Import (100K documents, 50 custom fields)¶
| Metric | Current (Legacy) | S3-Based | Improvement |
|---|---|---|---|
| Queries per document | ~200 | ~10 | 20x |
| Total queries | 20M | 1M | 20x |
| Import time | ~10 hours | ~1 hour | 10x |
| DB CPU during import | 80-100% | <20% | 5x |
Update (single custom field change)¶
| Metric | Current | S3-Based |
|---|---|---|
| Operations | 5 queries + 2 audits | 1 S3 GET + 1 S3 PUT + 1 ES + 1 audit |
| Time | ~50ms | ~65ms |
| DB load | 7 queries | 1 query |
Cost (100M documents, 50 custom fields)¶
| Component | Current | S3-Based | Savings |
|---|---|---|---|
| Aurora storage (tags+taggings) | ~$100/mo | $0 | $100/mo |
| S3 storage + requests | $0 | ~$15/mo | -$15/mo |
| Aurora IOPS (import) | High | Low | Significant |
| Net | ~$150/mo | ~$25/mo | ~$125/mo |
Consequences¶
Positive¶
- 10x faster imports for custom-field-heavy load files
- 95% reduction in Aurora queries during import
- Tags/taggings tables stop growing for new imports
- Audit trail maintained via existing ModelAudit system
- Gradual migration with rollback capability
Negative¶
- Dual read path (S3 + MySQL fallback) adds complexity
- FilterSearch must be migrated to ES (additional scope)
- S3 latency for reads (~50ms vs ~5ms MySQL) mitigated by Redis caching
- Legacy cases keep old Tag/Tagging data indefinitely (until case deleted)
- Custom field ID cache needed across Lambda invocations
Risks¶
- FilterSearch migration scope — Rewriting tag-based SQL JOINs to ES queries is non-trivial. Mitigation: feature flag, parallel testing.
- S3/ES consistency — Custom field updates must atomically update both. Mitigation: reconciliation job, S3 versioning for rollback.
- Coding overlay performance — Bulk overlay on 10K documents requires 10K S3 writes. Mitigation: batch S3 operations, async processing.
- Legacy coexistence — Dual read path (S3 vs MySQL) must work seamlessly. Mitigation: Exhibit#custom_fields abstracts storage behind single interface.
References¶
- FilterSearch:
rails/app/models/filter_search.rb(lines 454-530, tag JOINs) - Tagging audit:
rails/app/models/tagging.rb(AuditsTransactions) - NGE TagDedupe:
documentloader/shell/tags_ops.py(existing optimization) - Divergence map: Tag/Custom Field section
- BACKLOG: Import throughput items
Ask questions about Nextpoint architecture, patterns, rules, or any module. Powered by Claude Opus 4.6.