ADR-011: Custom Field Storage Modernization (S3 + ES + Async Audit)¶

Status¶

Proposed

Date¶

2026-03-20

Context¶

The Problem¶

Heavy Aurora MySQL load during imports with many custom fields. Legacy path generates 100+ queries per document (50+ Tag.create_or_find_with_index + 50+ Tagging.create). With 100K documents × 50 custom fields = 20M queries, imports take ~10 hours with 80-100% database CPU.

Current Scale¶

18K+ Legacy cases, ~1K NGE cases
Legacy cases will remain until deleted (no old-case migration)
Tags/taggings are the biggest tables in per-case databases
96% of tagging writes are never read — ES is the actual query layer

Current Data Flow¶

Extractor → S3 (metadata.json) → SQS → DocumentLoader
  ↓
  Per document:
  1. Create Exhibit in Aurora (core fields)
  2. For each custom field (50+):
     - Tag.create_or_find_with_index (2-3 queries)
     - Tagging.create (1 query)
  3. Index to Elasticsearch

  Total: 100+ queries per document (Legacy path)
  NGE path: Reduced via TagDedupe + INSERT IGNORE, but still significant

Current Tables¶

CustomField: Field definitions (title, input_type) — small, stays in RDS
Tag: Unique values per field (normalized) — large, 2.5B+ rows across all cases
Tagging: Links exhibits to tags (with user_id for audit) — largest, 5B+ rows

What We Verified¶

Tagging model HAS audit trail via AuditsTransactions concern (ModelAudit records)
Schema has NO actual FK constraints — only Rails-level dependent: :delete_all
Concurrent edit safety is advisory (unique index + try-fail), not pessimistic locks
FilterSearch (document browse/filter) uses MySQL JOINs on tags/taggings for ALL cases (NGE and Legacy)
Text search uses Elasticsearch (already)
Coding overlays DO update tags/taggings

Decision¶

Move custom field values from MySQL (Tag + Tagging tables) to S3 JSON + Elasticsearch for new imports. Maintain audit trail via existing ModelAudit system at the Exhibit level.

New Data Flow¶

Extractor → S3 (metadata.json) → SQS → DocumentLoader
  ↓
  Per document:
  1. Create Exhibit in Aurora (core fields only)
  2. Store custom fields in S3 JSON
  3. Index to Elasticsearch (includes custom fields)
  4. Async audit sync to Aurora (ModelAudit only, no Tag/Tagging)

  Total: ~10 queries per document

Storage Strategy¶

S3 Structure:

s3://nextpoint-metadata/
  case_{case_id}/
    exhibit_{exhibit_id}_custom_fields.json

JSON Format:

{
  "exhibit_id": 67890,
  "custom_fields": {
    "client_name": "Acme Corporation",
    "contract_date": "2024-01-15",
    "priority": "High",
    "amount": "50000"
  },
  "metadata": {
    "imported_at": "2024-01-01T10:00:00Z",
    "batch_id": 789
  }
}

Audit Trail¶

Use existing ModelAudit system at Exhibit level (not per-Tagging):

class Exhibit < PerCaseModel
  include AuditsTransactions

  attribute :custom_fields_json, :text

  audits_transactions(
    unreported: [:custom_fields_json],
    updated_instance_changes: lambda { |exhibit, model_audit|
      old_cf = JSON.parse(model_audit.parsed_previous_version['custom_fields_json'] || '{}')
      new_cf = JSON.parse(model_audit.parsed_updated_version['custom_fields_json'] || '{}')
      changes = []
      (old_cf.keys + new_cf.keys).uniq.each do |field|
        changes << [field, old_cf[field], new_cf[field]] if old_cf[field] != new_cf[field]
      end
      changes
    }
  )
end

FilterSearch Migration¶

Current FilterSearch uses MySQL JOINs for tag-based document filtering. This must be migrated to query Elasticsearch instead. The document list uses infinite scroll with AG Grid — the switch from MySQL JOINs to ES queries is compatible with this UI pattern.

Phase: FilterSearch → ES migration happens in Phase 2, before Phase 3 (stop MySQL tag writes).

Custom Field ID Caching¶

CustomField definitions (title → id mapping) must be cached for import performance. All Lambda invocations within a batch should access the same cache.

Options: - ElastiCache Redis (shared across Lambdas) — recommended - DynamoDB (if Redis not available in Lambda VPC) - S3 JSON manifest per batch (simplest, loaded once per Lambda cold start)

Legacy Case Support¶

Legacy cases (18K+) continue using existing Tag/Tagging tables. No migration of old data. Old cases only lose Tag/Tagging data when cases are deleted.

The dual path: - New imports (NGE): S3 JSON + ES + ModelAudit - Legacy cases: Existing Tag/Tagging MySQL path (unchanged) - Reads: S3 first, fallback to MySQL (for Legacy data)

Architecture¶

Import Path (documentloader — Python)¶

def process_document(event, metadata):
    # 1. Create Exhibit (core fields only)
    exhibit = create_exhibit(session, metadata)

    # 2. Store custom fields in S3
    s3_client.put_object(
        Bucket=METADATA_BUCKET,
        Key=f"case_{case_id}/exhibit_{exhibit.id}_custom_fields.json",
        Body=json.dumps({
            "exhibit_id": exhibit.id,
            "custom_fields": metadata.get("custom_fields", {}),
            "metadata": {"imported_at": now_iso(), "batch_id": batch_id}
        }),
        ContentType="application/json"
    )

    # 3. Index to ES (includes custom fields)
    es_tags = build_es_tags(metadata["custom_fields"], custom_field_cache)
    index_to_es(exhibit, es_tags)

    # 4. No Tag/Tagging writes — 100+ queries eliminated

Read Path (Rails — document editor/viewer)¶

class Exhibit < PerCaseModel
  def custom_fields
    @custom_fields ||= read_from_s3_or_mysql
  end

  private

  def read_from_s3_or_mysql
    # Try S3 first (new imports)
    metadata = S3.get_object(
      bucket: 'nextpoint-metadata',
      key: "case_#{npcase_id}/exhibit_#{id}_custom_fields.json"
    )
    JSON.parse(metadata.body.read)["custom_fields"]
  rescue Aws::S3::Errors::NoSuchKey
    # Fallback to MySQL (Legacy imports)
    tags_by_field = tags.includes(:custom_field).group_by(&:custom_field)
    tags_by_field.transform_values { |tags| tags.map(&:name) }
  end
end

Update Path (Rails — document editor, coding overlays)¶

def update_custom_field(field_name, new_value, user_id)
  metadata = read_metadata_from_s3
  old_value = metadata[:custom_fields][field_name]

  # 1. Update S3
  metadata[:custom_fields][field_name] = new_value
  write_metadata_to_s3(metadata)

  # 2. Update Elasticsearch
  ElasticsearchUpdateJob.perform_async(id, field_name, new_value)

  # 3. Audit via ModelAudit (1 query, not 5)
  self.custom_fields_json = metadata[:custom_fields].to_json
  @self_before_action = { id: id, custom_fields_json: {field_name => old_value}.to_json }.to_json
  perform_audit
end

Migration Plan¶

Phase 1: Parallel Write (Weeks 1-2)¶

documentloader writes custom fields to S3 JSON alongside existing Tag/Tagging
Validate S3 data matches MySQL data
No user-facing changes

Phase 2: FilterSearch → ES Migration (Weeks 3-4)¶

Rewrite tag-based filters in FilterSearch to query ES instead of MySQL JOINs
Test with infinite scroll AG Grid views
Deploy behind feature flag

Phase 3: Read from S3 (Weeks 5-6)¶

Exhibit model reads custom fields from S3 (MySQL fallback for Legacy)
Document editor uses S3 for display
Updates write to S3 + ES + ModelAudit (stop MySQL Tag/Tagging writes)

Phase 4: Cleanup (Weeks 7-8)¶

Stop parallel MySQL writes for new imports
Keep existing Tag/Tagging data for Legacy cases (read-only)
Archive unused code
Performance optimization

Performance Comparison¶

Import (100K documents, 50 custom fields)¶

Metric	Current (Legacy)	S3-Based	Improvement
Queries per document	~200	~10	20x
Total queries	20M	1M	20x
Import time	~10 hours	~1 hour	10x
DB CPU during import	80-100%	<20%	5x

Update (single custom field change)¶

Metric	Current	S3-Based
Operations	5 queries + 2 audits	1 S3 GET + 1 S3 PUT + 1 ES + 1 audit
Time	~50ms	~65ms
DB load	7 queries	1 query

Cost (100M documents, 50 custom fields)¶

Component	Current	S3-Based	Savings
Aurora storage (tags+taggings)	~$100/mo	$0	$100/mo
S3 storage + requests	$0	~$15/mo	-$15/mo
Aurora IOPS (import)	High	Low	Significant
Net	~$150/mo	~$25/mo	~$125/mo

Consequences¶

Positive¶

10x faster imports for custom-field-heavy load files
95% reduction in Aurora queries during import
Tags/taggings tables stop growing for new imports
Audit trail maintained via existing ModelAudit system
Gradual migration with rollback capability

Negative¶

Dual read path (S3 + MySQL fallback) adds complexity
FilterSearch must be migrated to ES (additional scope)
S3 latency for reads (~50ms vs ~5ms MySQL) mitigated by Redis caching
Legacy cases keep old Tag/Tagging data indefinitely (until case deleted)
Custom field ID cache needed across Lambda invocations

Risks¶

FilterSearch migration scope — Rewriting tag-based SQL JOINs to ES queries is non-trivial. Mitigation: feature flag, parallel testing.
S3/ES consistency — Custom field updates must atomically update both. Mitigation: reconciliation job, S3 versioning for rollback.
Coding overlay performance — Bulk overlay on 10K documents requires 10K S3 writes. Mitigation: batch S3 operations, async processing.
Legacy coexistence — Dual read path (S3 vs MySQL) must work seamlessly. Mitigation: Exhibit#custom_fields abstracts storage behind single interface.

References¶

FilterSearch: rails/app/models/filter_search.rb (lines 454-530, tag JOINs)
Tagging audit: rails/app/models/tagging.rb (AuditsTransactions)
NGE TagDedupe: documentloader/shell/tags_ops.py (existing optimization)
Divergence map: Tag/Custom Field section
BACKLOG: Import throughput items

Ask the Architecture ×

Ask questions about Nextpoint architecture, patterns, rules, or any module. Powered by Claude Opus 4.6.