Skip to content

ADR-011: Custom Field Storage Modernization (S3 + ES + Async Audit)

Status

Proposed

Date

2026-03-20

Context

The Problem

Heavy Aurora MySQL load during imports with many custom fields. Legacy path generates 100+ queries per document (50+ Tag.create_or_find_with_index + 50+ Tagging.create). With 100K documents × 50 custom fields = 20M queries, imports take ~10 hours with 80-100% database CPU.

Current Scale

  • 18K+ Legacy cases, ~1K NGE cases
  • Legacy cases will remain until deleted (no old-case migration)
  • Tags/taggings are the biggest tables in per-case databases
  • 96% of tagging writes are never read — ES is the actual query layer

Current Data Flow

Extractor → S3 (metadata.json) → SQS → DocumentLoader
  Per document:
  1. Create Exhibit in Aurora (core fields)
  2. For each custom field (50+):
     - Tag.create_or_find_with_index (2-3 queries)
     - Tagging.create (1 query)
  3. Index to Elasticsearch

  Total: 100+ queries per document (Legacy path)
  NGE path: Reduced via TagDedupe + INSERT IGNORE, but still significant

Current Tables

  • CustomField: Field definitions (title, input_type) — small, stays in RDS
  • Tag: Unique values per field (normalized) — large, 2.5B+ rows across all cases
  • Tagging: Links exhibits to tags (with user_id for audit) — largest, 5B+ rows

What We Verified

  • Tagging model HAS audit trail via AuditsTransactions concern (ModelAudit records)
  • Schema has NO actual FK constraints — only Rails-level dependent: :delete_all
  • Concurrent edit safety is advisory (unique index + try-fail), not pessimistic locks
  • FilterSearch (document browse/filter) uses MySQL JOINs on tags/taggings for ALL cases (NGE and Legacy)
  • Text search uses Elasticsearch (already)
  • Coding overlays DO update tags/taggings

Decision

Move custom field values from MySQL (Tag + Tagging tables) to S3 JSON + Elasticsearch for new imports. Maintain audit trail via existing ModelAudit system at the Exhibit level.

New Data Flow

Extractor → S3 (metadata.json) → SQS → DocumentLoader
  Per document:
  1. Create Exhibit in Aurora (core fields only)
  2. Store custom fields in S3 JSON
  3. Index to Elasticsearch (includes custom fields)
  4. Async audit sync to Aurora (ModelAudit only, no Tag/Tagging)

  Total: ~10 queries per document

Storage Strategy

S3 Structure:

s3://nextpoint-metadata/
  case_{case_id}/
    exhibit_{exhibit_id}_custom_fields.json

JSON Format:

{
  "exhibit_id": 67890,
  "custom_fields": {
    "client_name": "Acme Corporation",
    "contract_date": "2024-01-15",
    "priority": "High",
    "amount": "50000"
  },
  "metadata": {
    "imported_at": "2024-01-01T10:00:00Z",
    "batch_id": 789
  }
}

Audit Trail

Use existing ModelAudit system at Exhibit level (not per-Tagging):

class Exhibit < PerCaseModel
  include AuditsTransactions

  attribute :custom_fields_json, :text

  audits_transactions(
    unreported: [:custom_fields_json],
    updated_instance_changes: lambda { |exhibit, model_audit|
      old_cf = JSON.parse(model_audit.parsed_previous_version['custom_fields_json'] || '{}')
      new_cf = JSON.parse(model_audit.parsed_updated_version['custom_fields_json'] || '{}')
      changes = []
      (old_cf.keys + new_cf.keys).uniq.each do |field|
        changes << [field, old_cf[field], new_cf[field]] if old_cf[field] != new_cf[field]
      end
      changes
    }
  )
end

FilterSearch Migration

Current FilterSearch uses MySQL JOINs for tag-based document filtering. This must be migrated to query Elasticsearch instead. The document list uses infinite scroll with AG Grid — the switch from MySQL JOINs to ES queries is compatible with this UI pattern.

Phase: FilterSearch → ES migration happens in Phase 2, before Phase 3 (stop MySQL tag writes).

Custom Field ID Caching

CustomField definitions (title → id mapping) must be cached for import performance. All Lambda invocations within a batch should access the same cache.

Options: - ElastiCache Redis (shared across Lambdas) — recommended - DynamoDB (if Redis not available in Lambda VPC) - S3 JSON manifest per batch (simplest, loaded once per Lambda cold start)

Legacy Case Support

Legacy cases (18K+) continue using existing Tag/Tagging tables. No migration of old data. Old cases only lose Tag/Tagging data when cases are deleted.

The dual path: - New imports (NGE): S3 JSON + ES + ModelAudit - Legacy cases: Existing Tag/Tagging MySQL path (unchanged) - Reads: S3 first, fallback to MySQL (for Legacy data)

Architecture

Import Path (documentloader — Python)

def process_document(event, metadata):
    # 1. Create Exhibit (core fields only)
    exhibit = create_exhibit(session, metadata)

    # 2. Store custom fields in S3
    s3_client.put_object(
        Bucket=METADATA_BUCKET,
        Key=f"case_{case_id}/exhibit_{exhibit.id}_custom_fields.json",
        Body=json.dumps({
            "exhibit_id": exhibit.id,
            "custom_fields": metadata.get("custom_fields", {}),
            "metadata": {"imported_at": now_iso(), "batch_id": batch_id}
        }),
        ContentType="application/json"
    )

    # 3. Index to ES (includes custom fields)
    es_tags = build_es_tags(metadata["custom_fields"], custom_field_cache)
    index_to_es(exhibit, es_tags)

    # 4. No Tag/Tagging writes — 100+ queries eliminated

Read Path (Rails — document editor/viewer)

class Exhibit < PerCaseModel
  def custom_fields
    @custom_fields ||= read_from_s3_or_mysql
  end

  private

  def read_from_s3_or_mysql
    # Try S3 first (new imports)
    metadata = S3.get_object(
      bucket: 'nextpoint-metadata',
      key: "case_#{npcase_id}/exhibit_#{id}_custom_fields.json"
    )
    JSON.parse(metadata.body.read)["custom_fields"]
  rescue Aws::S3::Errors::NoSuchKey
    # Fallback to MySQL (Legacy imports)
    tags_by_field = tags.includes(:custom_field).group_by(&:custom_field)
    tags_by_field.transform_values { |tags| tags.map(&:name) }
  end
end

Update Path (Rails — document editor, coding overlays)

def update_custom_field(field_name, new_value, user_id)
  metadata = read_metadata_from_s3
  old_value = metadata[:custom_fields][field_name]

  # 1. Update S3
  metadata[:custom_fields][field_name] = new_value
  write_metadata_to_s3(metadata)

  # 2. Update Elasticsearch
  ElasticsearchUpdateJob.perform_async(id, field_name, new_value)

  # 3. Audit via ModelAudit (1 query, not 5)
  self.custom_fields_json = metadata[:custom_fields].to_json
  @self_before_action = { id: id, custom_fields_json: {field_name => old_value}.to_json }.to_json
  perform_audit
end

Migration Plan

Phase 1: Parallel Write (Weeks 1-2)

  • documentloader writes custom fields to S3 JSON alongside existing Tag/Tagging
  • Validate S3 data matches MySQL data
  • No user-facing changes

Phase 2: FilterSearch → ES Migration (Weeks 3-4)

  • Rewrite tag-based filters in FilterSearch to query ES instead of MySQL JOINs
  • Test with infinite scroll AG Grid views
  • Deploy behind feature flag

Phase 3: Read from S3 (Weeks 5-6)

  • Exhibit model reads custom fields from S3 (MySQL fallback for Legacy)
  • Document editor uses S3 for display
  • Updates write to S3 + ES + ModelAudit (stop MySQL Tag/Tagging writes)

Phase 4: Cleanup (Weeks 7-8)

  • Stop parallel MySQL writes for new imports
  • Keep existing Tag/Tagging data for Legacy cases (read-only)
  • Archive unused code
  • Performance optimization

Performance Comparison

Import (100K documents, 50 custom fields)

Metric Current (Legacy) S3-Based Improvement
Queries per document ~200 ~10 20x
Total queries 20M 1M 20x
Import time ~10 hours ~1 hour 10x
DB CPU during import 80-100% <20% 5x

Update (single custom field change)

Metric Current S3-Based
Operations 5 queries + 2 audits 1 S3 GET + 1 S3 PUT + 1 ES + 1 audit
Time ~50ms ~65ms
DB load 7 queries 1 query

Cost (100M documents, 50 custom fields)

Component Current S3-Based Savings
Aurora storage (tags+taggings) ~$100/mo $0 $100/mo
S3 storage + requests $0 ~$15/mo -$15/mo
Aurora IOPS (import) High Low Significant
Net ~$150/mo ~$25/mo ~$125/mo

Consequences

Positive

  • 10x faster imports for custom-field-heavy load files
  • 95% reduction in Aurora queries during import
  • Tags/taggings tables stop growing for new imports
  • Audit trail maintained via existing ModelAudit system
  • Gradual migration with rollback capability

Negative

  • Dual read path (S3 + MySQL fallback) adds complexity
  • FilterSearch must be migrated to ES (additional scope)
  • S3 latency for reads (~50ms vs ~5ms MySQL) mitigated by Redis caching
  • Legacy cases keep old Tag/Tagging data indefinitely (until case deleted)
  • Custom field ID cache needed across Lambda invocations

Risks

  • FilterSearch migration scope — Rewriting tag-based SQL JOINs to ES queries is non-trivial. Mitigation: feature flag, parallel testing.
  • S3/ES consistency — Custom field updates must atomically update both. Mitigation: reconciliation job, S3 versioning for rollback.
  • Coding overlay performance — Bulk overlay on 10K documents requires 10K S3 writes. Mitigation: batch S3 operations, async processing.
  • Legacy coexistence — Dual read path (S3 vs MySQL) must work seamlessly. Mitigation: Exhibit#custom_fields abstracts storage behind single interface.

References

  • FilterSearch: rails/app/models/filter_search.rb (lines 454-530, tag JOINs)
  • Tagging audit: rails/app/models/tagging.rb (AuditsTransactions)
  • NGE TagDedupe: documentloader/shell/tags_ops.py (existing optimization)
  • Divergence map: Tag/Custom Field section
  • BACKLOG: Import throughput items
Ask the Architecture ×

Ask questions about Nextpoint architecture, patterns, rules, or any module. Powered by Claude Opus 4.6.