Chain State Synchronization

Real-time chain monitoring and cache synchronization through WebSocket subscriptions.

Table of Contents


Overview

Prism supports WebSocket connections to upstream RPC providers for real-time chain state synchronization. Each configured upstream can optionally provide a WebSocket URL (wss://...) to enable push-based block notifications.

┌─────────────────────────────────────────────────────────┐
│                   WebSocket Flow                         │
├─────────────────────────────────────────────────────────┤
│                                                          │
│  1. Connect to wss://upstream.provider.com              │
│  2. Subscribe: eth_subscribe("newHeads")                │
│  3. Receive: Block header notifications                 │
│  4. Process:                                             │
│     ├─ Update chain tip                                 │
│     ├─ Detect reorgs                                    │
│     ├─ Fetch full block data                            │
│     └─ Cache block, transactions, receipts              │
│                                                          │
└─────────────────────────────────────────────────────────┘

Key Components

  1. WebSocketHandler: Manages WebSocket connections and subscriptions

  2. ReorgManager: Detects and handles chain reorganizations

  3. CacheManager: Proactively caches incoming block data

  4. WebSocketFailureTracker: Tracks connection health and prevents endless retries


Why Use WebSocket

WebSocket connections provide several critical advantages over HTTP-only polling:

1. Cache Accuracy

Problem without WebSocket:

  • Cache may serve stale data until health check detects new blocks

  • Health check intervals (default: 60s) create latency gaps

  • Cache invalidation delayed during reorgs

Solution with WebSocket:

  • Instant notification of new blocks (sub-second)

  • Immediate cache updates

  • Real-time reorg detection

2. Reorg Detection

HTTP-only detection:

Health Check (every 60s):
  T=0s:   Check block 18500000 (hash: 0xAAA)
  T=60s:  Check block 18500005 (hash: 0xBBB)

Problem: Missed reorg at T=30s affecting blocks 18500001-18500003
Result: Cache served wrong data for 30 seconds

WebSocket detection:

Real-time Monitoring:
  T=0s:   Block 18500000 (hash: 0xAAA)
  T=12s:  Block 18500001 (hash: 0xBBB)
  T=24s:  Block 18500002 (hash: 0xCCC)
  T=30s:  Block 18500002 (hash: 0xDDD) ← Different hash!

Detection: Immediate reorg detection at T=30s
Action: Invalidate cache instantly
Result: Always serve canonical chain data

3. Reduced Upstream Load

HTTP polling:

  • Periodic eth_blockNumber calls every 60 seconds

  • Redundant if no new blocks

  • Wasted API calls during low activity

WebSocket subscription:

  • Single persistent connection

  • Push notifications only when blocks arrive

  • ~60× fewer API calls (1 connection vs 60 polls/hour)

4. Lower Latency

Metric
HTTP-only
With WebSocket

New block detection

0-60s

< 1s

Reorg detection

0-60s

< 1s

Cache update latency

60s avg

< 2s

API call overhead

High

Low


Configuration

Enable WebSocket by adding wss_url to upstream provider configuration:

Basic Configuration

[[upstreams.providers]]
name = "infura"
chain_id = 1
https_url = "https://mainnet.infura.io/v3/YOUR_API_KEY"
wss_url = "wss://mainnet.infura.io/ws/v3/YOUR_API_KEY"  # Enable WebSocket
weight = 100
timeout_seconds = 30

Multiple Upstreams with WebSocket

[[upstreams.providers]]
name = "alchemy"
chain_id = 1
https_url = "https://eth-mainnet.g.alchemy.com/v2/YOUR_API_KEY"
wss_url = "wss://eth-mainnet.g.alchemy.com/v2/YOUR_API_KEY"
weight = 100

[[upstreams.providers]]
name = "infura"
chain_id = 1
https_url = "https://mainnet.infura.io/v3/YOUR_API_KEY"
wss_url = "wss://mainnet.infura.io/ws/v3/YOUR_API_KEY"
weight = 100

[[upstreams.providers]]
name = "quicknode"
chain_id = 1
https_url = "https://example.quiknode.pro/YOUR_TOKEN/"
wss_url = "wss://example.quiknode.pro/YOUR_TOKEN/"
weight = 50

Result: Prism subscribes to all three upstreams simultaneously for redundancy.

WebSocket-Only Upstream

Some providers offer WebSocket-only endpoints for specific use cases:

[[upstreams.providers]]
name = "websocket-only-provider"
chain_id = 1
https_url = "https://backup-http.provider.com"  # Fallback HTTP
wss_url = "wss://primary-ws.provider.com"       # Primary WebSocket
weight = 100

URL Format Requirements

Valid WebSocket URLs:

  • ws://localhost:8545 (insecure, local development)

  • wss://mainnet.infura.io/ws/v3/KEY (secure, production)

  • wss://eth-mainnet.g.alchemy.com/v2/KEY (secure, production)

Invalid URLs (rejected at startup):

  • http://provider.com ❌ (must start with ws:// or wss://)

  • https://provider.com ❌ (HTTPS is for HTTP RPC, not WebSocket)

  • `` (empty string) ❌

  • (whitespace only) ❌


WebSocket Subscriptions

Prism uses the standard Ethereum eth_subscribe method to receive real-time block headers.

Subscription Flow

1. Connection Establishment

Client (Prism) ──────────────► Upstream Provider
                CONNECT
         wss://provider.com/ws

Log output:

INFO connecting to websocket ws_url=wss://mainnet.infura.io/ws/v3/... upstream=infura
INFO websocket connected successfully upstream=infura status=101

2. Subscription Request

Prism sends the eth_subscribe JSON-RPC call:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "eth_subscribe",
  "params": ["newHeads"]
}

Purpose: Subscribe to new block header notifications.

3. Subscription Confirmation

Provider responds with a subscription ID:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": "0x9ce59a13059e417087c02d3236a0b9cc"
}

Log output:

INFO subscription confirmed upstream=infura subscription_id=0x9ce59a13059e417087c02d3236a0b9cc

4. Block Notifications

Provider pushes new block headers as they arrive:

{
  "jsonrpc": "2.0",
  "method": "eth_subscription",
  "params": {
    "subscription": "0x9ce59a13059e417087c02d3236a0b9cc",
    "result": {
      "number": "0x11A7C4F",
      "hash": "0x1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef",
      "parentHash": "0xabcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890",
      "timestamp": "0x649B8E7F",
      "miner": "0x...",
      "gasLimit": "0x1C9C380",
      "gasUsed": "0xB71B0A",
      "baseFeePerGas": "0x7",
      ...
    }
  }
}

Log output:

INFO websocket: processing block notification block_number=18498639 block_hash=0x1234... upstream=infura

Subscription Lifecycle

┌───────────────────────────────────────────────────────┐
│              WebSocket Lifecycle                       │
├───────────────────────────────────────────────────────┤
│                                                        │
│  CONNECT ──► SUBSCRIBE ──► RECEIVE ──► PROCESS       │
│                   │            │           │          │
│                   │            │           └─► Cache  │
│                   │            │               Update │
│                   │            │                      │
│                   │            └──► (loop)            │
│                   │                                   │
│              DISCONNECT ◄──────────────────┐          │
│                   │                        │          │
│                   └──► RECONNECT ──────────┘          │
│                        (with backoff)                 │
│                                                        │
└───────────────────────────────────────────────────────┘

Real-Time Chain Tip Updates

When a new block notification arrives, Prism immediately updates its internal chain state.

Update Process

Step 1: Extract Block Data

// Parse notification
{
  "number": "0x11A7C4F",  // Block number: 18498639
  "hash": "0x1234...",    // Block hash
}

// Convert to internal format
block_number = parse_hex_u64("0x11A7C4F") = 18498639
block_hash = parse_hex_array("0x1234...") = [0x12, 0x34, ...]

Step 2: Update Chain Tip

ReorgManager::update_tip(18498639, 0x1234...)

  ├─ Current tip: 18498638
  ├─ New tip: 18498639 (higher)

  └─ Action: Update chain state atomically
     ├─ ChainState.current_tip ← 18498639
     └─ ChainState.current_head_hash ← 0x1234...

Log output:

INFO websocket: processing block notification
     block_number=18498639
     block_hash=0x1234...
     upstream=infura
     current_tip=18498638
     is_same_height=false

Step 3: Fetch Full Block Data

Spawn background task to fetch complete block:

tokio::spawn(async move {
  // Fetch block with transactions
  let block = eth_getBlockByNumber(18498639, true).await;

  // Cache header
  cache_manager.insert_header(block.header).await;

  // Cache body (transaction hashes)
  cache_manager.insert_body(block.body).await;

  // Cache transactions
  for tx in block.transactions {
    cache_manager.insert_transaction(tx).await;
  }

  // Fetch and cache receipts
  let receipts = eth_getBlockReceipts(18498639).await;
  for receipt in receipts {
    cache_manager.insert_receipt(receipt).await;

    // Cache logs from receipts
    for log in receipt.logs {
      cache_manager.insert_log(log).await;
    }
  }
});

Log output:

DEBUG block_number=18498639 upstream=infura acquired fetch lock, fetching block
DEBUG block_number=18498639 upstream=infura cached header
DEBUG block_number=18498639 upstream=infura cached body
DEBUG block_number=18498639 upstream=infura transaction_count=150 cached transactions
DEBUG block_number=18498639 upstream=infura receipt_count=150 cached receipts
DEBUG block_number=18498639 upstream=infura log_count=342 conversion_ms=12 converted logs from receipts
DEBUG block_number=18498639 upstream=infura insert_ms=8 stats_ms=2 inserted logs
DEBUG block_number=18498639 upstream=infura fully cached block via websocket

Benefits

  1. Proactive Caching: Next request for block 18498639 is a cache hit

  2. Lower Latency: No upstream fetch needed

  3. Reduced Load: Fewer on-demand upstream calls

  4. Fresher Data: Cache updated within 2 seconds of block arrival


Reorg Detection

WebSocket enables real-time detection of chain reorganizations.

Detection Scenarios

Scenario 1: Same Height, Different Hash

Timeline:

T=0s:   Receive block 18500000 (hash: 0xAAA) ✓
T=12s:  Receive block 18500001 (hash: 0xBBB) ✓
T=24s:  Receive block 18500001 (hash: 0xCCC) ⚠ Same height, different hash!

Detection:

// Block arrives at same height
if new_block_number == current_tip {
  let current_hash = chain_state.current_head_hash();

  if new_block_hash != current_hash {
    // REORG DETECTED!
    warn!(
      block=new_block_number,
      old_hash=current_hash,
      new_hash=new_block_hash,
      "reorg detected at current tip"
    );

    handle_reorg(new_block_number, current_hash, new_block_hash).await;
  }
}

Log output:

WARN block=18500001 old_hash=0xBBB new_hash=0xCCC reorg detected at current tip - processing immediately
INFO divergence_block=18500000 calculated invalidation boundary
INFO invalidated_blocks=2 from_block=18500000 to_block=18500001 reorg completed

Action Taken:

  1. Calculate divergence point (where chains split)

  2. Invalidate cache from divergence to tip

  3. Update chain state with new canonical hash

  4. Continue normal operation

Scenario 2: Rollback Detection

Timeline:

T=0s:   Current tip: 18500000
T=30s:  Receive block 18499990 (lower than current tip!)

This happens when:

  • Chain underwent a deep reorg

  • Health checker detected it before WebSocket reconnected

  • Provider rolled back state manually (e.g., debug_setHead)

Detection:

if new_block_number < current_tip {
  let rollback_depth = current_tip - new_block_number;

  warn!(
    current_tip=current_tip,
    new_block=new_block_number,
    rollback_depth=rollback_depth,
    "websocket: received older block (potential rollback)"
  );

  handle_rollback(new_block_number, current_tip).await;
}

Log output:

WARN current_tip=18500000 new_block=18499990 rollback_depth=10
     websocket: received older block (potential rollback)
INFO new_tip=18499990 old_tip=18500000 invalidate_from=18499990
     blocks_to_invalidate=11 handling chain rollback
INFO invalidated_from=18499990 invalidated_to=18500000 rollback cache invalidation complete

Reorg Coalescing

During "reorg storms" (multiple rapid reorgs), Prism batches updates to prevent cache thrashing.

Problem Without Coalescing

T=0ms:   Reorg to hash 0xAAA → Invalidate cache → Fetch block
T=50ms:  Reorg to hash 0xBBB → Invalidate cache → Fetch block
T=100ms: Reorg to hash 0xCCC → Invalidate cache → Fetch block
T=150ms: Reorg to hash 0xDDD → Invalidate cache → Fetch block

Problem: Cache thrashing - blocks repeatedly invalidated and refetched

Solution: Coalescing Window

[cache.manager_config.reorg_manager]
coalesce_window_ms = 100  # Batch reorgs within 100ms window

Behavior:

T=0ms:   Reorg to 0xAAA → Process immediately
T=50ms:  Reorg to 0xBBB → Defer (within 100ms of last)
T=75ms:  Reorg to 0xCCC → Defer (within 100ms of last)
T=101ms: Window expired → Process batched reorgs (0xBBB, 0xCCC)

Log output:

INFO block=18500001 old_hash=0xAAA new_hash=0xBBB reorg detected - coalescing with pending updates
INFO block=18500001 old_hash=0xBBB new_hash=0xCCC processing coalesced reorg

Result: Reduced cache invalidation and upstream fetches during reorg storms.

Finality Boundaries

Reorgs respect finality checkpoints:

Block Height:   900         988          1000 (tip)
                │           │            │
Finality:   FINALIZED     SAFE        UNSAFE
            (never        (rarely     (frequently
             reorgs)       reorgs)     reorgs)

Configuration:

[cache.manager_config.reorg_manager]
safety_depth = 12          # Blocks from tip considered "safe" (~2.5 min)
max_reorg_depth = 100      # Maximum backward search depth

Invalidation Rules:

  • Finalized blocks (≤ 900): NEVER invalidated

  • Safe blocks (901-988): Only invalidated for deep reorgs

  • Unsafe blocks (989-1000): Always invalidated during reorg


Failure Handling & Reconnection

WebSocket connections can fail for various reasons. Prism handles failures gracefully with automatic reconnection.

Failure Scenarios

1. Connection Refused

Cause: Provider endpoint unavailable or incorrect URL

Error:

ERROR upstream=infura error="Connection refused" websocket connection failed

2. Method Not Allowed (405)

Cause: WebSocket protocol not supported at this endpoint

Error:

ERROR upstream=infura error="HTTP error: 405 Method Not Allowed" websocket connection failed
ConnectionFailed: WebSocket method not allowed for upstream infura (405 Method Not Allowed)

3. Forbidden (403)

Cause: Authentication failed or WebSocket not enabled in plan

Error:

ERROR upstream=infura error="HTTP error: 403 Forbidden" websocket connection failed
ConnectionFailed: WebSocket access forbidden for upstream infura (403 Forbidden)

4. Protocol Mismatch (200 OK)

Cause: Server returned HTTP response instead of WebSocket handshake

Error:

ERROR upstream=infura error="HTTP error: 200 OK" websocket connection failed
ConnectionFailed: Server returned 200 OK but does not support WebSocket protocol for upstream infura

5. Subscription Failure

Cause: eth_subscribe method not supported or rejected

Error:

ERROR upstream=infura error="WebSocket send error: channel closed" websocket connection failed

6. Connection Drop

Cause: Network interruption, provider restart, or idle timeout

Detection:

WARN upstream=infura websocket connection closed

Reconnection Logic

Prism automatically reconnects with exponential backoff:

const MAX_RECONNECT_DELAY: Duration = Duration::from_secs(60);
let mut reconnect_delay = Duration::from_secs(1);

loop {
  match upstream.subscribe_to_new_heads(...).await {
    Ok(()) => {
      info!(upstream=%name, "WebSocket subscription completed normally");
      reconnect_delay = Duration::from_secs(1); // Reset delay
    }
    Err(e) => {
      error!(
        upstream=%name,
        error=%e,
        reconnect_delay_secs=reconnect_delay.as_secs(),
        "WebSocket subscription failed, will retry"
      );

      upstream.record_websocket_failure().await;
      tokio::time::sleep(reconnect_delay).await;
      reconnect_delay = std::cmp::min(reconnect_delay * 2, MAX_RECONNECT_DELAY);
    }
  }

  tokio::time::sleep(Duration::from_millis(100)).await;
}

Reconnection Timeline:

T=0s:    Initial connection attempt → FAIL
T=1s:    Retry #1 (delay: 1s) → FAIL
T=3s:    Retry #2 (delay: 2s) → FAIL
T=7s:    Retry #3 (delay: 4s) → FAIL
T=15s:   Retry #4 (delay: 8s) → FAIL
T=31s:   Retry #5 (delay: 16s) → FAIL
T=63s:   Retry #6 (delay: 32s) → FAIL
T=123s:  Retry #7 (delay: 60s, capped) → SUCCESS

Log output (successful reconnection):

ERROR upstream=infura reconnect_delay_secs=1 WebSocket subscription failed, will retry
ERROR upstream=infura reconnect_delay_secs=2 WebSocket subscription failed, will retry
ERROR upstream=infura reconnect_delay_secs=4 WebSocket subscription failed, will retry
INFO upstream=infura WebSocket subscription completed normally
DEBUG upstream=infura websocket subscription succeeded, reset failure count

Backoff Strategies

Prism implements a failure tracker to prevent endless retry loops.

WebSocketFailureTracker

Tracks consecutive failures and stops retrying after threshold:

pub struct WebSocketFailureTracker {
  consecutive_failures: u32,
  max_consecutive_failures: u32,  // Default: 3
  permanently_failed: bool,
}

Failure Thresholds

Normal Failures (1-2)

Behavior: Retry with backoff

Log output:

DEBUG upstream=infura failure_count=1 websocket subscription failed
DEBUG upstream=infura failure_count=2 websocket subscription failed

Threshold Reached (3)

Behavior: Stop retrying temporarily (5 minutes)

Log output:

WARN upstream=infura failure_count=3 retry_delay_secs=300 websocket subscription failed, stopping retries

Effect: No reconnection attempts for 5 minutes (300 seconds)

Permanent Failure (6+)

Behavior: Stop retrying permanently

Log output:

WARN upstream=infura websocket subscription failed immediately after reset, marking as permanently failed

Effect: WebSocket disabled for this upstream until restart

Failure Reset

After 5 minutes (300 seconds), failure count resets:

pub fn reset_if_expired(&mut self) {
  if self.consecutive_failures >= self.max_consecutive_failures
    && self.last_failure_time.elapsed() >= Duration::from_secs(300)
  {
    self.consecutive_failures = 0;
    self.permanently_failed = false;
  }
}

Result: Retry attempts resume automatically

Configuration

Default thresholds are hardcoded but can be modified in source:

impl Default for WebSocketFailureTracker {
  fn default() -> Self {
    Self {
      consecutive_failures: 0,
      max_consecutive_failures: 3,      // Stop after 3 failures
      failure_reset_duration: Duration::from_secs(300), // 5 minute cooldown
      permanently_failed: false,
    }
  }
}

Tuning Recommendations:

  • Aggressive retry: max_consecutive_failures = 5 (more retries)

  • Conservative retry: max_consecutive_failures = 2 (fewer retries)

  • Faster reset: failure_reset_duration = 60 (1 minute cooldown)

  • Slower reset: failure_reset_duration = 600 (10 minute cooldown)


Monitoring WebSocket Health

Logs

Prism emits structured logs for WebSocket lifecycle events:

Connection Events

INFO connecting to websocket ws_url=wss://... upstream=infura
INFO websocket connected successfully upstream=infura status=101
INFO subscription confirmed upstream=infura subscription_id=0x...

Block Processing

DEBUG upstream=infura message=<json> received websocket message
DEBUG upstream=infura json={...} parsed json
INFO upstream=infura block_number=18498639 block_hash=0x... extracted block data
DEBUG block_number=18498639 upstream=infura acquired fetch lock, fetching block
DEBUG block_number=18498639 upstream=infura cached header
DEBUG block_number=18498639 upstream=infura transaction_count=150 cached transactions
DEBUG block_number=18498639 upstream=infura fully cached block via websocket

Disconnection Events

WARN upstream=infura websocket connection closed
INFO upstream=infura WebSocket subscription completed normally

Error Events

ERROR upstream=infura error="Connection refused" websocket connection failed
ERROR upstream=infura reconnect_delay_secs=2 WebSocket subscription failed, will retry
WARN upstream=infura failure_count=3 retry_delay_secs=300 websocket subscription failed, stopping retries

Metrics

WebSocket health is tracked in Prometheus metrics:

Connection Status

# WebSocket connection active (1 = connected, 0 = disconnected)
rpc_websocket_connected{upstream="infura"} 1

# Total connections established
rpc_websocket_connections_total{upstream="infura"} 42

# Total disconnections
rpc_websocket_disconnections_total{upstream="infura"} 41

Failure Tracking

# Consecutive failure count
rpc_websocket_consecutive_failures{upstream="infura"} 0

# Total failures
rpc_websocket_failures_total{upstream="infura"} 12

# Permanently failed status (1 = permanent, 0 = normal)
rpc_websocket_permanently_failed{upstream="infura"} 0

Block Processing

# Blocks received via WebSocket
rpc_websocket_blocks_received_total{upstream="infura"} 8456

# Blocks processed and cached
rpc_websocket_blocks_processed_total{upstream="infura"} 8450

# Blocks failed to process
rpc_websocket_blocks_failed_total{upstream="infura"} 6

Reorg Detection

# Reorgs detected via WebSocket
rpc_websocket_reorgs_detected_total{upstream="infura"} 3

# Blocks invalidated due to reorg
rpc_websocket_blocks_invalidated_total{upstream="infura"} 28

Grafana Dashboards

WebSocket Health Panel:

# Connection status (0 = down, 1 = up)
rpc_websocket_connected

# Success rate
rate(rpc_websocket_blocks_processed_total[5m]) /
rate(rpc_websocket_blocks_received_total[5m])

# Failure rate
rate(rpc_websocket_failures_total[5m])

WebSocket Latency Panel:

# Time between block arrival and cache completion
histogram_quantile(0.99,
  rate(rpc_websocket_block_processing_duration_seconds_bucket[5m])
)

Alerting

Alert: WebSocket Connection Down

- alert: WebSocketDisconnected
  expr: rpc_websocket_connected == 0
  for: 5m
  annotations:
    summary: "WebSocket connection down for {{ $labels.upstream }}"
    description: "No WebSocket connection to {{ $labels.upstream }} for 5 minutes"

Alert: High Failure Rate

- alert: WebSocketHighFailureRate
  expr: rate(rpc_websocket_failures_total[5m]) > 0.1
  for: 10m
  annotations:
    summary: "High WebSocket failure rate for {{ $labels.upstream }}"
    description: "WebSocket failures > 10% for {{ $labels.upstream }}"

Alert: Permanent Failure

- alert: WebSocketPermanentlyFailed
  expr: rpc_websocket_permanently_failed == 1
  for: 1m
  annotations:
    summary: "WebSocket permanently failed for {{ $labels.upstream }}"
    description: "WebSocket marked as permanently failed for {{ $labels.upstream }}"

When to Use WebSocket vs HTTP-Only

Use WebSocket When

1. Real-Time Applications

  • DeFi frontends with live data updates

  • Block explorers showing latest blocks

  • Transaction monitoring systems

  • MEV bots requiring instant block notifications

2. Cache Accuracy Critical

  • Financial applications

  • Analytics platforms

  • Data pipelines requiring canonical chain data

  • Applications sensitive to reorgs

3. High Query Volume

  • Applications making frequent eth_getLogs calls

  • Heavy caching workloads

  • Reduced upstream API costs important

4. Reorg-Sensitive Workloads

  • Cross-chain bridges

  • Smart contract indexers

  • Data integrity verification systems

Use HTTP-Only When

1. Development/Testing

  • Local development without WebSocket support

  • Testing environments

  • Proof-of-concept applications

2. Simplified Deployment

  • Don't want to manage WebSocket connections

  • Firewall restrictions on WebSocket protocols

  • Simpler architecture preferred

3. Low Query Volume

  • Occasional API calls

  • Background processing jobs

  • Cache hit rate already high

4. Provider Limitations

  • WebSocket not available in plan

  • WebSocket endpoint unreliable

  • Cost optimization (WebSocket often premium)

Hybrid Configuration

Run some upstreams with WebSocket, others without:

# Primary: WebSocket enabled for real-time updates
[[upstreams.providers]]
name = "infura-primary"
https_url = "https://mainnet.infura.io/v3/KEY"
wss_url = "wss://mainnet.infura.io/ws/v3/KEY"  # ✓ WebSocket
weight = 100

# Backup: HTTP-only for failover
[[upstreams.providers]]
name = "alchemy-backup"
https_url = "https://eth-mainnet.g.alchemy.com/v2/KEY"
# wss_url not configured (HTTP-only)
weight = 50

# Tertiary: HTTP-only for additional capacity
[[upstreams.providers]]
name = "quicknode-tertiary"
https_url = "https://example.quiknode.pro/TOKEN/"
# wss_url not configured (HTTP-only)
weight = 25

Benefits:

  • Real-time updates from primary

  • Cost savings on backup/tertiary

  • Failover to HTTP if WebSocket unavailable


Troubleshooting

Issue: WebSocket Won't Connect

Symptoms:

ERROR upstream=infura error="Connection refused" websocket connection failed

Causes:

  1. Invalid WebSocket URL

  2. Firewall blocking WebSocket port

  3. Provider endpoint down

  4. Authentication failure

Solutions:

1. Verify URL format:

# ✓ Correct
wss_url = "wss://mainnet.infura.io/ws/v3/YOUR_API_KEY"

# ✗ Wrong - missing wss://
wss_url = "mainnet.infura.io/ws/v3/YOUR_API_KEY"

# ✗ Wrong - using https:// instead of wss://
wss_url = "https://mainnet.infura.io/ws/v3/YOUR_API_KEY"

2. Test connection manually:

# Test WebSocket endpoint
websocat wss://mainnet.infura.io/ws/v3/YOUR_API_KEY

# Send test subscription
{"jsonrpc":"2.0","id":1,"method":"eth_subscribe","params":["newHeads"]}

3. Check firewall rules:

# Allow WebSocket port (443 for wss://)
sudo ufw allow 443/tcp

4. Verify API key:

  • Check key is valid and not expired

  • Verify WebSocket is enabled in provider plan

  • Test with different API key

Issue: WebSocket Connects But No Blocks

Symptoms:

INFO subscription confirmed upstream=infura subscription_id=0x...
# No subsequent block notifications

Causes:

  1. No new blocks being mined (testnets)

  2. Subscription not active

  3. Provider not sending notifications

Solutions:

1. Check if chain is producing blocks:

# Query latest block number
curl -X POST http://localhost:3030/ -H "Content-Type: application/json" -d '
{
  "jsonrpc": "2.0",
  "method": "eth_blockNumber",
  "params": [],
  "id": 1
}'

# Wait 15 seconds and query again - should be different

2. Enable debug logging:

RUST_LOG=debug ./prism-server

Look for messages:

DEBUG upstream=infura message=<json> received websocket message

3. Test with different provider:

  • Try Alchemy if Infura has issues

  • Use multiple providers for redundancy

Issue: Frequent Reconnections

Symptoms:

WARN upstream=infura websocket connection closed
INFO upstream=infura websocket connected successfully upstream=infura status=101
# Repeats every few minutes

Causes:

  1. Provider imposing connection time limits

  2. Network instability

  3. Idle timeout

Solutions:

1. Check provider documentation:

  • Some providers limit WebSocket connection duration

  • May require periodic reconnection

2. Monitor network stability:

# Check packet loss
ping -c 100 mainnet.infura.io

# Monitor connection quality
mtr mainnet.infura.io

3. Increase idle timeout (if provider supports):

  • Some providers accept keepalive messages

  • Prism currently doesn't send keepalives (future enhancement)

Issue: High Failure Count

Symptoms:

WARN upstream=infura failure_count=3 retry_delay_secs=300 websocket subscription failed, stopping retries

Causes:

  1. Provider WebSocket endpoint unreliable

  2. Incorrect configuration

  3. Network issues

  4. Provider rate limiting WebSocket connections

Solutions:

1. Verify endpoint is working:

# Test with curl (for HTTPS endpoint)
curl -X POST https://mainnet.infura.io/v3/YOUR_API_KEY \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}'

# If HTTPS works but WebSocket doesn't, contact provider

2. Switch to different upstream:

# Disable problematic upstream
[[upstreams.providers]]
name = "infura"
https_url = "https://mainnet.infura.io/v3/KEY"
# wss_url = "..."  # Commented out - WebSocket disabled

# Add reliable alternative
[[upstreams.providers]]
name = "alchemy"
https_url = "https://eth-mainnet.g.alchemy.com/v2/KEY"
wss_url = "wss://eth-mainnet.g.alchemy.com/v2/KEY"  # More reliable

3. Check rate limits:

  • Some providers limit WebSocket connections

  • May require higher-tier plan

Issue: WebSocket Marked Permanently Failed

Symptoms:

WARN upstream=infura websocket subscription failed immediately after reset, marking as permanently failed

Causes:

  1. Provider doesn't support WebSocket

  2. Endpoint permanently unavailable

  3. Configuration error

Solutions:

1. Disable WebSocket for this upstream:

[[upstreams.providers]]
name = "infura"
https_url = "https://mainnet.infura.io/v3/KEY"
# wss_url removed - use HTTP-only

2. Restart Prism to reset failure tracker:

# Graceful restart
systemctl restart prism

# Or send SIGTERM
kill -TERM <prism-pid>

3. Verify provider documentation:

  • Confirm WebSocket support

  • Check correct WebSocket URL format

  • Verify plan includes WebSocket access

Issue: Cache Not Updating in Real-Time

Symptoms:

  • WebSocket connected successfully

  • Blocks received via WebSocket

  • Cache still stale

Causes:

  1. Block processing failing

  2. Cache full (LRU eviction)

  3. Reorg invalidating cache frequently

Solutions:

1. Check block processing logs:

RUST_LOG=debug ./prism-server | grep "fully cached block"

Should see:

DEBUG block_number=18498639 upstream=infura fully cached block via websocket

2. Increase cache sizes:

[cache.manager_config.block_cache]
max_headers = 20000  # Increase from default
max_bodies = 10000

[cache.manager_config.log_cache]
max_exact_results = 100000  # Increase from default

3. Monitor cache evictions:

curl -s http://localhost:3030/metrics | grep cache_evictions

If evictions high, increase cache sizes.

Issue: Reorg Not Detected

Symptoms:

  • Chain reorganized

  • Prism still serving old data

  • No reorg logs

Causes:

  1. WebSocket missed reorg event

  2. Both old and new blocks have same height/hash (rare)

  3. Health checker interval too long

Solutions:

1. Check WebSocket is receiving blocks:

RUST_LOG=debug ./prism-server | grep "websocket: processing block"

2. Decrease health check interval:

[health_check]
interval_seconds = 30  # Check every 30s instead of 60s

3. Enable multiple upstreams:

  • Multiple WebSocket connections increase reorg detection probability

  • Cross-validate between upstreams

4. Monitor reorg metrics:

curl -s http://localhost:3030/metrics | grep reorg

Summary

WebSocket subscriptions provide significant benefits for Prism:

Key Advantages:

  • Real-time updates: Sub-second block notifications

  • 🎯 Cache accuracy: Immediate reorg detection and invalidation

  • 💰 Cost savings: ~60× fewer API calls vs. polling

  • 📊 Better observability: Detailed metrics and logs

Best Practices:

  1. Enable WebSocket on at least one upstream for real-time updates

  2. Use multiple WebSocket upstreams for redundancy

  3. Monitor connection health via Prometheus metrics

  4. Configure alerts for permanent failures

  5. Test WebSocket endpoints before deploying to production

  6. Have HTTP-only fallback upstreams

Production Checklist:


Next: Learn about Caching System or Monitoring.

Last updated