RPC Pool Failover
The transaction engine uses an intelligent RPC pool with automatic health scoring and failover.
Configuration
Configure multiple RPC endpoints for high availability:
# Primary RPC endpoint
SOLANA_RPC_URL = https://api.devnet.solana.com
# Pool of endpoints (comma-separated)
SOLANA_RPC_POOL_URLS = https://api.devnet.solana.com,https://rpc.ankr.com/solana_devnet,https://devnet.helius-rpc.com
# Health probe interval in milliseconds
SOLANA_RPC_HEALTH_PROBE_MS = 15000
Health Scoring Algorithm
Each endpoint maintains a dynamic health score based on:
Success rate - Successful requests increase score
Latency - High latency reduces score
Failure streaks - Consecutive failures penalize score
Recovery - Endpoints can recover score over time
Failover Behavior
Sort by Score
Endpoints are sorted by health score (descending), then by average latency (ascending).
Attempt Primary
The highest-scored endpoint is tried first for each operation.
Cascade on Failure
If the primary fails, the next endpoint in the sorted list is tried automatically.
Update Health
Success or failure updates the endpoint’s health score for future requests.
Monitor Pool Status
Query the current health of all RPC endpoints:
curl -H 'x-api-key: dev-api-key' \
http://localhost:3000/api/v1/rpc/pool/status
Response:
{
"endpoints" : [
{
"url" : "https://api.devnet.solana.com" ,
"score" : 0.95 ,
"successes" : 142 ,
"failures" : 3 ,
"avgLatencyMs" : 230 ,
"failStreak" : 0 ,
"lastCheckedAt" : "2026-03-08T10:30:00.000Z"
},
{
"url" : "https://rpc.ankr.com/solana_devnet" ,
"score" : 0.88 ,
"successes" : 98 ,
"failures" : 8 ,
"avgLatencyMs" : 420 ,
"failStreak" : 1 ,
"lastCheckedAt" : "2026-03-08T10:29:45.000Z" ,
"lastError" : "Connection timeout"
}
]
}
The pool continuously probes all endpoints in the background at the configured interval to maintain fresh health scores.
Durable Outbox Queue
The transaction engine uses a SQLite-backed outbox pattern for reliable transaction processing.
Architecture
Persistent Storage - Jobs survive process restarts
Lease-based Claiming - Workers claim jobs with time-bound leases
Automatic Retry - Failed jobs re-enter the queue
Deduplication - Prevents duplicate pending jobs for the same transaction
Outbox Actions
execute - Process a new transaction request
retry - Retry a failed transaction
approve - Process an approval gate decision
Job States
type OutboxStatus =
| 'pending' // Queued, waiting for worker
| 'processing' // Worker has active lease
| 'done' // Successfully completed
| 'failed' // Exceeded max retry attempts
Configuration
# Lease duration - how long a worker can hold a job
TX_OUTBOX_LEASE_MS = 30000
# Poll interval - how often worker checks for new jobs
TX_OUTBOX_POLL_MS = 2000
# Max attempts before marking job as permanently failed
TX_OUTBOX_MAX_ATTEMPTS = 6
Lease and Retry Semantics
Claim Job
Worker claims the oldest pending job or a processing job with expired lease. WHERE status = 'pending'
OR ( status = 'processing' AND lease_expires_at <= NOW ())
ORDER BY created_at ASC
LIMIT 1
Process with Lease
Job moves to processing with a unique leaseId and expiration timestamp. attempts += 1
lease_expires_at = NOW () + TX_OUTBOX_LEASE_MS
Complete or Fail
Worker marks job as done or failed.
Success: status = 'done', lease cleared
Retryable Failure: status = 'pending' if attempts < TX_OUTBOX_MAX_ATTEMPTS
Permanent Failure: status = 'failed' if max attempts exceeded
Automatic Recovery
If worker crashes, expired leases allow job to be reclaimed by another worker.
Lease Expiration: Set TX_OUTBOX_LEASE_MS longer than your worst-case transaction confirmation time to avoid premature lease expiration.
Monitor Outbox Status
curl -H 'x-api-key: dev-api-key' \
http://localhost:3000/api/v1/outbox/stats
Response:
{
"pending" : 3 ,
"processing" : 2 ,
"failed" : 0 ,
"done" : 145
}
Adaptive Priority Fee Tuning
The execution tuner automatically calculates optimal priority fees based on recent network activity.
Configuration
# Minimum priority fee (microlamports per compute unit)
SOLANA_PRIORITY_FEE_MIN_MICROLAMPORTS = 2000
# Maximum priority fee (microlamports per compute unit)
SOLANA_PRIORITY_FEE_MAX_MICROLAMPORTS = 200000
# Percentile of recent fees to target (1-99)
SOLANA_PRIORITY_FEE_PERCENTILE = 75
# Multiplier applied to percentile fee (basis points)
# 1150 bps = 1.15x boost
SOLANA_PRIORITY_FEE_MULTIPLIER_BPS = 1150
Fee Calculation Algorithm
Collect Recent Fees
Gather recent priority fees from the RPC endpoint’s recent blocks.
Calculate Percentile
Compute the configured percentile (e.g., 75th percentile). const sortedFees = recentFees . filter ( f => f >= 0 ). sort (( a , b ) => a - b )
const index = Math . floor (( percentile / 100 ) * ( sortedFees . length - 1 ))
const percentileFee = sortedFees [ index ]
Apply Multiplier
Boost the percentile fee by the configured multiplier. const multiplier = PRIORITY_FEE_MULTIPLIER_BPS / 10000
const boostedFee = Math . floor ( percentileFee * multiplier )
Clamp to Bounds
Ensure the final fee is within min/max bounds. const finalFee = clamp (
boostedFee > 0 ? boostedFee : minFee ,
minFee ,
maxFee
)
Compute Unit Estimation
The tuner also calculates compute unit limits based on transaction type:
const computeByType : Record < TransactionType , number > = {
transfer_sol: 120_000 ,
transfer_spl: 180_000 ,
swap: 380_000 ,
stake: 240_000 ,
unstake: 240_000 ,
lend_supply: 320_000 ,
lend_borrow: 350_000 ,
create_escrow: 320_000 ,
accept_escrow: 280_000 ,
release_escrow: 260_000 ,
// ...
}
// Add buffer for additional instructions
const instructionBuffer = max ( 0 , instructionCount - 1 ) * 15_000
const computeUnitLimit = clamp ( baseUnits + instructionBuffer , 100_000 , 1_200_000 )
Compute budgets are automatically injected as the first instructions in every transaction.
Delta Guard Checks
Delta guard validates that observed balance changes match expected deltas to detect simulation drift or unexpected fees.
Configuration
# Absolute tolerance in lamports for small variances
DELTA_GUARD_ABSOLUTE_TOLERANCE_LAMPORTS = 10000
Expected Delta Calculation
const expectedLamportsDelta = ( type : string , intent : Record < string , unknown >) : number | null => {
const amount = Number ( intent [ 'lamports' ] ?? intent [ 'amountLamports' ] ?? intent [ 'amount' ] ?? 0 )
if ( ! Number . isFinite ( amount ) || amount <= 0 ) {
return null // Cannot compute delta
}
// Outflows (negative delta)
if (
type === 'transfer_sol' ||
type === 'stake' ||
type === 'lend_supply' ||
type === 'create_escrow'
) {
return - amount
}
// Inflows (positive delta)
if (
type === 'unstake' ||
type === 'release_escrow' ||
type === 'refund_escrow'
) {
return amount
}
return null // No delta check for this type
}
Variance Evaluation
Compare Deltas
Calculate the absolute difference between expected and observed deltas. const absoluteDelta = Math . abs ( observed - expected )
Check Absolute Tolerance
If within the configured tolerance, delta guard passes. if ( absoluteDelta <= DELTA_GUARD_ABSOLUTE_TOLERANCE_LAMPORTS ) {
return { ok: true , reason: 'within absolute tolerance' }
}
Calculate Variance BPS
Compute variance in basis points. const denom = Math . max ( 1 , Math . abs ( expected ))
const varianceBps = Math . round (( absoluteDelta / denom ) * 10000 )
Check Threshold
Compare variance to the configured threshold (typically 200-500 bps). const ok = varianceBps <= thresholdBps
Delta Guard Result
interface DeltaGuardResult {
ok : boolean ;
expectedLamportsDelta : number | null ;
observedLamportsDelta : number | null ;
varianceBps : number | null ;
reason ?: string ;
}
Delta Guard Failures indicate a mismatch between simulation and execution. This could be due to:
Unexpected transaction fees
Rent changes
Protocol fee variations
Simulation drift (different blockhash/slot)
Investigate failed transactions to determine if the variance is acceptable.
Restart Recovery
The outbox queue automatically drains pending work on service restart.
Recovery Flow
Load Persistent State
On startup, the transaction engine loads the SQLite database with all pending and processing jobs.
Reclaim Expired Leases
Jobs in processing state with expired leases return to pending.
Resume Processing
The outbox worker starts polling and claiming jobs from the queue.
Process Backlog
All pending jobs are processed in FIFO order (oldest first).
No manual intervention is required after a restart. The system automatically recovers and continues processing.
Database Migration
The system includes automatic migration from legacy JSON snapshots to SQLite.
Migration Behavior
Automatic Detection - On first startup, checks for legacy snapshot file
One-time Import - Migrates jobs if SQLite database is empty
Preserves History - All job states, attempts, and metadata are preserved
Idempotent - Safe to restart during migration
if ( existsSync ( legacySnapshotFile ) && dbRowCount === 0 ) {
const snapshot = JSON . parse ( readFileSync ( legacySnapshotFile ))
db . transaction (() => {
for ( const job of snapshot . jobs ) {
db . insert ( job ) // INSERT OR IGNORE for idempotency
}
})
}
After successful migration, you can safely delete the legacy JSON snapshot file.