Component

Task Worker

The bridge between the job queue and EVA. Consumes SQS messages, enforces billing and concurrency, streams EVA events to Redis, and handles retries on failure.

tinyfish-io/ux-labs → task-worker/

Tech Stack

LanguageTypeScript (ESM)
RuntimeNode.js ≥ 20
DeployECS on EC2 (t3.medium)
Concurrency10 jobs per task

Job Pipeline

Each job passes through three gates before reaching EVA. Any gate can reject → SQS retry.

SQS Queue

Long-poll 20s

Cancellation Check

Redis cancelled:{runId}

Concurrency Slot

Redis Lua INCR/DECR

Billing Check

Alguna API (fail-open)

EVA SSE Stream

POST /create-and-run-session-sse

Redis Pub/Sub

run-events:{runId}

● = gate (can reject/retry)On failure: SQS visibility timeout → retry (max 3)

Retry Architecture

Two retry layers protect against transient failures.

EVA-Level Retry

p-retry with exponential backoff

  • TetraError → retry
  • 30s → 60s → 120s
  • Up to 5 attempts

SQS-Level Retry

Redis failure counter per message

  • Failure count < 3 → visibility timeout 60s → re-poll
  • Failure count ≥ 3 → dead letter
  • Worst case: 5 × 3 = 15 EVA calls

Key Concepts

Concurrency Control

Per-user slots via Redis Lua (atomic INCR/DECR with TTL). Default 2 per user, 20 internal. Crash-safe.

Billing Gate

Alguna API check — fail-open on transient errors. Internal @tinyfish.io users bypass entirely.

Scaling

Auto-scaled: ceil(SQS messages / 10). Sandbox: 10–40 tasks. Prod: 1–3 tasks.

Cancellation

Redis cancelled:{runId} flag. Checked before slot acquisition. EVA also checks each LLM step.

Code Pointers

task-worker/src/index.tsEntry: config → Sentry → Sonar → ECSWorker
task-worker/src/sqs-consumer.tsInfinite poll loop, heartbeat, retry tracking
task-worker/src/eva-client.tsHTTP + SSE stream parsing from EVA
task-worker/src/redis-publisher.tsPublishes events to run-events:{runId}
task-worker/src/concurrency-manager.tsPer-user slots via Redis Lua