Skip to main content
Sinas has a dual-execution model for functions, plus dedicated queue workers for async job processing.

Sandbox Containers

The sandbox container pool is a set of pre-warmed, generic Docker containers for executing untrusted user code. This is the default execution mode for all functions (shared_pool=false). How it works:
  • On startup, the pool creates sandbox_min_size containers (default: 4) ready to accept work.
  • When a function executes, a container is acquired from the idle pool, used, and returned.
  • Containers are recycled (destroyed and replaced) after sandbox_max_executions uses (default: 100) to prevent state leakage between executions.
  • If a container errors during execution, it’s marked as tainted and destroyed immediately.
  • A background replenishment loop monitors the idle count and creates new containers whenever it drops below sandbox_min_idle (default: 2), up to sandbox_max_size (default: 20).
  • Health checks run every 60 seconds to detect and replace dead containers.
Isolation guarantees: Each container runs with strict resource limits and security hardening:
ConstraintDefault
Memory512 MB (MAX_FUNCTION_MEMORY)
CPU1.0 cores (MAX_FUNCTION_CPU)
Disk1 GB (MAX_FUNCTION_STORAGE)
Execution time300 seconds (FUNCTION_TIMEOUT)
Temp storage100 MB tmpfs at /tmp
CapabilitiesAll dropped, only CHOWN/SETUID/SETGID added
Privilege escalationDisabled (no-new-privileges)
Runtime scaling: The pool can be scaled up or down at runtime without restarting the application:
# Check current pool state
GET /api/v1/containers/stats
# → {"idle": 2, "in_use": 3, "total": 5, "max_size": 20, ...}

# Scale up for high load
POST /api/v1/containers/scale
{"target": 15}
# → {"action": "scale_up", "previous": 5, "current": 15, "added": 10}

# Scale back down (only removes idle containers — never interrupts running executions)
POST /api/v1/containers/scale
{"target": 4}
# → {"action": "scale_down", "previous": 15, "current": 4, "removed": 11}
Package installation: When new packages are approved, existing containers don’t have them yet. Use the reload endpoint to install approved packages into all idle containers:
POST /api/v1/containers/reload
# → {"status": "completed", "idle_containers": 4, "success": 4, "failed": 0}
Containers that are currently executing are unaffected. New containers created by the replenishment loop automatically include all approved packages.

Shared Containers

Functions marked shared_pool=true run in persistent shared containers instead of sandbox containers. This is an admin-only option for trusted code that benefits from longer-lived containers. Differences from sandbox:
Sandbox ContainersShared Containers
Trust levelUntrusted user codeTrusted admin code only
IsolationPer-request (recycled after N uses)Shared (persistent containers)
LifecycleCreated/destroyed automaticallyPersist until explicitly scaled down
ScalingAuto-replenishment + manualManual via API only
Load balancingFirst available idle containerRound-robin across workers
Best forUser-submitted functionsAdmin functions, long-startup libraries
When to use shared_pool=true (shared containers):
  • Functions created and maintained by admins (not user-submitted code)
  • Functions that import heavy libraries (pandas, scikit-learn) where container startup cost matters
  • Performance-critical functions that benefit from warm containers
Management:
# List workers
GET /api/v1/workers

# Check count
GET /api/v1/workers/count
# → {"count": 4}

# Scale workers
POST /api/v1/workers/scale
{"target_count": 6}
# → {"action": "scale_up", "previous_count": 4, "current_count": 6, "added": 2}

# Reload packages in all workers
POST /api/v1/workers/reload
# → {"status": "completed", "total_workers": 6, "success": 6, "failed": 0}

Queue Workers

All function and agent executions are processed asynchronously through Redis-based queues (arq). Two separate worker types handle different workloads:
WorkerDocker serviceQueueConcurrencyRetries
Function workersqueue-workersinas:queue:functions10 jobs/workerUp to 3
Agent workersqueue-agentsinas:queue:agents5 jobs/workerNone (not idempotent)
Function workers dequeue function execution jobs, route them to either sandbox or shared containers, track results in Redis, and handle retries. Failed jobs that exhaust retries are moved to a dead letter queue (DLQ) for inspection and manual retry. Agent workers handle chat message processing — they call the LLM, execute tool calls, and stream responses back via Redis Streams. Agent jobs don’t retry because LLM calls with tool execution have side effects. Scaling is controlled via Docker Compose replicas:
# docker-compose.yml
queue-worker:
  command: python -m arq app.queue.worker.WorkerSettings
  deploy:
    replicas: ${QUEUE_WORKER_REPLICAS:-2}

queue-agent:
  command: python -m arq app.queue.worker.AgentWorkerSettings
  deploy:
    replicas: ${QUEUE_AGENT_REPLICAS:-2}
Each worker sends a heartbeat to Redis every 10 seconds (TTL: 30 seconds). If a worker dies, its heartbeat key auto-expires, making it easy to detect dead workers. Job status tracking:
# Check job status
GET /jobs/{job_id}
# → {"status": "completed", "execution_id": "...", ...}

# Get job result
GET /jobs/{job_id}/result
# → {function output}
Jobs go through states: queuedrunningcompleted or failed. Stale or orphaned jobs can be cancelled via the admin API:
# Cancel a running or queued job
POST /api/v1/queue/jobs/{job_id}/cancel
# → {"status": "cancelled", "job_id": "..."}
Cancellation updates the Redis status to cancelled and marks the DB execution record as CANCELLED. It also publishes to the done channel so any waiters unblock. This is a soft cancel — it does not kill running containers. Results are stored in Redis with a 24-hour TTL.

System Endpoints

Admin endpoints for monitoring and managing the Sinas deployment. All require sinas.system.read:all or sinas.system.update:all permissions. Health check:
GET /api/v1/system/health
Returns a comprehensive health report:
  • services — All Docker Compose containers with status, health, uptime, CPU %, and memory usage. Infrastructure containers (redis, postgres, pgbouncer) are listed first, followed by application containers sorted alphabetically. Sandbox and shared worker containers are included.
  • host — Host-level CPU, memory, and disk usage (read from /proc on Linux).
  • warnings — Auto-generated alerts at three levels:
    • critical — No queue workers running, or infrastructure services (redis, postgres, pgbouncer) down
    • warning — Non-infrastructure services down, unhealthy containers, DLQ items, queue backlog >50, disk/memory >90%
    • info — Disk/memory >75%
Container restart:
POST /api/v1/system/containers/{container_name}/restart
# → {"status": "restarted", "container": "sinas-backend"}
Restarts any Docker container by name (15-second timeout). Returns 404 if the container doesn’t exist. Flush stuck jobs:
POST /api/v1/system/flush-stuck-jobs
Cancels all jobs that have been stuck in running state for over 2 hours. Useful for recovering from worker crashes or orphaned jobs.

Dependencies (Python Packages)

Functions can only use Python packages that have been approved by an admin. This prevents untrusted code from installing arbitrary dependencies. Approval flow:
  1. Admin approves a dependency (optionally pinning a version)
  2. Package becomes available in newly created containers and workers
  3. Use POST /containers/reload or POST /workers/reload to install into existing containers
POST   /api/v1/dependencies              # Approve dependency (admin)
GET    /api/v1/dependencies              # List approved dependencies
DELETE /api/v1/dependencies/{id}         # Remove approval (admin)
Optionally restrict which packages can be approved with a whitelist:
# In .env — only these packages can be approved
ALLOWED_PACKAGES=requests,pandas,numpy,redis,boto3

Configuration Reference

Container pool:
VariableDefaultDescription
POOL_MIN_SIZE4Containers created on startup
POOL_MAX_SIZE20Maximum total containers
POOL_MIN_IDLE2Replenish when idle count drops below this
POOL_MAX_EXECUTIONS100Recycle container after this many uses
POOL_ACQUIRE_TIMEOUT30Seconds to wait for an available container
Function execution:
VariableDefaultDescription
FUNCTION_TIMEOUT300Max execution time in seconds
MAX_FUNCTION_MEMORY512Memory limit per container (MB)
MAX_FUNCTION_CPU1.0CPU cores per container
MAX_FUNCTION_STORAGE1gDisk storage limit
FUNCTION_CONTAINER_IDLE_TIMEOUT3600Idle container cleanup (seconds)
Workers and queues:
VariableDefaultDescription
DEFAULT_WORKER_COUNT4Shared workers created on startup
QUEUE_WORKER_REPLICAS2Function queue worker processes
QUEUE_AGENT_REPLICAS2Agent queue worker processes
QUEUE_FUNCTION_CONCURRENCY10Concurrent jobs per function worker
QUEUE_AGENT_CONCURRENCY5Concurrent jobs per agent worker
QUEUE_MAX_RETRIES3Retry attempts before DLQ
QUEUE_RETRY_DELAY10Seconds between retries
Packages:
VariableDefaultDescription
ALLOW_PACKAGE_INSTALLATIONtrueEnable pip in containers
ALLOWED_PACKAGES(empty)Comma-separated whitelist (empty = all allowed)