Failure Domains

QNSP is designed with isolated failure domains to contain blast radius.

Failure Domains

QNSP is designed with isolated failure domains to contain blast radius.

Service Isolation

Each service can fail independently:

  • Separate process/container
  • Independent health checks
  • Circuit breakers on dependencies

Service Ports & Health Endpoints

Service Port Health Endpoint
edge-gateway 8107 GET /health, GET /proxy/health
platform-api 8080 GET /proxy/platform/health, GET /edge/platform/health
auth-service 8081 GET /proxy/auth/health, GET /edge/auth/health
vault-service 8090 GET /proxy/vault/health, GET /edge/vault/health
kms-service 8095 GET /proxy/kms/health, GET /edge/kms/health
storage-service 8092 GET /proxy/storage/health, GET /edge/storage/health
audit-service 8103 GET /proxy/audit/health, GET /edge/audit/health

Failure Modes

Edge Gateway Failure

  • Impact: All external traffic blocked
  • Mitigation: Multi-instance deployment, health-based routing

Auth Service Failure

  • Impact: No new tokens issued
  • Mitigation: Cached token validation, graceful degradation

KMS Service Failure

  • Impact: No new encrypt/decrypt operations
  • Mitigation: Cached keys (TTL: 600s default), queue pending operations

Storage Service Failure

  • Impact: No data read/write
  • Mitigation: Retry with backoff, client-side caching

Graceful Degradation

Services implement:

  • Timeouts on all external calls
  • Circuit breakers (open after N failures)
  • Fallback responses where safe
  • Health endpoint for orchestrator