Security Architecture
This section describes the security architecture and assumptions of Backend.AI, focusing on network isolation requirements for production deployments.
Warning
Critical Security Requirement: Network Isolation
Backend.AI assumes compute nodes (where Agents run) are deployed in a network-isolated environment with restricted inbound access. Direct access to compute nodes from untrusted networks must be prevented through proper network configuration.
Network Architecture Overview
Backend.AI follows a defense-in-depth approach where different components are deployed in isolated network zones with controlled access paths.
Architecture Diagram
The following diagram illustrates the expected network architecture and traffic flow in a properly configured Backend.AI cluster:
┌─────────────────────────────────────────────────────────────┐
│ Public Network │
│ (Untrusted Network) │
└──────────────────────────┬──────────────────────────────────┘
│
│ HTTPS (443)
│
┌─────▼─────┐
│ Firewall │
│ / WAF │
└─────┬─────┘
│
┌──────────────────────────┼──────────────────────────────────┐
│ │ Management Zone │
│ ┌─────▼─────┐ │
│ │ Webserver │ ◄─── Web UI / REST API │
│ └─────┬─────┘ │
│ │ │
│ ┌─────▼─────┐ │
│ │ Manager │ ◄─── Business Logic │
│ └─────┬─────┘ │
│ │ │
│ ┌─────▼─────┐ │
│ │ AppProxy │ ◄─── Interactive Sessions │
│ └─────┬─────┘ │
│ │ │
└──────────────────────────┼──────────────────────────────────┘
│
│ Internal Network Only
│ (No Direct Public Access)
│
┌──────────────────────────┼──────────────────────────────────┐
│ │ Compute Zone │
│ │ (Private Network) │
│ ┌─────▼─────┐ │
│ ┌────────┤ Agent ├────────┐ │
│ │ └───────────┘ │ │
│ ┌─────▼─────┐ ┌────────▼──────┐ │
│ │ Container │ │ Container │ │
│ │ (Session) │ │ (Session) │ │
│ └───────────┘ └────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Traffic Flow
Authorized Access Path:
User → Webserver: Users connect to the webserver via HTTPS through a firewall or Web Application Firewall (WAF)
Webserver → Manager: Webserver forwards authenticated requests to the manager for processing
Manager → Agent: Manager communicates with agents in the compute zone via internal network for session lifecycle management
User → AppProxy → Agent → Container: For interactive sessions (notebooks, terminals, web apps), users connect through AppProxy which proxies traffic to containers running on agents
Blocked Access Path:
User → Agent: Direct access from public network to agents must be blocked
User → Container: Direct access from public network to containers must be blocked
Network Zones
Management Zone
The management zone contains Backend.AI control plane components:
Webserver: Web UI and REST API gateway
Manager: Core business logic and orchestration
AppProxy: Interactive session proxy
Database: PostgreSQL for persistent state
Etcd: Configuration and coordination
Redis: Caching and pub/sub
Network Requirements:
Must be accessible from trusted networks (VPN, corporate network)
Should be protected by firewall rules allowing only necessary ports
Should implement rate limiting and DDoS protection
TLS/SSL encryption must be enabled for all external-facing services
Compute Zone
The compute zone contains Backend.AI data plane components:
Agents: Container orchestration and resource management
Containers: User computation workloads
Network Requirements:
CRITICAL: Must be isolated in a private network with NO direct inbound access from untrusted networks
Agents must be able to initiate connections to management zone components
Containers should only be accessible via AppProxy tunnel
Inter-agent communication required for multi-node cluster sessions via overlay networks
Security Considerations
Interactive Session Access Control
Issue: Interactive sessions do not verify authorization when accessed directly
Impact: If compute nodes are accessible from untrusted networks, attackers could potentially access running sessions by bypassing the authentication layer
Mitigation: By design, Backend.AI assumes compute nodes are deployed in a network-isolated environment. This is not a software vulnerability but an operational security requirement that must be enforced through proper network configuration.
References
Installation Guides - Backend.AI Installation Guide
Cluster Networking - Networking Concepts
Install Backend.AI Agent - Agent Installation