12 KiB
Wild Cloud Setup Process & Infrastructure
Wild Cloud provides a complete, production-ready Kubernetes infrastructure designed for personal use. It combines enterprise-grade technologies to create a self-hosted cloud platform with automated deployment, HTTPS certificates, and web management interfaces.
Setup Phases Overview
The Wild Cloud setup follows a sequential, dependency-aware process:
- Environment Setup - Install required tools and dependencies
- DNS/Network Foundation - Set up dnsmasq for DNS and PXE booting
- Cluster Infrastructure - Deploy Talos Linux nodes and Kubernetes cluster
- Cluster Services - Install core services (ingress, storage, certificates, etc.)
Phase 1: Environment Setup
Dependencies Installation
Script: scripts/setup-utils.sh
Required Tools:
kubectl- Kubernetes CLIgomplate- Template processor for configurationkustomize- Kubernetes configuration managementyq- YAML processorrestic- Backup tooltalosctl- Talos Linux cluster management
Project Initialization
Command: wild-init
Creates the basic Wild Cloud directory structure:
.wildcloud/- Project marker and cacheconfig.yaml- Main configuration filesecrets.yaml- Sensitive data storage- Basic project scaffolding
Phase 2: DNS/Network Foundation
dnsmasq Infrastructure
Location: setup/dnsmasq/
Requirements: Dedicated Linux machine with static IP
Services Provided:
-
LAN DNS Server
- Forwards internal domains (
*.internal.domain.com) to cluster - Forwards external domains (
*.domain.com) to cluster - Provides DNS resolution for entire network
- Forwards internal domains (
-
PXE Boot Server
- Enables network booting for cluster node installation
- DHCP/TFTP services for Talos Linux deployment
- Automated node provisioning
Network Configuration Example:
network:
subnet: 192.168.1.0/24
gateway: 192.168.1.1
dnsmasq_ip: 192.168.1.50
dhcp_range: 192.168.1.100-200
metallb_pool: 192.168.1.80-89
control_plane_vip: 192.168.1.90
node_ips: 192.168.1.91-93
Phase 3: Cluster Infrastructure Setup
Talos Linux Foundation
Command: wild-setup-cluster
Talos Configuration:
- Version: v1.11.0 (configurable)
- Immutable OS: Designed specifically for Kubernetes
- System Extensions:
- Intel microcode updates
- iSCSI tools for storage
- gVisor container runtime
- NVIDIA GPU support (optional)
- Additional system utilities
Cluster Setup Process
1. Configuration Generation
Script: wild-cluster-config-generate
- Generates base Talos configurations (
controlplane.yaml,worker.yaml) - Creates cluster secrets using
talosctl gen config - Establishes foundation for all node configurations
2. Node Setup (Atomic Operations)
Script: wild-node-setup <node-name> [options]
Complete Node Lifecycle Management:
- Hardware Detection: Discovers network interfaces and storage devices
- Configuration Generation: Creates node-specific patches and final configs
- Deployment: Applies Talos configuration to the node
Options:
--detect: Force hardware re-detection--no-deploy: Generate configuration only, skip deployment
Integration with Cluster Setup:
wild-setup-clusterautomatically callswild-node-setupfor each node- Individual node failures don't break cluster setup
- Clear retry instructions for failed nodes
Cluster Architecture
Control Plane:
- 3 nodes for high availability
- Virtual IP (VIP) for load balancing
- etcd distributed across all control plane nodes
Worker Nodes:
- Variable count (configured during setup)
- Dedicated workload execution
- Storage participation via Longhorn
Networking:
- All nodes on same LAN segment
- Sequential IP assignment
- MetalLB integration for load balancing
Phase 4: Cluster Services Installation
Services Deployment Process
Command: wild-setup-services [options]
--fetch: Fetch fresh templates before setup--no-deploy: Configure only, skip deployment
New Architecture: Per-service atomic operations
- Uses
wild-service-setup <service>for each service in dependency order - Each service handles complete lifecycle: fetch → configure → deploy
- Dependency validation before each service deployment
- Fail-fast with clear recovery instructions
Individual Service Management: wild-service-setup <service> [options]
- Default: Configure and deploy using existing templates
--fetch: Fetch fresh templates before setup--no-deploy: Configure only, skip deployment
Core Services (Installed in Order)
1. MetalLB Load Balancer
Location: setup/cluster-services/metallb/
- Purpose: Provides load balancing for bare metal clusters
- Functionality: Assigns external IPs to Kubernetes services
- Configuration: IP address pool from local network range
- Integration: Foundation for ingress traffic routing
2. Longhorn Distributed Storage
Location: setup/cluster-services/longhorn/
- Purpose: Distributed block storage for persistent volumes
- Features:
- Cross-node data replication
- Snapshot and backup capabilities
- Volume expansion and management
- Web-based management interface
- Storage: Uses local disks from all cluster nodes
3. Traefik Ingress Controller
Location: setup/cluster-services/traefik/
- Purpose: HTTP/HTTPS reverse proxy and ingress controller
- Features:
- Automatic service discovery
- TLS termination
- Load balancing and routing
- Gateway API support
- Integration: Works with MetalLB for external traffic
4. CoreDNS
Location: setup/cluster-services/coredns/
- Purpose: DNS resolution for cluster services
- Integration: Connects with external DNS providers
- Functionality: Service discovery and DNS forwarding
5. cert-manager
Location: setup/cluster-services/cert-manager/
- Purpose: Automatic TLS certificate management
- Features:
- Let's Encrypt integration
- Automatic certificate issuance and renewal
- Multiple certificate authorities support
- Certificate lifecycle management
6. ExternalDNS
Location: setup/cluster-services/externaldns/
- Purpose: Automatic DNS record management
- Functionality:
- Syncs Kubernetes services with DNS providers
- Automatic A/CNAME record creation
- Supports multiple DNS providers (Cloudflare, Route53, etc.)
7. Kubernetes Dashboard
Location: setup/cluster-services/kubernetes-dashboard/
- Purpose: Web UI for cluster management
- Access:
https://dashboard.internal.domain.com - Authentication: Token-based access via
wild-dashboard-token - Features: Resource management, monitoring, troubleshooting
8. NFS Storage (Optional)
Location: setup/cluster-services/nfs/
- Purpose: Network file system for shared storage
- Use Cases: Media storage, backups, shared data
- Integration: Mounted as persistent volumes in applications
9. Docker Registry
Location: setup/cluster-services/docker-registry/
- Purpose: Private container registry
- Features: Store custom images locally
- Integration: Used by applications and CI/CD pipelines
Infrastructure Components Deep Dive
DNS and Domain Architecture
Internet → External DNS → MetalLB LoadBalancer → Traefik → Kubernetes Services
↑
Internal DNS (dnsmasq)
↑
Internal Network
Domain Types:
- External:
app.domain.com- Public-facing services - Internal:
app.internal.domain.com- Admin interfaces only - Resolution: dnsmasq forwards all domain traffic to cluster
Certificate and TLS Management
Automatic Certificate Flow:
- Service deployed with ingress annotation
- cert-manager detects certificate requirement
- Let's Encrypt challenge initiated
- Certificate issued and stored in Kubernetes secret
- Traefik uses certificate for TLS termination
- Automatic renewal before expiration
Storage Architecture
Longhorn Distributed Storage:
- Block-level replication across nodes
- Default 3-replica policy for data durability
- Automatic failover and recovery
- Snapshot and backup capabilities
- Web UI for management and monitoring
Storage Classes:
longhorn- Default replicated storagelonghorn-single- Single replica for non-critical datanfs- Shared network storage (if configured)
Network Traffic Flow
External Request Flow:
- DNS resolution via dnsmasq → cluster IP
- Traffic hits MetalLB load balancer
- MetalLB forwards to Traefik ingress
- Traefik terminates TLS and routes to service
- Service forwards to appropriate pod
- Response follows reverse path
High Availability Features
Control Plane HA:
- 3 control plane nodes with leader election
- Virtual IP for API server access
- etcd cluster with automatic failover
- Distributed workload scheduling
Storage HA:
- Longhorn 3-way replication
- Automatic replica placement across nodes
- Node failure recovery
- Data integrity verification
Networking HA:
- MetalLB speaker pods on all nodes
- Automatic load balancer failover
- Multiple ingress controller replicas
Hardware Requirements
Minimum Specifications
- Nodes: 3 control plane + optional workers
- RAM: 8GB minimum per node (16GB+ recommended)
- CPU: 4 cores minimum per node
- Storage: 100GB+ local storage per node
- Network: Gigabit ethernet connectivity
Network Requirements
- All nodes on same LAN segment
- Static IP assignments or DHCP reservations
- dnsmasq server accessible by all nodes
- Internet connectivity for image pulls and Let's Encrypt
Recommended Hardware
- Control Plane: 16GB RAM, 8 cores, 200GB NVMe SSD
- Workers: 32GB RAM, 16 cores, 500GB NVMe SSD
- Network: Dedicated VLAN or network segment
- Redundancy: UPS protection, dual network interfaces
Configuration Management
Configuration Files
config.yaml- Main configuration (domains, network, apps)secrets.yaml- Sensitive data (passwords, API keys, certificates).wildcloud/- Cache and temporary files
Template System
gomplate Integration:
- All configurations processed as templates
- Access to config and secrets via template variables
- Dynamic configuration generation
- Environment-specific customization
Configuration Commands
# Read configuration values
wild-config cluster.name
wild-config apps.ghost.domain
# Set configuration values
wild-config-set cloud.domain "example.com"
wild-config-set cluster.nodeCount 5
# Secret management
wild-secret apps.database.password
wild-secret-set apps.api.key "secret-value"
Setup Commands Reference
Complete Setup
wild-init # Initialize project
wild-setup # Complete automated setup
Phase-by-Phase Setup
wild-setup-cluster # Cluster infrastructure only
wild-setup-services # Cluster services only
Individual Operations
wild-cluster-config-generate # Generate base configs
wild-node-setup <node-name> # Complete node setup (detect → configure → deploy)
wild-node-setup <node-name> --detect # Force hardware re-detection
wild-node-setup <node-name> --no-deploy # Configuration only
wild-dashboard-token # Get dashboard access
wild-health # System health check
Troubleshooting and Validation
Health Checks
wild-health # Overall system status
kubectl get nodes # Node status
kubectl get pods -A # All pod status
talosctl health # Talos cluster health
Service Validation
kubectl get svc -n metallb-system # MetalLB status
kubectl get pods -n longhorn-system # Storage status
kubectl get pods -n traefik # Ingress status
kubectl get certificates -A # Certificate status
Log Analysis
talosctl logs -f machined # Talos system logs
kubectl logs -n traefik deployment/traefik # Ingress logs
kubectl logs -n cert-manager deployment/cert-manager # Certificate logs
This comprehensive setup process creates a production-ready personal cloud infrastructure with enterprise-grade reliability, security, and management capabilities.