- Refactor dnsmasq configuration and scripts for improved variable handling and clarity - Updated dnsmasq configuration files to use direct variable references instead of data source functions for better readability. - Modified setup scripts to ensure they are run from the correct environment and directory, checking for the WC_HOME variable. - Changed paths in README and scripts to reflect the new directory structure. - Enhanced error handling in setup scripts to provide clearer guidance on required configurations. - Adjusted kernel and initramfs URLs in boot.ipxe to use the updated variable references.
7.4 KiB
Cluster Node Setup
This directory contains automation for setting up Talos Kubernetes cluster nodes with static IP configuration.
Hardware Detection and Setup (Recommended)
The automated setup discovers hardware configuration from nodes in maintenance mode and generates machine configurations with the correct interface names and disk paths.
Prerequisites
source .env
- Boot nodes with Talos ISO in maintenance mode
- Nodes must be accessible on the network
Hardware Discovery Workflow
# ONE-TIME CLUSTER INITIALIZATION (run once per cluster)
./init-cluster.sh
# FOR EACH CONTROL PLANE NODE:
# 1. Boot node with Talos ISO (it will get a DHCP IP in maintenance mode)
# 2. Detect hardware and update config.yaml
./detect-node-hardware.sh <maintenance-ip> <node-number>
# Example: Node boots at 192.168.8.168, register as node 1
./detect-node-hardware.sh 192.168.8.168 1
# 3. Generate machine config for registered nodes
./generate-machine-configs.sh
# 4. Apply configuration - node will reboot with static IP
talosctl apply-config --insecure -n 192.168.8.168 --file final/controlplane-node-1.yaml
# 5. Wait for reboot, node should come up at its target static IP (192.168.8.31)
# Repeat steps 1-5 for additional control plane nodes
The detect-node-hardware.sh
script will:
- Connect to nodes in maintenance mode via talosctl
- Discover active ethernet interfaces (e.g.,
enp4s0
instead of hardcodedeth0
) - Discover available installation disks (>10GB)
- Update
config.yaml
with per-node hardware configuration - Provide next steps for machine config generation
The init-cluster.sh
script will:
- Generate Talos cluster secrets and base configurations (once per cluster)
- Set up talosctl context with cluster certificates
- Configure VIP endpoint for cluster communication
The generate-machine-configs.sh
script will:
- Check which nodes have been hardware-detected
- Compile network configuration templates with discovered hardware settings
- Create final machine configurations for registered nodes only
- Include system extensions for Longhorn (iscsi-tools, util-linux-tools)
- Update talosctl context with registered node IPs
Cluster Bootstrap
After all control plane nodes are configured with static IPs:
# Bootstrap the cluster using any control node
talosctl bootstrap --nodes 192.168.8.31 --endpoint 192.168.8.31
# Get kubeconfig
talosctl kubeconfig
# Verify cluster is ready
kubectl get nodes
Complete Example
Here's a complete example of setting up a 3-node control plane:
# CLUSTER INITIALIZATION (once per cluster)
./init-cluster.sh
# NODE 1
# Boot node with Talos ISO, it gets DHCP IP 192.168.8.168
./detect-node-hardware.sh 192.168.8.168 1
./generate-machine-configs.sh
talosctl apply-config --insecure -n 192.168.8.168 --file final/controlplane-node-1.yaml
# Node reboots and comes up at 192.168.8.31
# NODE 2
# Boot second node with Talos ISO, it gets DHCP IP 192.168.8.169
./detect-node-hardware.sh 192.168.8.169 2
./generate-machine-configs.sh
talosctl apply-config --insecure -n 192.168.8.169 --file final/controlplane-node-2.yaml
# Node reboots and comes up at 192.168.8.32
# NODE 3
# Boot third node with Talos ISO, it gets DHCP IP 192.168.8.170
./detect-node-hardware.sh 192.168.8.170 3
./generate-machine-configs.sh
talosctl apply-config --insecure -n 192.168.8.170 --file final/controlplane-node-3.yaml
# Node reboots and comes up at 192.168.8.33
# CLUSTER BOOTSTRAP
talosctl bootstrap -n 192.168.8.30
talosctl kubeconfig
kubectl get nodes
Configuration Details
Per-Node Configuration
Each control plane node has its own configuration block in config.yaml
:
cluster:
nodes:
control:
vip: 192.168.8.30
node1:
ip: 192.168.8.31
interface: enp4s0 # Discovered automatically
disk: /dev/sdb # Selected during hardware detection
node2:
ip: 192.168.8.32
# interface and disk added after hardware detection
node3:
ip: 192.168.8.33
# interface and disk added after hardware detection
Worker nodes use DHCP by default. You can use the same hardware detection process for worker nodes if static IPs are needed.
Talosconfig Management
Context Naming and Conflicts
When running talosctl config merge ./generated/talosconfig
, if a context with the same name already exists, talosctl will create an enumerated version (e.g., demo-cluster-2
).
For a clean setup:
- Delete existing contexts before merging:
talosctl config contexts
thentalosctl config context <name> --remove
- Or use
--force
to overwrite:talosctl config merge ./generated/talosconfig --force
Recommended approach for new clusters:
# Remove old context if rebuilding cluster
talosctl config context demo-cluster --remove || true
# Merge new configuration
talosctl config merge ./generated/talosconfig
talosctl config endpoint 192.168.8.30
talosctl config node 192.168.8.31 # Add nodes as they are registered
Context Configuration Timeline
- After first node hardware detection: Merge talosconfig and set endpoint/first node
- After additional nodes: Add them to the existing context with
talosctl config node <ip1> <ip2> <ip3>
- Before cluster bootstrap: Ensure all control plane nodes are in the node list
System Extensions
All nodes include:
siderolabs/iscsi-tools
: Required for Longhorn storagesiderolabs/util-linux-tools
: Utility tools for storage operations
Hardware Detection
The detect-node-hardware.sh
script automatically discovers:
- Network interfaces: Finds active ethernet interfaces (no more hardcoded
eth0
) - Installation disks: Lists available disks >10GB for interactive selection
- Per-node settings: Updates
config.yaml
with hardware-specific configuration
This eliminates the need to manually configure hardware settings and handles different hardware configurations across nodes.
Template Structure
Configuration templates are stored in patch.templates/
and use gomplate syntax:
controlplane-node-1.yaml
: Template for first control plane nodecontrolplane-node-2.yaml
: Template for second control plane nodecontrolplane-node-3.yaml
: Template for third control plane nodeworker.yaml
: Template for worker nodes
Templates use per-node variables from config.yaml
:
{{ .cluster.nodes.control.node1.ip }}
{{ .cluster.nodes.control.node1.interface }}
{{ .cluster.nodes.control.node1.disk }}
{{ .cluster.nodes.control.vip }}
The wild-compile-template-dir
command processes all templates and outputs compiled configurations to the patch/
directory.
Troubleshooting
Hardware Detection Issues
# Check if node is accessible in maintenance mode
talosctl -n <NODE_IP> version --insecure
# View available network interfaces
talosctl -n <NODE_IP> get links --insecure
# View available disks
talosctl -n <NODE_IP> get disks --insecure
Manual Hardware Discovery
If the automatic detection fails, you can manually inspect hardware:
# Find active ethernet interfaces
talosctl -n <NODE_IP> get links --insecure -o json | jq -s '.[] | select(.spec.operationalState == "up" and .spec.type == "ether" and .metadata.id != "lo") | .metadata.id'
# Find suitable installation disks
talosctl -n <NODE_IP> get disks --insecure -o json | jq -s '.[] | select(.spec.size > 10000000000) | .metadata.id'
Node Status
# View machine configuration (only works after config is applied)
talosctl -n <NODE_IP> get machineconfig