9.2 KiB
9.2 KiB
Discovery and Networking Guide
This guide covers Talos cluster discovery mechanisms, network configuration, and connectivity troubleshooting.
Cluster Discovery System
Talos includes built-in node discovery that allows cluster members to find each other and maintain membership information.
Discovery Registries
Service Registry (Default)
- External Service: Uses public discovery service at
https://discovery.talos.dev/ - Encryption: All data encrypted with AES-GCM before transmission
- Functionality: Works without dependency on etcd/Kubernetes
- Advantages: Available even when control plane is down
Kubernetes Registry (Deprecated)
- Data Source: Uses Kubernetes Node resources and annotations
- Limitation: Incompatible with Kubernetes 1.32+ due to AuthorizeNodeWithSelectors
- Status: Disabled by default, deprecated
Discovery Configuration
cluster:
discovery:
enabled: true
registries:
service:
disabled: false # Default
kubernetes:
disabled: true # Deprecated, disabled by default
To disable service registry:
cluster:
discovery:
enabled: true
registries:
service:
disabled: true
Discovery Data Flow
Service Registry Process
- Data Encryption: Node encrypts affiliate data with cluster key
- Endpoint Encryption: Endpoints separately encrypted for deduplication
- Data Submission: Node submits own data + observed peer endpoints
- Server Processing: Discovery service aggregates and deduplicates data
- Data Distribution: Encrypted updates sent to all cluster members
- Local Processing: Nodes decrypt data for cluster discovery and KubeSpan
Data Protection
- Cluster Isolation: Cluster ID used as key selector
- End-to-End Encryption: Discovery service cannot decrypt affiliate data
- Memory-Only Storage: Data stored in memory with encrypted snapshots
- No Sensitive Exposure: Service only sees encrypted blobs and cluster metadata
Discovery Resources
Node Identity
# View node's unique identity
talosctl get identities -o yaml
Output:
spec:
nodeId: Utoh3O0ZneV0kT2IUBrh7TgdouRcUW2yzaaMl4VXnCd
Identity Characteristics:
- Base62 encoded random 32 bytes
- URL-safe encoding
- Preserved in STATE partition (
node-identity.yaml) - Survives reboots and upgrades
- Regenerated on reset/wipe
Affiliates (Proposed Members)
# View discovered affiliates (proposed cluster members)
talosctl get affiliates
Output:
ID VERSION HOSTNAME MACHINE TYPE ADDRESSES
2VfX3nu67ZtZPl57IdJrU87BMjVWkSBJiL9ulP9TCnF 2 talos-default-controlplane-2 controlplane ["172.20.0.3","fd83:b1f7:fcb5:2802:986b:7eff:fec5:889d"]
Members (Approved Members)
# View cluster members
talosctl get members
Output:
ID VERSION HOSTNAME MACHINE TYPE OS ADDRESSES
talos-default-controlplane-1 2 talos-default-controlplane-1 controlplane Talos (v1.11.0) ["172.20.0.2","fd83:b1f7:fcb5:2802:8c13:71ff:feaf:7c94"]
Raw Registry Data
# View data from specific registries
talosctl get affiliates --namespace=cluster-raw
Output shows registry sources:
ID VERSION HOSTNAME
k8s/2VfX3nu67ZtZPl57IdJrU87BMjVWkSBJiL9ulP9TCnF 3 talos-default-controlplane-2
service/2VfX3nu67ZtZPl57IdJrU87BMjVWkSBJiL9ulP9TCnF 23 talos-default-controlplane-2
Network Architecture
Network Layers
Host Networking
- Node-to-Node: Direct IP connectivity between cluster nodes
- Control Plane: API server communication via control plane endpoint
- Discovery: HTTPS connection to discovery service (port 443)
Container Networking
- CNI: Container Network Interface for pod networking
- Service Mesh: Optional service mesh implementations
- Network Policies: Kubernetes network policy enforcement
Optional: KubeSpan (WireGuard Mesh)
- Mesh Networking: Full mesh WireGuard connections
- Discovery Integration: Uses discovery service for peer coordination
- Encryption: WireGuard public keys distributed via discovery
- Use Cases: Multi-cloud, hybrid, NAT traversal
Network Configuration Patterns
Basic Network Setup
machine:
network:
interfaces:
- interface: eth0
dhcp: true
Static IP Configuration
machine:
network:
interfaces:
- interface: eth0
addresses:
- 192.168.1.100/24
routes:
- network: 0.0.0.0/0
gateway: 192.168.1.1
mtu: 1500
nameservers:
- 8.8.8.8
- 1.1.1.1
Multiple Interface Configuration
machine:
network:
interfaces:
- interface: eth0 # Management interface
dhcp: true
- interface: eth1 # Kubernetes traffic
addresses:
- 10.0.1.100/24
routes:
- network: 10.0.0.0/16
gateway: 10.0.1.1
KubeSpan Configuration
Basic KubeSpan Setup
machine:
network:
kubespan:
enabled: true
Advanced KubeSpan Configuration
machine:
network:
kubespan:
enabled: true
advertiseKubernetesNetworks: true
allowDownPeerBypass: true
mtu: 1420 # Account for WireGuard overhead
filters:
endpoints:
- 0.0.0.0/0 # Allow all endpoints
KubeSpan Features:
- Automatic peer discovery via discovery service
- NAT traversal capabilities
- Encrypted mesh networking
- Kubernetes network advertisement
- Fault tolerance with peer bypass
Network Troubleshooting
Discovery Issues
Check Discovery Service Connectivity
# Test connectivity to discovery service
talosctl get affiliates
# Check discovery configuration
talosctl get discoveryconfig -o yaml
# Monitor discovery events
talosctl events --tail
Common Discovery Problems
-
No Affiliates Discovered:
- Check discovery service connectivity
- Verify cluster ID matches across nodes
- Confirm discovery is enabled
-
Partial Affiliate List:
- Network connectivity issues between nodes
- Discovery service regional availability
- Firewall blocking discovery traffic
-
Discovery Service Unreachable:
- Network connectivity to discovery.talos.dev:443
- Corporate firewall/proxy configuration
- DNS resolution issues
Network Connectivity Testing
Basic Network Tests
# Test network interfaces
talosctl get addresses
talosctl get routes
talosctl get nodeaddresses
# Check network configuration
talosctl get networkconfig -o yaml
# Test connectivity
talosctl -n <IP> ping <target-ip>
Inter-Node Connectivity
# Test control plane endpoint
talosctl health --control-plane-nodes <IP1>,<IP2>,<IP3>
# Check etcd connectivity
talosctl -n <IP> etcd members
# Test Kubernetes API
kubectl get nodes
KubeSpan Troubleshooting
# Check KubeSpan status
talosctl get kubespanpeerspecs
talosctl get kubespanpeerstatuses
# Monitor WireGuard connections
talosctl -n <IP> interfaces
# Check KubeSpan logs
talosctl -n <IP> logs controller-runtime | grep kubespan
Network Performance Optimization
Network Interface Tuning
machine:
network:
interfaces:
- interface: eth0
mtu: 9000 # Jumbo frames if supported
dhcp: true
KubeSpan Performance
- Adjust MTU for WireGuard overhead (typically -80 bytes)
- Consider endpoint filters for large clusters
- Monitor WireGuard peer connection stability
Security Considerations
Discovery Security
- Encrypted Communication: All discovery data encrypted end-to-end
- Cluster Isolation: Cluster ID prevents cross-cluster data access
- No Sensitive Data: Only encrypted metadata transmitted
- Network Security: HTTPS transport with certificate validation
Network Security
- mTLS: All Talos API communication uses mutual TLS
- Certificate Rotation: Automatic certificate lifecycle management
- Network Policies: Implement Kubernetes network policies for workloads
- Firewall Rules: Restrict network access to necessary ports only
Required Network Ports
- 6443: Kubernetes API server
- 2379-2380: etcd client/peer communication
- 10250: kubelet API
- 50000: Talos API (apid)
- 443: Discovery service (outbound)
- 51820: KubeSpan WireGuard (if enabled)
Operational Best Practices
Monitoring
- Monitor discovery service connectivity
- Track cluster member changes
- Alert on network partitions
- Monitor KubeSpan peer status
Backup and Recovery
- Document network configuration
- Backup discovery service configuration
- Test network recovery procedures
- Plan for discovery service outages
Scaling Considerations
- Discovery service scales to thousands of nodes
- KubeSpan mesh scales to hundreds of nodes efficiently
- Consider network segmentation for large clusters
- Plan for multi-region deployments
This networking foundation enables Talos clusters to maintain connectivity and membership across various network topologies while providing security and performance optimization options.