Compare commits

...

22 Commits

Author SHA1 Message Date
Paul Payne
c8fd702d1b Node delete should reset. 2025-11-09 00:15:36 +00:00
Paul Payne
1271eebf38 Add node details to cluster status. 2025-11-08 23:16:03 +00:00
Paul Payne
b00dffd2b6 Allow resetting a node to maintenance mode. 2025-11-08 22:57:35 +00:00
Paul Payne
c623843d53 ISOs need version AND schema 2025-11-08 22:23:26 +00:00
Paul Payne
b330b2aea7 Adds tests. 2025-11-08 20:10:13 +00:00
Paul Payne
7cd434aabf feat(api): Enhance NodeDiscover with subnet auto-detection and discovery cancellation
- Updated NodeDiscover to accept an optional subnet parameter, with auto-detection of local networks if none is provided.
- Removed support for IP list format in NodeDiscover request body.
- Implemented discovery cancellation functionality with NodeDiscoveryCancel endpoint.
- Improved error handling and response messages for better clarity.

feat(cluster): Add operation tracking for cluster bootstrap process

- Integrated operations manager into cluster manager for tracking bootstrap progress.
- Refactored Bootstrap method to run asynchronously with detailed progress updates.
- Added methods to wait for various bootstrap steps (etcd health, VIP assignment, control plane readiness, etc.).

fix(discovery): Optimize node discovery process and improve maintenance mode detection

- Enhanced node discovery to run in parallel with a semaphore to limit concurrent scans.
- Updated probeNode to detect maintenance mode more reliably.
- Added functions to expand CIDR notation into individual IP addresses and retrieve local network interfaces.

refactor(node): Update node manager to handle instance-specific configurations

- Modified NewManager to accept instanceName for tailored talosconfig usage.
- Improved hardware detection logic to handle maintenance mode scenarios.

feat(operations): Implement detailed bootstrap progress tracking

- Introduced BootstrapProgress struct to track and report the status of bootstrap operations.
- Updated operation management to include bootstrap-specific details.

fix(tools): Improve talosctl command execution with context and error handling

- Added context with timeout to talosctl commands to prevent hanging on unreachable nodes.
- Enhanced error handling for version retrieval in maintenance mode.
2025-11-04 17:16:16 +00:00
Paul Payne
005dc30aa5 Adds app endpoints for configuration and status. 2025-10-22 23:17:52 +00:00
Paul Payne
5b7d2835e7 Instance-namespace additional utility endpoints. 2025-10-14 21:06:18 +00:00
Paul Payne
67ca1b85be Functions for common paths. 2025-10-14 19:23:16 +00:00
Paul Payne
679ea18446 Namespace dashboard token endpoint in an instance. 2025-10-14 18:52:27 +00:00
Paul Payne
d2c8ff716e Lint fixes. 2025-10-14 07:31:54 +00:00
Paul Payne
2fd71c32dc Formatting. 2025-10-14 07:13:00 +00:00
Paul Payne
afb0e09aae Service config. Service logs. Service status. 2025-10-14 05:26:45 +00:00
Paul Payne
1d11996cd6 Better support for Talos ISO downloads. 2025-10-12 20:15:43 +00:00
Paul Payne
3a8488eaff Adds CORS. 2025-10-12 16:52:36 +00:00
Paul Payne
9d1abc3e90 Update env var name. 2025-10-12 00:41:04 +00:00
Paul Payne
9f2d5fc7fb Add dnsmasq endpoints. 2025-10-12 00:35:03 +00:00
Paul Payne
47c3b10be9 Dnsmasq management endpoints. 2025-10-11 23:01:26 +00:00
Paul Payne
ff97f14229 Get templates from embedded package. 2025-10-11 22:20:26 +00:00
Paul Payne
89c6a7aa80 Moves setup files into embedded package. 2025-10-11 22:06:39 +00:00
Paul Payne
92032202f4 Dev env setup. 2025-10-11 21:44:35 +00:00
Paul Payne
b76d2fb164 Adds setup files. 2025-10-11 18:15:06 +00:00
167 changed files with 27236 additions and 1067 deletions

38
.vscode/launch.json vendored Normal file
View File

@@ -0,0 +1,38 @@
{
"version": "0.2.0",
"configurations": [
{
"name": "Daemon (debug)",
"type": "go",
"request": "launch",
"mode": "debug",
"program": "${workspaceFolder}",
"console": "integratedTerminal",
"env": {
"GO_ENV": "development",
"WILD_CENTRAL_ENV": "development",
"DEBUG": "true"
},
"args": [],
"cwd": "${workspaceFolder}",
"stopOnEntry": false,
"showLog": true,
"trace": "verbose"
},
{
"name": "Daemon (no debug)",
"type": "go",
"request": "launch",
"program": "${workspaceFolder}",
"console": "integratedTerminal",
"env": {
"GO_ENV": "development",
"WILD_CENTRAL_ENV": "development"
},
"args": [],
"cwd": "${workspaceFolder}",
"stopOnEntry": false,
"showLog": true
}
],
}

23
.vscode/settings.json vendored Normal file
View File

@@ -0,0 +1,23 @@
{
"go.gopath": "",
"go.goroot": "",
"go.useLanguageServer": true,
"gopls": {
"experimentalWorkspaceModule": true,
"build.experimentalWorkspaceModule": true
},
"go.lintTool": "golangci-lint",
"go.lintOnSave": "workspace",
"go.formatTool": "goimports",
"go.testOnSave": false,
"[go]": {
"editor.formatOnSave": true,
"editor.codeActionsOnSave": {
"source.organizeImports": "explicit"
},
"editor.suggest.snippetsPreventQuickSuggestions": false
}
}

149
BUILDING_WILD_API.md Normal file
View File

@@ -0,0 +1,149 @@
# Building the Wild Cloud Central API
These are instructions for working with the Wild Cloud Central API (Wild API). Wild API is a web service that runs on Wild Central. Users can interact with the API directly, through the Wild CLI, or through the Wild Web App. The CLI and Web App depend on the API extensively.
Whenever changes are made to the API, it is important that the CLI and API are updated appropriately.
Use tests on the API extensively to keep the API functioning well for all clients, but don't duplicate test layers. If something is tested in one place, it doesn't need to be tested again in another place. Prefer unit tests. Tests should be run with `make test` after all API changes. If a bug was found by any means other than tests, it is a signal that a test should have been present to catch it earlier, so make sure a new test catches that bug before fixing it.
## Dev Environment Requirements
- Go 1.21+
- GNU Make (for build automation)
## Principles
- The API enables all of the functionaly needed by the CLI and the webapp. These clients should conform to the API. The API should not be designed around the needs of the CLI or webapp.
- A wild cloud instance is primarily data (YAML files for config, secrets, and manifests).
- Because a wild cloud instance is primarily data, a wild cloud instance can be managed by non-technical users through the webapp or by technical users by SSHing into the device (e.g. VSCode Remote SSH).
- Like v.PoC, we should only use gomplate templates for distinguishing between cloud instances. However, **within** a cloud instance, there should be no templating. The templates are compiled when being copied into the instances. This allows transparency and simple management by the user.
- Manage state and infrastructure idempotently.
- Cluster state should be the k8s cluster itself, not local files. It should be accessed via kubectl and talosctl.
- All wild cloud state should be stored on the filesystem in easy to read YAML files, and can be edited directly or through the webapp.
- All code should be simple and easy to understand.
- Avoid unnecessary complexity.
- Avoid unnecessary dependencies.
- Avoid unnecessary features.
- Avoid unnecessary abstractions.
- Avoid unnecessary comments.
- Avoid unnecessary configuration options.
- Avoid Helm. Use Kustomize.
- The API should be able to run on low-resource devices like a Raspberry Pi 4 (4GB RAM).
- The API should be able to manage multiple Wild Cloud instances on the LAN.
- The API should include functionality to manage a dnsmasq server on the same device. Currently, this is only used to resolve wild cloud domain names within the LAN to provide for private addresses on the LAN. The LAN router should be configured to use the Wild Central IP as its DNS server.
- The API is configurable to use various providers for:
- Wild Cloud Apps Directory provider (local FS, git repo, etc)
- DNS (built-in dnsmasq, external DNS server, etc)
### Coding Standards
- Use a standard Go project structure.
- Use Go modules.
- Use standard Go libraries wherever possible.
- Use popular, well-maintained libraries for common tasks (e.g. gorilla/mux for HTTP routing).
- Write unit tests for all functions and methods.
- Make and use common modules. For example, one module should handle all interactions with talosctl. Another modules should handle all interactions with kubectl.
- If the code is getting long and complex, break it into smaller modules.
- API requests and responses should be valid JSON. Object attributes should be standard JSON camel-cased.
### Features
- If WILD_CENTRAL_ENV environment variable is set to "development", the API should run in development mode.
## Patterns
### Instance-scoped Endpoints
Instance-scoped endpoints follow a consistent pattern to ensure stateless, RESTful API design. The instance name is always included in the URL path, not retrieved from session state or context.
#### Route Pattern
```go
// In handlers.go
r.HandleFunc("/api/v1/instances/{name}/utilities/dashboard/token", api.UtilitiesDashboardToken).Methods("GET")
```
#### Handler Pattern
```go
// In handlers_utilities.go
func (api *API) UtilitiesDashboardToken(w http.ResponseWriter, r *http.Request) {
// 1. Extract instance name from URL path parameters
vars := mux.Vars(r)
instanceName := vars["name"]
// 2. Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// 3. Construct instance-specific paths using tools helpers
kubeconfigPath := tools.GetKubeconfigPath(api.dataDir, instanceName)
// 4. Perform instance-specific operations
token, err := utilities.GetDashboardToken(kubeconfigPath)
if err != nil {
respondError(w, http.StatusInternalServerError, "Failed to get dashboard token")
return
}
// 5. Return response
respondJSON(w, http.StatusOK, map[string]interface{}{
"success": true,
"data": token,
})
}
```
#### Key Principles
1. **Instance name in URL**: Always include instance name as a path parameter (`{name}`)
2. **Extract from mux.Vars()**: Get instance name from `mux.Vars(r)["name"]`, not from context
3. **Validate instance**: Always validate the instance exists before operations
4. **Use path helpers**: Use `tools.GetKubeconfigPath()`, `tools.GetInstanceConfigPath()`, etc. instead of inline `filepath.Join()` constructions
5. **Stateless handlers**: Handlers should not depend on session state or current context
### kubectl and talosctl Commands
When making kubectl or talosctl calls for a specific instance, always use the `tools` package helpers to set the correct context.
#### Using kubectl with Instance Kubeconfig
```go
// In utilities.go or similar
func GetDashboardToken(kubeconfigPath string) (*DashboardToken, error) {
cmd := exec.Command("kubectl", "-n", "kubernetes-dashboard", "create", "token", "dashboard-admin")
tools.WithKubeconfig(cmd, kubeconfigPath)
output, err := cmd.Output()
if err != nil {
return nil, fmt.Errorf("failed to create token: %w", err)
}
token := strings.TrimSpace(string(output))
return &DashboardToken{Token: token}, nil
}
```
#### Using talosctl with Instance Talosconfig
```go
// In cluster operations
func GetClusterHealth(talosconfigPath string, nodeIP string) error {
cmd := exec.Command("talosctl", "health", "--nodes", nodeIP)
tools.WithTalosconfig(cmd, talosconfigPath)
output, err := cmd.Output()
if err != nil {
return fmt.Errorf("failed to check health: %w", err)
}
// Process output...
return nil
}
```
#### Key Principles
1. **Use tools helpers**: Always use `tools.WithKubeconfig()` or `tools.WithTalosconfig()` instead of manually setting environment variables
2. **Get paths from tools package**: Use `tools.GetKubeconfigPath()` or `tools.GetTalosconfigPath()` to construct config paths
3. **One config per command**: Each exec.Command should have its config set via the appropriate helper
4. **Error handling**: Always check for command execution errors and provide context

73
Makefile Normal file
View File

@@ -0,0 +1,73 @@
.PHONY: help build dev test clean install lint fmt vet
# Binary name
BINARY_NAME=wildd
BUILD_DIR=build
# Go parameters
GOCMD=go
GOBUILD=$(GOCMD) build
GOCLEAN=$(GOCMD) clean
GOTEST=$(GOCMD) test
GOGET=$(GOCMD) get
GOMOD=$(GOCMD) mod
GOFMT=$(GOCMD) fmt
GOVET=$(GOCMD) vet
# Build flags
LDFLAGS=-ldflags "-s -w"
help: ## Show this help message
@echo 'Usage: make [target]'
@echo ''
@echo 'Available targets:'
@awk 'BEGIN {FS = ":.*?## "} /^[a-zA-Z_-]+:.*?## / {printf " %-15s %s\n", $$1, $$2}' $(MAKEFILE_LIST)
build: ## Build the daemon binary
@echo "Building $(BINARY_NAME)..."
@mkdir -p $(BUILD_DIR)
$(GOBUILD) $(LDFLAGS) -o $(BUILD_DIR)/$(BINARY_NAME) .
dev: ## Run the daemon in development mode with live reloading
@echo "Starting $(BINARY_NAME) in development mode..."
$(GOCMD) run .
test: ## Run tests
@echo "Running tests..."
$(GOTEST) -v ./...
test-cover: ## Run tests with coverage
@echo "Running tests with coverage..."
$(GOTEST) -v -coverprofile=coverage.out ./...
$(GOCMD) tool cover -html=coverage.out -o coverage.html
@echo "Coverage report generated: coverage.html"
clean: ## Clean build artifacts
@echo "Cleaning..."
$(GOCLEAN)
@rm -rf $(BUILD_DIR)
@rm -f coverage.out coverage.html
install: build ## Install the binary to $(go env GOPATH)/bin
@echo "Installing $(BINARY_NAME)..."
@mkdir -p $$($(GOCMD) env GOPATH)/bin
@cp $(BUILD_DIR)/$(BINARY_NAME) $$($(GOCMD) env GOPATH)/bin/
deps: ## Download dependencies
@echo "Downloading dependencies..."
$(GOMOD) download
$(GOMOD) tidy
fmt: ## Format Go code
@echo "Formatting code..."
$(GOFMT) ./...
vet: ## Run go vet
@echo "Running go vet..."
$(GOVET) ./...
lint: fmt vet ## Run formatters and linters
check: lint test ## Run all checks (lint + test)
all: clean lint test build ## Clean, lint, test, and build

178
README.md
View File

@@ -1,168 +1,38 @@
# Wild Central Daemon
# Wild Central API
The Wild Central Daemon is a lightweight service that runs on a local machine (e.g., a Raspberry Pi) to manage Wild Cloud instances on the local network. It provides an interface for users to interact with and manage their Wild Cloud environments.
The Wild Central API is a lightweight service that runs on a local machine (e.g., a Raspberry Pi) to manage Wild Cloud instances on the local network. It provides an interface for users to interact with and manage their Wild Cloud environments.
## Development
Start the development server:
```bash
make dev
```
## Usage
The API will be available at `http://localhost:5055`.
### Batch Configuration Update Endpoint
### Environment Variables
#### Overview
- `WILD_API_DATA_DIR` - Directory for instance data (default: `/var/lib/wild-central`)
- `WILD_DIRECTORY` - Path to Wild Cloud apps directory (default: `/opt/wild-cloud/apps`)
- `WILD_API_DNSMASQ_CONFIG_PATH` - Path to dnsmasq config file (default: `/etc/dnsmasq.d/wild-cloud.conf`)
- `WILD_CORS_ORIGINS` - Comma-separated list of allowed CORS origins for production (default: localhost development origins)
The batch configuration update endpoint allows updating multiple configuration values in a single atomic request.
## API Endpoints
#### Endpoint
The API provides the following endpoint categories:
```
PATCH /api/v1/instances/{name}/config
```
- **Instances** - Create, list, get, and delete Wild Cloud instances
- **Configuration** - Manage instance config.yaml
- **Secrets** - Manage instance secrets.yaml (redacted by default)
- **Nodes** - Discover, configure, and manage cluster nodes
- **Cluster** - Bootstrap and manage Talos/Kubernetes clusters
- **Services** - Install and manage base infrastructure services
- **Apps** - Deploy and manage Wild Cloud applications
- **PXE** - Manage PXE boot assets for network installation
- **Operations** - Track and stream long-running operations
- **Utilities** - Helper functions and status endpoints
- **dnsmasq** - Configure and manage dnsmasq for network services
#### Request Format
```json
{
"updates": [
{"path": "string", "value": "any"},
{"path": "string", "value": "any"}
]
}
```
#### Response Format
Success (200 OK):
```json
{
"message": "Configuration updated successfully",
"updated": 3
}
```
Error (400 Bad Request / 404 Not Found / 500 Internal Server Error):
```json
{
"error": "error message"
}
```
#### Usage Examples
##### Example 1: Update Basic Configuration Values
```bash
curl -X PATCH http://localhost:8080/api/v1/instances/my-cloud/config \
-H "Content-Type: application/json" \
-d '{
"updates": [
{"path": "baseDomain", "value": "example.com"},
{"path": "domain", "value": "wild.example.com"},
{"path": "internalDomain", "value": "int.wild.example.com"}
]
}'
```
Response:
```json
{
"message": "Configuration updated successfully",
"updated": 3
}
```
##### Example 2: Update Nested Configuration Values
```bash
curl -X PATCH http://localhost:8080/api/v1/instances/my-cloud/config \
-H "Content-Type: application/json" \
-d '{
"updates": [
{"path": "cluster.name", "value": "prod-cluster"},
{"path": "cluster.loadBalancerIp", "value": "192.168.1.100"},
{"path": "cluster.ipAddressPool", "value": "192.168.1.100-192.168.1.200"}
]
}'
```
##### Example 3: Update Array Values
```bash
curl -X PATCH http://localhost:8080/api/v1/instances/my-cloud/config \
-H "Content-Type: application/json" \
-d '{
"updates": [
{"path": "cluster.nodes.activeNodes[0]", "value": "node-1"},
{"path": "cluster.nodes.activeNodes[1]", "value": "node-2"}
]
}'
```
##### Example 4: Error Handling - Invalid Instance
```bash
curl -X PATCH http://localhost:8080/api/v1/instances/nonexistent/config \
-H "Content-Type: application/json" \
-d '{
"updates": [
{"path": "baseDomain", "value": "example.com"}
]
}'
```
Response (404):
```json
{
"error": "Instance not found: instance nonexistent does not exist"
}
```
##### Example 5: Error Handling - Empty Updates
```bash
curl -X PATCH http://localhost:8080/api/v1/instances/my-cloud/config \
-H "Content-Type: application/json" \
-d '{
"updates": []
}'
```
Response (400):
```json
{
"error": "updates array is required and cannot be empty"
}
```
##### Example 6: Error Handling - Missing Path
```bash
curl -X PATCH http://localhost:8080/api/v1/instances/my-cloud/config \
-H "Content-Type: application/json" \
-d '{
"updates": [
{"path": "", "value": "example.com"}
]
}'
```
Response (400):
```json
{
"error": "update[0]: path is required"
}
```
#### Configuration Path Syntax
The `path` field uses YAML path syntax as implemented by the `yq` tool:
- Simple fields: `baseDomain`
- Nested fields: `cluster.name`
- Array elements: `cluster.nodes.activeNodes[0]`
- Array append: `cluster.nodes.activeNodes[+]`
Refer to the yq documentation for advanced path syntax.
See the API handler files in `internal/api/v1/` for detailed endpoint documentation.

7
go.mod
View File

@@ -4,5 +4,12 @@ go 1.24
require (
github.com/gorilla/mux v1.8.1
github.com/rs/cors v1.11.1
gopkg.in/yaml.v3 v3.0.1
)
require (
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/stretchr/testify v1.11.1 // indirect
)

8
go.sum
View File

@@ -1,5 +1,13 @@
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/gorilla/mux v1.8.1 h1:TuBL49tXwgrFYWhqrNgrUNEY92u81SPhu7sTdzQEiWY=
github.com/gorilla/mux v1.8.1/go.mod h1:AKf9I4AEqPTmMytcMc0KkNouC66V3BtZ4qD5fmWSiMQ=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/rs/cors v1.11.1 h1:eU3gRzXLRK57F5rKMGMZURNdIG4EoAmX8k94r9wXWHA=
github.com/rs/cors v1.11.1/go.mod h1:XyqrcTp5zjWr1wsJ8PIRZssZ8b/WMcMf71DJnit4EMU=
github.com/stretchr/testify v1.11.1 h1:7s2iGBzp5EwR7/aIZr8ao5+dra3wiQyKjjFuvgVKu7U=
github.com/stretchr/testify v1.11.1/go.mod h1:wZwfW3scLgRK+23gO65QZefKpKQRnfz6sD981Nm4B6U=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=

View File

@@ -4,9 +4,9 @@ import (
"encoding/json"
"fmt"
"io"
"log"
"net/http"
"os"
"path/filepath"
"time"
"github.com/gorilla/mux"
@@ -14,71 +14,81 @@ import (
"github.com/wild-cloud/wild-central/daemon/internal/config"
"github.com/wild-cloud/wild-central/daemon/internal/context"
"github.com/wild-cloud/wild-central/daemon/internal/dnsmasq"
"github.com/wild-cloud/wild-central/daemon/internal/instance"
"github.com/wild-cloud/wild-central/daemon/internal/operations"
"github.com/wild-cloud/wild-central/daemon/internal/secrets"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// API holds all dependencies for API handlers
type API struct {
dataDir string
directoryPath string // Path to Wild Cloud Directory
appsDir string
config *config.Manager
secrets *secrets.Manager
context *context.Manager
instance *instance.Manager
broadcaster *operations.Broadcaster // SSE broadcaster for operation output
dataDir string
appsDir string // Path to external apps directory
config *config.Manager
secrets *secrets.Manager
context *context.Manager
instance *instance.Manager
dnsmasq *dnsmasq.ConfigGenerator
opsMgr *operations.Manager // Operations manager
broadcaster *operations.Broadcaster // SSE broadcaster for operation output
}
// NewAPI creates a new API handler with all dependencies
func NewAPI(dataDir, directoryPath string) (*API, error) {
// Note: Setup files (cluster-services, cluster-nodes, etc.) are now embedded in the binary
func NewAPI(dataDir, appsDir string) (*API, error) {
// Ensure base directories exist
instancesDir := filepath.Join(dataDir, "instances")
instancesDir := tools.GetInstancesPath(dataDir)
if err := os.MkdirAll(instancesDir, 0755); err != nil {
return nil, fmt.Errorf("failed to create instances directory: %w", err)
}
// Apps directory is now in Wild Cloud Directory
appsDir := filepath.Join(directoryPath, "apps")
// Determine dnsmasq config path
dnsmasqConfigPath := "/etc/dnsmasq.d/wild-cloud.conf"
if os.Getenv("WILD_API_DNSMASQ_CONFIG_PATH") != "" {
dnsmasqConfigPath = os.Getenv("WILD_API_DNSMASQ_CONFIG_PATH")
log.Printf("Using custom dnsmasq config path: %s", dnsmasqConfigPath)
}
return &API{
dataDir: dataDir,
directoryPath: directoryPath,
appsDir: appsDir,
config: config.NewManager(),
secrets: secrets.NewManager(),
context: context.NewManager(dataDir),
instance: instance.NewManager(dataDir),
broadcaster: operations.NewBroadcaster(),
dataDir: dataDir,
appsDir: appsDir,
config: config.NewManager(),
secrets: secrets.NewManager(),
context: context.NewManager(dataDir),
instance: instance.NewManager(dataDir),
dnsmasq: dnsmasq.NewConfigGenerator(dnsmasqConfigPath),
opsMgr: operations.NewManager(dataDir),
broadcaster: operations.NewBroadcaster(),
}, nil
}
// RegisterRoutes registers all API routes (Phase 1 + Phase 2)
func (api *API) RegisterRoutes(r *mux.Router) {
// Phase 1: Instance management
// Instance management
r.HandleFunc("/api/v1/instances", api.CreateInstance).Methods("POST")
r.HandleFunc("/api/v1/instances", api.ListInstances).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}", api.GetInstance).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}", api.DeleteInstance).Methods("DELETE")
// Phase 1: Config management
// Config management
r.HandleFunc("/api/v1/instances/{name}/config", api.GetConfig).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/config", api.UpdateConfig).Methods("PUT")
r.HandleFunc("/api/v1/instances/{name}/config", api.ConfigUpdateBatch).Methods("PATCH")
// Phase 1: Secrets management
// Secrets management
r.HandleFunc("/api/v1/instances/{name}/secrets", api.GetSecrets).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/secrets", api.UpdateSecrets).Methods("PUT")
// Phase 1: Context management
// Context management
r.HandleFunc("/api/v1/context", api.GetContext).Methods("GET")
r.HandleFunc("/api/v1/context", api.SetContext).Methods("POST")
// Phase 2: Node management
// Node management
r.HandleFunc("/api/v1/instances/{name}/nodes/discover", api.NodeDiscover).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/nodes/detect", api.NodeDetect).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/discovery", api.NodeDiscoveryStatus).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/discovery/cancel", api.NodeDiscoveryCancel).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/nodes/hardware/{ip}", api.NodeHardware).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/nodes/fetch-templates", api.NodeFetchTemplates).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/nodes", api.NodeAdd).Methods("POST")
@@ -86,21 +96,28 @@ func (api *API) RegisterRoutes(r *mux.Router) {
r.HandleFunc("/api/v1/instances/{name}/nodes/{node}", api.NodeGet).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/nodes/{node}", api.NodeUpdate).Methods("PUT")
r.HandleFunc("/api/v1/instances/{name}/nodes/{node}/apply", api.NodeApply).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/nodes/{node}/reset", api.NodeReset).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/nodes/{node}", api.NodeDelete).Methods("DELETE")
// Phase 2: PXE asset management
r.HandleFunc("/api/v1/instances/{name}/pxe/assets", api.PXEListAssets).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/pxe/assets/download", api.PXEDownloadAsset).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/pxe/assets/{type}", api.PXEGetAsset).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/pxe/assets/{type}", api.PXEDeleteAsset).Methods("DELETE")
// PXE Asset management (schematic@version composite key)
r.HandleFunc("/api/v1/pxe/assets", api.AssetsList).Methods("GET")
r.HandleFunc("/api/v1/pxe/assets/{schematicId}/{version}", api.AssetsGet).Methods("GET")
r.HandleFunc("/api/v1/pxe/assets/{schematicId}/{version}/download", api.AssetsDownload).Methods("POST")
r.HandleFunc("/api/v1/pxe/assets/{schematicId}/{version}", api.AssetsDelete).Methods("DELETE")
r.HandleFunc("/api/v1/pxe/assets/{schematicId}/{version}/pxe/{assetType}", api.AssetsServePXE).Methods("GET")
r.HandleFunc("/api/v1/pxe/assets/{schematicId}/{version}/status", api.AssetsGetStatus).Methods("GET")
// Phase 2: Operations
// Instance-schematic relationship
r.HandleFunc("/api/v1/instances/{name}/schematic", api.SchematicGetInstanceSchematic).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/schematic", api.SchematicUpdateInstanceSchematic).Methods("PUT")
// Operations
r.HandleFunc("/api/v1/instances/{name}/operations", api.OperationList).Methods("GET")
r.HandleFunc("/api/v1/operations/{id}", api.OperationGet).Methods("GET")
r.HandleFunc("/api/v1/operations/{id}/stream", api.OperationStream).Methods("GET")
r.HandleFunc("/api/v1/operations/{id}/cancel", api.OperationCancel).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/operations/{id}", api.OperationGet).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/operations/{id}/stream", api.OperationStream).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/operations/{id}/cancel", api.OperationCancel).Methods("POST")
// Phase 3: Cluster operations
// Cluster operations
r.HandleFunc("/api/v1/instances/{name}/cluster/config/generate", api.ClusterGenerateConfig).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/cluster/bootstrap", api.ClusterBootstrap).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/cluster/endpoints", api.ClusterConfigureEndpoints).Methods("POST")
@@ -111,7 +128,7 @@ func (api *API) RegisterRoutes(r *mux.Router) {
r.HandleFunc("/api/v1/instances/{name}/cluster/talosconfig", api.ClusterGetTalosconfig).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/cluster/reset", api.ClusterReset).Methods("POST")
// Phase 4: Services
// Services
r.HandleFunc("/api/v1/instances/{name}/services", api.ServicesList).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/services", api.ServicesInstall).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/services/install-all", api.ServicesInstallAll).Methods("POST")
@@ -126,8 +143,10 @@ func (api *API) RegisterRoutes(r *mux.Router) {
r.HandleFunc("/api/v1/instances/{name}/services/{service}/fetch", api.ServicesFetch).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/services/{service}/compile", api.ServicesCompile).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/services/{service}/deploy", api.ServicesDeploy).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/services/{service}/logs", api.ServicesGetLogs).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/services/{service}/config", api.ServicesUpdateConfig).Methods("PATCH")
// Phase 4: Apps
// Apps
r.HandleFunc("/api/v1/apps", api.AppsListAvailable).Methods("GET")
r.HandleFunc("/api/v1/apps/{app}", api.AppsGetAvailable).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/apps", api.AppsListDeployed).Methods("GET")
@@ -136,19 +155,32 @@ func (api *API) RegisterRoutes(r *mux.Router) {
r.HandleFunc("/api/v1/instances/{name}/apps/{app}", api.AppsDelete).Methods("DELETE")
r.HandleFunc("/api/v1/instances/{name}/apps/{app}/status", api.AppsGetStatus).Methods("GET")
// Phase 5: Backup & Restore
// Enhanced app endpoints
r.HandleFunc("/api/v1/instances/{name}/apps/{app}/enhanced", api.AppsGetEnhanced).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/apps/{app}/runtime", api.AppsGetEnhancedStatus).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/apps/{app}/logs", api.AppsGetLogs).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/apps/{app}/events", api.AppsGetEvents).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/apps/{app}/readme", api.AppsGetReadme).Methods("GET")
// Backup & Restore
r.HandleFunc("/api/v1/instances/{name}/apps/{app}/backup", api.BackupAppStart).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/apps/{app}/backup", api.BackupAppList).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/apps/{app}/restore", api.BackupAppRestore).Methods("POST")
// Phase 5: Utilities
r.HandleFunc("/api/v1/utilities/health", api.UtilitiesHealth).Methods("GET")
// Utilities
r.HandleFunc("/api/v1/instances/{name}/utilities/health", api.InstanceUtilitiesHealth).Methods("GET")
r.HandleFunc("/api/v1/utilities/dashboard/token", api.UtilitiesDashboardToken).Methods("GET")
r.HandleFunc("/api/v1/utilities/nodes/ips", api.UtilitiesNodeIPs).Methods("GET")
r.HandleFunc("/api/v1/utilities/controlplane/ip", api.UtilitiesControlPlaneIP).Methods("GET")
r.HandleFunc("/api/v1/utilities/secrets/{secret}/copy", api.UtilitiesSecretCopy).Methods("POST")
r.HandleFunc("/api/v1/utilities/version", api.UtilitiesVersion).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/utilities/dashboard/token", api.UtilitiesDashboardToken).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/utilities/nodes/ips", api.UtilitiesNodeIPs).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/utilities/controlplane/ip", api.UtilitiesControlPlaneIP).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/utilities/secrets/{secret}/copy", api.UtilitiesSecretCopy).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/utilities/version", api.UtilitiesVersion).Methods("GET")
// dnsmasq management
r.HandleFunc("/api/v1/dnsmasq/status", api.DnsmasqStatus).Methods("GET")
r.HandleFunc("/api/v1/dnsmasq/config", api.DnsmasqGetConfig).Methods("GET")
r.HandleFunc("/api/v1/dnsmasq/restart", api.DnsmasqRestart).Methods("POST")
r.HandleFunc("/api/v1/dnsmasq/generate", api.DnsmasqGenerate).Methods("POST")
r.HandleFunc("/api/v1/dnsmasq/update", api.DnsmasqUpdate).Methods("POST")
}
// CreateInstance creates a new instance
@@ -172,10 +204,19 @@ func (api *API) CreateInstance(w http.ResponseWriter, r *http.Request) {
return
}
respondJSON(w, http.StatusCreated, map[string]string{
// Attempt to update dnsmasq configuration with all instances
// This is non-critical - include warning in response if it fails
response := map[string]interface{}{
"name": req.Name,
"message": "Instance created successfully",
})
}
if err := api.updateDnsmasqForAllInstances(); err != nil {
log.Printf("Warning: Could not update dnsmasq configuration: %v", err)
response["warning"] = fmt.Sprintf("dnsmasq update failed: %v. Use POST /api/v1/dnsmasq/update to retry.", err)
}
respondJSON(w, http.StatusCreated, response)
}
// ListInstances lists all instances
@@ -262,12 +303,9 @@ func (api *API) GetConfig(w http.ResponseWriter, r *http.Request) {
respondJSON(w, http.StatusOK, configMap)
}
// UpdateConfig updates instance configuration
func (api *API) UpdateConfig(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
name := vars["name"]
if err := api.instance.ValidateInstance(name); err != nil {
// updateYAMLFile updates a YAML file with the provided key-value pairs
func (api *API) updateYAMLFile(w http.ResponseWriter, r *http.Request, instanceName, fileType string) {
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
@@ -284,22 +322,71 @@ func (api *API) UpdateConfig(w http.ResponseWriter, r *http.Request) {
return
}
configPath := api.instance.GetInstanceConfigPath(name)
var filePath string
if fileType == "config" {
filePath = api.instance.GetInstanceConfigPath(instanceName)
} else {
filePath = api.instance.GetInstanceSecretsPath(instanceName)
}
// Update each key-value pair
for key, value := range updates {
valueStr := fmt.Sprintf("%v", value)
if err := api.config.SetConfigValue(configPath, key, valueStr); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to update config key %s: %v", key, err))
// Read existing config/secrets file
existingContent, err := storage.ReadFile(filePath)
if err != nil && !os.IsNotExist(err) {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to read existing %s: %v", fileType, err))
return
}
// Parse existing content or initialize empty map
var existingConfig map[string]interface{}
if len(existingContent) > 0 {
if err := yaml.Unmarshal(existingContent, &existingConfig); err != nil {
respondError(w, http.StatusBadRequest, fmt.Sprintf("Failed to parse existing %s: %v", fileType, err))
return
}
} else {
existingConfig = make(map[string]interface{})
}
// Merge updates into existing config (shallow merge for top-level keys)
// This preserves unmodified keys while updating specified ones
for key, value := range updates {
existingConfig[key] = value
}
// Marshal the merged config back to YAML with proper formatting
yamlContent, err := yaml.Marshal(existingConfig)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to marshal YAML: %v", err))
return
}
// Write the complete merged YAML content to the file with proper locking
lockPath := filePath + ".lock"
if err := storage.WithLock(lockPath, func() error {
return storage.WriteFile(filePath, yamlContent, 0644)
}); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to update %s: %v", fileType, err))
return
}
// Capitalize first letter of fileType for message
fileTypeCap := fileType
if len(fileType) > 0 {
fileTypeCap = string(fileType[0]-32) + fileType[1:]
}
respondJSON(w, http.StatusOK, map[string]string{
"message": "Config updated successfully",
"message": fmt.Sprintf("%s updated successfully", fileTypeCap),
})
}
// UpdateConfig updates instance configuration
func (api *API) UpdateConfig(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
name := vars["name"]
api.updateYAMLFile(w, r, name, "config")
}
// GetSecrets retrieves instance secrets (redacted by default)
func (api *API) GetSecrets(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
@@ -345,39 +432,7 @@ func (api *API) GetSecrets(w http.ResponseWriter, r *http.Request) {
func (api *API) UpdateSecrets(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
name := vars["name"]
if err := api.instance.ValidateInstance(name); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
body, err := io.ReadAll(r.Body)
if err != nil {
respondError(w, http.StatusBadRequest, "Failed to read request body")
return
}
var updates map[string]interface{}
if err := yaml.Unmarshal(body, &updates); err != nil {
respondError(w, http.StatusBadRequest, fmt.Sprintf("Invalid YAML: %v", err))
return
}
// Get secrets file path
secretsPath := api.instance.GetInstanceSecretsPath(name)
// Update each secret
for key, value := range updates {
valueStr := fmt.Sprintf("%v", value)
if err := api.secrets.SetSecret(secretsPath, key, valueStr); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to update secret %s: %v", key, err))
return
}
}
respondJSON(w, http.StatusOK, map[string]string{
"message": "Secrets updated successfully",
})
api.updateYAMLFile(w, r, name, "secrets")
}
// GetContext retrieves current context
@@ -427,7 +482,7 @@ func (api *API) SetContext(w http.ResponseWriter, r *http.Request) {
}
// StatusHandler returns daemon status information
func (api *API) StatusHandler(w http.ResponseWriter, r *http.Request, startTime time.Time, dataDir, directoryPath string) {
func (api *API) StatusHandler(w http.ResponseWriter, r *http.Request, startTime time.Time, dataDir, appsDir string) {
// Get list of instances
instances, err := api.instance.ListInstances()
if err != nil {
@@ -443,7 +498,8 @@ func (api *API) StatusHandler(w http.ResponseWriter, r *http.Request, startTime
"uptime": uptime.String(),
"uptimeSeconds": int(uptime.Seconds()),
"dataDir": dataDir,
"directoryPath": directoryPath,
"appsDir": appsDir,
"setupFiles": "embedded", // Indicate that setup files are now embedded
"instances": map[string]interface{}{
"count": len(instances),
"names": instances,
@@ -456,7 +512,7 @@ func (api *API) StatusHandler(w http.ResponseWriter, r *http.Request, startTime
func respondJSON(w http.ResponseWriter, status int, data interface{}) {
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(status)
json.NewEncoder(w).Encode(data)
_ = json.NewEncoder(w).Encode(data)
}
func respondError(w http.ResponseWriter, status int, message string) {

View File

@@ -4,11 +4,16 @@ import (
"encoding/json"
"fmt"
"net/http"
"os"
"path/filepath"
"strconv"
"strings"
"github.com/gorilla/mux"
"github.com/wild-cloud/wild-central/daemon/internal/apps"
"github.com/wild-cloud/wild-central/daemon/internal/operations"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// AppsListAvailable lists all available apps
@@ -106,80 +111,62 @@ func (api *API) AppsAdd(w http.ResponseWriter, r *http.Request) {
})
}
// AppsDeploy deploys an app to the cluster
func (api *API) AppsDeploy(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
appName := vars["app"]
// startAppOperation starts an app operation (deploy or delete) in the background
func (api *API) startAppOperation(w http.ResponseWriter, instanceName, appName, operationType, successMessage string, operation func(*apps.Manager, string, string) error) {
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Start deploy operation
// Start operation
opsMgr := operations.NewManager(api.dataDir)
opID, err := opsMgr.Start(instanceName, "deploy_app", appName)
opID, err := opsMgr.Start(instanceName, operationType, appName)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to start operation: %v", err))
return
}
// Deploy in background
// Execute operation in background
go func() {
appsMgr := apps.NewManager(api.dataDir, api.appsDir)
opsMgr.UpdateStatus(instanceName, opID, "running")
_ = opsMgr.UpdateStatus(instanceName, opID, "running")
if err := appsMgr.Deploy(instanceName, appName); err != nil {
opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
if err := operation(appsMgr, instanceName, appName); err != nil {
_ = opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
} else {
opsMgr.Update(instanceName, opID, "completed", "App deployed", 100)
_ = opsMgr.Update(instanceName, opID, "completed", successMessage, 100)
}
}()
respondJSON(w, http.StatusAccepted, map[string]string{
"operation_id": opID,
"message": "App deployment initiated",
"message": fmt.Sprintf("App %s initiated", operationType),
})
}
// AppsDeploy deploys an app to the cluster
func (api *API) AppsDeploy(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
appName := vars["app"]
api.startAppOperation(w, instanceName, appName, "deploy_app", "App deployed",
func(mgr *apps.Manager, instance, app string) error {
return mgr.Deploy(instance, app)
})
}
// AppsDelete deletes an app
func (api *API) AppsDelete(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
appName := vars["app"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Start delete operation
opsMgr := operations.NewManager(api.dataDir)
opID, err := opsMgr.Start(instanceName, "delete_app", appName)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to start operation: %v", err))
return
}
// Delete in background
go func() {
appsMgr := apps.NewManager(api.dataDir, api.appsDir)
opsMgr.UpdateStatus(instanceName, opID, "running")
if err := appsMgr.Delete(instanceName, appName); err != nil {
opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
} else {
opsMgr.Update(instanceName, opID, "completed", "App deleted", 100)
}
}()
respondJSON(w, http.StatusAccepted, map[string]string{
"operation_id": opID,
"message": "App deletion initiated",
})
api.startAppOperation(w, instanceName, appName, "delete_app", "App deleted",
func(mgr *apps.Manager, instance, app string) error {
return mgr.Delete(instance, app)
})
}
// AppsGetStatus returns app status
@@ -204,3 +191,190 @@ func (api *API) AppsGetStatus(w http.ResponseWriter, r *http.Request) {
respondJSON(w, http.StatusOK, status)
}
// AppsGetEnhanced returns enhanced app details with runtime status
func (api *API) AppsGetEnhanced(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
appName := vars["app"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Get enhanced app details
appsMgr := apps.NewManager(api.dataDir, api.appsDir)
enhanced, err := appsMgr.GetEnhanced(instanceName, appName)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to get app details: %v", err))
return
}
respondJSON(w, http.StatusOK, enhanced)
}
// AppsGetEnhancedStatus returns just runtime status for an app
func (api *API) AppsGetEnhancedStatus(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
appName := vars["app"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Get runtime status
appsMgr := apps.NewManager(api.dataDir, api.appsDir)
status, err := appsMgr.GetEnhancedStatus(instanceName, appName)
if err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Failed to get runtime status: %v", err))
return
}
respondJSON(w, http.StatusOK, status)
}
// AppsGetLogs returns logs for an app (from first pod)
func (api *API) AppsGetLogs(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
appName := vars["app"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Parse query parameters
tailStr := r.URL.Query().Get("tail")
sinceSecondsStr := r.URL.Query().Get("sinceSeconds")
podName := r.URL.Query().Get("pod")
tail := 100 // default
if tailStr != "" {
if t, err := strconv.Atoi(tailStr); err == nil && t > 0 {
tail = t
}
}
sinceSeconds := 0
if sinceSecondsStr != "" {
if s, err := strconv.Atoi(sinceSecondsStr); err == nil && s > 0 {
sinceSeconds = s
}
}
// Get logs
kubeconfigPath := api.dataDir + "/instances/" + instanceName + "/kubeconfig"
kubectl := tools.NewKubectl(kubeconfigPath)
// If no pod specified, get the first pod
if podName == "" {
pods, err := kubectl.GetPods(appName, true)
if err != nil || len(pods) == 0 {
respondError(w, http.StatusNotFound, "No pods found for app")
return
}
podName = pods[0].Name
}
logOpts := tools.LogOptions{
Tail: tail,
SinceSeconds: sinceSeconds,
}
logs, err := kubectl.GetLogs(appName, podName, logOpts)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to get logs: %v", err))
return
}
respondJSON(w, http.StatusOK, map[string]interface{}{
"pod": podName,
"logs": logs,
})
}
// AppsGetEvents returns kubernetes events for an app
func (api *API) AppsGetEvents(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
appName := vars["app"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Parse query parameters
limitStr := r.URL.Query().Get("limit")
limit := 20 // default
if limitStr != "" {
if l, err := strconv.Atoi(limitStr); err == nil && l > 0 {
limit = l
}
}
// Get events
kubeconfigPath := api.dataDir + "/instances/" + instanceName + "/kubeconfig"
kubectl := tools.NewKubectl(kubeconfigPath)
events, err := kubectl.GetRecentEvents(appName, limit)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to get events: %v", err))
return
}
respondJSON(w, http.StatusOK, map[string]interface{}{
"events": events,
})
}
// AppsGetReadme returns the README.md content for an app
func (api *API) AppsGetReadme(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
appName := vars["app"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Validate app name to prevent path traversal
if appName == "" || appName == "." || appName == ".." ||
strings.Contains(appName, "/") || strings.Contains(appName, "\\") {
respondError(w, http.StatusBadRequest, "Invalid app name")
return
}
// Try instance-specific README first
instancePath := filepath.Join(api.dataDir, "instances", instanceName, "apps", appName, "README.md")
content, err := os.ReadFile(instancePath)
if err == nil {
w.Header().Set("Content-Type", "text/markdown; charset=utf-8")
w.Write(content)
return
}
// Fall back to global directory
globalPath := filepath.Join(api.appsDir, appName, "README.md")
content, err = os.ReadFile(globalPath)
if err != nil {
if os.IsNotExist(err) {
respondError(w, http.StatusNotFound, fmt.Sprintf("README not found for app '%s' in instance '%s'", appName, instanceName))
} else {
respondError(w, http.StatusInternalServerError, "Failed to read README file")
}
return
}
w.Header().Set("Content-Type", "text/markdown; charset=utf-8")
w.Write(content)
}

View File

@@ -0,0 +1,172 @@
package v1
import (
"encoding/json"
"fmt"
"net/http"
"os"
"github.com/gorilla/mux"
"github.com/wild-cloud/wild-central/daemon/internal/assets"
)
// AssetsList lists all available assets (schematic@version combinations)
func (api *API) AssetsList(w http.ResponseWriter, r *http.Request) {
assetsMgr := assets.NewManager(api.dataDir)
assetList, err := assetsMgr.ListAssets()
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to list assets: %v", err))
return
}
respondJSON(w, http.StatusOK, map[string]interface{}{
"assets": assetList,
})
}
// AssetsGet returns details for a specific asset (schematic@version)
func (api *API) AssetsGet(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
schematicID := vars["schematicId"]
version := vars["version"]
assetsMgr := assets.NewManager(api.dataDir)
asset, err := assetsMgr.GetAsset(schematicID, version)
if err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Asset not found: %v", err))
return
}
respondJSON(w, http.StatusOK, asset)
}
// AssetsDownload downloads assets for a schematic@version
func (api *API) AssetsDownload(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
schematicID := vars["schematicId"]
version := vars["version"]
// Parse request body
var req struct {
Platform string `json:"platform,omitempty"`
AssetTypes []string `json:"asset_types,omitempty"`
}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
respondError(w, http.StatusBadRequest, "Invalid request body")
return
}
// Default platform to amd64 if not specified
if req.Platform == "" {
req.Platform = "amd64"
}
// Download assets
assetsMgr := assets.NewManager(api.dataDir)
if err := assetsMgr.DownloadAssets(schematicID, version, req.Platform, req.AssetTypes); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to download assets: %v", err))
return
}
respondJSON(w, http.StatusOK, map[string]interface{}{
"message": "Assets downloaded successfully",
"schematic_id": schematicID,
"version": version,
"platform": req.Platform,
})
}
// AssetsServePXE serves a PXE asset file
func (api *API) AssetsServePXE(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
schematicID := vars["schematicId"]
version := vars["version"]
assetType := vars["assetType"]
assetsMgr := assets.NewManager(api.dataDir)
// Get asset path
assetPath, err := assetsMgr.GetAssetPath(schematicID, version, assetType)
if err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Asset not found: %v", err))
return
}
// Open file
file, err := os.Open(assetPath)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to open asset: %v", err))
return
}
defer file.Close()
// Get file info for size
info, err := file.Stat()
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to stat asset: %v", err))
return
}
// Set appropriate content type
var contentType string
switch assetType {
case "kernel":
contentType = "application/octet-stream"
case "initramfs":
contentType = "application/x-xz"
case "iso":
contentType = "application/x-iso9660-image"
default:
contentType = "application/octet-stream"
}
// Set headers
w.Header().Set("Content-Type", contentType)
w.Header().Set("Content-Length", fmt.Sprintf("%d", info.Size()))
// Set Content-Disposition to suggest filename for download
w.Header().Set("Content-Disposition", fmt.Sprintf("attachment; filename=\"%s\"", info.Name()))
// Serve file
http.ServeContent(w, r, info.Name(), info.ModTime(), file)
}
// AssetsGetStatus returns download status for a schematic@version
func (api *API) AssetsGetStatus(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
schematicID := vars["schematicId"]
version := vars["version"]
assetsMgr := assets.NewManager(api.dataDir)
status, err := assetsMgr.GetAssetStatus(schematicID, version)
if err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Asset not found: %v", err))
return
}
respondJSON(w, http.StatusOK, status)
}
// AssetsDelete deletes an asset (schematic@version) and all its files
func (api *API) AssetsDelete(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
schematicID := vars["schematicId"]
version := vars["version"]
assetsMgr := assets.NewManager(api.dataDir)
if err := assetsMgr.DeleteAsset(schematicID, version); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to delete asset: %v", err))
return
}
respondJSON(w, http.StatusOK, map[string]string{
"message": "Asset deleted successfully",
"schematic_id": schematicID,
"version": version,
})
}

View File

@@ -27,15 +27,15 @@ func (api *API) BackupAppStart(w http.ResponseWriter, r *http.Request) {
// Run backup in background
go func() {
opMgr.UpdateProgress(instanceName, opID, 10, "Starting backup")
_ = opMgr.UpdateProgress(instanceName, opID, 10, "Starting backup")
info, err := mgr.BackupApp(instanceName, appName)
if err != nil {
opMgr.Update(instanceName, opID, "failed", err.Error(), 100)
_ = opMgr.Update(instanceName, opID, "failed", err.Error(), 100)
return
}
opMgr.Update(instanceName, opID, "completed", "Backup completed", 100)
_ = opMgr.Update(instanceName, opID, "completed", "Backup completed", 100)
_ = info // Metadata saved in backup.json
}()
@@ -92,14 +92,14 @@ func (api *API) BackupAppRestore(w http.ResponseWriter, r *http.Request) {
// Run restore in background
go func() {
opMgr.UpdateProgress(instanceName, opID, 10, "Starting restore")
_ = opMgr.UpdateProgress(instanceName, opID, 10, "Starting restore")
if err := mgr.RestoreApp(instanceName, appName, opts); err != nil {
opMgr.Update(instanceName, opID, "failed", err.Error(), 100)
_ = opMgr.Update(instanceName, opID, "failed", err.Error(), 100)
return
}
opMgr.Update(instanceName, opID, "completed", "Restore completed", 100)
_ = opMgr.Update(instanceName, opID, "completed", "Restore completed", 100)
}()
respondJSON(w, http.StatusAccepted, map[string]interface{}{

View File

@@ -46,15 +46,15 @@ func (api *API) ClusterGenerateConfig(w http.ResponseWriter, r *http.Request) {
}
// Create cluster config
config := cluster.ClusterConfig{
clusterConfig := cluster.ClusterConfig{
ClusterName: clusterName,
VIP: vip,
Version: version,
}
// Generate configuration
clusterMgr := cluster.NewManager(api.dataDir)
if err := clusterMgr.GenerateConfig(instanceName, &config); err != nil {
clusterMgr := cluster.NewManager(api.dataDir, api.opsMgr)
if err := clusterMgr.GenerateConfig(instanceName, &clusterConfig); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to generate config: %v", err))
return
}
@@ -90,26 +90,14 @@ func (api *API) ClusterBootstrap(w http.ResponseWriter, r *http.Request) {
return
}
// Start bootstrap operation
opsMgr := operations.NewManager(api.dataDir)
opID, err := opsMgr.Start(instanceName, "bootstrap", req.Node)
// Bootstrap with progress tracking
clusterMgr := cluster.NewManager(api.dataDir, api.opsMgr)
opID, err := clusterMgr.Bootstrap(instanceName, req.Node)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to start operation: %v", err))
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to start bootstrap: %v", err))
return
}
// Bootstrap in background
go func() {
clusterMgr := cluster.NewManager(api.dataDir)
opsMgr.UpdateStatus(instanceName, opID, "running")
if err := clusterMgr.Bootstrap(instanceName, req.Node); err != nil {
opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
} else {
opsMgr.Update(instanceName, opID, "completed", "Bootstrap completed", 100)
}
}()
respondJSON(w, http.StatusAccepted, map[string]string{
"operation_id": opID,
"message": "Bootstrap initiated",
@@ -138,7 +126,7 @@ func (api *API) ClusterConfigureEndpoints(w http.ResponseWriter, r *http.Request
}
// Configure endpoints
clusterMgr := cluster.NewManager(api.dataDir)
clusterMgr := cluster.NewManager(api.dataDir, api.opsMgr)
if err := clusterMgr.ConfigureEndpoints(instanceName, req.IncludeNodes); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to configure endpoints: %v", err))
return
@@ -161,7 +149,7 @@ func (api *API) ClusterGetStatus(w http.ResponseWriter, r *http.Request) {
}
// Get status
clusterMgr := cluster.NewManager(api.dataDir)
clusterMgr := cluster.NewManager(api.dataDir, api.opsMgr)
status, err := clusterMgr.GetStatus(instanceName)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to get status: %v", err))
@@ -183,7 +171,7 @@ func (api *API) ClusterHealth(w http.ResponseWriter, r *http.Request) {
}
// Get health checks
clusterMgr := cluster.NewManager(api.dataDir)
clusterMgr := cluster.NewManager(api.dataDir, api.opsMgr)
checks, err := clusterMgr.Health(instanceName)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to get health: %v", err))
@@ -219,7 +207,7 @@ func (api *API) ClusterGetKubeconfig(w http.ResponseWriter, r *http.Request) {
}
// Get kubeconfig
clusterMgr := cluster.NewManager(api.dataDir)
clusterMgr := cluster.NewManager(api.dataDir, api.opsMgr)
kubeconfig, err := clusterMgr.GetKubeconfig(instanceName)
if err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Kubeconfig not found: %v", err))
@@ -243,7 +231,7 @@ func (api *API) ClusterGenerateKubeconfig(w http.ResponseWriter, r *http.Request
}
// Regenerate kubeconfig from cluster
clusterMgr := cluster.NewManager(api.dataDir)
clusterMgr := cluster.NewManager(api.dataDir, api.opsMgr)
if err := clusterMgr.RegenerateKubeconfig(instanceName); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to generate kubeconfig: %v", err))
return
@@ -266,7 +254,7 @@ func (api *API) ClusterGetTalosconfig(w http.ResponseWriter, r *http.Request) {
}
// Get talosconfig
clusterMgr := cluster.NewManager(api.dataDir)
clusterMgr := cluster.NewManager(api.dataDir, api.opsMgr)
talosconfig, err := clusterMgr.GetTalosconfig(instanceName)
if err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Talosconfig not found: %v", err))
@@ -314,13 +302,13 @@ func (api *API) ClusterReset(w http.ResponseWriter, r *http.Request) {
// Reset in background
go func() {
clusterMgr := cluster.NewManager(api.dataDir)
opsMgr.UpdateStatus(instanceName, opID, "running")
clusterMgr := cluster.NewManager(api.dataDir, api.opsMgr)
_ = opsMgr.UpdateStatus(instanceName, opID, "running")
if err := clusterMgr.Reset(instanceName, req.Confirm); err != nil {
opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
_ = opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
} else {
opsMgr.Update(instanceName, opID, "completed", "Cluster reset completed", 100)
_ = opsMgr.Update(instanceName, opID, "completed", "Cluster reset completed", 100)
}
}()

View File

@@ -0,0 +1,656 @@
package v1
import (
"bytes"
"net/http"
"net/http/httptest"
"os"
"path/filepath"
"testing"
"github.com/gorilla/mux"
"gopkg.in/yaml.v3"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
)
func setupTestAPI(t *testing.T) (*API, string) {
tmpDir := t.TempDir()
appsDir := filepath.Join(tmpDir, "apps")
api, err := NewAPI(tmpDir, appsDir)
if err != nil {
t.Fatalf("Failed to create test API: %v", err)
}
return api, tmpDir
}
func createTestInstance(t *testing.T, api *API, name string) {
if err := api.instance.CreateInstance(name); err != nil {
t.Fatalf("Failed to create test instance: %v", err)
}
}
func TestUpdateYAMLFile_DeltaUpdate(t *testing.T) {
api, _ := setupTestAPI(t)
instanceName := "test-instance"
createTestInstance(t, api, instanceName)
configPath := api.instance.GetInstanceConfigPath(instanceName)
// Create initial config
initialConfig := map[string]interface{}{
"domain": "old.com",
"email": "admin@old.com",
"cluster": map[string]interface{}{
"name": "test-cluster",
},
}
initialYAML, _ := yaml.Marshal(initialConfig)
if err := storage.WriteFile(configPath, initialYAML, 0644); err != nil {
t.Fatalf("Failed to write initial config: %v", err)
}
// Update only domain
updateData := map[string]interface{}{
"domain": "new.com",
}
updateYAML, _ := yaml.Marshal(updateData)
req := httptest.NewRequest("PUT", "/api/v1/instances/"+instanceName+"/config", bytes.NewBuffer(updateYAML))
w := httptest.NewRecorder()
vars := map[string]string{"name": instanceName}
req = mux.SetURLVars(req, vars)
api.UpdateConfig(w, req)
if w.Code != http.StatusOK {
t.Fatalf("Expected status 200, got %d: %s", w.Code, w.Body.String())
}
// Verify merged config
resultData, err := storage.ReadFile(configPath)
if err != nil {
t.Fatalf("Failed to read result: %v", err)
}
var result map[string]interface{}
if err := yaml.Unmarshal(resultData, &result); err != nil {
t.Fatalf("Failed to parse result: %v", err)
}
// Domain should be updated
if result["domain"] != "new.com" {
t.Errorf("Expected domain='new.com', got %v", result["domain"])
}
// Email should be preserved
if result["email"] != "admin@old.com" {
t.Errorf("Expected email='admin@old.com', got %v", result["email"])
}
// Cluster should be preserved
if cluster, ok := result["cluster"].(map[string]interface{}); !ok {
t.Errorf("Cluster not preserved as map")
} else if cluster["name"] != "test-cluster" {
t.Errorf("Cluster name not preserved")
}
}
func TestUpdateYAMLFile_FullReplacement(t *testing.T) {
api, _ := setupTestAPI(t)
instanceName := "test-instance"
createTestInstance(t, api, instanceName)
configPath := api.instance.GetInstanceConfigPath(instanceName)
// Create initial config
initialConfig := map[string]interface{}{
"domain": "old.com",
"email": "admin@old.com",
"oldKey": "oldValue",
}
initialYAML, _ := yaml.Marshal(initialConfig)
if err := storage.WriteFile(configPath, initialYAML, 0644); err != nil {
t.Fatalf("Failed to write initial config: %v", err)
}
// Full replacement
newConfig := map[string]interface{}{
"domain": "new.com",
"email": "new@new.com",
"newKey": "newValue",
}
newYAML, _ := yaml.Marshal(newConfig)
req := httptest.NewRequest("PUT", "/api/v1/instances/"+instanceName+"/config", bytes.NewBuffer(newYAML))
w := httptest.NewRecorder()
vars := map[string]string{"name": instanceName}
req = mux.SetURLVars(req, vars)
api.UpdateConfig(w, req)
if w.Code != http.StatusOK {
t.Fatalf("Expected status 200, got %d: %s", w.Code, w.Body.String())
}
// Verify result
resultData, err := storage.ReadFile(configPath)
if err != nil {
t.Fatalf("Failed to read result: %v", err)
}
var result map[string]interface{}
if err := yaml.Unmarshal(resultData, &result); err != nil {
t.Fatalf("Failed to parse result: %v", err)
}
// All new values should be present
if result["domain"] != "new.com" {
t.Errorf("Expected domain='new.com', got %v", result["domain"])
}
if result["email"] != "new@new.com" {
t.Errorf("Expected email='new@new.com', got %v", result["email"])
}
if result["newKey"] != "newValue" {
t.Errorf("Expected newKey='newValue', got %v", result["newKey"])
}
// Old key should still be present (shallow merge)
if result["oldKey"] != "oldValue" {
t.Errorf("Expected oldKey='oldValue', got %v", result["oldKey"])
}
}
func TestUpdateYAMLFile_NestedStructure(t *testing.T) {
api, _ := setupTestAPI(t)
instanceName := "test-instance"
createTestInstance(t, api, instanceName)
configPath := api.instance.GetInstanceConfigPath(instanceName)
// Update with nested structure
updateData := map[string]interface{}{
"cloud": map[string]interface{}{
"domain": "test.com",
"dns": map[string]interface{}{
"ip": "1.2.3.4",
"port": 53,
},
},
}
updateYAML, _ := yaml.Marshal(updateData)
req := httptest.NewRequest("PUT", "/api/v1/instances/"+instanceName+"/config", bytes.NewBuffer(updateYAML))
w := httptest.NewRecorder()
vars := map[string]string{"name": instanceName}
req = mux.SetURLVars(req, vars)
api.UpdateConfig(w, req)
if w.Code != http.StatusOK {
t.Fatalf("Expected status 200, got %d: %s", w.Code, w.Body.String())
}
// Verify nested structure preserved
resultData, err := storage.ReadFile(configPath)
if err != nil {
t.Fatalf("Failed to read result: %v", err)
}
var result map[string]interface{}
if err := yaml.Unmarshal(resultData, &result); err != nil {
t.Fatalf("Failed to parse result: %v", err)
}
// Verify nested structure is proper YAML, not Go map notation
resultStr := string(resultData)
if bytes.Contains(resultData, []byte("map[")) {
t.Errorf("Result contains Go map notation: %s", resultStr)
}
// Verify structure is accessible
cloud, ok := result["cloud"].(map[string]interface{})
if !ok {
t.Fatalf("cloud is not a map: %T", result["cloud"])
}
if cloud["domain"] != "test.com" {
t.Errorf("Expected cloud.domain='test.com', got %v", cloud["domain"])
}
dns, ok := cloud["dns"].(map[string]interface{})
if !ok {
t.Fatalf("cloud.dns is not a map: %T", cloud["dns"])
}
if dns["ip"] != "1.2.3.4" {
t.Errorf("Expected dns.ip='1.2.3.4', got %v", dns["ip"])
}
if dns["port"] != 53 {
t.Errorf("Expected dns.port=53, got %v", dns["port"])
}
}
func TestUpdateYAMLFile_EmptyFileCreation(t *testing.T) {
api, _ := setupTestAPI(t)
instanceName := "test-instance"
createTestInstance(t, api, instanceName)
configPath := api.instance.GetInstanceConfigPath(instanceName)
// Truncate the config file to make it empty (but still exists)
if err := storage.WriteFile(configPath, []byte(""), 0644); err != nil {
t.Fatalf("Failed to empty config file: %v", err)
}
// Update should populate empty file
updateData := map[string]interface{}{
"domain": "new.com",
"email": "admin@new.com",
}
updateYAML, _ := yaml.Marshal(updateData)
req := httptest.NewRequest("PUT", "/api/v1/instances/"+instanceName+"/config", bytes.NewBuffer(updateYAML))
w := httptest.NewRecorder()
vars := map[string]string{"name": instanceName}
req = mux.SetURLVars(req, vars)
api.UpdateConfig(w, req)
if w.Code != http.StatusOK {
t.Fatalf("Expected status 200, got %d: %s", w.Code, w.Body.String())
}
// Verify content
resultData, err := storage.ReadFile(configPath)
if err != nil {
t.Fatalf("Failed to read result: %v", err)
}
var result map[string]interface{}
if err := yaml.Unmarshal(resultData, &result); err != nil {
t.Fatalf("Failed to parse result: %v", err)
}
if result["domain"] != "new.com" {
t.Errorf("Expected domain='new.com', got %v", result["domain"])
}
if result["email"] != "admin@new.com" {
t.Errorf("Expected email='admin@new.com', got %v", result["email"])
}
}
func TestUpdateYAMLFile_EmptyUpdate(t *testing.T) {
api, _ := setupTestAPI(t)
instanceName := "test-instance"
createTestInstance(t, api, instanceName)
configPath := api.instance.GetInstanceConfigPath(instanceName)
// Create initial config
initialConfig := map[string]interface{}{
"domain": "test.com",
}
initialYAML, _ := yaml.Marshal(initialConfig)
if err := storage.WriteFile(configPath, initialYAML, 0644); err != nil {
t.Fatalf("Failed to write initial config: %v", err)
}
// Empty update
updateData := map[string]interface{}{}
updateYAML, _ := yaml.Marshal(updateData)
req := httptest.NewRequest("PUT", "/api/v1/instances/"+instanceName+"/config", bytes.NewBuffer(updateYAML))
w := httptest.NewRecorder()
vars := map[string]string{"name": instanceName}
req = mux.SetURLVars(req, vars)
api.UpdateConfig(w, req)
if w.Code != http.StatusOK {
t.Fatalf("Expected status 200, got %d: %s", w.Code, w.Body.String())
}
// Verify file unchanged
resultData, err := storage.ReadFile(configPath)
if err != nil {
t.Fatalf("Failed to read result: %v", err)
}
var result map[string]interface{}
if err := yaml.Unmarshal(resultData, &result); err != nil {
t.Fatalf("Failed to parse result: %v", err)
}
if result["domain"] != "test.com" {
t.Errorf("Expected domain='test.com', got %v", result["domain"])
}
}
func TestUpdateYAMLFile_YAMLFormatting(t *testing.T) {
api, _ := setupTestAPI(t)
instanceName := "test-instance"
createTestInstance(t, api, instanceName)
configPath := api.instance.GetInstanceConfigPath(instanceName)
// Update with complex nested structure
updateData := map[string]interface{}{
"cloud": map[string]interface{}{
"domain": "test.com",
"dns": map[string]interface{}{
"ip": "1.2.3.4",
},
},
"cluster": map[string]interface{}{
"nodes": []interface{}{
map[string]interface{}{
"name": "node1",
"ip": "10.0.0.1",
},
map[string]interface{}{
"name": "node2",
"ip": "10.0.0.2",
},
},
},
}
updateYAML, _ := yaml.Marshal(updateData)
req := httptest.NewRequest("PUT", "/api/v1/instances/"+instanceName+"/config", bytes.NewBuffer(updateYAML))
w := httptest.NewRecorder()
vars := map[string]string{"name": instanceName}
req = mux.SetURLVars(req, vars)
api.UpdateConfig(w, req)
if w.Code != http.StatusOK {
t.Fatalf("Expected status 200, got %d: %s", w.Code, w.Body.String())
}
// Verify YAML formatting
resultData, err := storage.ReadFile(configPath)
if err != nil {
t.Fatalf("Failed to read result: %v", err)
}
resultStr := string(resultData)
// Should not contain Go map notation
if bytes.Contains(resultData, []byte("map[")) {
t.Errorf("Result contains Go map notation: %s", resultStr)
}
// Should be valid YAML
var result map[string]interface{}
if err := yaml.Unmarshal(resultData, &result); err != nil {
t.Fatalf("Result is not valid YAML: %v", err)
}
// Should have proper indentation (check for nested structure indicators)
if !bytes.Contains(resultData, []byte(" ")) {
t.Error("Result appears to lack proper indentation")
}
}
func TestUpdateYAMLFile_InvalidYAML(t *testing.T) {
api, _ := setupTestAPI(t)
instanceName := "test-instance"
createTestInstance(t, api, instanceName)
// Send invalid YAML
invalidYAML := []byte("invalid: yaml: content: [")
req := httptest.NewRequest("PUT", "/api/v1/instances/"+instanceName+"/config", bytes.NewBuffer(invalidYAML))
w := httptest.NewRecorder()
vars := map[string]string{"name": instanceName}
req = mux.SetURLVars(req, vars)
api.UpdateConfig(w, req)
if w.Code != http.StatusBadRequest {
t.Errorf("Expected status 400, got %d", w.Code)
}
}
func TestUpdateYAMLFile_InvalidInstance(t *testing.T) {
api, _ := setupTestAPI(t)
updateData := map[string]interface{}{
"domain": "test.com",
}
updateYAML, _ := yaml.Marshal(updateData)
req := httptest.NewRequest("PUT", "/api/v1/instances/nonexistent/config", bytes.NewBuffer(updateYAML))
w := httptest.NewRecorder()
vars := map[string]string{"name": "nonexistent"}
req = mux.SetURLVars(req, vars)
api.UpdateConfig(w, req)
if w.Code != http.StatusNotFound {
t.Errorf("Expected status 404, got %d", w.Code)
}
}
func TestUpdateYAMLFile_FilePermissions(t *testing.T) {
api, _ := setupTestAPI(t)
instanceName := "test-instance"
createTestInstance(t, api, instanceName)
configPath := api.instance.GetInstanceConfigPath(instanceName)
updateData := map[string]interface{}{
"domain": "test.com",
}
updateYAML, _ := yaml.Marshal(updateData)
req := httptest.NewRequest("PUT", "/api/v1/instances/"+instanceName+"/config", bytes.NewBuffer(updateYAML))
w := httptest.NewRecorder()
vars := map[string]string{"name": instanceName}
req = mux.SetURLVars(req, vars)
api.UpdateConfig(w, req)
if w.Code != http.StatusOK {
t.Fatalf("Expected status 200, got %d: %s", w.Code, w.Body.String())
}
// Check file permissions
info, err := os.Stat(configPath)
if err != nil {
t.Fatalf("Failed to stat config file: %v", err)
}
expectedPerm := os.FileMode(0644)
if info.Mode().Perm() != expectedPerm {
t.Errorf("Expected permissions %v, got %v", expectedPerm, info.Mode().Perm())
}
}
func TestUpdateYAMLFile_UpdateSecrets(t *testing.T) {
api, _ := setupTestAPI(t)
instanceName := "test-instance"
createTestInstance(t, api, instanceName)
secretsPath := api.instance.GetInstanceSecretsPath(instanceName)
// Update secrets
updateData := map[string]interface{}{
"dbPassword": "secret123",
"apiKey": "key456",
}
updateYAML, _ := yaml.Marshal(updateData)
req := httptest.NewRequest("PUT", "/api/v1/instances/"+instanceName+"/secrets", bytes.NewBuffer(updateYAML))
w := httptest.NewRecorder()
vars := map[string]string{"name": instanceName}
req = mux.SetURLVars(req, vars)
api.UpdateSecrets(w, req)
if w.Code != http.StatusOK {
t.Fatalf("Expected status 200, got %d: %s", w.Code, w.Body.String())
}
// Verify secrets file created and contains data
resultData, err := storage.ReadFile(secretsPath)
if err != nil {
t.Fatalf("Failed to read secrets: %v", err)
}
var result map[string]interface{}
if err := yaml.Unmarshal(resultData, &result); err != nil {
t.Fatalf("Failed to parse secrets: %v", err)
}
if result["dbPassword"] != "secret123" {
t.Errorf("Expected dbPassword='secret123', got %v", result["dbPassword"])
}
if result["apiKey"] != "key456" {
t.Errorf("Expected apiKey='key456', got %v", result["apiKey"])
}
}
func TestUpdateYAMLFile_ConcurrentUpdates(t *testing.T) {
api, _ := setupTestAPI(t)
instanceName := "test-instance"
createTestInstance(t, api, instanceName)
// This test verifies that file locking prevents race conditions
// We'll simulate concurrent updates and verify data integrity
numUpdates := 10
done := make(chan bool, numUpdates)
for i := 0; i < numUpdates; i++ {
go func(index int) {
updateData := map[string]interface{}{
"counter": index,
}
updateYAML, _ := yaml.Marshal(updateData)
req := httptest.NewRequest("PUT", "/api/v1/instances/"+instanceName+"/config", bytes.NewBuffer(updateYAML))
w := httptest.NewRecorder()
vars := map[string]string{"name": instanceName}
req = mux.SetURLVars(req, vars)
api.UpdateConfig(w, req)
done <- w.Code == http.StatusOK
}(i)
}
// Wait for all updates to complete
successCount := 0
for i := 0; i < numUpdates; i++ {
if <-done {
successCount++
}
}
if successCount != numUpdates {
t.Errorf("Expected %d successful updates, got %d", numUpdates, successCount)
}
// Verify file is still valid YAML
configPath := api.instance.GetInstanceConfigPath(instanceName)
resultData, err := storage.ReadFile(configPath)
if err != nil {
t.Fatalf("Failed to read final config: %v", err)
}
var result map[string]interface{}
if err := yaml.Unmarshal(resultData, &result); err != nil {
t.Fatalf("Final config is not valid YAML: %v", err)
}
}
func TestUpdateYAMLFile_PreservesComplexTypes(t *testing.T) {
api, _ := setupTestAPI(t)
instanceName := "test-instance"
createTestInstance(t, api, instanceName)
configPath := api.instance.GetInstanceConfigPath(instanceName)
// Create config with various types
updateData := map[string]interface{}{
"stringValue": "text",
"intValue": 42,
"floatValue": 3.14,
"boolValue": true,
"arrayValue": []interface{}{"a", "b", "c"},
"mapValue": map[string]interface{}{
"nested": "value",
},
"nullValue": nil,
}
updateYAML, _ := yaml.Marshal(updateData)
req := httptest.NewRequest("PUT", "/api/v1/instances/"+instanceName+"/config", bytes.NewBuffer(updateYAML))
w := httptest.NewRecorder()
vars := map[string]string{"name": instanceName}
req = mux.SetURLVars(req, vars)
api.UpdateConfig(w, req)
if w.Code != http.StatusOK {
t.Fatalf("Expected status 200, got %d: %s", w.Code, w.Body.String())
}
// Verify types preserved
resultData, err := storage.ReadFile(configPath)
if err != nil {
t.Fatalf("Failed to read result: %v", err)
}
var result map[string]interface{}
if err := yaml.Unmarshal(resultData, &result); err != nil {
t.Fatalf("Failed to parse result: %v", err)
}
if result["stringValue"] != "text" {
t.Errorf("String value not preserved: %v", result["stringValue"])
}
if result["intValue"] != 42 {
t.Errorf("Int value not preserved: %v", result["intValue"])
}
if result["floatValue"] != 3.14 {
t.Errorf("Float value not preserved: %v", result["floatValue"])
}
if result["boolValue"] != true {
t.Errorf("Bool value not preserved: %v", result["boolValue"])
}
arrayValue, ok := result["arrayValue"].([]interface{})
if !ok {
t.Errorf("Array not preserved as slice: %T", result["arrayValue"])
} else if len(arrayValue) != 3 {
t.Errorf("Array length not preserved: %d", len(arrayValue))
}
mapValue, ok := result["mapValue"].(map[string]interface{})
if !ok {
t.Errorf("Map not preserved: %T", result["mapValue"])
} else if mapValue["nested"] != "value" {
t.Errorf("Nested map value not preserved: %v", mapValue["nested"])
}
if result["nullValue"] != nil {
t.Errorf("Null value not preserved: %v", result["nullValue"])
}
}

View File

@@ -0,0 +1,137 @@
package v1
import (
"fmt"
"log"
"net/http"
"github.com/wild-cloud/wild-central/daemon/internal/config"
)
// DnsmasqStatus returns the status of the dnsmasq service
func (api *API) DnsmasqStatus(w http.ResponseWriter, r *http.Request) {
status, err := api.dnsmasq.GetStatus()
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to get dnsmasq status: %v", err))
return
}
if status.Status != "active" {
w.WriteHeader(http.StatusServiceUnavailable)
}
respondJSON(w, http.StatusOK, status)
}
// DnsmasqGetConfig returns the current dnsmasq configuration
func (api *API) DnsmasqGetConfig(w http.ResponseWriter, r *http.Request) {
configContent, err := api.dnsmasq.ReadConfig()
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to read dnsmasq config: %v", err))
return
}
respondJSON(w, http.StatusOK, map[string]interface{}{
"config_file": api.dnsmasq.GetConfigPath(),
"content": configContent,
})
}
// DnsmasqRestart restarts the dnsmasq service
func (api *API) DnsmasqRestart(w http.ResponseWriter, r *http.Request) {
if err := api.dnsmasq.RestartService(); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to restart dnsmasq: %v", err))
return
}
respondJSON(w, http.StatusOK, map[string]string{
"message": "dnsmasq service restarted successfully",
})
}
// DnsmasqGenerate generates the dnsmasq configuration without applying it (dry-run)
func (api *API) DnsmasqGenerate(w http.ResponseWriter, r *http.Request) {
// Get all instances
instanceNames, err := api.instance.ListInstances()
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to list instances: %v", err))
return
}
// Load global config
globalConfigPath := api.getGlobalConfigPath()
globalCfg, err := config.LoadGlobalConfig(globalConfigPath)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to load global config: %v", err))
return
}
// Load all instance configs
var instanceConfigs []config.InstanceConfig
for _, name := range instanceNames {
instanceConfigPath := api.instance.GetInstanceConfigPath(name)
instanceCfg, err := config.LoadCloudConfig(instanceConfigPath)
if err != nil {
log.Printf("Warning: Could not load instance config for %s: %v", name, err)
continue
}
instanceConfigs = append(instanceConfigs, *instanceCfg)
}
// Generate config without writing or restarting
configContent := api.dnsmasq.Generate(globalCfg, instanceConfigs)
respondJSON(w, http.StatusOK, map[string]interface{}{
"message": "dnsmasq configuration generated (dry-run mode)",
"config": configContent,
})
}
// DnsmasqUpdate regenerates and updates the dnsmasq configuration with all instances
func (api *API) DnsmasqUpdate(w http.ResponseWriter, r *http.Request) {
if err := api.updateDnsmasqForAllInstances(); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to update dnsmasq: %v", err))
return
}
respondJSON(w, http.StatusOK, map[string]string{
"message": "dnsmasq configuration updated successfully",
})
}
// updateDnsmasqForAllInstances helper regenerates dnsmasq config from all instances
func (api *API) updateDnsmasqForAllInstances() error {
// Get all instances
instanceNames, err := api.instance.ListInstances()
if err != nil {
return fmt.Errorf("listing instances: %w", err)
}
// Load global config
globalConfigPath := api.getGlobalConfigPath()
globalCfg, err := config.LoadGlobalConfig(globalConfigPath)
if err != nil {
return fmt.Errorf("loading global config: %w", err)
}
// Load all instance configs
var instanceConfigs []config.InstanceConfig
for _, name := range instanceNames {
instanceConfigPath := api.instance.GetInstanceConfigPath(name)
instanceCfg, err := config.LoadCloudConfig(instanceConfigPath)
if err != nil {
log.Printf("Warning: Could not load instance config for %s: %v", name, err)
continue
}
instanceConfigs = append(instanceConfigs, *instanceCfg)
}
// Regenerate and write dnsmasq config
return api.dnsmasq.UpdateConfig(globalCfg, instanceConfigs)
}
// getGlobalConfigPath returns the path to the global config file
func (api *API) getGlobalConfigPath() string {
// This should match the structure from data.Paths
return api.dataDir + "/config.yaml"
}

View File

@@ -4,6 +4,7 @@ import (
"encoding/json"
"fmt"
"net/http"
"strings"
"github.com/gorilla/mux"
@@ -12,6 +13,7 @@ import (
)
// NodeDiscover initiates node discovery
// Accepts optional subnet parameter. If no subnet provided, auto-detects local networks.
func (api *API) NodeDiscover(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
@@ -22,9 +24,9 @@ func (api *API) NodeDiscover(w http.ResponseWriter, r *http.Request) {
return
}
// Parse request body
// Parse request body - only subnet is supported
var req struct {
IPList []string `json:"ip_list"`
Subnet string `json:"subnet,omitempty"`
}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
@@ -32,21 +34,51 @@ func (api *API) NodeDiscover(w http.ResponseWriter, r *http.Request) {
return
}
if len(req.IPList) == 0 {
respondError(w, http.StatusBadRequest, "ip_list is required")
return
// Build IP list
var ipList []string
var err error
if req.Subnet != "" {
// Expand provided CIDR notation to individual IPs
ipList, err = discovery.ExpandSubnet(req.Subnet)
if err != nil {
respondError(w, http.StatusBadRequest, fmt.Sprintf("Invalid subnet: %v", err))
return
}
} else {
// Auto-detect: Get local networks when no subnet provided
networks, err := discovery.GetLocalNetworks()
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to detect local networks: %v", err))
return
}
if len(networks) == 0 {
respondError(w, http.StatusNotFound, "No local networks found")
return
}
// Expand all detected networks
for _, network := range networks {
ips, err := discovery.ExpandSubnet(network)
if err != nil {
continue // Skip invalid networks
}
ipList = append(ipList, ips...)
}
}
// Start discovery
discoveryMgr := discovery.NewManager(api.dataDir, instanceName)
if err := discoveryMgr.StartDiscovery(instanceName, req.IPList); err != nil {
if err := discoveryMgr.StartDiscovery(instanceName, ipList); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to start discovery: %v", err))
return
}
respondJSON(w, http.StatusAccepted, map[string]string{
"message": "Discovery started",
"status": "running",
respondJSON(w, http.StatusAccepted, map[string]interface{}{
"message": "Discovery started",
"status": "running",
"ips_to_scan": len(ipList),
})
}
@@ -84,7 +116,7 @@ func (api *API) NodeHardware(w http.ResponseWriter, r *http.Request) {
}
// Detect hardware
nodeMgr := node.NewManager(api.dataDir)
nodeMgr := node.NewManager(api.dataDir, instanceName)
hwInfo, err := nodeMgr.DetectHardware(nodeIP)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to detect hardware: %v", err))
@@ -95,6 +127,7 @@ func (api *API) NodeHardware(w http.ResponseWriter, r *http.Request) {
}
// NodeDetect detects hardware on a single node (POST with IP in body)
// IP address is required.
func (api *API) NodeDetect(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
@@ -115,13 +148,14 @@ func (api *API) NodeDetect(w http.ResponseWriter, r *http.Request) {
return
}
// Validate IP is provided
if req.IP == "" {
respondError(w, http.StatusBadRequest, "ip is required")
respondError(w, http.StatusBadRequest, "IP address is required")
return
}
// Detect hardware
nodeMgr := node.NewManager(api.dataDir)
// Detect hardware for specific IP
nodeMgr := node.NewManager(api.dataDir, instanceName)
hwInfo, err := nodeMgr.DetectHardware(req.IP)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to detect hardware: %v", err))
@@ -150,7 +184,7 @@ func (api *API) NodeAdd(w http.ResponseWriter, r *http.Request) {
}
// Add node
nodeMgr := node.NewManager(api.dataDir)
nodeMgr := node.NewManager(api.dataDir, instanceName)
if err := nodeMgr.Add(instanceName, &nodeData); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to add node: %v", err))
return
@@ -174,7 +208,7 @@ func (api *API) NodeList(w http.ResponseWriter, r *http.Request) {
}
// List nodes
nodeMgr := node.NewManager(api.dataDir)
nodeMgr := node.NewManager(api.dataDir, instanceName)
nodes, err := nodeMgr.List(instanceName)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to list nodes: %v", err))
@@ -199,7 +233,7 @@ func (api *API) NodeGet(w http.ResponseWriter, r *http.Request) {
}
// Get node
nodeMgr := node.NewManager(api.dataDir)
nodeMgr := node.NewManager(api.dataDir, instanceName)
nodeData, err := nodeMgr.Get(instanceName, nodeIdentifier)
if err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Node not found: %v", err))
@@ -225,7 +259,7 @@ func (api *API) NodeApply(w http.ResponseWriter, r *http.Request) {
opts := node.ApplyOptions{}
// Apply node configuration
nodeMgr := node.NewManager(api.dataDir)
nodeMgr := node.NewManager(api.dataDir, instanceName)
if err := nodeMgr.Apply(instanceName, nodeIdentifier, opts); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to apply node configuration: %v", err))
return
@@ -257,7 +291,7 @@ func (api *API) NodeUpdate(w http.ResponseWriter, r *http.Request) {
}
// Update node
nodeMgr := node.NewManager(api.dataDir)
nodeMgr := node.NewManager(api.dataDir, instanceName)
if err := nodeMgr.Update(instanceName, nodeIdentifier, updates); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to update node: %v", err))
return
@@ -281,7 +315,7 @@ func (api *API) NodeFetchTemplates(w http.ResponseWriter, r *http.Request) {
}
// Fetch templates
nodeMgr := node.NewManager(api.dataDir)
nodeMgr := node.NewManager(api.dataDir, instanceName)
if err := nodeMgr.FetchTemplates(instanceName); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to fetch templates: %v", err))
return
@@ -293,6 +327,7 @@ func (api *API) NodeFetchTemplates(w http.ResponseWriter, r *http.Request) {
}
// NodeDelete removes a node
// Query parameter: skip_reset=true to force delete without resetting
func (api *API) NodeDelete(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
@@ -304,14 +339,76 @@ func (api *API) NodeDelete(w http.ResponseWriter, r *http.Request) {
return
}
// Delete node
nodeMgr := node.NewManager(api.dataDir)
if err := nodeMgr.Delete(instanceName, nodeIdentifier); err != nil {
// Parse skip_reset query parameter (default: false)
skipReset := r.URL.Query().Get("skip_reset") == "true"
// Delete node (with reset unless skipReset=true)
nodeMgr := node.NewManager(api.dataDir, instanceName)
if err := nodeMgr.Delete(instanceName, nodeIdentifier, skipReset); err != nil {
// Check if it's a reset-related error
errMsg := err.Error()
if !skipReset && (strings.Contains(errMsg, "reset") || strings.Contains(errMsg, "timed out")) {
respondError(w, http.StatusConflict, errMsg)
return
}
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to delete node: %v", err))
return
}
message := "Node deleted successfully"
if !skipReset {
message = "Node reset and removed successfully"
}
respondJSON(w, http.StatusOK, map[string]string{
"message": "Node deleted successfully",
"message": message,
})
}
// NodeDiscoveryCancel cancels an in-progress discovery operation
func (api *API) NodeDiscoveryCancel(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Cancel discovery
discoveryMgr := discovery.NewManager(api.dataDir, instanceName)
if err := discoveryMgr.CancelDiscovery(instanceName); err != nil {
respondError(w, http.StatusBadRequest, fmt.Sprintf("Failed to cancel discovery: %v", err))
return
}
respondJSON(w, http.StatusOK, map[string]string{
"message": "Discovery cancelled successfully",
})
}
// NodeReset resets a node to maintenance mode
func (api *API) NodeReset(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
nodeIdentifier := vars["node"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Reset node
nodeMgr := node.NewManager(api.dataDir, instanceName)
if err := nodeMgr.Reset(instanceName, nodeIdentifier); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to reset node: %v", err))
return
}
respondJSON(w, http.StatusOK, map[string]string{
"message": "Node reset successfully - now in maintenance mode",
"node": nodeIdentifier,
})
}

View File

@@ -11,17 +11,18 @@ import (
"github.com/gorilla/mux"
"github.com/wild-cloud/wild-central/daemon/internal/operations"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// OperationGet returns operation status
func (api *API) OperationGet(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
opID := vars["id"]
// Extract instance name from query param or header
instanceName := r.URL.Query().Get("instance")
if instanceName == "" {
respondError(w, http.StatusBadRequest, "instance parameter is required")
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
@@ -63,12 +64,12 @@ func (api *API) OperationList(w http.ResponseWriter, r *http.Request) {
// OperationCancel cancels an operation
func (api *API) OperationCancel(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
opID := vars["id"]
// Extract instance name from query param
instanceName := r.URL.Query().Get("instance")
if instanceName == "" {
respondError(w, http.StatusBadRequest, "instance parameter is required")
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
@@ -88,12 +89,12 @@ func (api *API) OperationCancel(w http.ResponseWriter, r *http.Request) {
// OperationStream streams operation output via Server-Sent Events (SSE)
func (api *API) OperationStream(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
opID := vars["id"]
// Extract instance name from query param
instanceName := r.URL.Query().Get("instance")
if instanceName == "" {
respondError(w, http.StatusBadRequest, "instance parameter is required")
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
@@ -110,7 +111,7 @@ func (api *API) OperationStream(w http.ResponseWriter, r *http.Request) {
}
// Check if operation is already completed
statusFile := filepath.Join(api.dataDir, "instances", instanceName, "operations", opID+".json")
statusFile := filepath.Join(tools.GetInstanceOperationsPath(api.dataDir, instanceName), opID+".json")
isCompleted := false
if data, err := os.ReadFile(statusFile); err == nil {
var op map[string]interface{}
@@ -122,7 +123,7 @@ func (api *API) OperationStream(w http.ResponseWriter, r *http.Request) {
}
// Send existing log file content first (if exists)
logPath := filepath.Join(api.dataDir, "instances", instanceName, "operations", opID, "output.log")
logPath := filepath.Join(tools.GetInstanceOperationsPath(api.dataDir, instanceName), opID, "output.log")
if _, err := os.Stat(logPath); err == nil {
file, err := os.Open(logPath)
if err == nil {

View File

@@ -3,42 +3,64 @@ package v1
import (
"encoding/json"
"fmt"
"log"
"net/http"
"github.com/gorilla/mux"
"github.com/wild-cloud/wild-central/daemon/internal/assets"
"github.com/wild-cloud/wild-central/daemon/internal/pxe"
)
// PXEListAssets lists all PXE assets for an instance
// DEPRECATED: This endpoint is deprecated. Use GET /api/v1/assets/{schematicId} instead.
func (api *API) PXEListAssets(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
// Add deprecation warning header
w.Header().Set("X-Deprecated", "This endpoint is deprecated. Use GET /api/v1/assets/{schematicId} instead.")
log.Printf("Warning: Deprecated endpoint /api/v1/instances/%s/pxe/assets called", instanceName)
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// List assets
pxeMgr := pxe.NewManager(api.dataDir)
assets, err := pxeMgr.ListAssets(instanceName)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to list assets: %v", err))
// Get schematic ID from instance config
configPath := api.instance.GetInstanceConfigPath(instanceName)
schematicID, err := api.config.GetConfigValue(configPath, "cluster.nodes.talos.schematicId")
if err != nil || schematicID == "" || schematicID == "null" {
// Fall back to old PXE manager if no schematic configured
pxeMgr := pxe.NewManager(api.dataDir)
assets, err := pxeMgr.ListAssets(instanceName)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to list assets: %v", err))
return
}
respondJSON(w, http.StatusOK, map[string]interface{}{
"assets": assets,
})
return
}
respondJSON(w, http.StatusOK, map[string]interface{}{
"assets": assets,
"assets": []interface{}{},
"message": "Please use the new /api/v1/pxe/assets endpoint with both schematic ID and version",
})
}
// PXEDownloadAsset downloads a PXE asset
// DEPRECATED: This endpoint is deprecated. Use POST /api/v1/assets/{schematicId}/download instead.
func (api *API) PXEDownloadAsset(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
// Add deprecation warning header
w.Header().Set("X-Deprecated", "This endpoint is deprecated. Use POST /api/v1/assets/{schematicId}/download instead.")
log.Printf("Warning: Deprecated endpoint /api/v1/instances/%s/pxe/assets/download called", instanceName)
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
@@ -62,80 +84,141 @@ func (api *API) PXEDownloadAsset(w http.ResponseWriter, r *http.Request) {
return
}
if req.URL == "" {
respondError(w, http.StatusBadRequest, "url is required")
// Get schematic ID from instance config
configPath := api.instance.GetInstanceConfigPath(instanceName)
schematicID, err := api.config.GetConfigValue(configPath, "cluster.nodes.talos.schematicId")
// If no schematic configured or URL provided (old behavior), fall back to old PXE manager
if (err != nil || schematicID == "" || schematicID == "null") || req.URL != "" {
if req.URL == "" {
respondError(w, http.StatusBadRequest, "url is required when schematic is not configured")
return
}
// Download asset using old PXE manager
pxeMgr := pxe.NewManager(api.dataDir)
if err := pxeMgr.DownloadAsset(instanceName, req.AssetType, req.Version, req.URL); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to download asset: %v", err))
return
}
respondJSON(w, http.StatusOK, map[string]string{
"message": "Asset downloaded successfully",
"asset_type": req.AssetType,
"version": req.Version,
})
return
}
// Download asset
pxeMgr := pxe.NewManager(api.dataDir)
if err := pxeMgr.DownloadAsset(instanceName, req.AssetType, req.Version, req.URL); err != nil {
// Proxy to new asset system
if req.Version == "" {
respondError(w, http.StatusBadRequest, "version is required")
return
}
assetsMgr := assets.NewManager(api.dataDir)
assetTypes := []string{req.AssetType}
platform := "amd64" // Default platform for backward compatibility
if err := assetsMgr.DownloadAssets(schematicID, req.Version, platform, assetTypes); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to download asset: %v", err))
return
}
respondJSON(w, http.StatusOK, map[string]string{
"message": "Asset downloaded successfully",
"asset_type": req.AssetType,
"version": req.Version,
"message": "Asset downloaded successfully",
"asset_type": req.AssetType,
"version": req.Version,
"schematic_id": schematicID,
})
}
// PXEGetAsset returns information about a specific asset
// DEPRECATED: This endpoint is deprecated. Use GET /api/v1/assets/{schematicId}/pxe/{assetType} instead.
func (api *API) PXEGetAsset(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
assetType := vars["type"]
// Add deprecation warning header
w.Header().Set("X-Deprecated", "This endpoint is deprecated. Use GET /api/v1/assets/{schematicId}/pxe/{assetType} instead.")
log.Printf("Warning: Deprecated endpoint /api/v1/instances/%s/pxe/assets/%s called", instanceName, assetType)
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Get asset path
pxeMgr := pxe.NewManager(api.dataDir)
assetPath, err := pxeMgr.GetAssetPath(instanceName, assetType)
if err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Asset not found: %v", err))
// Get schematic ID from instance config
configPath := api.instance.GetInstanceConfigPath(instanceName)
schematicID, err := api.config.GetConfigValue(configPath, "cluster.nodes.talos.schematicId")
if err != nil || schematicID == "" || schematicID == "null" {
// Fall back to old PXE manager if no schematic configured
pxeMgr := pxe.NewManager(api.dataDir)
assetPath, err := pxeMgr.GetAssetPath(instanceName, assetType)
if err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Asset not found: %v", err))
return
}
// Verify asset
valid, err := pxeMgr.VerifyAsset(instanceName, assetType)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to verify asset: %v", err))
return
}
respondJSON(w, http.StatusOK, map[string]interface{}{
"type": assetType,
"path": assetPath,
"valid": valid,
})
return
}
// Verify asset
valid, err := pxeMgr.VerifyAsset(instanceName, assetType)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to verify asset: %v", err))
return
}
respondJSON(w, http.StatusOK, map[string]interface{}{
"type": assetType,
"path": assetPath,
"valid": valid,
})
respondError(w, http.StatusBadRequest, "This deprecated endpoint requires version. Please use /api/v1/pxe/assets/{schematicId}/{version}/pxe/{assetType}")
}
// PXEDeleteAsset deletes a PXE asset
// DEPRECATED: This endpoint is deprecated. Use DELETE /api/v1/assets/{schematicId} instead.
func (api *API) PXEDeleteAsset(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
assetType := vars["type"]
// Add deprecation warning header
w.Header().Set("X-Deprecated", "This endpoint is deprecated. Use DELETE /api/v1/assets/{schematicId} instead.")
log.Printf("Warning: Deprecated endpoint DELETE /api/v1/instances/%s/pxe/assets/%s called", instanceName, assetType)
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Delete asset
pxeMgr := pxe.NewManager(api.dataDir)
if err := pxeMgr.DeleteAsset(instanceName, assetType); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to delete asset: %v", err))
// Get schematic ID from instance config
configPath := api.instance.GetInstanceConfigPath(instanceName)
schematicID, err := api.config.GetConfigValue(configPath, "cluster.nodes.talos.schematicId")
if err != nil || schematicID == "" || schematicID == "null" {
// Fall back to old PXE manager if no schematic configured
pxeMgr := pxe.NewManager(api.dataDir)
if err := pxeMgr.DeleteAsset(instanceName, assetType); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to delete asset: %v", err))
return
}
respondJSON(w, http.StatusOK, map[string]string{
"message": "Asset deleted successfully",
"type": assetType,
})
return
}
// Note: In the new system, we don't delete individual assets, only entire schematics
// For backward compatibility, we'll just report success without doing anything
respondJSON(w, http.StatusOK, map[string]string{
"message": "Asset deleted successfully",
"type": assetType,
"message": "Individual asset deletion not supported in schematic mode. Use DELETE /api/v1/assets/{schematicId} to delete all assets.",
"type": assetType,
"schematic_id": schematicID,
})
}

View File

@@ -0,0 +1,122 @@
package v1
import (
"encoding/json"
"fmt"
"net/http"
"github.com/gorilla/mux"
"github.com/wild-cloud/wild-central/daemon/internal/assets"
)
// SchematicGetInstanceSchematic returns the schematic configuration for an instance
func (api *API) SchematicGetInstanceSchematic(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
configPath := api.instance.GetInstanceConfigPath(instanceName)
// Get schematic ID from config
schematicID, err := api.config.GetConfigValue(configPath, "cluster.nodes.talos.schematicId")
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to get schematic ID: %v", err))
return
}
// Get version from config
version, err := api.config.GetConfigValue(configPath, "cluster.nodes.talos.version")
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to get version: %v", err))
return
}
// If schematic is configured, get asset status
var assetStatus interface{}
if schematicID != "" && schematicID != "null" && version != "" && version != "null" {
assetsMgr := assets.NewManager(api.dataDir)
status, err := assetsMgr.GetAssetStatus(schematicID, version)
if err == nil {
assetStatus = status
}
}
respondJSON(w, http.StatusOK, map[string]interface{}{
"schematic_id": schematicID,
"version": version,
"assets": assetStatus,
})
}
// SchematicUpdateInstanceSchematic updates the schematic configuration for an instance
func (api *API) SchematicUpdateInstanceSchematic(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Parse request body
var req struct {
SchematicID string `json:"schematic_id"`
Version string `json:"version"`
Download bool `json:"download,omitempty"`
}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
respondError(w, http.StatusBadRequest, "Invalid request body")
return
}
if req.SchematicID == "" {
respondError(w, http.StatusBadRequest, "schematic_id is required")
return
}
if req.Version == "" {
respondError(w, http.StatusBadRequest, "version is required")
return
}
configPath := api.instance.GetInstanceConfigPath(instanceName)
// Update schematic ID in config
if err := api.config.SetConfigValue(configPath, "cluster.nodes.talos.schematicId", req.SchematicID); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to set schematic ID: %v", err))
return
}
// Update version in config
if err := api.config.SetConfigValue(configPath, "cluster.nodes.talos.version", req.Version); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to set version: %v", err))
return
}
response := map[string]interface{}{
"message": "Schematic configuration updated successfully",
"schematic_id": req.SchematicID,
"version": req.Version,
}
// Optionally download assets
if req.Download {
assetsMgr := assets.NewManager(api.dataDir)
platform := "amd64" // Default platform
if err := assetsMgr.DownloadAssets(req.SchematicID, req.Version, platform, nil); err != nil {
response["download_warning"] = fmt.Sprintf("Failed to download assets: %v", err)
} else {
response["download_status"] = "Assets downloaded successfully"
}
}
respondJSON(w, http.StatusOK, response)
}

View File

@@ -5,14 +5,15 @@ import (
"fmt"
"net/http"
"os"
"path/filepath"
"strings"
"github.com/gorilla/mux"
"gopkg.in/yaml.v3"
"github.com/wild-cloud/wild-central/daemon/internal/contracts"
"github.com/wild-cloud/wild-central/daemon/internal/operations"
"github.com/wild-cloud/wild-central/daemon/internal/services"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// ServicesList lists all base services
@@ -27,7 +28,7 @@ func (api *API) ServicesList(w http.ResponseWriter, r *http.Request) {
}
// List services
servicesMgr := services.NewManager(api.dataDir, filepath.Join(api.directoryPath, "setup", "cluster-services"))
servicesMgr := services.NewManager(api.dataDir)
svcList, err := servicesMgr.List(instanceName)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to list services: %v", err))
@@ -52,7 +53,7 @@ func (api *API) ServicesGet(w http.ResponseWriter, r *http.Request) {
}
// Get service
servicesMgr := services.NewManager(api.dataDir, filepath.Join(api.directoryPath, "setup", "cluster-services"))
servicesMgr := services.NewManager(api.dataDir)
service, err := servicesMgr.Get(instanceName, serviceName)
if err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Service not found: %v", err))
@@ -104,20 +105,20 @@ func (api *API) ServicesInstall(w http.ResponseWriter, r *http.Request) {
defer func() {
if r := recover(); r != nil {
fmt.Printf("[ERROR] Service install goroutine panic: %v\n", r)
opsMgr.Update(instanceName, opID, "failed", fmt.Sprintf("Internal error: %v", r), 0)
_ = opsMgr.Update(instanceName, opID, "failed", fmt.Sprintf("Internal error: %v", r), 0)
}
}()
fmt.Printf("[DEBUG] Service install goroutine started: service=%s instance=%s opID=%s\n", req.Name, instanceName, opID)
servicesMgr := services.NewManager(api.dataDir, filepath.Join(api.directoryPath, "setup", "cluster-services"))
opsMgr.UpdateStatus(instanceName, opID, "running")
servicesMgr := services.NewManager(api.dataDir)
_ = opsMgr.UpdateStatus(instanceName, opID, "running")
if err := servicesMgr.Install(instanceName, req.Name, req.Fetch, req.Deploy, opID, api.broadcaster); err != nil {
fmt.Printf("[DEBUG] Service install failed: %v\n", err)
opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
_ = opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
} else {
fmt.Printf("[DEBUG] Service install completed successfully\n")
opsMgr.Update(instanceName, opID, "completed", "Service installed", 100)
_ = opsMgr.Update(instanceName, opID, "completed", "Service installed", 100)
}
}()
@@ -159,13 +160,13 @@ func (api *API) ServicesInstallAll(w http.ResponseWriter, r *http.Request) {
// Install in background
go func() {
servicesMgr := services.NewManager(api.dataDir, filepath.Join(api.directoryPath, "setup", "cluster-services"))
opsMgr.UpdateStatus(instanceName, opID, "running")
servicesMgr := services.NewManager(api.dataDir)
_ = opsMgr.UpdateStatus(instanceName, opID, "running")
if err := servicesMgr.InstallAll(instanceName, req.Fetch, req.Deploy, opID, api.broadcaster); err != nil {
opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
_ = opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
} else {
opsMgr.Update(instanceName, opID, "completed", "All services installed", 100)
_ = opsMgr.Update(instanceName, opID, "completed", "All services installed", 100)
}
}()
@@ -197,13 +198,13 @@ func (api *API) ServicesDelete(w http.ResponseWriter, r *http.Request) {
// Delete in background
go func() {
servicesMgr := services.NewManager(api.dataDir, filepath.Join(api.directoryPath, "setup", "cluster-services"))
opsMgr.UpdateStatus(instanceName, opID, "running")
servicesMgr := services.NewManager(api.dataDir)
_ = opsMgr.UpdateStatus(instanceName, opID, "running")
if err := servicesMgr.Delete(instanceName, serviceName); err != nil {
opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
_ = opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
} else {
opsMgr.Update(instanceName, opID, "completed", "Service deleted", 100)
_ = opsMgr.Update(instanceName, opID, "completed", "Service deleted", 100)
}
}()
@@ -225,11 +226,11 @@ func (api *API) ServicesGetStatus(w http.ResponseWriter, r *http.Request) {
return
}
// Get status
servicesMgr := services.NewManager(api.dataDir, filepath.Join(api.directoryPath, "setup", "cluster-services"))
status, err := servicesMgr.GetStatus(instanceName, serviceName)
// Get detailed status
servicesMgr := services.NewManager(api.dataDir)
status, err := servicesMgr.GetDetailedStatus(instanceName, serviceName)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to get status: %v", err))
respondError(w, http.StatusNotFound, fmt.Sprintf("Failed to get status: %v", err))
return
}
@@ -241,7 +242,7 @@ func (api *API) ServicesGetManifest(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
serviceName := vars["service"]
servicesMgr := services.NewManager(api.dataDir, filepath.Join(api.directoryPath, "setup", "cluster-services"))
servicesMgr := services.NewManager(api.dataDir)
manifest, err := servicesMgr.GetManifest(serviceName)
if err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Service not found: %v", err))
@@ -256,7 +257,7 @@ func (api *API) ServicesGetConfig(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
serviceName := vars["service"]
servicesMgr := services.NewManager(api.dataDir, filepath.Join(api.directoryPath, "setup", "cluster-services"))
servicesMgr := services.NewManager(api.dataDir)
// Get manifest
manifest, err := servicesMgr.GetManifest(serviceName)
@@ -286,7 +287,7 @@ func (api *API) ServicesGetInstanceConfig(w http.ResponseWriter, r *http.Request
return
}
servicesMgr := services.NewManager(api.dataDir, filepath.Join(api.directoryPath, "setup", "cluster-services"))
servicesMgr := services.NewManager(api.dataDir)
// Get manifest to know which config paths to read
manifest, err := servicesMgr.GetManifest(serviceName)
@@ -296,7 +297,7 @@ func (api *API) ServicesGetInstanceConfig(w http.ResponseWriter, r *http.Request
}
// Load instance config as map for dynamic path extraction
configPath := filepath.Join(api.dataDir, "instances", instanceName, "config.yaml")
configPath := tools.GetInstanceConfigPath(api.dataDir, instanceName)
configData, err := os.ReadFile(configPath)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to read instance config: %v", err))
@@ -364,7 +365,7 @@ func (api *API) ServicesFetch(w http.ResponseWriter, r *http.Request) {
}
// Fetch service files
servicesMgr := services.NewManager(api.dataDir, filepath.Join(api.directoryPath, "setup", "cluster-services"))
servicesMgr := services.NewManager(api.dataDir)
if err := servicesMgr.Fetch(instanceName, serviceName); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to fetch service: %v", err))
return
@@ -388,7 +389,7 @@ func (api *API) ServicesCompile(w http.ResponseWriter, r *http.Request) {
}
// Compile templates
servicesMgr := services.NewManager(api.dataDir, filepath.Join(api.directoryPath, "setup", "cluster-services"))
servicesMgr := services.NewManager(api.dataDir)
if err := servicesMgr.Compile(instanceName, serviceName); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to compile templates: %v", err))
return
@@ -412,7 +413,7 @@ func (api *API) ServicesDeploy(w http.ResponseWriter, r *http.Request) {
}
// Deploy service (without operation tracking for standalone deploy)
servicesMgr := services.NewManager(api.dataDir, filepath.Join(api.directoryPath, "setup", "cluster-services"))
servicesMgr := services.NewManager(api.dataDir)
if err := servicesMgr.Deploy(instanceName, serviceName, "", nil); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to deploy service: %v", err))
return
@@ -422,3 +423,110 @@ func (api *API) ServicesDeploy(w http.ResponseWriter, r *http.Request) {
"message": fmt.Sprintf("Service %s deployed successfully", serviceName),
})
}
// ServicesGetLogs retrieves or streams service logs
func (api *API) ServicesGetLogs(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
serviceName := vars["service"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Parse query parameters
query := r.URL.Query()
logsReq := contracts.ServiceLogsRequest{
Container: query.Get("container"),
Follow: query.Get("follow") == "true",
Previous: query.Get("previous") == "true",
Since: query.Get("since"),
}
// Parse tail parameter
if tailStr := query.Get("tail"); tailStr != "" {
var tail int
if _, err := fmt.Sscanf(tailStr, "%d", &tail); err == nil {
logsReq.Tail = tail
}
}
// Validate parameters
if logsReq.Tail < 0 {
respondError(w, http.StatusBadRequest, "tail parameter must be positive")
return
}
if logsReq.Tail > 5000 {
respondError(w, http.StatusBadRequest, "tail parameter cannot exceed 5000")
return
}
if logsReq.Previous && logsReq.Follow {
respondError(w, http.StatusBadRequest, "previous and follow cannot be used together")
return
}
servicesMgr := services.NewManager(api.dataDir)
// Stream logs with SSE if follow=true
if logsReq.Follow {
// Set SSE headers
w.Header().Set("Content-Type", "text/event-stream")
w.Header().Set("Cache-Control", "no-cache")
w.Header().Set("Connection", "keep-alive")
w.Header().Set("X-Accel-Buffering", "no")
// Stream logs
if err := servicesMgr.StreamLogs(instanceName, serviceName, logsReq, w); err != nil {
// Log error but can't send response (SSE already started)
fmt.Printf("Error streaming logs: %v\n", err)
}
return
}
// Get buffered logs
logsResp, err := servicesMgr.GetLogs(instanceName, serviceName, logsReq)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to get logs: %v", err))
return
}
respondJSON(w, http.StatusOK, logsResp)
}
// ServicesUpdateConfig updates service configuration
func (api *API) ServicesUpdateConfig(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
serviceName := vars["service"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Parse request body
var update contracts.ServiceConfigUpdate
if err := json.NewDecoder(r.Body).Decode(&update); err != nil {
respondError(w, http.StatusBadRequest, fmt.Sprintf("Invalid request body: %v", err))
return
}
// Validate request
if len(update.Config) == 0 {
respondError(w, http.StatusBadRequest, "config field is required and must not be empty")
return
}
// Update config
servicesMgr := services.NewManager(api.dataDir)
response, err := servicesMgr.UpdateConfig(instanceName, serviceName, update, api.broadcaster)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to update config: %v", err))
return
}
respondJSON(w, http.StatusOK, response)
}

View File

@@ -4,26 +4,12 @@ import (
"encoding/json"
"fmt"
"net/http"
"path/filepath"
"github.com/gorilla/mux"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
"github.com/wild-cloud/wild-central/daemon/internal/utilities"
)
// UtilitiesHealth returns cluster health status (legacy, no instance context)
func (api *API) UtilitiesHealth(w http.ResponseWriter, r *http.Request) {
status, err := utilities.GetClusterHealth("")
if err != nil {
respondError(w, http.StatusInternalServerError, "Failed to get cluster health")
return
}
respondJSON(w, http.StatusOK, map[string]interface{}{
"success": true,
"data": status,
})
}
// InstanceUtilitiesHealth returns cluster health status for a specific instance
func (api *API) InstanceUtilitiesHealth(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
@@ -36,7 +22,7 @@ func (api *API) InstanceUtilitiesHealth(w http.ResponseWriter, r *http.Request)
}
// Get kubeconfig path for this instance
kubeconfigPath := filepath.Join(api.dataDir, "instances", instanceName, "kubeconfig")
kubeconfigPath := tools.GetKubeconfigPath(api.dataDir, instanceName)
status, err := utilities.GetClusterHealth(kubeconfigPath)
if err != nil {
@@ -50,12 +36,24 @@ func (api *API) InstanceUtilitiesHealth(w http.ResponseWriter, r *http.Request)
})
}
// UtilitiesDashboardToken returns a Kubernetes dashboard token
// InstanceUtilitiesDashboardToken returns a Kubernetes dashboard token for a specific instance
func (api *API) UtilitiesDashboardToken(w http.ResponseWriter, r *http.Request) {
token, err := utilities.GetDashboardToken()
vars := mux.Vars(r)
instanceName := vars["name"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Get kubeconfig path for the instance
kubeconfigPath := tools.GetKubeconfigPath(api.dataDir, instanceName)
token, err := utilities.GetDashboardToken(kubeconfigPath)
if err != nil {
// Try fallback method
token, err = utilities.GetDashboardTokenFromSecret()
token, err = utilities.GetDashboardTokenFromSecret(kubeconfigPath)
if err != nil {
respondError(w, http.StatusInternalServerError, "Failed to get dashboard token")
return
@@ -70,7 +68,19 @@ func (api *API) UtilitiesDashboardToken(w http.ResponseWriter, r *http.Request)
// UtilitiesNodeIPs returns IP addresses for all cluster nodes
func (api *API) UtilitiesNodeIPs(w http.ResponseWriter, r *http.Request) {
nodes, err := utilities.GetNodeIPs()
vars := mux.Vars(r)
instanceName := vars["name"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Get kubeconfig path for this instance
kubeconfigPath := tools.GetKubeconfigPath(api.dataDir, instanceName)
nodes, err := utilities.GetNodeIPs(kubeconfigPath)
if err != nil {
respondError(w, http.StatusInternalServerError, "Failed to get node IPs")
return
@@ -86,7 +96,19 @@ func (api *API) UtilitiesNodeIPs(w http.ResponseWriter, r *http.Request) {
// UtilitiesControlPlaneIP returns the control plane IP
func (api *API) UtilitiesControlPlaneIP(w http.ResponseWriter, r *http.Request) {
ip, err := utilities.GetControlPlaneIP()
vars := mux.Vars(r)
instanceName := vars["name"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Get kubeconfig path for this instance
kubeconfigPath := tools.GetKubeconfigPath(api.dataDir, instanceName)
ip, err := utilities.GetControlPlaneIP(kubeconfigPath)
if err != nil {
respondError(w, http.StatusInternalServerError, "Failed to get control plane IP")
return
@@ -103,8 +125,15 @@ func (api *API) UtilitiesControlPlaneIP(w http.ResponseWriter, r *http.Request)
// UtilitiesSecretCopy copies a secret between namespaces
func (api *API) UtilitiesSecretCopy(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
secretName := vars["secret"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
var req struct {
SourceNamespace string `json:"source_namespace"`
DestinationNamespace string `json:"destination_namespace"`
@@ -120,7 +149,10 @@ func (api *API) UtilitiesSecretCopy(w http.ResponseWriter, r *http.Request) {
return
}
if err := utilities.CopySecretBetweenNamespaces(secretName, req.SourceNamespace, req.DestinationNamespace); err != nil {
// Get kubeconfig path for this instance
kubeconfigPath := tools.GetKubeconfigPath(api.dataDir, instanceName)
if err := utilities.CopySecretBetweenNamespaces(kubeconfigPath, secretName, req.SourceNamespace, req.DestinationNamespace); err != nil {
respondError(w, http.StatusInternalServerError, "Failed to copy secret")
return
}
@@ -133,7 +165,19 @@ func (api *API) UtilitiesSecretCopy(w http.ResponseWriter, r *http.Request) {
// UtilitiesVersion returns cluster and Talos versions
func (api *API) UtilitiesVersion(w http.ResponseWriter, r *http.Request) {
k8sVersion, err := utilities.GetClusterVersion()
vars := mux.Vars(r)
instanceName := vars["name"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Get kubeconfig path for this instance
kubeconfigPath := tools.GetKubeconfigPath(api.dataDir, instanceName)
k8sVersion, err := utilities.GetClusterVersion(kubeconfigPath)
if err != nil {
respondError(w, http.StatusInternalServerError, "Failed to get cluster version")
return

View File

@@ -2,6 +2,7 @@ package apps
import (
"bytes"
"encoding/json"
"fmt"
"os"
"os/exec"
@@ -31,12 +32,15 @@ func NewManager(dataDir, appsDir string) *Manager {
// App represents an application
type App struct {
Name string `json:"name" yaml:"name"`
Description string `json:"description" yaml:"description"`
Version string `json:"version" yaml:"version"`
Category string `json:"category" yaml:"category"`
Dependencies []string `json:"dependencies" yaml:"dependencies"`
Config map[string]string `json:"config,omitempty" yaml:"config,omitempty"`
Name string `json:"name" yaml:"name"`
Description string `json:"description" yaml:"description"`
Version string `json:"version" yaml:"version"`
Category string `json:"category,omitempty" yaml:"category,omitempty"`
Icon string `json:"icon,omitempty" yaml:"icon,omitempty"`
Dependencies []string `json:"dependencies" yaml:"dependencies"`
Config map[string]string `json:"config,omitempty" yaml:"config,omitempty"`
DefaultConfig map[string]interface{} `json:"defaultConfig,omitempty" yaml:"defaultConfig,omitempty"`
RequiredSecrets []string `json:"requiredSecrets,omitempty" yaml:"requiredSecrets,omitempty"`
}
// DeployedApp represents a deployed application instance
@@ -78,12 +82,30 @@ func (m *Manager) ListAvailable() ([]App, error) {
continue
}
var app App
if err := yaml.Unmarshal(data, &app); err != nil {
var manifest AppManifest
if err := yaml.Unmarshal(data, &manifest); err != nil {
continue
}
app.Name = entry.Name() // Use directory name as app name
// Convert manifest to App struct
app := App{
Name: entry.Name(), // Use directory name as app name
Description: manifest.Description,
Version: manifest.Version,
Category: manifest.Category,
Icon: manifest.Icon,
DefaultConfig: manifest.DefaultConfig,
RequiredSecrets: manifest.RequiredSecrets,
}
// Extract dependencies from Requires field
if len(manifest.Requires) > 0 {
app.Dependencies = make([]string, len(manifest.Requires))
for i, dep := range manifest.Requires {
app.Dependencies[i] = dep.Name
}
}
apps = append(apps, app)
}
@@ -103,19 +125,37 @@ func (m *Manager) Get(appName string) (*App, error) {
return nil, fmt.Errorf("failed to read app file: %w", err)
}
var app App
if err := yaml.Unmarshal(data, &app); err != nil {
var manifest AppManifest
if err := yaml.Unmarshal(data, &manifest); err != nil {
return nil, fmt.Errorf("failed to parse app file: %w", err)
}
app.Name = appName
return &app, nil
// Convert manifest to App struct
app := &App{
Name: appName,
Description: manifest.Description,
Version: manifest.Version,
Category: manifest.Category,
Icon: manifest.Icon,
DefaultConfig: manifest.DefaultConfig,
RequiredSecrets: manifest.RequiredSecrets,
}
// Extract dependencies from Requires field
if len(manifest.Requires) > 0 {
app.Dependencies = make([]string, len(manifest.Requires))
for i, dep := range manifest.Requires {
app.Dependencies[i] = dep.Name
}
}
return app, nil
}
// ListDeployed lists deployed apps for an instance
func (m *Manager) ListDeployed(instanceName string) ([]DeployedApp, error) {
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
instancePath := filepath.Join(m.dataDir, "instances", instanceName)
instancePath := tools.GetInstancePath(m.dataDir, instanceName)
appsDir := filepath.Join(instancePath, "apps")
apps := []DeployedApp{}
@@ -139,44 +179,104 @@ func (m *Manager) ListDeployed(instanceName string) ([]DeployedApp, error) {
appName := entry.Name()
// Initialize app with basic info
app := DeployedApp{
Name: appName,
Namespace: appName,
Status: "added", // Default status: added but not deployed
}
// Try to get version from manifest
manifestPath := filepath.Join(appsDir, appName, "manifest.yaml")
if storage.FileExists(manifestPath) {
manifestData, _ := os.ReadFile(manifestPath)
var manifest struct {
Version string `yaml:"version"`
}
if yaml.Unmarshal(manifestData, &manifest) == nil {
app.Version = manifest.Version
}
}
// Check if namespace exists in cluster
checkCmd := exec.Command("kubectl", "get", "namespace", appName, "-o", "json")
tools.WithKubeconfig(checkCmd, kubeconfigPath)
output, err := checkCmd.CombinedOutput()
if err != nil {
// Namespace doesn't exist - app not deployed
continue
}
// Parse namespace status
var ns struct {
Status struct {
Phase string `json:"phase"`
} `json:"status"`
}
if err := yaml.Unmarshal(output, &ns); err == nil && ns.Status.Phase == "Active" {
// App is deployed - get more details
app := DeployedApp{
Name: appName,
Namespace: appName,
Status: "deployed",
if err == nil {
// Namespace exists - parse status
var ns struct {
Status struct {
Phase string `json:"phase"`
} `json:"status"`
}
if yaml.Unmarshal(output, &ns) == nil && ns.Status.Phase == "Active" {
// Namespace is active - app is deployed
app.Status = "deployed"
// Try to get version from manifest
manifestPath := filepath.Join(appsDir, appName, "manifest.yaml")
if storage.FileExists(manifestPath) {
manifestData, _ := os.ReadFile(manifestPath)
var manifest struct {
Version string `yaml:"version"`
// Get ingress URL if available
// Try Traefik IngressRoute first
ingressCmd := exec.Command("kubectl", "get", "ingressroute", "-n", appName, "-o", "json")
tools.WithKubeconfig(ingressCmd, kubeconfigPath)
ingressOutput, err := ingressCmd.CombinedOutput()
if err == nil {
var ingressList struct {
Items []struct {
Spec struct {
Routes []struct {
Match string `json:"match"`
} `json:"routes"`
} `json:"spec"`
} `json:"items"`
}
if json.Unmarshal(ingressOutput, &ingressList) == nil && len(ingressList.Items) > 0 {
// Extract host from the first route match (format: Host(`example.com`))
if len(ingressList.Items[0].Spec.Routes) > 0 {
match := ingressList.Items[0].Spec.Routes[0].Match
// Parse Host(`domain.com`) format
if strings.Contains(match, "Host(`") {
start := strings.Index(match, "Host(`") + 6
end := strings.Index(match[start:], "`")
if end > 0 {
host := match[start : start+end]
app.URL = "https://" + host
}
}
}
}
}
if yaml.Unmarshal(manifestData, &manifest) == nil {
app.Version = manifest.Version
// If no IngressRoute, try standard Ingress
if app.URL == "" {
ingressCmd := exec.Command("kubectl", "get", "ingress", "-n", appName, "-o", "json")
tools.WithKubeconfig(ingressCmd, kubeconfigPath)
ingressOutput, err := ingressCmd.CombinedOutput()
if err == nil {
var ingressList struct {
Items []struct {
Spec struct {
Rules []struct {
Host string `json:"host"`
} `json:"rules"`
} `json:"spec"`
} `json:"items"`
}
if json.Unmarshal(ingressOutput, &ingressList) == nil && len(ingressList.Items) > 0 {
if len(ingressList.Items[0].Spec.Rules) > 0 {
host := ingressList.Items[0].Spec.Rules[0].Host
if host != "" {
app.URL = "https://" + host
}
}
}
}
}
}
apps = append(apps, app)
}
apps = append(apps, app)
}
return apps, nil
@@ -190,9 +290,9 @@ func (m *Manager) Add(instanceName, appName string, config map[string]string) er
return fmt.Errorf("app %s not found at %s", appName, manifestPath)
}
instancePath := filepath.Join(m.dataDir, "instances", instanceName)
configFile := filepath.Join(instancePath, "config.yaml")
secretsFile := filepath.Join(instancePath, "secrets.yaml")
instancePath := tools.GetInstancePath(m.dataDir, instanceName)
configFile := tools.GetInstanceConfigPath(m.dataDir, instanceName)
secretsFile := tools.GetInstanceSecretsPath(m.dataDir, instanceName)
appDestDir := filepath.Join(instancePath, "apps", appName)
// Check instance config exists
@@ -306,8 +406,8 @@ func (m *Manager) Add(instanceName, appName string, config map[string]string) er
// Deploy deploys an app to the cluster
func (m *Manager) Deploy(instanceName, appName string) error {
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
instancePath := filepath.Join(m.dataDir, "instances", instanceName)
secretsFile := filepath.Join(instancePath, "secrets.yaml")
instancePath := tools.GetInstancePath(m.dataDir, instanceName)
secretsFile := tools.GetInstanceSecretsPath(m.dataDir, instanceName)
// Get compiled app manifests from instance directory
appDir := filepath.Join(instancePath, "apps", appName)
@@ -323,7 +423,7 @@ func (m *Manager) Deploy(instanceName, appName string) error {
applyNsCmd := exec.Command("kubectl", "apply", "-f", "-")
applyNsCmd.Stdin = bytes.NewReader(namespaceYaml)
tools.WithKubeconfig(applyNsCmd, kubeconfigPath)
applyNsCmd.CombinedOutput() // Ignore errors - namespace might already exist
_, _ = applyNsCmd.CombinedOutput() // Ignore errors - namespace might already exist
// Create Kubernetes secrets from secrets.yaml
if storage.FileExists(secretsFile) {
@@ -334,7 +434,7 @@ func (m *Manager) Deploy(instanceName, appName string) error {
// Delete existing secret if it exists (to update it)
deleteCmd := exec.Command("kubectl", "delete", "secret", fmt.Sprintf("%s-secrets", appName), "-n", appName, "--ignore-not-found")
tools.WithKubeconfig(deleteCmd, kubeconfigPath)
deleteCmd.CombinedOutput()
_, _ = deleteCmd.CombinedOutput()
// Create secret from literals
createSecretCmd := exec.Command("kubectl", "create", "secret", "generic", fmt.Sprintf("%s-secrets", appName), "-n", appName)
@@ -369,9 +469,9 @@ func (m *Manager) Deploy(instanceName, appName string) error {
// Delete removes an app from the cluster and configuration
func (m *Manager) Delete(instanceName, appName string) error {
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
instancePath := filepath.Join(m.dataDir, "instances", instanceName)
configFile := filepath.Join(instancePath, "config.yaml")
secretsFile := filepath.Join(instancePath, "secrets.yaml")
instancePath := tools.GetInstancePath(m.dataDir, instanceName)
configFile := tools.GetInstanceConfigPath(m.dataDir, instanceName)
secretsFile := tools.GetInstanceSecretsPath(m.dataDir, instanceName)
// Get compiled app manifests from instance directory
appDir := filepath.Join(instancePath, "apps", appName)
@@ -390,7 +490,7 @@ func (m *Manager) Delete(instanceName, appName string) error {
// Wait for namespace deletion to complete (timeout after 60s)
waitCmd := exec.Command("kubectl", "wait", "--for=delete", "namespace", appName, "--timeout=60s")
tools.WithKubeconfig(waitCmd, kubeconfigPath)
waitCmd.CombinedOutput() // Ignore errors - namespace might not exist
_, _ = waitCmd.CombinedOutput() // Ignore errors - namespace might not exist
// Delete local app configuration directory
if err := os.RemoveAll(appDir); err != nil {
@@ -425,7 +525,7 @@ func (m *Manager) Delete(instanceName, appName string) error {
// GetStatus returns the status of a deployed app
func (m *Manager) GetStatus(instanceName, appName string) (*DeployedApp, error) {
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
instancePath := filepath.Join(m.dataDir, "instances", instanceName)
instancePath := tools.GetInstancePath(m.dataDir, instanceName)
appDir := filepath.Join(instancePath, "apps", appName)
app := &DeployedApp{
@@ -485,7 +585,7 @@ func (m *Manager) GetStatus(instanceName, appName string) (*DeployedApp, error)
var podList struct {
Items []struct {
Status struct {
Phase string `json:"phase"`
Phase string `json:"phase"`
ContainerStatuses []struct {
Ready bool `json:"ready"`
} `json:"containerStatuses"`
@@ -526,3 +626,214 @@ func (m *Manager) GetStatus(instanceName, appName string) (*DeployedApp, error)
return app, nil
}
// GetEnhanced returns enhanced app information with runtime status
func (m *Manager) GetEnhanced(instanceName, appName string) (*EnhancedApp, error) {
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
instancePath := tools.GetInstancePath(m.dataDir, instanceName)
configFile := tools.GetInstanceConfigPath(m.dataDir, instanceName)
appDir := filepath.Join(instancePath, "apps", appName)
enhanced := &EnhancedApp{
Name: appName,
Status: "not-added",
Namespace: appName,
}
// Check if app was added to instance
if !storage.FileExists(appDir) {
return enhanced, nil
}
enhanced.Status = "not-deployed"
// Load manifest
manifestPath := filepath.Join(appDir, "manifest.yaml")
if storage.FileExists(manifestPath) {
manifestData, _ := os.ReadFile(manifestPath)
var manifest AppManifest
if yaml.Unmarshal(manifestData, &manifest) == nil {
enhanced.Version = manifest.Version
enhanced.Description = manifest.Description
enhanced.Icon = manifest.Icon
enhanced.Manifest = &manifest
}
}
// Note: README content is now served via dedicated /readme endpoint
// No need to populate readme/documentation fields here
// Load config
yq := tools.NewYQ()
configJSON, err := yq.Get(configFile, fmt.Sprintf(".apps.%s | @json", appName))
if err == nil && configJSON != "" && configJSON != "null" {
var config map[string]string
if json.Unmarshal([]byte(configJSON), &config) == nil {
enhanced.Config = config
}
}
// Check if namespace exists
checkNsCmd := exec.Command("kubectl", "get", "namespace", appName, "-o", "json")
tools.WithKubeconfig(checkNsCmd, kubeconfigPath)
nsOutput, err := checkNsCmd.CombinedOutput()
if err != nil {
// Namespace doesn't exist - not deployed
return enhanced, nil
}
// Parse namespace to check if it's active
var ns struct {
Status struct {
Phase string `json:"phase"`
} `json:"status"`
}
if err := json.Unmarshal(nsOutput, &ns); err != nil || ns.Status.Phase != "Active" {
return enhanced, nil
}
enhanced.Status = "deployed"
// Get URL (ingress)
enhanced.URL = m.getAppURL(kubeconfigPath, appName)
// Get runtime status
runtime, err := m.getRuntimeStatus(kubeconfigPath, appName)
if err == nil {
enhanced.Runtime = runtime
// Update status based on runtime
if runtime.Pods != nil && len(runtime.Pods) > 0 {
allRunning := true
allReady := true
for _, pod := range runtime.Pods {
if pod.Status != "Running" {
allRunning = false
}
// Check ready ratio
parts := strings.Split(pod.Ready, "/")
if len(parts) == 2 && parts[0] != parts[1] {
allReady = false
}
}
if allRunning && allReady {
enhanced.Status = "running"
} else if allRunning {
enhanced.Status = "starting"
} else {
enhanced.Status = "unhealthy"
}
}
}
return enhanced, nil
}
// GetEnhancedStatus returns just the runtime status for an app
func (m *Manager) GetEnhancedStatus(instanceName, appName string) (*RuntimeStatus, error) {
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
// Check if namespace exists
checkNsCmd := exec.Command("kubectl", "get", "namespace", appName, "-o", "json")
tools.WithKubeconfig(checkNsCmd, kubeconfigPath)
if err := checkNsCmd.Run(); err != nil {
return nil, fmt.Errorf("namespace not found or not deployed")
}
return m.getRuntimeStatus(kubeconfigPath, appName)
}
// getRuntimeStatus fetches runtime information from kubernetes
func (m *Manager) getRuntimeStatus(kubeconfigPath, namespace string) (*RuntimeStatus, error) {
kubectl := tools.NewKubectl(kubeconfigPath)
runtime := &RuntimeStatus{}
// Get pods (with detailed info for app status display)
pods, err := kubectl.GetPods(namespace, true)
if err == nil {
runtime.Pods = pods
}
// Get replicas
replicas, err := kubectl.GetReplicas(namespace)
if err == nil && (replicas.Desired > 0 || replicas.Current > 0) {
runtime.Replicas = replicas
}
// Get resources
resources, err := kubectl.GetResources(namespace)
if err == nil {
runtime.Resources = resources
}
// Get recent events (last 10)
events, err := kubectl.GetRecentEvents(namespace, 10)
if err == nil {
runtime.RecentEvents = events
}
return runtime, nil
}
// getAppURL extracts the ingress URL for an app
func (m *Manager) getAppURL(kubeconfigPath, appName string) string {
// Try Traefik IngressRoute first
ingressCmd := exec.Command("kubectl", "get", "ingressroute", "-n", appName, "-o", "json")
tools.WithKubeconfig(ingressCmd, kubeconfigPath)
ingressOutput, err := ingressCmd.CombinedOutput()
if err == nil {
var ingressList struct {
Items []struct {
Spec struct {
Routes []struct {
Match string `json:"match"`
} `json:"routes"`
} `json:"spec"`
} `json:"items"`
}
if json.Unmarshal(ingressOutput, &ingressList) == nil && len(ingressList.Items) > 0 {
if len(ingressList.Items[0].Spec.Routes) > 0 {
match := ingressList.Items[0].Spec.Routes[0].Match
// Parse Host(`domain.com`) format
if strings.Contains(match, "Host(`") {
start := strings.Index(match, "Host(`") + 6
end := strings.Index(match[start:], "`")
if end > 0 {
host := match[start : start+end]
return "https://" + host
}
}
}
}
}
// If no IngressRoute, try standard Ingress
ingressCmd = exec.Command("kubectl", "get", "ingress", "-n", appName, "-o", "json")
tools.WithKubeconfig(ingressCmd, kubeconfigPath)
ingressOutput, err = ingressCmd.CombinedOutput()
if err == nil {
var ingressList struct {
Items []struct {
Spec struct {
Rules []struct {
Host string `json:"host"`
} `json:"rules"`
} `json:"spec"`
} `json:"items"`
}
if json.Unmarshal(ingressOutput, &ingressList) == nil && len(ingressList.Items) > 0 {
if len(ingressList.Items[0].Spec.Rules) > 0 {
host := ingressList.Items[0].Spec.Rules[0].Host
if host != "" {
return "https://" + host
}
}
}
}
return ""
}

55
internal/apps/models.go Normal file
View File

@@ -0,0 +1,55 @@
package apps
import "github.com/wild-cloud/wild-central/daemon/internal/tools"
// AppManifest represents the complete app manifest from manifest.yaml
type AppManifest struct {
Name string `json:"name" yaml:"name"`
Description string `json:"description" yaml:"description"`
Version string `json:"version" yaml:"version"`
Icon string `json:"icon,omitempty" yaml:"icon,omitempty"`
Category string `json:"category,omitempty" yaml:"category,omitempty"`
Requires []AppDependency `json:"requires,omitempty" yaml:"requires,omitempty"`
DefaultConfig map[string]interface{} `json:"defaultConfig,omitempty" yaml:"defaultConfig,omitempty"`
RequiredSecrets []string `json:"requiredSecrets,omitempty" yaml:"requiredSecrets,omitempty"`
}
// AppDependency represents a dependency on another app
type AppDependency struct {
Name string `json:"name" yaml:"name"`
}
// EnhancedApp extends DeployedApp with runtime status information
type EnhancedApp struct {
Name string `json:"name"`
Status string `json:"status"`
Version string `json:"version"`
Namespace string `json:"namespace"`
URL string `json:"url,omitempty"`
Description string `json:"description,omitempty"`
Icon string `json:"icon,omitempty"`
Manifest *AppManifest `json:"manifest,omitempty"`
Runtime *RuntimeStatus `json:"runtime,omitempty"`
Config map[string]string `json:"config,omitempty"`
Readme string `json:"readme,omitempty"`
Documentation string `json:"documentation,omitempty"`
}
// RuntimeStatus contains runtime information from kubernetes
type RuntimeStatus struct {
Pods []PodInfo `json:"pods,omitempty"`
Replicas *ReplicaInfo `json:"replicas,omitempty"`
Resources *ResourceUsage `json:"resources,omitempty"`
RecentEvents []KubernetesEvent `json:"recentEvents,omitempty"`
}
// Type aliases for kubectl wrapper types
// These types are defined in internal/tools and shared across the codebase
type PodInfo = tools.PodInfo
type ContainerInfo = tools.ContainerInfo
type ContainerState = tools.ContainerState
type PodCondition = tools.PodCondition
type ReplicaInfo = tools.ReplicaInfo
type ResourceUsage = tools.ResourceUsage
type KubernetesEvent = tools.KubernetesEvent
type LogEntry = tools.LogEntry

448
internal/assets/assets.go Normal file
View File

@@ -0,0 +1,448 @@
package assets
import (
"crypto/sha256"
"fmt"
"io"
"net/http"
"os"
"path/filepath"
"strings"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
)
// Manager handles centralized Talos asset management
type Manager struct {
dataDir string
}
// NewManager creates a new asset manager
func NewManager(dataDir string) *Manager {
return &Manager{
dataDir: dataDir,
}
}
// Asset represents a Talos boot asset
type Asset struct {
Type string `json:"type"` // kernel, initramfs, iso
Path string `json:"path"` // Full path to asset file
Size int64 `json:"size"` // File size in bytes
SHA256 string `json:"sha256"` // SHA256 hash
Downloaded bool `json:"downloaded"` // Whether asset exists
}
// PXEAsset represents a schematic@version combination and its assets
type PXEAsset struct {
SchematicID string `json:"schematic_id"`
Version string `json:"version"`
Path string `json:"path"`
Assets []Asset `json:"assets"`
}
// AssetStatus represents download status for a schematic
type AssetStatus struct {
SchematicID string `json:"schematic_id"`
Version string `json:"version"`
Assets map[string]Asset `json:"assets"`
Complete bool `json:"complete"`
}
// GetAssetDir returns the asset directory for a schematic@version composite key
func (m *Manager) GetAssetDir(schematicID, version string) string {
composite := fmt.Sprintf("%s@%s", schematicID, version)
return filepath.Join(m.dataDir, "assets", composite)
}
// GetAssetsRootDir returns the root assets directory
func (m *Manager) GetAssetsRootDir() string {
return filepath.Join(m.dataDir, "assets")
}
// ListAssets returns all available schematic@version combinations
func (m *Manager) ListAssets() ([]PXEAsset, error) {
assetsDir := m.GetAssetsRootDir()
// Ensure assets directory exists
if err := storage.EnsureDir(assetsDir, 0755); err != nil {
return nil, fmt.Errorf("ensuring assets directory: %w", err)
}
entries, err := os.ReadDir(assetsDir)
if err != nil {
return nil, fmt.Errorf("reading assets directory: %w", err)
}
var assets []PXEAsset
for _, entry := range entries {
if entry.IsDir() {
// Parse directory name as schematicID@version
parts := strings.SplitN(entry.Name(), "@", 2)
if len(parts) != 2 {
// Skip invalid directory names (old format or other)
continue
}
schematicID := parts[0]
version := parts[1]
asset, err := m.GetAsset(schematicID, version)
if err != nil {
// Skip invalid assets
continue
}
assets = append(assets, *asset)
}
}
return assets, nil
}
// GetAsset returns details for a specific schematic@version combination
func (m *Manager) GetAsset(schematicID, version string) (*PXEAsset, error) {
if schematicID == "" {
return nil, fmt.Errorf("schematic ID cannot be empty")
}
if version == "" {
return nil, fmt.Errorf("version cannot be empty")
}
assetDir := m.GetAssetDir(schematicID, version)
// Check if asset directory exists
if !storage.FileExists(assetDir) {
return nil, fmt.Errorf("asset %s@%s not found", schematicID, version)
}
// List assets for this schematic@version
assets, err := m.listAssetFiles(schematicID, version)
if err != nil {
return nil, fmt.Errorf("listing assets: %w", err)
}
return &PXEAsset{
SchematicID: schematicID,
Version: version,
Path: assetDir,
Assets: assets,
}, nil
}
// AssetExists checks if a schematic@version exists
func (m *Manager) AssetExists(schematicID, version string) bool {
return storage.FileExists(m.GetAssetDir(schematicID, version))
}
// listAssetFiles lists all asset files for a schematic@version
func (m *Manager) listAssetFiles(schematicID, version string) ([]Asset, error) {
assetDir := m.GetAssetDir(schematicID, version)
var assets []Asset
// Check for PXE assets (kernel and initramfs for both platforms)
pxeDir := filepath.Join(assetDir, "pxe")
pxePatterns := []string{
"kernel-amd64",
"kernel-arm64",
"initramfs-amd64.xz",
"initramfs-arm64.xz",
}
for _, pattern := range pxePatterns {
assetPath := filepath.Join(pxeDir, pattern)
info, err := os.Stat(assetPath)
var assetType string
if strings.HasPrefix(pattern, "kernel-") {
assetType = "kernel"
} else {
assetType = "initramfs"
}
asset := Asset{
Type: assetType,
Path: assetPath,
Downloaded: err == nil,
}
if err == nil && info != nil {
asset.Size = info.Size()
// Calculate SHA256 if file exists
if hash, err := calculateSHA256(assetPath); err == nil {
asset.SHA256 = hash
}
}
assets = append(assets, asset)
}
// Check for ISO assets (glob pattern to find all ISOs)
isoDir := filepath.Join(assetDir, "iso")
isoMatches, err := filepath.Glob(filepath.Join(isoDir, "talos-*.iso"))
if err == nil {
for _, isoPath := range isoMatches {
info, err := os.Stat(isoPath)
asset := Asset{
Type: "iso",
Path: isoPath,
Downloaded: err == nil,
}
if err == nil && info != nil {
asset.Size = info.Size()
// Calculate SHA256 if file exists
if hash, err := calculateSHA256(isoPath); err == nil {
asset.SHA256 = hash
}
}
assets = append(assets, asset)
}
}
return assets, nil
}
// DownloadAssets downloads specified assets for a schematic
func (m *Manager) DownloadAssets(schematicID, version, platform string, assetTypes []string) error {
if schematicID == "" {
return fmt.Errorf("schematic ID cannot be empty")
}
if version == "" {
return fmt.Errorf("version cannot be empty")
}
if platform == "" {
platform = "amd64" // Default to amd64
}
// Validate platform
if platform != "amd64" && platform != "arm64" {
return fmt.Errorf("invalid platform: %s (must be amd64 or arm64)", platform)
}
if len(assetTypes) == 0 {
// Default to all asset types
assetTypes = []string{"kernel", "initramfs", "iso"}
}
assetDir := m.GetAssetDir(schematicID, version)
// Ensure asset directory exists
if err := storage.EnsureDir(assetDir, 0755); err != nil {
return fmt.Errorf("creating asset directory: %w", err)
}
// Download each requested asset
for _, assetType := range assetTypes {
if err := m.downloadAsset(schematicID, assetType, version, platform); err != nil {
return fmt.Errorf("downloading %s: %w", assetType, err)
}
}
return nil
}
// downloadAsset downloads a single asset
func (m *Manager) downloadAsset(schematicID, assetType, version, platform string) error {
assetDir := m.GetAssetDir(schematicID, version)
// Determine subdirectory, filename, and URL based on asset type and platform
var subdir, filename, urlPath string
switch assetType {
case "kernel":
subdir = "pxe"
filename = fmt.Sprintf("kernel-%s", platform)
urlPath = fmt.Sprintf("kernel-%s", platform)
case "initramfs":
subdir = "pxe"
filename = fmt.Sprintf("initramfs-%s.xz", platform)
urlPath = fmt.Sprintf("initramfs-%s.xz", platform)
case "iso":
subdir = "iso"
// Include version in filename for clarity
filename = fmt.Sprintf("talos-%s-metal-%s.iso", version, platform)
urlPath = fmt.Sprintf("metal-%s.iso", platform)
default:
return fmt.Errorf("unknown asset type: %s", assetType)
}
// Create subdirectory structure
assetTypeDir := filepath.Join(assetDir, subdir)
if err := storage.EnsureDir(assetTypeDir, 0755); err != nil {
return fmt.Errorf("creating %s directory: %w", subdir, err)
}
assetPath := filepath.Join(assetTypeDir, filename)
// Skip if asset already exists (idempotency)
if storage.FileExists(assetPath) {
return nil
}
// Construct download URL from Image Factory
url := fmt.Sprintf("https://factory.talos.dev/image/%s/%s/%s", schematicID, version, urlPath)
// Download file
resp, err := http.Get(url)
if err != nil {
return fmt.Errorf("downloading from %s: %w", url, err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
return fmt.Errorf("download failed with status %d from %s", resp.StatusCode, url)
}
// Create temporary file
tmpFile := assetPath + ".tmp"
out, err := os.Create(tmpFile)
if err != nil {
return fmt.Errorf("creating temporary file: %w", err)
}
defer out.Close()
// Copy data
_, err = io.Copy(out, resp.Body)
if err != nil {
os.Remove(tmpFile)
return fmt.Errorf("writing file: %w", err)
}
// Close file before rename
out.Close()
// Move to final location
if err := os.Rename(tmpFile, assetPath); err != nil {
os.Remove(tmpFile)
return fmt.Errorf("moving file to final location: %w", err)
}
return nil
}
// GetAssetStatus returns the download status for a schematic@version
func (m *Manager) GetAssetStatus(schematicID, version string) (*AssetStatus, error) {
if schematicID == "" {
return nil, fmt.Errorf("schematic ID cannot be empty")
}
if version == "" {
return nil, fmt.Errorf("version cannot be empty")
}
assetDir := m.GetAssetDir(schematicID, version)
// Check if asset directory exists
if !storage.FileExists(assetDir) {
return nil, fmt.Errorf("asset %s@%s not found", schematicID, version)
}
// List assets
assets, err := m.listAssetFiles(schematicID, version)
if err != nil {
return nil, fmt.Errorf("listing assets: %w", err)
}
// Build asset map and check completion
assetMap := make(map[string]Asset)
complete := true
for _, asset := range assets {
assetMap[asset.Type] = asset
if !asset.Downloaded {
complete = false
}
}
return &AssetStatus{
SchematicID: schematicID,
Version: version,
Assets: assetMap,
Complete: complete,
}, nil
}
// GetAssetPath returns the path to a specific asset file
func (m *Manager) GetAssetPath(schematicID, version, assetType string) (string, error) {
if schematicID == "" {
return "", fmt.Errorf("schematic ID cannot be empty")
}
if version == "" {
return "", fmt.Errorf("version cannot be empty")
}
assetDir := m.GetAssetDir(schematicID, version)
var subdir, pattern string
switch assetType {
case "kernel":
subdir = "pxe"
pattern = "kernel-amd64"
case "initramfs":
subdir = "pxe"
pattern = "initramfs-amd64.xz"
case "iso":
subdir = "iso"
pattern = "talos-*.iso" // Glob pattern for version and platform-specific filename
default:
return "", fmt.Errorf("unknown asset type: %s", assetType)
}
assetTypeDir := filepath.Join(assetDir, subdir)
// Find matching file (supports glob pattern for ISO)
var assetPath string
if strings.Contains(pattern, "*") {
matches, err := filepath.Glob(filepath.Join(assetTypeDir, pattern))
if err != nil {
return "", fmt.Errorf("searching for asset: %w", err)
}
if len(matches) == 0 {
return "", fmt.Errorf("asset %s not found for schematic %s", assetType, schematicID)
}
assetPath = matches[0] // Use first match
} else {
assetPath = filepath.Join(assetTypeDir, pattern)
}
if !storage.FileExists(assetPath) {
return "", fmt.Errorf("asset %s not found for schematic %s", assetType, schematicID)
}
return assetPath, nil
}
// DeleteAsset removes a schematic@version and all its assets
func (m *Manager) DeleteAsset(schematicID, version string) error {
if schematicID == "" {
return fmt.Errorf("schematic ID cannot be empty")
}
if version == "" {
return fmt.Errorf("version cannot be empty")
}
assetDir := m.GetAssetDir(schematicID, version)
if !storage.FileExists(assetDir) {
return nil // Already deleted, idempotent
}
return os.RemoveAll(assetDir)
}
// calculateSHA256 computes the SHA256 hash of a file
func calculateSHA256(filePath string) (string, error) {
file, err := os.Open(filePath)
if err != nil {
return "", err
}
defer file.Close()
hash := sha256.New()
if _, err := io.Copy(hash, file); err != nil {
return "", err
}
return fmt.Sprintf("%x", hash.Sum(nil)), nil
}

View File

@@ -28,10 +28,10 @@ type BackupInfo struct {
// RestoreOptions configures restore behavior
type RestoreOptions struct {
DBOnly bool `json:"db_only"`
PVCOnly bool `json:"pvc_only"`
SkipGlobals bool `json:"skip_globals"`
SnapshotID string `json:"snapshot_id,omitempty"`
DBOnly bool `json:"db_only"`
PVCOnly bool `json:"pvc_only"`
SkipGlobals bool `json:"skip_globals"`
SnapshotID string `json:"snapshot_id,omitempty"`
}
// Manager handles backup and restore operations
@@ -46,7 +46,7 @@ func NewManager(dataDir string) *Manager {
// GetBackupDir returns the backup directory for an instance
func (m *Manager) GetBackupDir(instanceName string) string {
return filepath.Join(m.dataDir, "instances", instanceName, "backups")
return tools.GetInstanceBackupsPath(m.dataDir, instanceName)
}
// GetStagingDir returns the staging directory for backups

View File

@@ -1,6 +1,7 @@
package cluster
import (
"context"
"encoding/json"
"fmt"
"log"
@@ -10,6 +11,7 @@ import (
"strings"
"time"
"github.com/wild-cloud/wild-central/daemon/internal/operations"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
@@ -18,13 +20,15 @@ import (
type Manager struct {
dataDir string
talosctl *tools.Talosctl
opsMgr *operations.Manager
}
// NewManager creates a new cluster manager
func NewManager(dataDir string) *Manager {
func NewManager(dataDir string, opsMgr *operations.Manager) *Manager {
return &Manager{
dataDir: dataDir,
talosctl: tools.NewTalosctl(),
opsMgr: opsMgr,
}
}
@@ -35,20 +39,29 @@ type ClusterConfig struct {
Version string `json:"version"`
}
// NodeStatus represents the health status of a single node
type NodeStatus struct {
Hostname string `json:"hostname"`
Ready bool `json:"ready"`
KubernetesReady bool `json:"kubernetes_ready"`
Role string `json:"role"` // "control-plane" or "worker"
}
// ClusterStatus represents cluster health and status
type ClusterStatus struct {
Status string `json:"status"` // ready, pending, error
Nodes int `json:"nodes"`
ControlPlaneNodes int `json:"control_plane_nodes"`
WorkerNodes int `json:"worker_nodes"`
KubernetesVersion string `json:"kubernetes_version"`
TalosVersion string `json:"talos_version"`
Services map[string]string `json:"services"`
Status string `json:"status"` // ready, pending, error
Nodes int `json:"nodes"`
ControlPlaneNodes int `json:"control_plane_nodes"`
WorkerNodes int `json:"worker_nodes"`
KubernetesVersion string `json:"kubernetes_version"`
TalosVersion string `json:"talos_version"`
Services map[string]string `json:"services"`
NodeStatuses map[string]NodeStatus `json:"node_statuses,omitempty"`
}
// GetTalosDir returns the talos directory for an instance
func (m *Manager) GetTalosDir(instanceName string) string {
return filepath.Join(m.dataDir, "instances", instanceName, "talos")
return tools.GetInstanceTalosPath(m.dataDir, instanceName)
}
// GetGeneratedDir returns the generated config directory
@@ -96,12 +109,28 @@ func (m *Manager) GenerateConfig(instanceName string, config *ClusterConfig) err
return nil
}
// Bootstrap bootstraps the cluster on the specified node
func (m *Manager) Bootstrap(instanceName, nodeName string) error {
// Get node configuration to find the target IP
instancePath := filepath.Join(m.dataDir, "instances", instanceName)
configPath := filepath.Join(instancePath, "config.yaml")
// Bootstrap bootstraps the cluster on the specified node with progress tracking
func (m *Manager) Bootstrap(instanceName, nodeName string) (string, error) {
// Create operation for tracking
opID, err := m.opsMgr.Start(instanceName, "bootstrap", nodeName)
if err != nil {
return "", fmt.Errorf("failed to start bootstrap operation: %w", err)
}
// Run bootstrap asynchronously
go func() {
if err := m.runBootstrapWithTracking(instanceName, nodeName, opID); err != nil {
_ = m.opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
}
}()
return opID, nil
}
// runBootstrapWithTracking runs the bootstrap process with detailed progress tracking
func (m *Manager) runBootstrapWithTracking(instanceName, nodeName, opID string) error {
ctx := context.Background()
configPath := tools.GetInstanceConfigPath(m.dataDir, instanceName)
yq := tools.NewYQ()
// Get node's target IP
@@ -115,17 +144,71 @@ func (m *Manager) Bootstrap(instanceName, nodeName string) error {
return fmt.Errorf("node %s does not have a target IP configured", nodeName)
}
// Get talosconfig path for this instance
// Get VIP
vipRaw, err := yq.Get(configPath, ".cluster.nodes.control.vip")
if err != nil {
return fmt.Errorf("failed to get VIP: %w", err)
}
vip := tools.CleanYQOutput(vipRaw)
if vip == "" || vip == "null" {
return fmt.Errorf("control plane VIP not configured")
}
// Step 0: Run talosctl bootstrap
if err := m.runBootstrapCommand(instanceName, nodeIP, opID); err != nil {
return err
}
// Step 1: Wait for etcd health
if err := m.waitForEtcd(ctx, instanceName, nodeIP, opID); err != nil {
return err
}
// Step 2: Wait for VIP assignment
if err := m.waitForVIP(ctx, instanceName, nodeIP, vip, opID); err != nil {
return err
}
// Step 3: Wait for control plane components
if err := m.waitForControlPlane(ctx, instanceName, nodeIP, opID); err != nil {
return err
}
// Step 4: Wait for API server on VIP
if err := m.waitForAPIServer(ctx, instanceName, vip, opID); err != nil {
return err
}
// Step 5: Configure cluster access
if err := m.configureClusterAccess(instanceName, vip, opID); err != nil {
return err
}
// Step 6: Verify node registration
if err := m.waitForNodeRegistration(ctx, instanceName, opID); err != nil {
return err
}
// Mark as completed
_ = m.opsMgr.Update(instanceName, opID, "completed", "Bootstrap completed successfully", 100)
return nil
}
// runBootstrapCommand executes the initial bootstrap command
func (m *Manager) runBootstrapCommand(instanceName, nodeIP, opID string) error {
_ = m.opsMgr.UpdateBootstrapProgress(instanceName, opID, 0, "bootstrap", 1, 1, "Running talosctl bootstrap command")
talosconfigPath := tools.GetTalosconfigPath(m.dataDir, instanceName)
// Set talosctl endpoint (with proper context via TALOSCONFIG env var)
// Set talosctl endpoint
cmdEndpoint := exec.Command("talosctl", "config", "endpoint", nodeIP)
tools.WithTalosconfig(cmdEndpoint, talosconfigPath)
if output, err := cmdEndpoint.CombinedOutput(); err != nil {
return fmt.Errorf("failed to set talosctl endpoint: %w\nOutput: %s", err, string(output))
}
// Bootstrap command (with proper context via TALOSCONFIG env var)
// Bootstrap command
cmd := exec.Command("talosctl", "bootstrap", "--nodes", nodeIP)
tools.WithTalosconfig(cmd, talosconfigPath)
output, err := cmd.CombinedOutput()
@@ -133,16 +216,152 @@ func (m *Manager) Bootstrap(instanceName, nodeName string) error {
return fmt.Errorf("failed to bootstrap cluster: %w\nOutput: %s", err, string(output))
}
// Retrieve kubeconfig after bootstrap (best-effort with retry)
log.Printf("Waiting for Kubernetes API server to become ready...")
if err := m.retrieveKubeconfigFromCluster(instanceName, nodeIP, 5*time.Minute); err != nil {
log.Printf("Warning: %v", err)
log.Printf("You can retrieve it manually later using: wild cluster kubeconfig --generate")
return nil
}
// waitForEtcd waits for etcd to become healthy
func (m *Manager) waitForEtcd(ctx context.Context, instanceName, nodeIP, opID string) error {
maxAttempts := 30
talosconfigPath := tools.GetTalosconfigPath(m.dataDir, instanceName)
for attempt := 1; attempt <= maxAttempts; attempt++ {
_ = m.opsMgr.UpdateBootstrapProgress(instanceName, opID, 1, "etcd", attempt, maxAttempts, "Waiting for etcd to become healthy")
cmd := exec.Command("talosctl", "-n", nodeIP, "etcd", "status")
tools.WithTalosconfig(cmd, talosconfigPath)
output, err := cmd.CombinedOutput()
if err == nil && strings.Contains(string(output), nodeIP) {
return nil
}
if attempt < maxAttempts {
time.Sleep(10 * time.Second)
}
}
return fmt.Errorf("etcd did not become healthy after %d attempts", maxAttempts)
}
// waitForVIP waits for VIP to be assigned to the node
func (m *Manager) waitForVIP(ctx context.Context, instanceName, nodeIP, vip, opID string) error {
maxAttempts := 90
talosconfigPath := tools.GetTalosconfigPath(m.dataDir, instanceName)
for attempt := 1; attempt <= maxAttempts; attempt++ {
_ = m.opsMgr.UpdateBootstrapProgress(instanceName, opID, 2, "vip", attempt, maxAttempts, "Waiting for VIP assignment")
cmd := exec.Command("talosctl", "-n", nodeIP, "get", "addresses")
tools.WithTalosconfig(cmd, talosconfigPath)
output, err := cmd.CombinedOutput()
if err == nil && strings.Contains(string(output), vip+"/32") {
return nil
}
if attempt < maxAttempts {
time.Sleep(10 * time.Second)
}
}
return fmt.Errorf("VIP was not assigned after %d attempts", maxAttempts)
}
// waitForControlPlane waits for control plane components to start
func (m *Manager) waitForControlPlane(ctx context.Context, instanceName, nodeIP, opID string) error {
maxAttempts := 60
talosconfigPath := tools.GetTalosconfigPath(m.dataDir, instanceName)
for attempt := 1; attempt <= maxAttempts; attempt++ {
_ = m.opsMgr.UpdateBootstrapProgress(instanceName, opID, 3, "controlplane", attempt, maxAttempts, "Waiting for control plane components")
cmd := exec.Command("talosctl", "-n", nodeIP, "containers", "-k")
tools.WithTalosconfig(cmd, talosconfigPath)
output, err := cmd.CombinedOutput()
if err == nil && strings.Contains(string(output), "kube-") {
return nil
}
if attempt < maxAttempts {
time.Sleep(10 * time.Second)
}
}
return fmt.Errorf("control plane components did not start after %d attempts", maxAttempts)
}
// waitForAPIServer waits for Kubernetes API server to respond
func (m *Manager) waitForAPIServer(ctx context.Context, instanceName, vip, opID string) error {
maxAttempts := 60
apiURL := fmt.Sprintf("https://%s:6443/healthz", vip)
for attempt := 1; attempt <= maxAttempts; attempt++ {
_ = m.opsMgr.UpdateBootstrapProgress(instanceName, opID, 4, "apiserver", attempt, maxAttempts, "Waiting for Kubernetes API server")
cmd := exec.Command("curl", "-k", "-s", "--max-time", "5", apiURL)
output, err := cmd.CombinedOutput()
if err == nil && strings.Contains(string(output), "ok") {
return nil
}
if attempt < maxAttempts {
time.Sleep(10 * time.Second)
}
}
return fmt.Errorf("API server did not respond after %d attempts", maxAttempts)
}
// configureClusterAccess configures talosctl and kubectl to use the VIP
func (m *Manager) configureClusterAccess(instanceName, vip, opID string) error {
_ = m.opsMgr.UpdateBootstrapProgress(instanceName, opID, 5, "configure", 1, 1, "Configuring cluster access")
talosconfigPath := tools.GetTalosconfigPath(m.dataDir, instanceName)
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
// Set talosctl endpoint to VIP
cmdEndpoint := exec.Command("talosctl", "config", "endpoint", vip)
tools.WithTalosconfig(cmdEndpoint, talosconfigPath)
if output, err := cmdEndpoint.CombinedOutput(); err != nil {
return fmt.Errorf("failed to set talosctl endpoint: %w\nOutput: %s", err, string(output))
}
// Retrieve kubeconfig
cmdKubeconfig := exec.Command("talosctl", "kubeconfig", "--nodes", vip, kubeconfigPath)
tools.WithTalosconfig(cmdKubeconfig, talosconfigPath)
if output, err := cmdKubeconfig.CombinedOutput(); err != nil {
return fmt.Errorf("failed to retrieve kubeconfig: %w\nOutput: %s", err, string(output))
}
return nil
}
// waitForNodeRegistration waits for the node to register with Kubernetes
func (m *Manager) waitForNodeRegistration(ctx context.Context, instanceName, opID string) error {
maxAttempts := 10
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
for attempt := 1; attempt <= maxAttempts; attempt++ {
_ = m.opsMgr.UpdateBootstrapProgress(instanceName, opID, 6, "nodes", attempt, maxAttempts, "Waiting for node registration")
cmd := exec.Command("kubectl", "get", "nodes")
tools.WithKubeconfig(cmd, kubeconfigPath)
output, err := cmd.CombinedOutput()
if err == nil && strings.Contains(string(output), "Ready") {
return nil
}
if attempt < maxAttempts {
time.Sleep(10 * time.Second)
}
}
return fmt.Errorf("node did not register after %d attempts", maxAttempts)
}
// retrieveKubeconfigFromCluster retrieves kubeconfig from the cluster with retry logic
func (m *Manager) retrieveKubeconfigFromCluster(instanceName, nodeIP string, timeout time.Duration) error {
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
@@ -183,8 +402,7 @@ func (m *Manager) retrieveKubeconfigFromCluster(instanceName, nodeIP string, tim
// RegenerateKubeconfig regenerates the kubeconfig by retrieving it from the cluster
func (m *Manager) RegenerateKubeconfig(instanceName string) error {
instancePath := filepath.Join(m.dataDir, "instances", instanceName)
configPath := filepath.Join(instancePath, "config.yaml")
configPath := tools.GetInstanceConfigPath(m.dataDir, instanceName)
yq := tools.NewYQ()
@@ -206,8 +424,7 @@ func (m *Manager) RegenerateKubeconfig(instanceName string) error {
// ConfigureEndpoints updates talosconfig to use VIP and retrieves kubeconfig
func (m *Manager) ConfigureEndpoints(instanceName string, includeNodes bool) error {
instancePath := filepath.Join(m.dataDir, "instances", instanceName)
configPath := filepath.Join(instancePath, "config.yaml")
configPath := tools.GetInstanceConfigPath(m.dataDir, instanceName)
talosconfigPath := tools.GetTalosconfigPath(m.dataDir, instanceName)
yq := tools.NewYQ()
@@ -276,7 +493,8 @@ func (m *Manager) GetStatus(instanceName string) (*ClusterStatus, error) {
}
// Get node count and types using kubectl
cmd := exec.Command("kubectl", "--kubeconfig", kubeconfigPath, "get", "nodes", "-o", "json")
cmd := exec.Command("kubectl", "get", "nodes", "-o", "json")
tools.WithKubeconfig(cmd, kubeconfigPath)
output, err := cmd.Output()
if err != nil {
status.Status = "unreachable"
@@ -286,6 +504,7 @@ func (m *Manager) GetStatus(instanceName string) (*ClusterStatus, error) {
var nodesResult struct {
Items []struct {
Metadata struct {
Name string `json:"name"`
Labels map[string]string `json:"labels"`
} `json:"metadata"`
Status struct {
@@ -330,20 +549,38 @@ func (m *Manager) GetStatus(instanceName string) (*ClusterStatus, error) {
}
}
// Count control plane and worker nodes
// Count control plane and worker nodes, and populate per-node status
status.NodeStatuses = make(map[string]NodeStatus)
for _, node := range nodesResult.Items {
hostname := node.Metadata.Name // K8s node name is hostname
role := "worker"
if _, isControl := node.Metadata.Labels["node-role.kubernetes.io/control-plane"]; isControl {
role = "control-plane"
status.ControlPlaneNodes++
} else {
status.WorkerNodes++
}
// Check if node is ready
nodeReady := false
for _, cond := range node.Status.Conditions {
if cond.Type == "Ready" && cond.Status != "True" {
status.Status = "degraded"
if cond.Type == "Ready" {
nodeReady = (cond.Status == "True")
if !nodeReady {
status.Status = "degraded"
}
break
}
}
status.NodeStatuses[hostname] = NodeStatus{
Hostname: hostname,
Ready: true, // In K8s means it's reachable
KubernetesReady: nodeReady,
Role: role,
}
}
// Check basic service status
@@ -359,9 +596,9 @@ func (m *Manager) GetStatus(instanceName string) (*ClusterStatus, error) {
}
for _, svc := range services {
cmd := exec.Command("kubectl", "--kubeconfig", kubeconfigPath,
"get", "pods", "-n", svc.namespace, "-l", svc.selector,
cmd := exec.Command("kubectl", "get", "pods", "-n", svc.namespace, "-l", svc.selector,
"-o", "jsonpath={.items[*].status.phase}")
tools.WithKubeconfig(cmd, kubeconfigPath)
output, err := cmd.Output()
if err != nil || len(output) == 0 {
status.Services[svc.name] = "not_found"

View File

@@ -101,17 +101,31 @@ type NodeConfig struct {
}
type InstanceConfig struct {
BaseDomain string `yaml:"baseDomain" json:"baseDomain"`
Domain string `yaml:"domain" json:"domain"`
InternalDomain string `yaml:"internalDomain" json:"internalDomain"`
Backup struct {
Root string `yaml:"root" json:"root"`
} `yaml:"backup" json:"backup"`
DHCPRange string `yaml:"dhcpRange" json:"dhcpRange"`
NFS struct {
Host string `yaml:"host" json:"host"`
MediaPath string `yaml:"mediaPath" json:"mediaPath"`
} `yaml:"nfs" json:"nfs"`
Cloud struct {
Router struct {
IP string `yaml:"ip" json:"ip"`
} `yaml:"router" json:"router"`
DNS struct {
IP string `yaml:"ip" json:"ip"`
ExternalResolver string `yaml:"externalResolver" json:"externalResolver"`
} `yaml:"dns" json:"dns"`
DHCPRange string `yaml:"dhcpRange" json:"dhcpRange"`
Dnsmasq struct {
Interface string `yaml:"interface" json:"interface"`
} `yaml:"dnsmasq" json:"dnsmasq"`
BaseDomain string `yaml:"baseDomain" json:"baseDomain"`
Domain string `yaml:"domain" json:"domain"`
InternalDomain string `yaml:"internalDomain" json:"internalDomain"`
NFS struct {
MediaPath string `yaml:"mediaPath" json:"mediaPath"`
Host string `yaml:"host" json:"host"`
StorageCapacity string `yaml:"storageCapacity" json:"storageCapacity"`
} `yaml:"nfs" json:"nfs"`
DockerRegistryHost string `yaml:"dockerRegistryHost" json:"dockerRegistryHost"`
Backup struct {
Root string `yaml:"root" json:"root"`
} `yaml:"backup" json:"backup"`
} `yaml:"cloud" json:"cloud"`
Cluster struct {
Name string `yaml:"name" json:"name"`
LoadBalancerIp string `yaml:"loadBalancerIp" json:"loadBalancerIp"`

View File

@@ -0,0 +1,714 @@
package config
import (
"os"
"path/filepath"
"strings"
"testing"
)
// Test: LoadGlobalConfig loads valid configuration
func TestLoadGlobalConfig(t *testing.T) {
tests := []struct {
name string
configYAML string
verify func(t *testing.T, config *GlobalConfig)
wantErr bool
}{
{
name: "loads complete configuration",
configYAML: `wildcloud:
repository: "https://github.com/example/repo"
currentPhase: "setup"
completedPhases:
- "phase1"
- "phase2"
server:
port: 8080
host: "localhost"
operator:
email: "admin@example.com"
cloud:
dns:
ip: "192.168.1.1"
externalResolver: "8.8.8.8"
router:
ip: "192.168.1.254"
dynamicDns: "example.dyndns.org"
dnsmasq:
interface: "eth0"
cluster:
endpointIp: "192.168.1.100"
nodes:
talos:
version: "v1.8.0"
`,
verify: func(t *testing.T, config *GlobalConfig) {
if config.Wildcloud.Repository != "https://github.com/example/repo" {
t.Error("repository not loaded correctly")
}
if config.Server.Port != 8080 {
t.Error("port not loaded correctly")
}
if config.Cloud.DNS.IP != "192.168.1.1" {
t.Error("DNS IP not loaded correctly")
}
if config.Cluster.EndpointIP != "192.168.1.100" {
t.Error("endpoint IP not loaded correctly")
}
},
wantErr: false,
},
{
name: "applies default values",
configYAML: `cloud:
dns:
ip: "192.168.1.1"
cluster:
nodes:
talos:
version: "v1.8.0"
`,
verify: func(t *testing.T, config *GlobalConfig) {
if config.Server.Port != 5055 {
t.Errorf("default port not applied, got %d, want 5055", config.Server.Port)
}
if config.Server.Host != "0.0.0.0" {
t.Errorf("default host not applied, got %q, want %q", config.Server.Host, "0.0.0.0")
}
},
wantErr: false,
},
{
name: "preserves custom port and host",
configYAML: `server:
port: 9000
host: "127.0.0.1"
cloud:
dns:
ip: "192.168.1.1"
cluster:
nodes:
talos:
version: "v1.8.0"
`,
verify: func(t *testing.T, config *GlobalConfig) {
if config.Server.Port != 9000 {
t.Errorf("custom port not preserved, got %d, want 9000", config.Server.Port)
}
if config.Server.Host != "127.0.0.1" {
t.Errorf("custom host not preserved, got %q, want %q", config.Server.Host, "127.0.0.1")
}
},
wantErr: false,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
tempDir := t.TempDir()
configPath := filepath.Join(tempDir, "config.yaml")
if err := os.WriteFile(configPath, []byte(tt.configYAML), 0644); err != nil {
t.Fatalf("setup failed: %v", err)
}
config, err := LoadGlobalConfig(configPath)
if tt.wantErr {
if err == nil {
t.Error("expected error, got nil")
}
return
}
if err != nil {
t.Errorf("unexpected error: %v", err)
return
}
if config == nil {
t.Fatal("config is nil")
}
if tt.verify != nil {
tt.verify(t, config)
}
})
}
}
// Test: LoadGlobalConfig error cases
func TestLoadGlobalConfig_Errors(t *testing.T) {
tests := []struct {
name string
setupFunc func(t *testing.T) string
errContains string
}{
{
name: "non-existent file",
setupFunc: func(t *testing.T) string {
return filepath.Join(t.TempDir(), "nonexistent.yaml")
},
errContains: "reading config file",
},
{
name: "invalid yaml",
setupFunc: func(t *testing.T) string {
tempDir := t.TempDir()
configPath := filepath.Join(tempDir, "config.yaml")
content := `invalid: yaml: [[[`
if err := os.WriteFile(configPath, []byte(content), 0644); err != nil {
t.Fatalf("setup failed: %v", err)
}
return configPath
},
errContains: "parsing config file",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
configPath := tt.setupFunc(t)
_, err := LoadGlobalConfig(configPath)
if err == nil {
t.Error("expected error, got nil")
} else if !strings.Contains(err.Error(), tt.errContains) {
t.Errorf("error %q does not contain %q", err.Error(), tt.errContains)
}
})
}
}
// Test: SaveGlobalConfig saves configuration correctly
func TestSaveGlobalConfig(t *testing.T) {
tests := []struct {
name string
config *GlobalConfig
verify func(t *testing.T, configPath string)
}{
{
name: "saves complete configuration",
config: &GlobalConfig{
Wildcloud: struct {
Repository string `yaml:"repository" json:"repository"`
CurrentPhase string `yaml:"currentPhase" json:"currentPhase"`
CompletedPhases []string `yaml:"completedPhases" json:"completedPhases"`
}{
Repository: "https://github.com/example/repo",
CurrentPhase: "setup",
CompletedPhases: []string{"phase1", "phase2"},
},
Server: struct {
Port int `yaml:"port" json:"port"`
Host string `yaml:"host" json:"host"`
}{
Port: 8080,
Host: "localhost",
},
},
verify: func(t *testing.T, configPath string) {
content, err := os.ReadFile(configPath)
if err != nil {
t.Fatalf("failed to read saved config: %v", err)
}
contentStr := string(content)
if !strings.Contains(contentStr, "repository") {
t.Error("saved config missing repository field")
}
if !strings.Contains(contentStr, "8080") {
t.Error("saved config missing port value")
}
},
},
{
name: "saves empty configuration",
config: &GlobalConfig{},
verify: func(t *testing.T, configPath string) {
if _, err := os.Stat(configPath); os.IsNotExist(err) {
t.Error("config file not created")
}
},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
tempDir := t.TempDir()
configPath := filepath.Join(tempDir, "subdir", "config.yaml")
err := SaveGlobalConfig(tt.config, configPath)
if err != nil {
t.Errorf("SaveGlobalConfig failed: %v", err)
return
}
// Verify file exists
if _, err := os.Stat(configPath); err != nil {
t.Errorf("config file not created: %v", err)
return
}
// Verify file permissions
info, err := os.Stat(configPath)
if err != nil {
t.Fatalf("failed to stat config file: %v", err)
}
if info.Mode().Perm() != 0644 {
t.Errorf("expected permissions 0644, got %v", info.Mode().Perm())
}
// Verify content can be loaded back
loadedConfig, err := LoadGlobalConfig(configPath)
if err != nil {
t.Errorf("failed to reload saved config: %v", err)
} else if loadedConfig == nil {
t.Error("loaded config is nil")
}
if tt.verify != nil {
tt.verify(t, configPath)
}
})
}
}
// Test: SaveGlobalConfig creates directory
func TestSaveGlobalConfig_CreatesDirectory(t *testing.T) {
tempDir := t.TempDir()
configPath := filepath.Join(tempDir, "nested", "dirs", "config.yaml")
config := &GlobalConfig{}
err := SaveGlobalConfig(config, configPath)
if err != nil {
t.Fatalf("SaveGlobalConfig failed: %v", err)
}
// Verify nested directories were created
if _, err := os.Stat(filepath.Dir(configPath)); err != nil {
t.Errorf("directory not created: %v", err)
}
// Verify file exists
if _, err := os.Stat(configPath); err != nil {
t.Errorf("config file not created: %v", err)
}
}
// Test: GlobalConfig.IsEmpty checks if config is empty
func TestGlobalConfig_IsEmpty(t *testing.T) {
tests := []struct {
name string
config *GlobalConfig
want bool
}{
{
name: "nil config is empty",
config: nil,
want: true,
},
{
name: "default config is empty",
config: &GlobalConfig{},
want: true,
},
{
name: "config with only DNS IP is empty",
config: &GlobalConfig{
Cloud: struct {
DNS struct {
IP string `yaml:"ip" json:"ip"`
ExternalResolver string `yaml:"externalResolver" json:"externalResolver"`
} `yaml:"dns" json:"dns"`
Router struct {
IP string `yaml:"ip" json:"ip"`
DynamicDns string `yaml:"dynamicDns" json:"dynamicDns"`
} `yaml:"router" json:"router"`
Dnsmasq struct {
Interface string `yaml:"interface" json:"interface"`
} `yaml:"dnsmasq" json:"dnsmasq"`
}{
DNS: struct {
IP string `yaml:"ip" json:"ip"`
ExternalResolver string `yaml:"externalResolver" json:"externalResolver"`
}{
IP: "192.168.1.1",
},
},
},
want: true,
},
{
name: "config with only Talos version is empty",
config: &GlobalConfig{
Cluster: struct {
EndpointIP string `yaml:"endpointIp" json:"endpointIp"`
Nodes struct {
Talos struct {
Version string `yaml:"version" json:"version"`
} `yaml:"talos" json:"talos"`
} `yaml:"nodes" json:"nodes"`
}{
Nodes: struct {
Talos struct {
Version string `yaml:"version" json:"version"`
} `yaml:"talos" json:"talos"`
}{
Talos: struct {
Version string `yaml:"version" json:"version"`
}{
Version: "v1.8.0",
},
},
},
},
want: true,
},
{
name: "config with both DNS IP and Talos version is not empty",
config: &GlobalConfig{
Cloud: struct {
DNS struct {
IP string `yaml:"ip" json:"ip"`
ExternalResolver string `yaml:"externalResolver" json:"externalResolver"`
} `yaml:"dns" json:"dns"`
Router struct {
IP string `yaml:"ip" json:"ip"`
DynamicDns string `yaml:"dynamicDns" json:"dynamicDns"`
} `yaml:"router" json:"router"`
Dnsmasq struct {
Interface string `yaml:"interface" json:"interface"`
} `yaml:"dnsmasq" json:"dnsmasq"`
}{
DNS: struct {
IP string `yaml:"ip" json:"ip"`
ExternalResolver string `yaml:"externalResolver" json:"externalResolver"`
}{
IP: "192.168.1.1",
},
},
Cluster: struct {
EndpointIP string `yaml:"endpointIp" json:"endpointIp"`
Nodes struct {
Talos struct {
Version string `yaml:"version" json:"version"`
} `yaml:"talos" json:"talos"`
} `yaml:"nodes" json:"nodes"`
}{
Nodes: struct {
Talos struct {
Version string `yaml:"version" json:"version"`
} `yaml:"talos" json:"talos"`
}{
Talos: struct {
Version string `yaml:"version" json:"version"`
}{
Version: "v1.8.0",
},
},
},
},
want: false,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := tt.config.IsEmpty()
if got != tt.want {
t.Errorf("IsEmpty() = %v, want %v", got, tt.want)
}
})
}
}
// Test: LoadCloudConfig loads instance configuration
func TestLoadCloudConfig(t *testing.T) {
tests := []struct {
name string
configYAML string
verify func(t *testing.T, config *InstanceConfig)
wantErr bool
}{
{
name: "loads complete instance configuration",
configYAML: `cloud:
router:
ip: "192.168.1.254"
dns:
ip: "192.168.1.1"
externalResolver: "8.8.8.8"
dhcpRange: "192.168.1.100,192.168.1.200"
baseDomain: "example.com"
domain: "home"
internalDomain: "internal.example.com"
cluster:
name: "my-cluster"
loadBalancerIp: "192.168.1.10"
nodes:
talos:
version: "v1.8.0"
activeNodes:
- node1:
role: "control"
interface: "eth0"
disk: "/dev/sda"
`,
verify: func(t *testing.T, config *InstanceConfig) {
if config.Cloud.BaseDomain != "example.com" {
t.Error("base domain not loaded correctly")
}
if config.Cluster.Name != "my-cluster" {
t.Error("cluster name not loaded correctly")
}
if config.Cluster.Nodes.Talos.Version != "v1.8.0" {
t.Error("talos version not loaded correctly")
}
},
wantErr: false,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
tempDir := t.TempDir()
configPath := filepath.Join(tempDir, "config.yaml")
if err := os.WriteFile(configPath, []byte(tt.configYAML), 0644); err != nil {
t.Fatalf("setup failed: %v", err)
}
config, err := LoadCloudConfig(configPath)
if tt.wantErr {
if err == nil {
t.Error("expected error, got nil")
}
return
}
if err != nil {
t.Errorf("unexpected error: %v", err)
return
}
if config == nil {
t.Fatal("config is nil")
}
if tt.verify != nil {
tt.verify(t, config)
}
})
}
}
// Test: LoadCloudConfig error cases
func TestLoadCloudConfig_Errors(t *testing.T) {
tests := []struct {
name string
setupFunc func(t *testing.T) string
errContains string
}{
{
name: "non-existent file",
setupFunc: func(t *testing.T) string {
return filepath.Join(t.TempDir(), "nonexistent.yaml")
},
errContains: "reading config file",
},
{
name: "invalid yaml",
setupFunc: func(t *testing.T) string {
tempDir := t.TempDir()
configPath := filepath.Join(tempDir, "config.yaml")
content := `invalid: yaml: [[[`
if err := os.WriteFile(configPath, []byte(content), 0644); err != nil {
t.Fatalf("setup failed: %v", err)
}
return configPath
},
errContains: "parsing config file",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
configPath := tt.setupFunc(t)
_, err := LoadCloudConfig(configPath)
if err == nil {
t.Error("expected error, got nil")
} else if !strings.Contains(err.Error(), tt.errContains) {
t.Errorf("error %q does not contain %q", err.Error(), tt.errContains)
}
})
}
}
// Test: SaveCloudConfig saves instance configuration
func TestSaveCloudConfig(t *testing.T) {
tests := []struct {
name string
config *InstanceConfig
verify func(t *testing.T, configPath string)
}{
{
name: "saves instance configuration",
config: &InstanceConfig{
Cloud: struct {
Router struct {
IP string `yaml:"ip" json:"ip"`
} `yaml:"router" json:"router"`
DNS struct {
IP string `yaml:"ip" json:"ip"`
ExternalResolver string `yaml:"externalResolver" json:"externalResolver"`
} `yaml:"dns" json:"dns"`
DHCPRange string `yaml:"dhcpRange" json:"dhcpRange"`
Dnsmasq struct {
Interface string `yaml:"interface" json:"interface"`
} `yaml:"dnsmasq" json:"dnsmasq"`
BaseDomain string `yaml:"baseDomain" json:"baseDomain"`
Domain string `yaml:"domain" json:"domain"`
InternalDomain string `yaml:"internalDomain" json:"internalDomain"`
NFS struct {
MediaPath string `yaml:"mediaPath" json:"mediaPath"`
Host string `yaml:"host" json:"host"`
StorageCapacity string `yaml:"storageCapacity" json:"storageCapacity"`
} `yaml:"nfs" json:"nfs"`
DockerRegistryHost string `yaml:"dockerRegistryHost" json:"dockerRegistryHost"`
Backup struct {
Root string `yaml:"root" json:"root"`
} `yaml:"backup" json:"backup"`
}{
BaseDomain: "example.com",
Domain: "home",
},
},
verify: func(t *testing.T, configPath string) {
content, err := os.ReadFile(configPath)
if err != nil {
t.Fatalf("failed to read saved config: %v", err)
}
contentStr := string(content)
if !strings.Contains(contentStr, "example.com") {
t.Error("saved config missing base domain")
}
},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
tempDir := t.TempDir()
configPath := filepath.Join(tempDir, "subdir", "config.yaml")
err := SaveCloudConfig(tt.config, configPath)
if err != nil {
t.Errorf("SaveCloudConfig failed: %v", err)
return
}
// Verify file exists
if _, err := os.Stat(configPath); err != nil {
t.Errorf("config file not created: %v", err)
return
}
// Verify content can be loaded back
loadedConfig, err := LoadCloudConfig(configPath)
if err != nil {
t.Errorf("failed to reload saved config: %v", err)
} else if loadedConfig == nil {
t.Error("loaded config is nil")
}
if tt.verify != nil {
tt.verify(t, configPath)
}
})
}
}
// Test: Round-trip save and load preserves data
func TestGlobalConfig_RoundTrip(t *testing.T) {
tempDir := t.TempDir()
configPath := filepath.Join(tempDir, "config.yaml")
// Create config with all fields
original := &GlobalConfig{
Wildcloud: struct {
Repository string `yaml:"repository" json:"repository"`
CurrentPhase string `yaml:"currentPhase" json:"currentPhase"`
CompletedPhases []string `yaml:"completedPhases" json:"completedPhases"`
}{
Repository: "https://github.com/example/repo",
CurrentPhase: "setup",
CompletedPhases: []string{"phase1", "phase2"},
},
Server: struct {
Port int `yaml:"port" json:"port"`
Host string `yaml:"host" json:"host"`
}{
Port: 8080,
Host: "localhost",
},
Operator: struct {
Email string `yaml:"email" json:"email"`
}{
Email: "admin@example.com",
},
}
// Save config
if err := SaveGlobalConfig(original, configPath); err != nil {
t.Fatalf("SaveGlobalConfig failed: %v", err)
}
// Load config
loaded, err := LoadGlobalConfig(configPath)
if err != nil {
t.Fatalf("LoadGlobalConfig failed: %v", err)
}
// Verify all fields match
if loaded.Wildcloud.Repository != original.Wildcloud.Repository {
t.Errorf("repository mismatch: got %q, want %q", loaded.Wildcloud.Repository, original.Wildcloud.Repository)
}
if loaded.Server.Port != original.Server.Port {
t.Errorf("port mismatch: got %d, want %d", loaded.Server.Port, original.Server.Port)
}
if loaded.Operator.Email != original.Operator.Email {
t.Errorf("email mismatch: got %q, want %q", loaded.Operator.Email, original.Operator.Email)
}
}
// Test: Round-trip save and load for instance config
func TestInstanceConfig_RoundTrip(t *testing.T) {
tempDir := t.TempDir()
configPath := filepath.Join(tempDir, "config.yaml")
// Create instance config
original := &InstanceConfig{}
original.Cloud.BaseDomain = "example.com"
original.Cloud.Domain = "home"
original.Cluster.Name = "my-cluster"
// Save config
if err := SaveCloudConfig(original, configPath); err != nil {
t.Fatalf("SaveCloudConfig failed: %v", err)
}
// Load config
loaded, err := LoadCloudConfig(configPath)
if err != nil {
t.Fatalf("LoadCloudConfig failed: %v", err)
}
// Verify fields match
if loaded.Cloud.BaseDomain != original.Cloud.BaseDomain {
t.Errorf("base domain mismatch: got %q, want %q", loaded.Cloud.BaseDomain, original.Cloud.BaseDomain)
}
if loaded.Cluster.Name != original.Cluster.Name {
t.Errorf("cluster name mismatch: got %q, want %q", loaded.Cluster.Name, original.Cluster.Name)
}
}

View File

@@ -152,16 +152,19 @@ func (m *Manager) CopyConfig(srcPath, dstPath string) error {
}
// GetInstanceConfigPath returns the path to an instance's config file
// Deprecated: Use tools.GetInstanceConfigPath instead
func GetInstanceConfigPath(dataDir, instanceName string) string {
return filepath.Join(dataDir, "instances", instanceName, "config.yaml")
return tools.GetInstanceConfigPath(dataDir, instanceName)
}
// GetInstanceSecretsPath returns the path to an instance's secrets file
// Deprecated: Use tools.GetInstanceSecretsPath instead
func GetInstanceSecretsPath(dataDir, instanceName string) string {
return filepath.Join(dataDir, "instances", instanceName, "secrets.yaml")
return tools.GetInstanceSecretsPath(dataDir, instanceName)
}
// GetInstancePath returns the path to an instance directory
// Deprecated: Use tools.GetInstancePath instead
func GetInstancePath(dataDir, instanceName string) string {
return filepath.Join(dataDir, "instances", instanceName)
return tools.GetInstancePath(dataDir, instanceName)
}

View File

@@ -0,0 +1,905 @@
package config
import (
"os"
"path/filepath"
"strings"
"sync"
"testing"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
)
// Test: NewManager creates manager successfully
func TestNewManager(t *testing.T) {
m := NewManager()
if m == nil {
t.Fatal("NewManager returned nil")
}
if m.yq == nil {
t.Error("Manager.yq is nil")
}
}
// Test: EnsureInstanceConfig creates config file with proper structure
func TestEnsureInstanceConfig(t *testing.T) {
tests := []struct {
name string
setupFunc func(t *testing.T, instancePath string)
wantErr bool
errContains string
}{
{
name: "creates config when not exists",
setupFunc: nil,
wantErr: false,
},
{
name: "returns nil when config exists",
setupFunc: func(t *testing.T, instancePath string) {
configPath := filepath.Join(instancePath, "config.yaml")
content := `baseDomain: "test.local"
domain: "test"
internalDomain: "internal.test"
dhcpRange: ""
backup:
root: ""
nfs:
host: ""
mediaPath: ""
cluster:
name: ""
loadBalancerIp: ""
ipAddressPool: ""
hostnamePrefix: ""
certManager:
cloudflare:
domain: ""
zoneID: ""
externalDns:
ownerId: ""
nodes:
talos:
version: ""
schematicId: ""
control:
vip: ""
activeNodes: []
`
if err := storage.WriteFile(configPath, []byte(content), 0644); err != nil {
t.Fatalf("setup failed: %v", err)
}
},
wantErr: false,
},
{
name: "returns error when config is invalid yaml",
setupFunc: func(t *testing.T, instancePath string) {
configPath := filepath.Join(instancePath, "config.yaml")
content := `invalid: yaml: content: [[[`
if err := storage.WriteFile(configPath, []byte(content), 0644); err != nil {
t.Fatalf("setup failed: %v", err)
}
},
wantErr: true,
errContains: "invalid config file",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
instancePath := t.TempDir()
m := NewManager()
if tt.setupFunc != nil {
tt.setupFunc(t, instancePath)
}
err := m.EnsureInstanceConfig(instancePath)
if tt.wantErr {
if err == nil {
t.Error("expected error, got nil")
} else if tt.errContains != "" && !strings.Contains(err.Error(), tt.errContains) {
t.Errorf("error %q does not contain %q", err.Error(), tt.errContains)
}
return
}
if err != nil {
t.Errorf("unexpected error: %v", err)
return
}
// Verify config file exists
configPath := filepath.Join(instancePath, "config.yaml")
if !storage.FileExists(configPath) {
t.Error("config file not created")
}
// Verify config is valid YAML
if err := m.ValidateConfig(configPath); err != nil {
t.Errorf("config validation failed: %v", err)
}
// Verify config has expected structure
content, err := storage.ReadFile(configPath)
if err != nil {
t.Fatalf("failed to read config: %v", err)
}
contentStr := string(content)
requiredFields := []string{"baseDomain:", "domain:", "cluster:", "backup:", "nfs:"}
for _, field := range requiredFields {
if !strings.Contains(contentStr, field) {
t.Errorf("config missing required field: %s", field)
}
}
})
}
}
// Test: GetConfigValue retrieves values correctly
func TestGetConfigValue(t *testing.T) {
tests := []struct {
name string
configYAML string
key string
want string
wantErr bool
errContains string
}{
{
name: "get simple string value",
configYAML: `baseDomain: "example.com"
domain: "test"
`,
key: "baseDomain",
want: "example.com",
wantErr: false,
},
{
name: "get nested value with dot notation",
configYAML: `cluster:
name: "my-cluster"
nodes:
talos:
version: "v1.8.0"
`,
key: "cluster.nodes.talos.version",
want: "v1.8.0",
wantErr: false,
},
{
name: "get empty string value",
configYAML: `baseDomain: ""
`,
key: "baseDomain",
want: "",
wantErr: false,
},
{
name: "get non-existent key returns null",
configYAML: `baseDomain: "example.com"
`,
key: "nonexistent",
want: "null",
wantErr: false,
},
{
name: "get from array",
configYAML: `cluster:
nodes:
activeNodes:
- "node1"
- "node2"
`,
key: "cluster.nodes.activeNodes.[0]",
want: "node1",
wantErr: false,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
tempDir := t.TempDir()
configPath := filepath.Join(tempDir, "config.yaml")
if err := storage.WriteFile(configPath, []byte(tt.configYAML), 0644); err != nil {
t.Fatalf("setup failed: %v", err)
}
m := NewManager()
got, err := m.GetConfigValue(configPath, tt.key)
if tt.wantErr {
if err == nil {
t.Error("expected error, got nil")
} else if tt.errContains != "" && !strings.Contains(err.Error(), tt.errContains) {
t.Errorf("error %q does not contain %q", err.Error(), tt.errContains)
}
return
}
if err != nil {
t.Errorf("unexpected error: %v", err)
return
}
if got != tt.want {
t.Errorf("got %q, want %q", got, tt.want)
}
})
}
}
// Test: GetConfigValue error cases
func TestGetConfigValue_Errors(t *testing.T) {
tests := []struct {
name string
setupFunc func(t *testing.T) string
key string
errContains string
}{
{
name: "non-existent file",
setupFunc: func(t *testing.T) string {
return filepath.Join(t.TempDir(), "nonexistent.yaml")
},
key: "baseDomain",
errContains: "config file not found",
},
{
name: "malformed yaml",
setupFunc: func(t *testing.T) string {
tempDir := t.TempDir()
configPath := filepath.Join(tempDir, "config.yaml")
content := `invalid: yaml: [[[`
if err := storage.WriteFile(configPath, []byte(content), 0644); err != nil {
t.Fatalf("setup failed: %v", err)
}
return configPath
},
key: "baseDomain",
errContains: "getting config value",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
configPath := tt.setupFunc(t)
m := NewManager()
_, err := m.GetConfigValue(configPath, tt.key)
if err == nil {
t.Error("expected error, got nil")
} else if !strings.Contains(err.Error(), tt.errContains) {
t.Errorf("error %q does not contain %q", err.Error(), tt.errContains)
}
})
}
}
// Test: SetConfigValue sets values correctly
func TestSetConfigValue(t *testing.T) {
tests := []struct {
name string
initialYAML string
key string
value string
verifyFunc func(t *testing.T, configPath string)
}{
{
name: "set simple value",
initialYAML: `baseDomain: ""
domain: ""
`,
key: "baseDomain",
value: "example.com",
verifyFunc: func(t *testing.T, configPath string) {
m := NewManager()
got, err := m.GetConfigValue(configPath, "baseDomain")
if err != nil {
t.Fatalf("verify failed: %v", err)
}
if got != "example.com" {
t.Errorf("got %q, want %q", got, "example.com")
}
},
},
{
name: "set nested value",
initialYAML: `cluster:
name: ""
nodes:
talos:
version: ""
`,
key: "cluster.nodes.talos.version",
value: "v1.8.0",
verifyFunc: func(t *testing.T, configPath string) {
m := NewManager()
got, err := m.GetConfigValue(configPath, "cluster.nodes.talos.version")
if err != nil {
t.Fatalf("verify failed: %v", err)
}
if got != "v1.8.0" {
t.Errorf("got %q, want %q", got, "v1.8.0")
}
},
},
{
name: "update existing value",
initialYAML: `baseDomain: "old.com"
`,
key: "baseDomain",
value: "new.com",
verifyFunc: func(t *testing.T, configPath string) {
m := NewManager()
got, err := m.GetConfigValue(configPath, "baseDomain")
if err != nil {
t.Fatalf("verify failed: %v", err)
}
if got != "new.com" {
t.Errorf("got %q, want %q", got, "new.com")
}
},
},
{
name: "create new nested path",
initialYAML: `cluster: {}
`,
key: "cluster.newField",
value: "newValue",
verifyFunc: func(t *testing.T, configPath string) {
m := NewManager()
got, err := m.GetConfigValue(configPath, "cluster.newField")
if err != nil {
t.Fatalf("verify failed: %v", err)
}
if got != "newValue" {
t.Errorf("got %q, want %q", got, "newValue")
}
},
},
{
name: "set value with special characters",
initialYAML: `baseDomain: ""
`,
key: "baseDomain",
value: `special"quotes'and\backslashes`,
verifyFunc: func(t *testing.T, configPath string) {
m := NewManager()
got, err := m.GetConfigValue(configPath, "baseDomain")
if err != nil {
t.Fatalf("verify failed: %v", err)
}
if got != `special"quotes'and\backslashes` {
t.Errorf("got %q, want %q", got, `special"quotes'and\backslashes`)
}
},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
tempDir := t.TempDir()
configPath := filepath.Join(tempDir, "config.yaml")
if err := storage.WriteFile(configPath, []byte(tt.initialYAML), 0644); err != nil {
t.Fatalf("setup failed: %v", err)
}
m := NewManager()
if err := m.SetConfigValue(configPath, tt.key, tt.value); err != nil {
t.Errorf("SetConfigValue failed: %v", err)
return
}
// Verify the value was set correctly
tt.verifyFunc(t, configPath)
// Verify config is still valid YAML
if err := m.ValidateConfig(configPath); err != nil {
t.Errorf("config validation failed after set: %v", err)
}
})
}
}
// Test: SetConfigValue error cases
func TestSetConfigValue_Errors(t *testing.T) {
tests := []struct {
name string
setupFunc func(t *testing.T) string
key string
value string
errContains string
}{
{
name: "non-existent file",
setupFunc: func(t *testing.T) string {
return filepath.Join(t.TempDir(), "nonexistent.yaml")
},
key: "baseDomain",
value: "example.com",
errContains: "config file not found",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
configPath := tt.setupFunc(t)
m := NewManager()
err := m.SetConfigValue(configPath, tt.key, tt.value)
if err == nil {
t.Error("expected error, got nil")
} else if !strings.Contains(err.Error(), tt.errContains) {
t.Errorf("error %q does not contain %q", err.Error(), tt.errContains)
}
})
}
}
// Test: SetConfigValue with concurrent access
func TestSetConfigValue_ConcurrentAccess(t *testing.T) {
tempDir := t.TempDir()
configPath := filepath.Join(tempDir, "config.yaml")
initialYAML := `counter: "0"
`
if err := storage.WriteFile(configPath, []byte(initialYAML), 0644); err != nil {
t.Fatalf("setup failed: %v", err)
}
m := NewManager()
const numGoroutines = 10
var wg sync.WaitGroup
errors := make(chan error, numGoroutines)
// Launch multiple goroutines trying to write different values
for i := 0; i < numGoroutines; i++ {
wg.Add(1)
go func(val int) {
defer wg.Done()
key := "counter"
value := string(rune('0' + val))
if err := m.SetConfigValue(configPath, key, value); err != nil {
errors <- err
}
}(i)
}
wg.Wait()
close(errors)
// Check if any errors occurred
for err := range errors {
t.Errorf("concurrent write error: %v", err)
}
// Verify config is still valid after concurrent access
if err := m.ValidateConfig(configPath); err != nil {
t.Errorf("config validation failed after concurrent writes: %v", err)
}
// Verify we can read the value (should be one of the written values)
value, err := m.GetConfigValue(configPath, "counter")
if err != nil {
t.Errorf("failed to read value after concurrent writes: %v", err)
}
if value == "" || value == "null" {
t.Error("counter value is empty after concurrent writes")
}
}
// Test: EnsureConfigValue sets value only when not set
func TestEnsureConfigValue(t *testing.T) {
tests := []struct {
name string
initialYAML string
key string
value string
expectSet bool
}{
{
name: "sets value when empty string",
initialYAML: `baseDomain: ""
`,
key: "baseDomain",
value: "example.com",
expectSet: true,
},
{
name: "sets value when null",
initialYAML: `baseDomain: null
`,
key: "baseDomain",
value: "example.com",
expectSet: true,
},
{
name: "does not set value when already set",
initialYAML: `baseDomain: "existing.com"
`,
key: "baseDomain",
value: "new.com",
expectSet: false,
},
{
name: "sets value when key does not exist",
initialYAML: `domain: "test"
`,
key: "baseDomain",
value: "example.com",
expectSet: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
tempDir := t.TempDir()
configPath := filepath.Join(tempDir, "config.yaml")
if err := storage.WriteFile(configPath, []byte(tt.initialYAML), 0644); err != nil {
t.Fatalf("setup failed: %v", err)
}
m := NewManager()
// Get initial value
initialVal, _ := m.GetConfigValue(configPath, tt.key)
// Call EnsureConfigValue
if err := m.EnsureConfigValue(configPath, tt.key, tt.value); err != nil {
t.Errorf("EnsureConfigValue failed: %v", err)
return
}
// Get final value
finalVal, err := m.GetConfigValue(configPath, tt.key)
if err != nil {
t.Fatalf("GetConfigValue failed: %v", err)
}
if tt.expectSet {
if finalVal != tt.value {
t.Errorf("expected value to be set to %q, got %q", tt.value, finalVal)
}
} else {
if finalVal != initialVal {
t.Errorf("expected value to remain %q, got %q", initialVal, finalVal)
}
}
// Call EnsureConfigValue again - should be idempotent
if err := m.EnsureConfigValue(configPath, tt.key, "different.com"); err != nil {
t.Errorf("second EnsureConfigValue failed: %v", err)
return
}
// Value should not change on second call
secondVal, err := m.GetConfigValue(configPath, tt.key)
if err != nil {
t.Fatalf("GetConfigValue failed: %v", err)
}
if secondVal != finalVal {
t.Errorf("value changed on second ensure: %q -> %q", finalVal, secondVal)
}
})
}
}
// Test: ValidateConfig validates YAML correctly
func TestValidateConfig(t *testing.T) {
tests := []struct {
name string
configYAML string
wantErr bool
errContains string
}{
{
name: "valid yaml",
configYAML: `baseDomain: "example.com"
domain: "test"
cluster:
name: "my-cluster"
`,
wantErr: false,
},
{
name: "invalid yaml - bad indentation",
configYAML: `baseDomain: "example.com"\n domain: "test"`,
wantErr: true,
errContains: "yaml validation failed",
},
{
name: "invalid yaml - unclosed bracket",
configYAML: `cluster: { name: "test"`,
wantErr: true,
errContains: "yaml validation failed",
},
{
name: "empty file",
configYAML: "",
wantErr: false,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
tempDir := t.TempDir()
configPath := filepath.Join(tempDir, "config.yaml")
if err := storage.WriteFile(configPath, []byte(tt.configYAML), 0644); err != nil {
t.Fatalf("setup failed: %v", err)
}
m := NewManager()
err := m.ValidateConfig(configPath)
if tt.wantErr {
if err == nil {
t.Error("expected error, got nil")
} else if tt.errContains != "" && !strings.Contains(err.Error(), tt.errContains) {
t.Errorf("error %q does not contain %q", err.Error(), tt.errContains)
}
return
}
if err != nil {
t.Errorf("unexpected error: %v", err)
}
})
}
}
// Test: ValidateConfig error cases
func TestValidateConfig_Errors(t *testing.T) {
t.Run("non-existent file", func(t *testing.T) {
tempDir := t.TempDir()
configPath := filepath.Join(tempDir, "nonexistent.yaml")
m := NewManager()
err := m.ValidateConfig(configPath)
if err == nil {
t.Error("expected error, got nil")
} else if !strings.Contains(err.Error(), "config file not found") {
t.Errorf("error %q does not contain 'config file not found'", err.Error())
}
})
}
// Test: CopyConfig copies configuration correctly
func TestCopyConfig(t *testing.T) {
tests := []struct {
name string
srcYAML string
setupDst func(t *testing.T, dstPath string)
wantErr bool
errContains string
}{
{
name: "copies config successfully",
srcYAML: `baseDomain: "example.com"
domain: "test"
cluster:
name: "my-cluster"
`,
setupDst: nil,
wantErr: false,
},
{
name: "creates destination directory",
srcYAML: `baseDomain: "example.com"`,
setupDst: nil,
wantErr: false,
},
{
name: "overwrites existing destination",
srcYAML: `baseDomain: "new.com"
`,
setupDst: func(t *testing.T, dstPath string) {
oldContent := `baseDomain: "old.com"`
if err := storage.WriteFile(dstPath, []byte(oldContent), 0644); err != nil {
t.Fatalf("setup failed: %v", err)
}
},
wantErr: false,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
tempDir := t.TempDir()
srcPath := filepath.Join(tempDir, "source.yaml")
dstPath := filepath.Join(tempDir, "subdir", "dest.yaml")
// Create source file
if err := storage.WriteFile(srcPath, []byte(tt.srcYAML), 0644); err != nil {
t.Fatalf("setup failed: %v", err)
}
// Setup destination if needed
if tt.setupDst != nil {
if err := storage.EnsureDir(filepath.Dir(dstPath), 0755); err != nil {
t.Fatalf("setup failed: %v", err)
}
tt.setupDst(t, dstPath)
}
m := NewManager()
err := m.CopyConfig(srcPath, dstPath)
if tt.wantErr {
if err == nil {
t.Error("expected error, got nil")
} else if tt.errContains != "" && !strings.Contains(err.Error(), tt.errContains) {
t.Errorf("error %q does not contain %q", err.Error(), tt.errContains)
}
return
}
if err != nil {
t.Errorf("unexpected error: %v", err)
return
}
// Verify destination file exists
if !storage.FileExists(dstPath) {
t.Error("destination file not created")
}
// Verify content matches source
srcContent, err := storage.ReadFile(srcPath)
if err != nil {
t.Fatalf("failed to read source: %v", err)
}
dstContent, err := storage.ReadFile(dstPath)
if err != nil {
t.Fatalf("failed to read destination: %v", err)
}
if string(srcContent) != string(dstContent) {
t.Error("destination content does not match source")
}
// Verify destination is valid YAML
if err := m.ValidateConfig(dstPath); err != nil {
t.Errorf("destination config validation failed: %v", err)
}
})
}
}
// Test: CopyConfig error cases
func TestCopyConfig_Errors(t *testing.T) {
tests := []struct {
name string
setupFunc func(t *testing.T, tempDir string) (srcPath, dstPath string)
errContains string
}{
{
name: "source file does not exist",
setupFunc: func(t *testing.T, tempDir string) (string, string) {
return filepath.Join(tempDir, "nonexistent.yaml"),
filepath.Join(tempDir, "dest.yaml")
},
errContains: "source config file not found",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
tempDir := t.TempDir()
srcPath, dstPath := tt.setupFunc(t, tempDir)
m := NewManager()
err := m.CopyConfig(srcPath, dstPath)
if err == nil {
t.Error("expected error, got nil")
} else if !strings.Contains(err.Error(), tt.errContains) {
t.Errorf("error %q does not contain %q", err.Error(), tt.errContains)
}
})
}
}
// Test: File permissions are preserved
func TestEnsureInstanceConfig_FilePermissions(t *testing.T) {
tempDir := t.TempDir()
m := NewManager()
if err := m.EnsureInstanceConfig(tempDir); err != nil {
t.Fatalf("EnsureInstanceConfig failed: %v", err)
}
configPath := filepath.Join(tempDir, "config.yaml")
info, err := os.Stat(configPath)
if err != nil {
t.Fatalf("failed to stat config file: %v", err)
}
// Verify file has 0644 permissions
if info.Mode().Perm() != 0644 {
t.Errorf("expected permissions 0644, got %v", info.Mode().Perm())
}
}
// Test: Idempotent config creation
func TestEnsureInstanceConfig_Idempotent(t *testing.T) {
tempDir := t.TempDir()
m := NewManager()
// First call creates config
if err := m.EnsureInstanceConfig(tempDir); err != nil {
t.Fatalf("first EnsureInstanceConfig failed: %v", err)
}
configPath := filepath.Join(tempDir, "config.yaml")
firstContent, err := storage.ReadFile(configPath)
if err != nil {
t.Fatalf("failed to read config: %v", err)
}
// Second call should not modify config
if err := m.EnsureInstanceConfig(tempDir); err != nil {
t.Fatalf("second EnsureInstanceConfig failed: %v", err)
}
secondContent, err := storage.ReadFile(configPath)
if err != nil {
t.Fatalf("failed to read config: %v", err)
}
if string(firstContent) != string(secondContent) {
t.Error("config content changed on second call")
}
}
// Test: Config structure contains all required fields
func TestEnsureInstanceConfig_RequiredFields(t *testing.T) {
tempDir := t.TempDir()
m := NewManager()
if err := m.EnsureInstanceConfig(tempDir); err != nil {
t.Fatalf("EnsureInstanceConfig failed: %v", err)
}
configPath := filepath.Join(tempDir, "config.yaml")
content, err := storage.ReadFile(configPath)
if err != nil {
t.Fatalf("failed to read config: %v", err)
}
contentStr := string(content)
requiredFields := []string{
"baseDomain:",
"domain:",
"internalDomain:",
"dhcpRange:",
"backup:",
"nfs:",
"cluster:",
"loadBalancerIp:",
"ipAddressPool:",
"hostnamePrefix:",
"certManager:",
"externalDns:",
"nodes:",
"talos:",
"version:",
"schematicId:",
"control:",
"vip:",
"activeNodes:",
}
for _, field := range requiredFields {
if !strings.Contains(contentStr, field) {
t.Errorf("config missing required field: %s", field)
}
}
}

View File

@@ -6,6 +6,7 @@ import (
"strings"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// Manager handles current instance context tracking
@@ -53,7 +54,7 @@ func (m *Manager) SetCurrentContext(instanceName string) error {
}
// Verify instance exists
instancePath := filepath.Join(m.dataDir, "instances", instanceName)
instancePath := tools.GetInstancePath(m.dataDir, instanceName)
if !storage.FileExists(instancePath) {
return fmt.Errorf("instance %s does not exist", instanceName)
}
@@ -101,7 +102,7 @@ func (m *Manager) ValidateContext() error {
return err
}
instancePath := filepath.Join(m.dataDir, "instances", contextName)
instancePath := tools.GetInstancePath(m.dataDir, contextName)
if !storage.FileExists(instancePath) {
return fmt.Errorf("current context %s points to non-existent instance", contextName)
}
@@ -116,7 +117,7 @@ func (m *Manager) GetCurrentInstancePath() (string, error) {
return "", err
}
return filepath.Join(m.dataDir, "instances", contextName), nil
return tools.GetInstancePath(m.dataDir, contextName), nil
}
// GetCurrentInstanceConfigPath returns the path to the current instance's config file

View File

@@ -0,0 +1,339 @@
// Package contracts contains API contracts for service management endpoints
package contracts
import "time"
// ==============================
// Request/Response Types
// ==============================
// ServiceManifest represents basic service information
type ServiceManifest struct {
Name string `json:"name"`
Description string `json:"description"`
Namespace string `json:"namespace"`
ConfigReferences []string `json:"configReferences,omitempty"`
ServiceConfig map[string]ConfigDefinition `json:"serviceConfig,omitempty"`
}
// ConfigDefinition defines config that should be prompted during service setup
type ConfigDefinition struct {
Path string `json:"path"`
Prompt string `json:"prompt"`
Default string `json:"default"`
Type string `json:"type,omitempty"`
}
// PodStatus represents the status of a single pod
type PodStatus struct {
Name string `json:"name"` // Pod name
Status string `json:"status"` // Pod phase: Running, Pending, Failed, etc.
Ready string `json:"ready"` // Ready containers e.g. "1/1", "0/1"
Restarts int `json:"restarts"` // Container restart count
Age string `json:"age"` // Human-readable age e.g. "2h", "5m"
Node string `json:"node"` // Node name where pod is scheduled
IP string `json:"ip,omitempty"` // Pod IP if available
}
// DetailedServiceStatus provides comprehensive service status
type DetailedServiceStatus struct {
Name string `json:"name"` // Service name
Namespace string `json:"namespace"` // Kubernetes namespace
DeploymentStatus string `json:"deploymentStatus"` // "Ready", "Progressing", "Degraded", "NotFound"
Replicas ReplicaStatus `json:"replicas"` // Desired/current/ready replicas
Pods []PodStatus `json:"pods"` // Pod details
Config map[string]interface{} `json:"config,omitempty"` // Current config from config.yaml
Manifest *ServiceManifest `json:"manifest,omitempty"` // Service manifest if available
LastUpdated time.Time `json:"lastUpdated"` // Timestamp of status
}
// ReplicaStatus tracks deployment replica counts
type ReplicaStatus struct {
Desired int32 `json:"desired"` // Desired replica count
Current int32 `json:"current"` // Current replica count
Ready int32 `json:"ready"` // Ready replica count
Available int32 `json:"available"` // Available replica count
}
// ServiceLogsRequest query parameters for log retrieval
type ServiceLogsRequest struct {
Container string `json:"container,omitempty"` // Specific container name (optional)
Tail int `json:"tail,omitempty"` // Number of lines from end (default: 100)
Follow bool `json:"follow,omitempty"` // Stream logs via SSE
Previous bool `json:"previous,omitempty"` // Get previous container logs
Since string `json:"since,omitempty"` // RFC3339 or duration string e.g. "10m"
}
// ServiceLogsResponse for non-streaming log retrieval
type ServiceLogsResponse struct {
Service string `json:"service"` // Service name
Namespace string `json:"namespace"` // Kubernetes namespace
Container string `json:"container,omitempty"` // Container name if specified
Lines []string `json:"lines"` // Log lines
Truncated bool `json:"truncated"` // Whether logs were truncated
Timestamp time.Time `json:"timestamp"` // Response timestamp
}
// ServiceLogsSSEEvent for streaming logs via Server-Sent Events
type ServiceLogsSSEEvent struct {
Type string `json:"type"` // "log", "error", "end"
Line string `json:"line,omitempty"` // Log line content
Error string `json:"error,omitempty"` // Error message if type="error"
Container string `json:"container,omitempty"` // Container source
Timestamp time.Time `json:"timestamp"` // Event timestamp
}
// ServiceConfigUpdate request to update service configuration
type ServiceConfigUpdate struct {
Config map[string]interface{} `json:"config"` // Configuration updates
Redeploy bool `json:"redeploy"` // Trigger recompilation/redeployment
Fetch bool `json:"fetch"` // Fetch fresh templates before redeployment
}
// ServiceConfigResponse response after config update
type ServiceConfigResponse struct {
Service string `json:"service"` // Service name
Namespace string `json:"namespace"` // Kubernetes namespace
Config map[string]interface{} `json:"config"` // Updated configuration
Redeployed bool `json:"redeployed"` // Whether service was redeployed
Message string `json:"message"` // Success/info message
}
// ==============================
// Error Response
// ==============================
// ErrorResponse standard error format for all endpoints
type ErrorResponse struct {
Error ErrorDetail `json:"error"`
}
// ErrorDetail contains error information
type ErrorDetail struct {
Code string `json:"code"` // Machine-readable error code
Message string `json:"message"` // Human-readable error message
Details map[string]interface{} `json:"details,omitempty"` // Additional error context
}
// Standard error codes
const (
ErrCodeNotFound = "SERVICE_NOT_FOUND"
ErrCodeInstanceNotFound = "INSTANCE_NOT_FOUND"
ErrCodeInvalidRequest = "INVALID_REQUEST"
ErrCodeKubectlFailed = "KUBECTL_FAILED"
ErrCodeConfigInvalid = "CONFIG_INVALID"
ErrCodeDeploymentFailed = "DEPLOYMENT_FAILED"
ErrCodeStreamingError = "STREAMING_ERROR"
ErrCodeInternalError = "INTERNAL_ERROR"
)
// ==============================
// API Endpoint Specifications
// ==============================
/*
1. GET /api/v1/instances/{name}/services/{service}/status
Purpose: Returns comprehensive service status including pods and health
Response Codes:
- 200 OK: Service status retrieved successfully
- 404 Not Found: Instance or service not found
- 500 Internal Server Error: kubectl command failed
Example Request:
GET /api/v1/instances/production/services/nginx/status
Example Response (200 OK):
{
"name": "nginx",
"namespace": "nginx",
"deploymentStatus": "Ready",
"replicas": {
"desired": 3,
"current": 3,
"ready": 3,
"available": 3
},
"pods": [
{
"name": "nginx-7c5464c66d-abc123",
"status": "Running",
"ready": "1/1",
"restarts": 0,
"age": "2h",
"node": "worker-1",
"ip": "10.42.1.5"
}
],
"config": {
"nginx.image": "nginx:1.21",
"nginx.replicas": 3
},
"manifest": {
"name": "nginx",
"description": "NGINX web server",
"namespace": "nginx"
},
"lastUpdated": "2024-01-15T10:30:00Z"
}
Example Error Response (404):
{
"error": {
"code": "SERVICE_NOT_FOUND",
"message": "Service nginx not found in instance production",
"details": {
"instance": "production",
"service": "nginx"
}
}
}
*/
/*
2. GET /api/v1/instances/{name}/services/{service}/logs
Purpose: Retrieve or stream service logs
Query Parameters:
- container (string): Specific container name
- tail (int): Number of lines from end (default: 100, max: 5000)
- follow (bool): Stream logs via SSE (default: false)
- previous (bool): Get previous container logs (default: false)
- since (string): RFC3339 timestamp or duration (e.g. "10m")
Response Codes:
- 200 OK: Logs retrieved successfully (or SSE stream started)
- 400 Bad Request: Invalid query parameters
- 404 Not Found: Instance, service, or container not found
- 500 Internal Server Error: kubectl command failed
Example Request (buffered):
GET /api/v1/instances/production/services/nginx/logs?tail=50
Example Response (200 OK):
{
"service": "nginx",
"namespace": "nginx",
"container": "nginx",
"lines": [
"2024/01/15 10:00:00 [notice] Configuration loaded",
"2024/01/15 10:00:01 [info] Server started on port 80"
],
"truncated": false,
"timestamp": "2024-01-15T10:30:00Z"
}
Example Request (streaming):
GET /api/v1/instances/production/services/nginx/logs?follow=true
Accept: text/event-stream
Example SSE Response:
data: {"type":"log","line":"2024/01/15 10:00:00 [notice] Configuration loaded","container":"nginx","timestamp":"2024-01-15T10:30:00Z"}
data: {"type":"log","line":"2024/01/15 10:00:01 [info] Request from 10.0.0.1","container":"nginx","timestamp":"2024-01-15T10:30:01Z"}
data: {"type":"error","error":"Container restarting","timestamp":"2024-01-15T10:30:02Z"}
data: {"type":"end","timestamp":"2024-01-15T10:30:03Z"}
*/
/*
3. PATCH /api/v1/instances/{name}/services/{service}/config
Purpose: Update service configuration in config.yaml and optionally redeploy
Request Body: ServiceConfigUpdate (JSON)
Response Codes:
- 200 OK: Configuration updated successfully
- 400 Bad Request: Invalid configuration
- 404 Not Found: Instance or service not found
- 500 Internal Server Error: Update or deployment failed
Example Request:
PATCH /api/v1/instances/production/services/nginx/config
Content-Type: application/json
{
"config": {
"nginx.image": "nginx:1.22",
"nginx.replicas": 5,
"nginx.resources.memory": "512Mi"
},
"redeploy": true
}
Example Response (200 OK):
{
"service": "nginx",
"namespace": "nginx",
"config": {
"nginx.image": "nginx:1.22",
"nginx.replicas": 5,
"nginx.resources.memory": "512Mi"
},
"redeployed": true,
"message": "Service configuration updated and redeployed successfully"
}
Example Error Response (400):
{
"error": {
"code": "CONFIG_INVALID",
"message": "Invalid configuration: nginx.replicas must be a positive integer",
"details": {
"field": "nginx.replicas",
"value": -1,
"constraint": "positive integer"
}
}
}
*/
// ==============================
// Validation Rules
// ==============================
/*
Query Parameter Validation:
ServiceLogsRequest:
- tail: Must be between 1 and 5000 (default: 100)
- since: Must be valid RFC3339 timestamp or Go duration string (e.g. "5m", "1h")
- container: Must match existing container name if specified
- follow: When true, response uses Server-Sent Events (SSE)
- previous: Cannot be combined with follow=true
ServiceConfigUpdate:
- config: Must be valid YAML-compatible structure
- config keys: Must follow service's expected configuration schema
- redeploy: When true, triggers kustomize recompilation and kubectl apply
Path Parameters:
- instance name: Must match existing instance directory
- service name: Must match installed service name
*/
// ==============================
// HTTP Status Code Summary
// ==============================
/*
200 OK:
- Service status retrieved successfully
- Logs retrieved successfully (non-streaming)
- Configuration updated successfully
400 Bad Request:
- Invalid query parameters
- Invalid configuration in request body
- Validation errors
404 Not Found:
- Instance does not exist
- Service not installed in instance
- Container name not found (for logs)
500 Internal Server Error:
- kubectl command execution failed
- File system operations failed
- Unexpected errors during processing
*/

View File

@@ -37,8 +37,8 @@ func (m *Manager) Initialize() error {
if err != nil {
return fmt.Errorf("failed to get current directory: %w", err)
}
if os.Getenv("WILD_CENTRAL_DATA") != "" {
dataDir = os.Getenv("WILD_CENTRAL_DATA")
if os.Getenv("WILD_API_DATA_DIR") != "" {
dataDir = os.Getenv("WILD_API_DATA_DIR")
} else {
dataDir = filepath.Join(cwd, "data")
}

View File

@@ -3,6 +3,7 @@ package discovery
import (
"encoding/json"
"fmt"
"net"
"os"
"path/filepath"
"sync"
@@ -15,45 +16,43 @@ import (
// Manager handles node discovery operations
type Manager struct {
dataDir string
nodeMgr *node.Manager
talosctl *tools.Talosctl
dataDir string
nodeMgr *node.Manager
talosctl *tools.Talosctl
discoveryMu sync.Mutex
}
// NewManager creates a new discovery manager
func NewManager(dataDir string, instanceName string) *Manager {
// Get talosconfig path for the instance
talosconfigPath := filepath.Join(dataDir, "instances", instanceName, "setup", "cluster-nodes", "generated", "talosconfig")
talosconfigPath := tools.GetTalosconfigPath(dataDir, instanceName)
return &Manager{
dataDir: dataDir,
nodeMgr: node.NewManager(dataDir),
nodeMgr: node.NewManager(dataDir, instanceName),
talosctl: tools.NewTalosconfigWithConfig(talosconfigPath),
}
}
// DiscoveredNode represents a discovered node on the network
// DiscoveredNode represents a discovered node on the network (maintenance mode only)
type DiscoveredNode struct {
IP string `json:"ip"`
Hostname string `json:"hostname,omitempty"`
MaintenanceMode bool `json:"maintenance_mode"`
Version string `json:"version,omitempty"`
Interface string `json:"interface,omitempty"`
Disks []string `json:"disks,omitempty"`
}
// DiscoveryStatus represents the current state of discovery
type DiscoveryStatus struct {
Active bool `json:"active"`
StartedAt time.Time `json:"started_at,omitempty"`
NodesFound []DiscoveredNode `json:"nodes_found"`
Error string `json:"error,omitempty"`
Active bool `json:"active"`
StartedAt time.Time `json:"started_at,omitempty"`
NodesFound []DiscoveredNode `json:"nodes_found"`
Error string `json:"error,omitempty"`
}
// GetDiscoveryDir returns the discovery directory for an instance
func (m *Manager) GetDiscoveryDir(instanceName string) string {
return filepath.Join(m.dataDir, "instances", instanceName, "discovery")
return tools.GetInstanceDiscoveryPath(m.dataDir, instanceName)
}
// GetDiscoveryStatusPath returns the path to discovery status file
@@ -127,61 +126,69 @@ func (m *Manager) runDiscovery(instanceName string, ipList []string) {
status, _ := m.GetDiscoveryStatus(instanceName)
status.Active = false
m.writeDiscoveryStatus(instanceName, status)
_ = m.writeDiscoveryStatus(instanceName, status)
}()
// Discover nodes by probing each IP
discoveredNodes := []DiscoveredNode{}
// Discover nodes by probing each IP in parallel
var wg sync.WaitGroup
resultsChan := make(chan DiscoveredNode, len(ipList))
// Limit concurrent scans to avoid overwhelming the network
semaphore := make(chan struct{}, 50)
for _, ip := range ipList {
node, err := m.probeNode(ip)
if err != nil {
// Node not reachable or not a Talos node
continue
}
wg.Add(1)
go func(ip string) {
defer wg.Done()
discoveredNodes = append(discoveredNodes, *node)
// Acquire semaphore
semaphore <- struct{}{}
defer func() { <-semaphore }()
node, err := m.probeNode(ip)
if err != nil {
// Node not reachable or not a Talos node
return
}
resultsChan <- *node
}(ip)
}
// Close results channel when all goroutines complete
go func() {
wg.Wait()
close(resultsChan)
}()
// Collect results and update status incrementally
discoveredNodes := []DiscoveredNode{}
for node := range resultsChan {
discoveredNodes = append(discoveredNodes, node)
// Update status incrementally
m.discoveryMu.Lock()
status, _ := m.GetDiscoveryStatus(instanceName)
status.NodesFound = discoveredNodes
m.writeDiscoveryStatus(instanceName, status)
_ = m.writeDiscoveryStatus(instanceName, status)
m.discoveryMu.Unlock()
}
}
// probeNode attempts to detect if a node is running Talos
// probeNode attempts to detect if a node is running Talos in maintenance mode
func (m *Manager) probeNode(ip string) (*DiscoveredNode, error) {
// Attempt to get version (quick connectivity test)
version, err := m.talosctl.GetVersion(ip, false)
// Try insecure connection first (maintenance mode)
version, err := m.talosctl.GetVersion(ip, true)
if err != nil {
// Not in maintenance mode or not reachable
return nil, err
}
// Node is reachable, get hardware info
hwInfo, err := m.nodeMgr.DetectHardware(ip)
if err != nil {
// Still count it as discovered even if we can't get full hardware
return &DiscoveredNode{
IP: ip,
MaintenanceMode: false,
Version: version,
}, nil
}
// Extract just the disk paths for discovery output
diskPaths := make([]string, len(hwInfo.Disks))
for i, disk := range hwInfo.Disks {
diskPaths[i] = disk.Path
}
// If insecure connection works, node is in maintenance mode
return &DiscoveredNode{
IP: ip,
MaintenanceMode: hwInfo.MaintenanceMode,
MaintenanceMode: true,
Version: version,
Interface: hwInfo.Interface,
Disks: diskPaths,
}, nil
}
@@ -245,3 +252,132 @@ func (m *Manager) writeDiscoveryStatus(instanceName string, status *DiscoverySta
return nil
}
// CancelDiscovery cancels an in-progress discovery operation
func (m *Manager) CancelDiscovery(instanceName string) error {
m.discoveryMu.Lock()
defer m.discoveryMu.Unlock()
// Get current status
status, err := m.GetDiscoveryStatus(instanceName)
if err != nil {
return err
}
if !status.Active {
return fmt.Errorf("no discovery in progress")
}
// Mark discovery as cancelled
status.Active = false
status.Error = "Discovery cancelled by user"
if err := m.writeDiscoveryStatus(instanceName, status); err != nil {
return err
}
return nil
}
// GetLocalNetworks discovers local network interfaces and returns their CIDR addresses
// Skips loopback, link-local, and down interfaces
// Only returns IPv4 networks
func GetLocalNetworks() ([]string, error) {
interfaces, err := net.Interfaces()
if err != nil {
return nil, fmt.Errorf("failed to get network interfaces: %w", err)
}
var networks []string
for _, iface := range interfaces {
// Skip loopback and down interfaces
if iface.Flags&net.FlagLoopback != 0 || iface.Flags&net.FlagUp == 0 {
continue
}
addrs, err := iface.Addrs()
if err != nil {
continue
}
for _, addr := range addrs {
ipnet, ok := addr.(*net.IPNet)
if !ok {
continue
}
// Only IPv4 for now
if ipnet.IP.To4() == nil {
continue
}
// Skip link-local addresses (169.254.0.0/16)
if ipnet.IP.IsLinkLocalUnicast() {
continue
}
networks = append(networks, ipnet.String())
}
}
return networks, nil
}
// ExpandSubnet expands a CIDR notation subnet into individual IP addresses
// Example: "192.168.8.0/24" → ["192.168.8.1", "192.168.8.2", ..., "192.168.8.254"]
// Also handles single IPs (without CIDR notation)
func ExpandSubnet(subnet string) ([]string, error) {
// Check if it's a CIDR notation
ip, ipnet, err := net.ParseCIDR(subnet)
if err != nil {
// Not a CIDR, might be single IP
if net.ParseIP(subnet) != nil {
return []string{subnet}, nil
}
return nil, fmt.Errorf("invalid IP or CIDR: %s", subnet)
}
// Special case: /32 (single host) - just return the IP
ones, _ := ipnet.Mask.Size()
if ones == 32 {
return []string{ip.String()}, nil
}
var ips []string
// Iterate through all IPs in the subnet
for ip := ip.Mask(ipnet.Mask); ipnet.Contains(ip); incIP(ip) {
// Skip network address (first IP)
if ip.Equal(ipnet.IP) {
continue
}
// Skip broadcast address (last IP)
if isLastIP(ip, ipnet) {
continue
}
ips = append(ips, ip.String())
}
return ips, nil
}
// incIP increments an IP address
func incIP(ip net.IP) {
for j := len(ip) - 1; j >= 0; j-- {
ip[j]++
if ip[j] > 0 {
break
}
}
}
// isLastIP checks if an IP is the last IP in a subnet (broadcast address)
func isLastIP(ip net.IP, ipnet *net.IPNet) bool {
lastIP := make(net.IP, len(ip))
for i := range ip {
lastIP[i] = ip[i] | ^ipnet.Mask[i]
}
return ip.Equal(lastIP)
}

View File

@@ -5,16 +5,31 @@ import (
"log"
"os"
"os/exec"
"strconv"
"strings"
"time"
"github.com/wild-cloud/wild-central/daemon/internal/config"
)
// ConfigGenerator handles dnsmasq configuration generation
type ConfigGenerator struct{}
type ConfigGenerator struct {
configPath string
}
// NewConfigGenerator creates a new dnsmasq config generator
func NewConfigGenerator() *ConfigGenerator {
return &ConfigGenerator{}
func NewConfigGenerator(configPath string) *ConfigGenerator {
if configPath == "" {
configPath = "/etc/dnsmasq.d/wild-cloud.conf"
}
return &ConfigGenerator{
configPath: configPath,
}
}
// GetConfigPath returns the dnsmasq config file path
func (g *ConfigGenerator) GetConfigPath() string {
return g.configPath
}
// Generate creates a dnsmasq configuration from the app config
@@ -22,8 +37,8 @@ func (g *ConfigGenerator) Generate(cfg *config.GlobalConfig, clouds []config.Ins
resolution_section := ""
for _, cloud := range clouds {
resolution_section += fmt.Sprintf("local=/%s/\naddress=/%s/%s\n", cloud.Domain, cloud.Domain, cfg.Cluster.EndpointIP)
resolution_section += fmt.Sprintf("local=/%s/\naddress=/%s/%s\n", cloud.InternalDomain, cloud.InternalDomain, cfg.Cluster.EndpointIP)
resolution_section += fmt.Sprintf("local=/%s/\naddress=/%s/%s\n", cloud.Cloud.Domain, cloud.Cloud.Domain, cfg.Cluster.EndpointIP)
resolution_section += fmt.Sprintf("local=/%s/\naddress=/%s/%s\n", cloud.Cloud.InternalDomain, cloud.Cloud.InternalDomain, cfg.Cluster.EndpointIP)
}
template := `# Configuration file for dnsmasq.
@@ -71,3 +86,91 @@ func (g *ConfigGenerator) RestartService() error {
}
return nil
}
// ServiceStatus represents the status of the dnsmasq service
type ServiceStatus struct {
Status string `json:"status"`
PID int `json:"pid"`
ConfigFile string `json:"config_file"`
InstancesConfigured int `json:"instances_configured"`
LastRestart time.Time `json:"last_restart"`
}
// GetStatus checks the status of the dnsmasq service
func (g *ConfigGenerator) GetStatus() (*ServiceStatus, error) {
status := &ServiceStatus{
ConfigFile: g.configPath,
}
// Check if service is active
cmd := exec.Command("systemctl", "is-active", "dnsmasq.service")
output, err := cmd.Output()
if err != nil {
status.Status = "inactive"
return status, nil
}
statusStr := strings.TrimSpace(string(output))
status.Status = statusStr
// Get PID if running
if statusStr == "active" {
cmd = exec.Command("systemctl", "show", "dnsmasq.service", "--property=MainPID")
output, err := cmd.Output()
if err == nil {
parts := strings.Split(strings.TrimSpace(string(output)), "=")
if len(parts) == 2 {
if pid, err := strconv.Atoi(parts[1]); err == nil {
status.PID = pid
}
}
}
// Get last restart time
cmd = exec.Command("systemctl", "show", "dnsmasq.service", "--property=ActiveEnterTimestamp")
output, err = cmd.Output()
if err == nil {
parts := strings.Split(strings.TrimSpace(string(output)), "=")
if len(parts) == 2 {
// Parse systemd timestamp format
if t, err := time.Parse("Mon 2006-01-02 15:04:05 MST", parts[1]); err == nil {
status.LastRestart = t
}
}
}
}
// Count instances in config
if data, err := os.ReadFile(g.configPath); err == nil {
// Count "local=/" occurrences (each instance has multiple)
count := strings.Count(string(data), "local=/")
// Each instance creates 2 "local=/" entries (domain and internal domain)
status.InstancesConfigured = count / 2
}
return status, nil
}
// ReadConfig reads the current dnsmasq configuration
func (g *ConfigGenerator) ReadConfig() (string, error) {
data, err := os.ReadFile(g.configPath)
if err != nil {
return "", fmt.Errorf("reading dnsmasq config: %w", err)
}
return string(data), nil
}
// UpdateConfig regenerates and writes the dnsmasq configuration for all instances
func (g *ConfigGenerator) UpdateConfig(cfg *config.GlobalConfig, instances []config.InstanceConfig) error {
// Generate fresh config from scratch
configContent := g.Generate(cfg, instances)
// Write config
log.Printf("Writing dnsmasq config to: %s", g.configPath)
if err := os.WriteFile(g.configPath, []byte(configContent), 0644); err != nil {
return fmt.Errorf("writing dnsmasq config: %w", err)
}
// Restart service to apply changes
return g.RestartService()
}

View File

@@ -9,47 +9,51 @@ import (
"github.com/wild-cloud/wild-central/daemon/internal/context"
"github.com/wild-cloud/wild-central/daemon/internal/secrets"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// Manager handles instance lifecycle operations
type Manager struct {
dataDir string
configMgr *config.Manager
secretsMgr *secrets.Manager
contextMgr *context.Manager
dataDir string
configMgr *config.Manager
secretsMgr *secrets.Manager
contextMgr *context.Manager
}
// NewManager creates a new instance manager
func NewManager(dataDir string) *Manager {
return &Manager{
dataDir: dataDir,
configMgr: config.NewManager(),
secretsMgr: secrets.NewManager(),
contextMgr: context.NewManager(dataDir),
dataDir: dataDir,
configMgr: config.NewManager(),
secretsMgr: secrets.NewManager(),
contextMgr: context.NewManager(dataDir),
}
}
// Instance represents a Wild Cloud instance
type Instance struct {
Name string
Path string
ConfigPath string
Name string
Path string
ConfigPath string
SecretsPath string
}
// GetInstancePath returns the path to an instance directory
// Deprecated: Use tools.GetInstancePath instead
func (m *Manager) GetInstancePath(name string) string {
return filepath.Join(m.dataDir, "instances", name)
return tools.GetInstancePath(m.dataDir, name)
}
// GetInstanceConfigPath returns the path to an instance's config file
// Deprecated: Use tools.GetInstanceConfigPath instead
func (m *Manager) GetInstanceConfigPath(name string) string {
return filepath.Join(m.GetInstancePath(name), "config.yaml")
return tools.GetInstanceConfigPath(m.dataDir, name)
}
// GetInstanceSecretsPath returns the path to an instance's secrets file
// Deprecated: Use tools.GetInstanceSecretsPath instead
func (m *Manager) GetInstanceSecretsPath(name string) string {
return filepath.Join(m.GetInstancePath(name), "secrets.yaml")
return tools.GetInstanceSecretsPath(m.dataDir, name)
}
// InstanceExists checks if an instance exists
@@ -71,7 +75,7 @@ func (m *Manager) CreateInstance(name string) error {
}
// Acquire lock for instance creation
lockPath := filepath.Join(m.dataDir, "instances", ".lock")
lockPath := tools.GetInstancesLockPath(m.dataDir)
return storage.WithLock(lockPath, func() error {
// Create instance directory
if err := storage.EnsureDir(instancePath, 0755); err != nil {
@@ -123,7 +127,7 @@ func (m *Manager) DeleteInstance(name string) error {
}
// Acquire lock for instance deletion
lockPath := filepath.Join(m.dataDir, "instances", ".lock")
lockPath := tools.GetInstancesLockPath(m.dataDir)
return storage.WithLock(lockPath, func() error {
// Remove instance directory
if err := os.RemoveAll(instancePath); err != nil {
@@ -136,7 +140,7 @@ func (m *Manager) DeleteInstance(name string) error {
// ListInstances returns a list of all instance names
func (m *Manager) ListInstances() ([]string, error) {
instancesDir := filepath.Join(m.dataDir, "instances")
instancesDir := tools.GetInstancesPath(m.dataDir)
// Ensure instances directory exists
if !storage.FileExists(instancesDir) {

View File

@@ -1,29 +1,43 @@
package node
import (
"context"
"fmt"
"os"
"os/exec"
"path/filepath"
"strings"
"time"
"github.com/wild-cloud/wild-central/daemon/internal/config"
"github.com/wild-cloud/wild-central/daemon/internal/setup"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// Manager handles node configuration and state management
type Manager struct {
dataDir string
configMgr *config.Manager
talosctl *tools.Talosctl
dataDir string
configMgr *config.Manager
talosctl *tools.Talosctl
}
// NewManager creates a new node manager
func NewManager(dataDir string) *Manager {
func NewManager(dataDir string, instanceName string) *Manager {
var talosctl *tools.Talosctl
// If instanceName is provided, use instance-specific talosconfig
// Otherwise, create basic talosctl (will use --insecure mode)
if instanceName != "" {
talosconfigPath := tools.GetTalosconfigPath(dataDir, instanceName)
talosctl = tools.NewTalosconfigWithConfig(talosconfigPath)
} else {
talosctl = tools.NewTalosctl()
}
return &Manager{
dataDir: dataDir,
configMgr: config.NewManager(),
talosctl: tools.NewTalosctl(),
talosctl: talosctl,
}
}
@@ -58,7 +72,7 @@ type ApplyOptions struct {
// GetInstancePath returns the path to an instance's nodes directory
func (m *Manager) GetInstancePath(instanceName string) string {
return filepath.Join(m.dataDir, "instances", instanceName)
return tools.GetInstancePath(m.dataDir, instanceName)
}
// List returns all nodes for an instance
@@ -242,23 +256,53 @@ func (m *Manager) Add(instanceName string, node *Node) error {
}
// Delete removes a node from config.yaml
func (m *Manager) Delete(instanceName, nodeIdentifier string) error {
// If skipReset is false, the node will be reset before deletion (with 30s timeout)
func (m *Manager) Delete(instanceName, nodeIdentifier string, skipReset bool) error {
// Get node to find hostname
node, err := m.Get(instanceName, nodeIdentifier)
if err != nil {
return err
}
// Reset node first unless skipReset is true
if !skipReset {
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
// Use goroutine to respect context timeout
done := make(chan error, 1)
go func() {
done <- m.Reset(instanceName, nodeIdentifier)
}()
select {
case err := <-done:
if err != nil {
return fmt.Errorf("failed to reset node before deletion (use skip_reset=true to force delete): %w", err)
}
case <-ctx.Done():
return fmt.Errorf("node reset timed out after 30 seconds (use skip_reset=true to force delete)")
}
}
// Delete node from config.yaml
return m.deleteFromConfig(instanceName, node.Hostname)
}
// deleteFromConfig removes a node entry from config.yaml
func (m *Manager) deleteFromConfig(instanceName, hostname string) error {
instancePath := m.GetInstancePath(instanceName)
configPath := filepath.Join(instancePath, "config.yaml")
// Delete node from config.yaml
// Path: cluster.nodes.active.{hostname}
nodePath := fmt.Sprintf("cluster.nodes.active.%s", node.Hostname)
// Path: .cluster.nodes.active["hostname"]
// Use bracket notation to safely handle hostnames with special characters
nodePath := fmt.Sprintf(".cluster.nodes.active[\"%s\"]", hostname)
yq := tools.NewYQ()
// Use yq to delete the node
_, err = yq.Exec("eval", "-i", fmt.Sprintf("del(%s)", nodePath), configPath)
delExpr := fmt.Sprintf("del(%s)", nodePath)
_, err := yq.Exec("eval", "-i", delExpr, configPath)
if err != nil {
return fmt.Errorf("failed to delete node: %w", err)
}
@@ -267,10 +311,20 @@ func (m *Manager) Delete(instanceName, nodeIdentifier string) error {
}
// DetectHardware queries node hardware information via talosctl
// Automatically detects maintenance mode by trying insecure first, then secure
func (m *Manager) DetectHardware(nodeIP string) (*HardwareInfo, error) {
// Query node with insecure flag (maintenance mode)
insecure := true
// Try insecure first (maintenance mode)
hwInfo, err := m.detectHardwareWithMode(nodeIP, true)
if err == nil {
return hwInfo, nil
}
// Fall back to secure (configured node)
return m.detectHardwareWithMode(nodeIP, false)
}
// detectHardwareWithMode queries node hardware with specified connection mode
func (m *Manager) detectHardwareWithMode(nodeIP string, insecure bool) (*HardwareInfo, error) {
// Try to get default interface (with default route)
iface, err := m.talosctl.GetDefaultInterface(nodeIP, insecure)
if err != nil {
@@ -298,7 +352,7 @@ func (m *Manager) DetectHardware(nodeIP string) (*HardwareInfo, error) {
Interface: iface,
Disks: disks,
SelectedDisk: selectedDisk,
MaintenanceMode: true,
MaintenanceMode: insecure, // If we used insecure, it's in maintenance mode
}, nil
}
@@ -335,11 +389,11 @@ func (m *Manager) Apply(instanceName, nodeIdentifier string, opts ApplyOptions)
}
}
// Always auto-fetch templates if they don't exist
// Always auto-extract templates from embedded files if they don't exist
templatesDir := filepath.Join(setupDir, "patch.templates")
if !m.templatesExist(templatesDir) {
if err := m.copyTemplatesFromDirectory(templatesDir); err != nil {
return fmt.Errorf("failed to copy templates: %w", err)
if err := m.extractEmbeddedTemplates(templatesDir); err != nil {
return fmt.Errorf("failed to extract templates: %w", err)
}
}
@@ -379,9 +433,9 @@ func (m *Manager) Apply(instanceName, nodeIdentifier string, opts ApplyOptions)
// Determine which IP to use and whether node is in maintenance mode
//
// Three scenarios:
// 1. Production node (currentIP empty/same, maintenance=false): use targetIP, no --insecure
// 1. Production node (already applied, maintenance=false): use targetIP, no --insecure
// 2. IP changing (currentIP != targetIP): use currentIP, --insecure (always maintenance)
// 3. Maintenance at target (maintenance=true, no IP change): use targetIP, --insecure
// 3. Fresh/maintenance node (never applied OR maintenance=true): use targetIP, --insecure
var deployIP string
var maintenanceMode bool
@@ -389,12 +443,13 @@ func (m *Manager) Apply(instanceName, nodeIdentifier string, opts ApplyOptions)
// Scenario 2: IP is changing - node is at currentIP, moving to targetIP
deployIP = node.CurrentIP
maintenanceMode = true
} else if node.Maintenance {
// Scenario 3: Explicit maintenance mode, no IP change
} else if node.Maintenance || !node.Applied {
// Scenario 3: Explicit maintenance mode OR never been applied (fresh node)
// Fresh nodes need --insecure because they have self-signed certificates
deployIP = node.TargetIP
maintenanceMode = true
} else {
// Scenario 1: Production node at target IP
// Scenario 1: Production node at target IP (already applied, not in maintenance)
deployIP = node.TargetIP
maintenanceMode = false
}
@@ -407,8 +462,8 @@ func (m *Manager) Apply(instanceName, nodeIdentifier string, opts ApplyOptions)
// Post-application updates: move to production IP, exit maintenance mode
node.Applied = true
node.CurrentIP = node.TargetIP // Node now on production IP
node.Maintenance = false // Exit maintenance mode
node.CurrentIP = node.TargetIP // Node now on production IP
node.Maintenance = false // Exit maintenance mode
if err := m.updateNodeStatus(instanceName, node); err != nil {
return fmt.Errorf("failed to update node status: %w", err)
}
@@ -504,51 +559,36 @@ func (m *Manager) templatesExist(templatesDir string) bool {
return err1 == nil && err2 == nil
}
// copyTemplatesFromDirectory copies patch templates from directory/ to instance
func (m *Manager) copyTemplatesFromDirectory(destDir string) error {
// Find the directory/setup/cluster-nodes/patch.templates directory
// It should be in the same parent as the data directory
sourceDir := filepath.Join(filepath.Dir(m.dataDir), "directory", "setup", "cluster-nodes", "patch.templates")
// Check if source directory exists
if _, err := os.Stat(sourceDir); err != nil {
return fmt.Errorf("source templates directory not found: %s", sourceDir)
}
// extractEmbeddedTemplates extracts patch templates from embedded files to instance directory
func (m *Manager) extractEmbeddedTemplates(destDir string) error {
// Create destination directory
if err := os.MkdirAll(destDir, 0755); err != nil {
return fmt.Errorf("failed to create templates directory: %w", err)
}
// Copy controlplane.yaml
if err := m.copyFile(
filepath.Join(sourceDir, "controlplane.yaml"),
filepath.Join(destDir, "controlplane.yaml"),
); err != nil {
return fmt.Errorf("failed to copy controlplane template: %w", err)
// Get embedded template files
controlplaneData, err := setup.GetClusterNodesFile("patch.templates/controlplane.yaml")
if err != nil {
return fmt.Errorf("failed to get controlplane template: %w", err)
}
// Copy worker.yaml
if err := m.copyFile(
filepath.Join(sourceDir, "worker.yaml"),
filepath.Join(destDir, "worker.yaml"),
); err != nil {
return fmt.Errorf("failed to copy worker template: %w", err)
workerData, err := setup.GetClusterNodesFile("patch.templates/worker.yaml")
if err != nil {
return fmt.Errorf("failed to get worker template: %w", err)
}
// Write templates
if err := os.WriteFile(filepath.Join(destDir, "controlplane.yaml"), controlplaneData, 0644); err != nil {
return fmt.Errorf("failed to write controlplane template: %w", err)
}
if err := os.WriteFile(filepath.Join(destDir, "worker.yaml"), workerData, 0644); err != nil {
return fmt.Errorf("failed to write worker template: %w", err)
}
return nil
}
// copyFile copies a file from src to dst
func (m *Manager) copyFile(src, dst string) error {
data, err := os.ReadFile(src)
if err != nil {
return err
}
return os.WriteFile(dst, data, 0644)
}
// updateNodeStatus updates node status flags in config.yaml
func (m *Manager) updateNodeStatus(instanceName string, node *Node) error {
instancePath := m.GetInstancePath(instanceName)
@@ -576,17 +616,21 @@ func (m *Manager) updateNodeStatus(instanceName string, node *Node) error {
}
// Update configured flag
configuredValue := "false"
if node.Configured {
if err := yq.Set(configPath, basePath+".configured", "true"); err != nil {
return err
}
configuredValue = "true"
}
if err := yq.Set(configPath, basePath+".configured", configuredValue); err != nil {
return err
}
// Update applied flag
appliedValue := "false"
if node.Applied {
if err := yq.Set(configPath, basePath+".applied", "true"); err != nil {
return err
}
appliedValue = "true"
}
if err := yq.Set(configPath, basePath+".applied", appliedValue); err != nil {
return err
}
return nil
@@ -660,9 +704,55 @@ func (m *Manager) Update(instanceName string, hostname string, updates map[strin
return nil
}
// FetchTemplates copies patch templates from directory/ to instance
// FetchTemplates extracts patch templates from embedded files to instance
func (m *Manager) FetchTemplates(instanceName string) error {
instancePath := m.GetInstancePath(instanceName)
destDir := filepath.Join(instancePath, "setup", "cluster-nodes", "patch.templates")
return m.copyTemplatesFromDirectory(destDir)
return m.extractEmbeddedTemplates(destDir)
}
// Reset resets a node to maintenance mode
func (m *Manager) Reset(instanceName, nodeIdentifier string) error {
// Get node
node, err := m.Get(instanceName, nodeIdentifier)
if err != nil {
return fmt.Errorf("node not found: %w", err)
}
// Determine IP to reset
resetIP := node.CurrentIP
if resetIP == "" {
resetIP = node.TargetIP
}
// Execute reset command with graceful=false and reboot flags
talosconfigPath := tools.GetTalosconfigPath(m.dataDir, instanceName)
cmd := exec.Command("talosctl", "-n", resetIP, "--talosconfig", talosconfigPath, "reset", "--graceful=false", "--reboot")
output, err := cmd.CombinedOutput()
if err != nil {
// Check if error is due to node rebooting (expected after reset command)
outputStr := string(output)
if strings.Contains(outputStr, "connection refused") || strings.Contains(outputStr, "Unavailable") {
// This is expected - node is rebooting after successful reset
// Continue with config cleanup
} else {
// Real error - return it
return fmt.Errorf("failed to reset node: %w\nOutput: %s", err, outputStr)
}
}
// Update node status to maintenance mode, then remove from config
node.Maintenance = true
node.Configured = false
node.Applied = false
if err := m.updateNodeStatus(instanceName, node); err != nil {
return fmt.Errorf("failed to update node status: %w", err)
}
// Remove node from config.yaml after successful reset
if err := m.deleteFromConfig(instanceName, node.Hostname); err != nil {
return fmt.Errorf("failed to remove node from config: %w", err)
}
return nil
}

View File

@@ -8,8 +8,12 @@ import (
"time"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// Bootstrap step constants
const totalBootstrapSteps = 7
// Manager handles async operation tracking
type Manager struct {
dataDir string
@@ -22,23 +26,38 @@ func NewManager(dataDir string) *Manager {
}
}
// BootstrapProgress tracks detailed bootstrap progress
type BootstrapProgress struct {
CurrentStep int `json:"current_step"` // 0-6
StepName string `json:"step_name"`
Attempt int `json:"attempt"`
MaxAttempts int `json:"max_attempts"`
StepDescription string `json:"step_description"`
}
// OperationDetails contains operation-specific details
type OperationDetails struct {
BootstrapProgress *BootstrapProgress `json:"bootstrap,omitempty"`
}
// Operation represents a long-running operation
type Operation struct {
ID string `json:"id"`
Type string `json:"type"` // discover, setup, download, bootstrap
Target string `json:"target"`
Instance string `json:"instance"`
Status string `json:"status"` // pending, running, completed, failed, cancelled
Message string `json:"message,omitempty"`
Progress int `json:"progress"` // 0-100
LogFile string `json:"logFile,omitempty"` // Path to output log file
StartedAt time.Time `json:"started_at"`
EndedAt time.Time `json:"ended_at,omitempty"`
ID string `json:"id"`
Type string `json:"type"` // discover, setup, download, bootstrap
Target string `json:"target"`
Instance string `json:"instance"`
Status string `json:"status"` // pending, running, completed, failed, cancelled
Message string `json:"message,omitempty"`
Progress int `json:"progress"` // 0-100
Details *OperationDetails `json:"details,omitempty"` // Operation-specific details
LogFile string `json:"logFile,omitempty"` // Path to output log file
StartedAt time.Time `json:"started_at"`
EndedAt time.Time `json:"ended_at,omitempty"`
}
// GetOperationsDir returns the operations directory for an instance
func (m *Manager) GetOperationsDir(instanceName string) string {
return filepath.Join(m.dataDir, "instances", instanceName, "operations")
return tools.GetInstanceOperationsPath(m.dataDir, instanceName)
}
// generateID generates a unique operation ID
@@ -78,20 +97,6 @@ func (m *Manager) Start(instanceName, opType, target string) (string, error) {
return opID, nil
}
// Get returns operation status
func (m *Manager) Get(opID string) (*Operation, error) {
// Operation ID contains instance name, but we need to find it
// For now, we'll scan all instances (not ideal but simple)
// Better approach: encode instance in operation ID or maintain index
// Simplified: assume operation ID format is op_{type}_{target}_{timestamp}
// We need to know which instance to look in
// For now, return error if we can't find it
// This needs improvement in actual implementation
return nil, fmt.Errorf("operation lookup not implemented - need instance context")
}
// GetByInstance returns an operation for a specific instance
func (m *Manager) GetByInstance(instanceName, opID string) (*Operation, error) {
opsDir := m.GetOperationsDir(instanceName)
@@ -230,13 +235,38 @@ func (m *Manager) Cleanup(instanceName string, olderThan time.Duration) error {
for _, op := range ops {
if (op.Status == "completed" || op.Status == "failed" || op.Status == "cancelled") &&
!op.EndedAt.IsZero() && op.EndedAt.Before(cutoff) {
m.Delete(instanceName, op.ID)
_ = m.Delete(instanceName, op.ID)
}
}
return nil
}
// UpdateBootstrapProgress updates bootstrap-specific progress details
func (m *Manager) UpdateBootstrapProgress(instanceName, opID string, step int, stepName string, attempt, maxAttempts int, stepDescription string) error {
op, err := m.GetByInstance(instanceName, opID)
if err != nil {
return err
}
if op.Details == nil {
op.Details = &OperationDetails{}
}
op.Details.BootstrapProgress = &BootstrapProgress{
CurrentStep: step,
StepName: stepName,
Attempt: attempt,
MaxAttempts: maxAttempts,
StepDescription: stepDescription,
}
op.Progress = (step * 100) / (totalBootstrapSteps - 1)
op.Message = fmt.Sprintf("Step %d/%d: %s (attempt %d/%d)", step+1, totalBootstrapSteps, stepName, attempt, maxAttempts)
return m.writeOperation(op)
}
// writeOperation writes operation to disk
func (m *Manager) writeOperation(op *Operation) error {
opsDir := m.GetOperationsDir(op.Instance)

View File

@@ -9,6 +9,7 @@ import (
"path/filepath"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// Manager handles PXE boot asset management
@@ -35,7 +36,7 @@ type Asset struct {
// GetPXEDir returns the PXE directory for an instance
func (m *Manager) GetPXEDir(instanceName string) string {
return filepath.Join(m.dataDir, "instances", instanceName, "pxe")
return tools.GetInstancePXEPath(m.dataDir, instanceName)
}
// ListAssets returns available PXE assets for an instance

View File

@@ -95,6 +95,11 @@ func (m *Manager) GetSecret(secretsPath, key string) (string, error) {
return "", fmt.Errorf("getting secret %s: %w", key, err)
}
// yq returns "null" for non-existent keys
if value == "" || value == "null" {
return "", fmt.Errorf("secret not found: %s", key)
}
return value, nil
}

File diff suppressed because it is too large Load Diff

142
internal/services/config.go Normal file
View File

@@ -0,0 +1,142 @@
package services
import (
"fmt"
"os"
"strings"
"gopkg.in/yaml.v3"
"github.com/wild-cloud/wild-central/daemon/internal/contracts"
"github.com/wild-cloud/wild-central/daemon/internal/operations"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// UpdateConfig updates service configuration and optionally redeploys
func (m *Manager) UpdateConfig(instanceName, serviceName string, update contracts.ServiceConfigUpdate, broadcaster *operations.Broadcaster) (*contracts.ServiceConfigResponse, error) {
// 1. Validate service exists
manifest, err := m.GetManifest(serviceName)
if err != nil {
return nil, fmt.Errorf("service not found: %w", err)
}
namespace := manifest.Namespace
if deployment, ok := serviceDeployments[serviceName]; ok {
namespace = deployment.namespace
}
// 2. Load instance config
configPath := tools.GetInstanceConfigPath(m.dataDir, instanceName)
if !storage.FileExists(configPath) {
return nil, fmt.Errorf("config file not found for instance %s", instanceName)
}
configData, err := os.ReadFile(configPath)
if err != nil {
return nil, fmt.Errorf("failed to read config: %w", err)
}
var config map[string]interface{}
if err := yaml.Unmarshal(configData, &config); err != nil {
return nil, fmt.Errorf("failed to parse config: %w", err)
}
// 3. Validate config keys against service manifest
validPaths := make(map[string]bool)
for _, path := range manifest.ConfigReferences {
validPaths[path] = true
}
for _, cfg := range manifest.ServiceConfig {
validPaths[cfg.Path] = true
}
for key := range update.Config {
if !validPaths[key] {
return nil, fmt.Errorf("invalid config key '%s' for service %s", key, serviceName)
}
}
// 4. Update config values
for key, value := range update.Config {
if err := setNestedValue(config, key, value); err != nil {
return nil, fmt.Errorf("failed to set config key '%s': %w", key, err)
}
}
// 5. Write updated config
updatedData, err := yaml.Marshal(config)
if err != nil {
return nil, fmt.Errorf("failed to marshal config: %w", err)
}
if err := os.WriteFile(configPath, updatedData, 0644); err != nil {
return nil, fmt.Errorf("failed to write config: %w", err)
}
// 6. Redeploy if requested
redeployed := false
if update.Redeploy {
// Fetch fresh templates if requested
if update.Fetch {
if err := m.Fetch(instanceName, serviceName); err != nil {
return nil, fmt.Errorf("failed to fetch templates: %w", err)
}
}
// Recompile templates
if err := m.Compile(instanceName, serviceName); err != nil {
return nil, fmt.Errorf("failed to recompile templates: %w", err)
}
// Redeploy service
if err := m.Deploy(instanceName, serviceName, "", broadcaster); err != nil {
return nil, fmt.Errorf("failed to redeploy service: %w", err)
}
redeployed = true
}
// 7. Build response
message := "Service configuration updated successfully"
if redeployed {
message = "Service configuration updated and redeployed successfully"
}
return &contracts.ServiceConfigResponse{
Service: serviceName,
Namespace: namespace,
Config: update.Config,
Redeployed: redeployed,
Message: message,
}, nil
}
// setNestedValue sets a value in a nested map using dot notation
func setNestedValue(data map[string]interface{}, path string, value interface{}) error {
keys := strings.Split(path, ".")
current := data
for i, key := range keys {
if i == len(keys)-1 {
// Last key - set the value
current[key] = value
return nil
}
// Navigate to the next level
if next, ok := current[key].(map[string]interface{}); ok {
current = next
} else if current[key] == nil {
// Create intermediate map if it doesn't exist
next := make(map[string]interface{})
current[key] = next
current = next
} else {
return fmt.Errorf("path '%s' conflicts with existing non-map value at '%s'", path, key)
}
}
return nil
}

287
internal/services/logs.go Normal file
View File

@@ -0,0 +1,287 @@
package services
import (
"bufio"
"encoding/json"
"fmt"
"io"
"time"
"github.com/wild-cloud/wild-central/daemon/internal/contracts"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// GetLogs retrieves buffered logs from a service
func (m *Manager) GetLogs(instanceName, serviceName string, opts contracts.ServiceLogsRequest) (*contracts.ServiceLogsResponse, error) {
// 1. Get service namespace
manifest, err := m.GetManifest(serviceName)
if err != nil {
return nil, fmt.Errorf("service not found: %w", err)
}
namespace := manifest.Namespace
if deployment, ok := serviceDeployments[serviceName]; ok {
namespace = deployment.namespace
}
// 2. Get kubeconfig path
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
if !storage.FileExists(kubeconfigPath) {
return nil, fmt.Errorf("kubeconfig not found - cluster may not be bootstrapped")
}
kubectl := tools.NewKubectl(kubeconfigPath)
// 3. Get pod name (use first pod if no specific container specified)
podName := ""
if opts.Container == "" {
// Get first pod in namespace
podName, err = kubectl.GetFirstPodName(namespace)
if err != nil {
// Check if it's because there are no pods
pods, _ := kubectl.GetPods(namespace, false)
if len(pods) == 0 {
// Return empty logs response instead of error when no pods exist
return &contracts.ServiceLogsResponse{
Lines: []string{"No pods found for service. The service may not be deployed yet."},
}, nil
}
return nil, fmt.Errorf("failed to find pod: %w", err)
}
// If no container specified, get first container
containers, err := kubectl.GetPodContainers(namespace, podName)
if err != nil {
return nil, fmt.Errorf("failed to get pod containers: %w", err)
}
if len(containers) > 0 {
opts.Container = containers[0]
}
} else {
// Find pod with specified container
pods, err := kubectl.GetPods(namespace, false)
if err != nil {
return nil, fmt.Errorf("failed to list pods: %w", err)
}
if len(pods) > 0 {
podName = pods[0].Name
} else {
return nil, fmt.Errorf("no pods found in namespace %s", namespace)
}
}
// 4. Set default tail if not specified
if opts.Tail == 0 {
opts.Tail = 100
}
// Enforce maximum tail
if opts.Tail > 5000 {
opts.Tail = 5000
}
// 5. Get logs
logOpts := tools.LogOptions{
Container: opts.Container,
Tail: opts.Tail,
Previous: opts.Previous,
Since: opts.Since,
SinceSeconds: 0,
}
logEntries, err := kubectl.GetLogs(namespace, podName, logOpts)
if err != nil {
return nil, fmt.Errorf("failed to get logs: %w", err)
}
// 6. Convert structured logs to string lines
lines := make([]string, 0, len(logEntries))
for _, entry := range logEntries {
lines = append(lines, entry.Message)
}
truncated := false
if len(lines) > opts.Tail {
lines = lines[len(lines)-opts.Tail:]
truncated = true
}
return &contracts.ServiceLogsResponse{
Service: serviceName,
Namespace: namespace,
Container: opts.Container,
Lines: lines,
Truncated: truncated,
Timestamp: time.Now(),
}, nil
}
// StreamLogs streams logs from a service using SSE
func (m *Manager) StreamLogs(instanceName, serviceName string, opts contracts.ServiceLogsRequest, writer io.Writer) error {
// 1. Get service namespace
manifest, err := m.GetManifest(serviceName)
if err != nil {
return fmt.Errorf("service not found: %w", err)
}
namespace := manifest.Namespace
if deployment, ok := serviceDeployments[serviceName]; ok {
namespace = deployment.namespace
}
// 2. Get kubeconfig path
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
if !storage.FileExists(kubeconfigPath) {
return fmt.Errorf("kubeconfig not found - cluster may not be bootstrapped")
}
kubectl := tools.NewKubectl(kubeconfigPath)
// 3. Get pod name
podName := ""
if opts.Container == "" {
podName, err = kubectl.GetFirstPodName(namespace)
if err != nil {
// Check if it's because there are no pods
pods, _ := kubectl.GetPods(namespace, false)
if len(pods) == 0 {
// Send a message event indicating no pods
fmt.Fprintf(writer, "data: No pods found for service. The service may not be deployed yet.\n\n")
return nil
}
return fmt.Errorf("failed to find pod: %w", err)
}
// Get first container
containers, err := kubectl.GetPodContainers(namespace, podName)
if err != nil {
return fmt.Errorf("failed to get pod containers: %w", err)
}
if len(containers) > 0 {
opts.Container = containers[0]
}
} else {
pods, err := kubectl.GetPods(namespace, false)
if err != nil {
return fmt.Errorf("failed to list pods: %w", err)
}
if len(pods) > 0 {
podName = pods[0].Name
} else {
return fmt.Errorf("no pods found in namespace %s", namespace)
}
}
// 4. Set default tail for streaming
if opts.Tail == 0 {
opts.Tail = 50
}
// 5. Stream logs
logOpts := tools.LogOptions{
Container: opts.Container,
Tail: opts.Tail,
Since: opts.Since,
}
cmd, err := kubectl.StreamLogs(namespace, podName, logOpts)
if err != nil {
return fmt.Errorf("failed to start log stream: %w", err)
}
// Get stdout pipe
stdout, err := cmd.StdoutPipe()
if err != nil {
return fmt.Errorf("failed to get stdout pipe: %w", err)
}
stderr, err := cmd.StderrPipe()
if err != nil {
return fmt.Errorf("failed to get stderr pipe: %w", err)
}
// Start command
if err := cmd.Start(); err != nil {
return fmt.Errorf("failed to start kubectl logs: %w", err)
}
// Stream logs line by line as SSE events
scanner := bufio.NewScanner(stdout)
errScanner := bufio.NewScanner(stderr)
// Channel to signal completion
done := make(chan error, 1)
// Read stderr in background
go func() {
for errScanner.Scan() {
event := contracts.ServiceLogsSSEEvent{
Type: "error",
Error: errScanner.Text(),
Container: opts.Container,
Timestamp: time.Now(),
}
_ = writeSSEEvent(writer, event)
}
}()
// Read stdout
go func() {
for scanner.Scan() {
event := contracts.ServiceLogsSSEEvent{
Type: "log",
Line: scanner.Text(),
Container: opts.Container,
Timestamp: time.Now(),
}
if err := writeSSEEvent(writer, event); err != nil {
done <- err
return
}
}
if err := scanner.Err(); err != nil {
done <- err
return
}
done <- nil
}()
// Wait for completion or error
err = <-done
_ = cmd.Process.Kill()
// Send end event
endEvent := contracts.ServiceLogsSSEEvent{
Type: "end",
Timestamp: time.Now(),
}
_ = writeSSEEvent(writer, endEvent)
return err
}
// writeSSEEvent writes an SSE event to the writer
func writeSSEEvent(w io.Writer, event contracts.ServiceLogsSSEEvent) error {
// Marshal the event to JSON safely
jsonData, err := json.Marshal(event)
if err != nil {
return fmt.Errorf("failed to marshal SSE event: %w", err)
}
// Write SSE format: "data: <json>\n\n"
data := fmt.Sprintf("data: %s\n\n", jsonData)
_, err = w.Write([]byte(data))
if err != nil {
return err
}
// Flush if writer supports it
if flusher, ok := w.(interface{ Flush() }); ok {
flusher.Flush()
}
return nil
}

View File

@@ -24,9 +24,9 @@ type ServiceManifest struct {
// ConfigDefinition defines config that should be prompted during service setup
type ConfigDefinition struct {
Path string `yaml:"path" json:"path"` // Config path to set
Prompt string `yaml:"prompt" json:"prompt"` // User prompt text
Default string `yaml:"default" json:"default"` // Default value (supports templates)
Path string `yaml:"path" json:"path"` // Config path to set
Prompt string `yaml:"prompt" json:"prompt"` // User prompt text
Default string `yaml:"default" json:"default"` // Default value (supports templates)
Type string `yaml:"type,omitempty" json:"type,omitempty"` // Value type: string|int|bool (default: string)
}

View File

@@ -2,6 +2,7 @@ package services
import (
"fmt"
"io/fs"
"os"
"os/exec"
"path/filepath"
@@ -10,30 +11,56 @@ import (
"gopkg.in/yaml.v3"
"github.com/wild-cloud/wild-central/daemon/internal/operations"
"github.com/wild-cloud/wild-central/daemon/internal/setup"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// Manager handles base service operations
type Manager struct {
dataDir string
servicesDir string // Path to services directory
manifests map[string]*ServiceManifest // Cached service manifests
dataDir string
manifests map[string]*ServiceManifest // Cached service manifests
}
// NewManager creates a new services manager
func NewManager(dataDir, servicesDir string) *Manager {
// Note: Service definitions are now loaded from embedded setup files
func NewManager(dataDir string) *Manager {
m := &Manager{
dataDir: dataDir,
servicesDir: servicesDir,
dataDir: dataDir,
}
// Load all service manifests
manifests, err := LoadAllManifests(servicesDir)
if err != nil {
// Log error but continue - services without manifests will fall back to hardcoded map
fmt.Printf("Warning: failed to load service manifests: %v\n", err)
manifests = make(map[string]*ServiceManifest)
// Load all service manifests from embedded files
manifests := make(map[string]*ServiceManifest)
services, err := setup.ListServices()
if err == nil {
for _, serviceName := range services {
manifest, err := setup.GetManifest(serviceName)
if err == nil {
// Convert setup.ServiceManifest to services.ServiceManifest
// Convert setup.ConfigDefinition map to services.ConfigDefinition map
serviceConfig := make(map[string]ConfigDefinition)
for key, cfg := range manifest.ServiceConfig {
serviceConfig[key] = ConfigDefinition{
Path: cfg.Path,
Prompt: cfg.Prompt,
Default: cfg.Default,
Type: cfg.Type,
}
}
manifests[serviceName] = &ServiceManifest{
Name: manifest.Name,
Description: manifest.Description,
Namespace: manifest.Namespace,
Category: manifest.Category,
Dependencies: manifest.Dependencies,
ConfigReferences: manifest.ConfigReferences,
ServiceConfig: serviceConfig,
}
}
}
} else {
fmt.Printf("Warning: failed to load service manifests from embedded files: %v\n", err)
}
m.manifests = manifests
@@ -42,12 +69,13 @@ func NewManager(dataDir, servicesDir string) *Manager {
// Service represents a base service
type Service struct {
Name string `json:"name"`
Description string `json:"description"`
Status string `json:"status"`
Version string `json:"version"`
Namespace string `json:"namespace"`
Name string `json:"name"`
Description string `json:"description"`
Status string `json:"status"`
Version string `json:"version"`
Namespace string `json:"namespace"`
Dependencies []string `json:"dependencies,omitempty"`
HasConfig bool `json:"hasConfig"` // Whether service has configurable fields
}
// Base services in Wild Cloud (kept for reference/validation)
@@ -91,7 +119,8 @@ func (m *Manager) checkServiceStatus(instanceName, serviceName string) string {
// Special case: NFS doesn't have a deployment, check for StorageClass instead
if serviceName == "nfs" {
cmd := exec.Command("kubectl", "--kubeconfig", kubeconfigPath, "get", "storageclass", "nfs", "-o", "name")
cmd := exec.Command("kubectl", "get", "storageclass", "nfs", "-o", "name")
tools.WithKubeconfig(cmd, kubeconfigPath)
if err := cmd.Run(); err == nil {
return "deployed"
}
@@ -124,28 +153,25 @@ func (m *Manager) checkServiceStatus(instanceName, serviceName string) string {
func (m *Manager) List(instanceName string) ([]Service, error) {
services := []Service{}
// Discover services from the services directory
entries, err := os.ReadDir(m.servicesDir)
// Discover services from embedded setup files
serviceNames, err := setup.ListServices()
if err != nil {
return nil, fmt.Errorf("failed to read services directory: %w", err)
return nil, fmt.Errorf("failed to list services from embedded files: %w", err)
}
for _, entry := range entries {
if !entry.IsDir() {
continue // Skip non-directories like README.md
}
name := entry.Name()
for _, name := range serviceNames {
// Get service info from manifest if available
var namespace, description, version string
var dependencies []string
var hasConfig bool
if manifest, ok := m.manifests[name]; ok {
namespace = manifest.Namespace
description = manifest.Description
version = manifest.Category // Using category as version for now
dependencies = manifest.Dependencies
hasConfig = len(manifest.ServiceConfig) > 0
} else {
// Fall back to hardcoded map
namespace = name + "-system" // default
@@ -161,6 +187,7 @@ func (m *Manager) List(instanceName string) ([]Service, error) {
Description: description,
Version: version,
Dependencies: dependencies,
HasConfig: hasConfig,
}
services = append(services, service)
@@ -232,11 +259,17 @@ func (m *Manager) InstallAll(instanceName string, fetch, deploy bool, opID strin
func (m *Manager) Delete(instanceName, serviceName string) error {
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
serviceDir := filepath.Join(m.servicesDir, serviceName)
manifestsFile := filepath.Join(serviceDir, "manifests.yaml")
// Check if service exists in embedded files
if !setup.ServiceExists(serviceName) {
return fmt.Errorf("service %s not found", serviceName)
}
// Get manifests file from embedded setup or instance directory
instanceServiceDir := filepath.Join(tools.GetInstancePath(m.dataDir, instanceName), "setup", "cluster-services", serviceName)
manifestsFile := filepath.Join(instanceServiceDir, "manifests.yaml")
if !storage.FileExists(manifestsFile) {
return fmt.Errorf("service %s not found", serviceName)
return fmt.Errorf("service manifests not found - service may not be installed")
}
cmd := exec.Command("kubectl", "delete", "-f", manifestsFile)
@@ -292,46 +325,50 @@ func (m *Manager) GetConfigReferences(serviceName string) ([]string, error) {
return manifest.ConfigReferences, nil
}
// Fetch copies service files from directory to instance
// Fetch extracts service files from embedded setup to instance
func (m *Manager) Fetch(instanceName, serviceName string) error {
// 1. Validate service exists in directory
sourceDir := filepath.Join(m.servicesDir, serviceName)
if !dirExists(sourceDir) {
return fmt.Errorf("service %s not found in directory", serviceName)
// 1. Validate service exists in embedded files
if !setup.ServiceExists(serviceName) {
return fmt.Errorf("service %s not found in embedded files", serviceName)
}
// 2. Create instance service directory
instanceDir := filepath.Join(m.dataDir, "instances", instanceName,
instanceDir := filepath.Join(tools.GetInstancePath(m.dataDir, instanceName),
"setup", "cluster-services", serviceName)
if err := os.MkdirAll(instanceDir, 0755); err != nil {
return fmt.Errorf("failed to create service directory: %w", err)
}
// 3. Copy files:
// 3. Extract files from embedded setup:
// - README.md (if exists, optional)
// - install.sh (if exists, optional)
// - wild-manifest.yaml
// - kustomize.template/* (if exists, optional)
// Copy README.md
copyFileIfExists(filepath.Join(sourceDir, "README.md"),
filepath.Join(instanceDir, "README.md"))
// Copy install.sh (optional)
installSh := filepath.Join(sourceDir, "install.sh")
if fileExists(installSh) {
if err := copyFile(installSh, filepath.Join(instanceDir, "install.sh")); err != nil {
return fmt.Errorf("failed to copy install.sh: %w", err)
}
// Make install.sh executable
os.Chmod(filepath.Join(instanceDir, "install.sh"), 0755)
// Extract README.md if it exists
if readmeData, err := setup.GetServiceFile(serviceName, "README.md"); err == nil {
_ = os.WriteFile(filepath.Join(instanceDir, "README.md"), readmeData, 0644)
}
// Copy kustomize.template directory if it exists
templateDir := filepath.Join(sourceDir, "kustomize.template")
if dirExists(templateDir) {
// Extract install.sh if it exists
if installData, err := setup.GetServiceFile(serviceName, "install.sh"); err == nil {
installPath := filepath.Join(instanceDir, "install.sh")
if err := os.WriteFile(installPath, installData, 0755); err != nil {
return fmt.Errorf("failed to write install.sh: %w", err)
}
}
// Extract wild-manifest.yaml
if manifestData, err := setup.GetServiceFile(serviceName, "wild-manifest.yaml"); err == nil {
_ = os.WriteFile(filepath.Join(instanceDir, "wild-manifest.yaml"), manifestData, 0644)
}
// Extract kustomize.template directory
templateFS, err := setup.GetKustomizeTemplate(serviceName)
if err == nil {
destTemplateDir := filepath.Join(instanceDir, "kustomize.template")
if err := copyDir(templateDir, destTemplateDir); err != nil {
return fmt.Errorf("failed to copy templates: %w", err)
if err := extractFS(templateFS, destTemplateDir); err != nil {
return fmt.Errorf("failed to extract templates: %w", err)
}
}
@@ -340,7 +377,7 @@ func (m *Manager) Fetch(instanceName, serviceName string) error {
// serviceFilesExist checks if service files exist in the instance
func (m *Manager) serviceFilesExist(instanceName, serviceName string) bool {
serviceDir := filepath.Join(m.dataDir, "instances", instanceName,
serviceDir := filepath.Join(tools.GetInstancePath(m.dataDir, instanceName),
"setup", "cluster-services", serviceName)
installSh := filepath.Join(serviceDir, "install.sh")
return fileExists(installSh)
@@ -358,55 +395,35 @@ func dirExists(path string) bool {
return err == nil && info.IsDir()
}
func copyFile(src, dst string) error {
input, err := os.ReadFile(src)
if err != nil {
return err
}
return os.WriteFile(dst, input, 0644)
}
func copyFileIfExists(src, dst string) error {
if !fileExists(src) {
return nil
}
return copyFile(src, dst)
}
func copyDir(src, dst string) error {
// Create destination directory
if err := os.MkdirAll(dst, 0755); err != nil {
return err
}
// Read source directory
entries, err := os.ReadDir(src)
if err != nil {
return err
}
// Copy each entry
for _, entry := range entries {
srcPath := filepath.Join(src, entry.Name())
dstPath := filepath.Join(dst, entry.Name())
if entry.IsDir() {
if err := copyDir(srcPath, dstPath); err != nil {
return err
}
} else {
if err := copyFile(srcPath, dstPath); err != nil {
return err
}
// extractFS extracts files from an fs.FS to a destination directory
func extractFS(fsys fs.FS, dst string) error {
return fs.WalkDir(fsys, ".", func(path string, d fs.DirEntry, err error) error {
if err != nil {
return err
}
}
return nil
// Create destination path
dstPath := filepath.Join(dst, path)
if d.IsDir() {
// Create directory
return os.MkdirAll(dstPath, 0755)
}
// Read file from embedded FS
data, err := fs.ReadFile(fsys, path)
if err != nil {
return err
}
// Write file to destination
return os.WriteFile(dstPath, data, 0644)
})
}
// Compile processes gomplate templates into final Kubernetes manifests
func (m *Manager) Compile(instanceName, serviceName string) error {
instanceDir := filepath.Join(m.dataDir, "instances", instanceName)
instanceDir := tools.GetInstancePath(m.dataDir, instanceName)
serviceDir := filepath.Join(instanceDir, "setup", "cluster-services", serviceName)
templateDir := filepath.Join(serviceDir, "kustomize.template")
outputDir := filepath.Join(serviceDir, "kustomize")
@@ -484,7 +501,7 @@ func (m *Manager) Compile(instanceName, serviceName string) error {
func (m *Manager) Deploy(instanceName, serviceName, opID string, broadcaster *operations.Broadcaster) error {
fmt.Printf("[DEBUG] Deploy() called for service=%s instance=%s opID=%s\n", serviceName, instanceName, opID)
instanceDir := filepath.Join(m.dataDir, "instances", instanceName)
instanceDir := tools.GetInstancePath(m.dataDir, instanceName)
serviceDir := filepath.Join(instanceDir, "setup", "cluster-services", serviceName)
installScript := filepath.Join(serviceDir, "install.sh")
@@ -511,7 +528,7 @@ func (m *Manager) Deploy(instanceName, serviceName, opID string, broadcaster *op
env := os.Environ()
env = append(env,
fmt.Sprintf("WILD_INSTANCE=%s", instanceName),
fmt.Sprintf("WILD_CENTRAL_DATA=%s", m.dataDir),
fmt.Sprintf("WILD_API_DATA_DIR=%s", m.dataDir),
fmt.Sprintf("KUBECONFIG=%s", kubeconfigPath),
)
fmt.Printf("[DEBUG] Environment configured: WILD_INSTANCE=%s, KUBECONFIG=%s\n", instanceName, kubeconfigPath)
@@ -557,8 +574,8 @@ func (m *Manager) Deploy(instanceName, serviceName, opID string, broadcaster *op
err := cmd.Run()
fmt.Printf("[DEBUG] Command completed for opID=%s, err=%v\n", opID, err)
if broadcaster != nil {
outputWriter.Flush() // Flush any remaining buffered data
broadcaster.Close(opID) // Close all SSE clients
outputWriter.Flush() // Flush any remaining buffered data
broadcaster.Close(opID) // Close all SSE clients
}
return err
} else {
@@ -580,8 +597,7 @@ func (m *Manager) validateConfig(instanceName, serviceName string) error {
}
// Load instance config
instanceDir := filepath.Join(m.dataDir, "instances", instanceName)
configFile := filepath.Join(instanceDir, "config.yaml")
configFile := tools.GetInstanceConfigPath(m.dataDir, instanceName)
configData, err := os.ReadFile(configFile)
if err != nil {

146
internal/services/status.go Normal file
View File

@@ -0,0 +1,146 @@
package services
import (
"fmt"
"os"
"time"
"gopkg.in/yaml.v3"
"github.com/wild-cloud/wild-central/daemon/internal/contracts"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// GetDetailedStatus returns comprehensive service status including pods and health
func (m *Manager) GetDetailedStatus(instanceName, serviceName string) (*contracts.DetailedServiceStatus, error) {
// 1. Get service manifest and namespace
manifest, err := m.GetManifest(serviceName)
if err != nil {
return nil, fmt.Errorf("service not found: %w", err)
}
namespace := manifest.Namespace
deploymentName := manifest.GetDeploymentName()
// Check hardcoded map for correct deployment name
if deployment, ok := serviceDeployments[serviceName]; ok {
namespace = deployment.namespace
deploymentName = deployment.deploymentName
}
// 2. Get kubeconfig path
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
if !storage.FileExists(kubeconfigPath) {
return &contracts.DetailedServiceStatus{
Name: serviceName,
Namespace: namespace,
DeploymentStatus: "NotFound",
Replicas: contracts.ReplicaStatus{},
Pods: []contracts.PodStatus{},
LastUpdated: time.Now(),
}, nil
}
kubectl := tools.NewKubectl(kubeconfigPath)
// 3. Get deployment information
deploymentInfo, err := kubectl.GetDeployment(deploymentName, namespace)
deploymentStatus := "NotFound"
replicas := contracts.ReplicaStatus{}
if err == nil {
replicas = contracts.ReplicaStatus{
Desired: deploymentInfo.Desired,
Current: deploymentInfo.Current,
Ready: deploymentInfo.Ready,
Available: deploymentInfo.Available,
}
// Determine deployment status
if deploymentInfo.Ready == deploymentInfo.Desired && deploymentInfo.Desired > 0 {
deploymentStatus = "Ready"
} else if deploymentInfo.Ready < deploymentInfo.Desired {
if deploymentInfo.Current > deploymentInfo.Desired {
deploymentStatus = "Progressing"
} else {
deploymentStatus = "Degraded"
}
} else if deploymentInfo.Desired == 0 {
deploymentStatus = "Scaled to Zero"
}
}
// 4. Get pod information
podInfos, err := kubectl.GetPods(namespace, false)
pods := make([]contracts.PodStatus, 0, len(podInfos))
if err == nil {
for _, podInfo := range podInfos {
pods = append(pods, contracts.PodStatus{
Name: podInfo.Name,
Status: podInfo.Status,
Ready: podInfo.Ready,
Restarts: podInfo.Restarts,
Age: podInfo.Age,
Node: podInfo.Node,
IP: podInfo.IP,
})
}
}
// 5. Load current config values
configPath := tools.GetInstanceConfigPath(m.dataDir, instanceName)
configValues := make(map[string]interface{})
if storage.FileExists(configPath) {
configData, err := os.ReadFile(configPath)
if err == nil {
var instanceConfig map[string]interface{}
if err := yaml.Unmarshal(configData, &instanceConfig); err == nil {
// Extract values for all config paths
for _, path := range manifest.ConfigReferences {
if value := getNestedValue(instanceConfig, path); value != nil {
configValues[path] = value
}
}
for _, cfg := range manifest.ServiceConfig {
if value := getNestedValue(instanceConfig, cfg.Path); value != nil {
configValues[cfg.Path] = value
}
}
}
}
}
// 6. Convert ServiceConfig to contracts.ConfigDefinition
contractsServiceConfig := make(map[string]contracts.ConfigDefinition)
for key, cfg := range manifest.ServiceConfig {
contractsServiceConfig[key] = contracts.ConfigDefinition{
Path: cfg.Path,
Prompt: cfg.Prompt,
Default: cfg.Default,
Type: cfg.Type,
}
}
// 7. Build detailed status response
status := &contracts.DetailedServiceStatus{
Name: serviceName,
Namespace: namespace,
DeploymentStatus: deploymentStatus,
Replicas: replicas,
Pods: pods,
Config: configValues,
Manifest: &contracts.ServiceManifest{
Name: manifest.Name,
Description: manifest.Description,
Namespace: manifest.Namespace,
ConfigReferences: manifest.ConfigReferences,
ServiceConfig: contractsServiceConfig,
},
LastUpdated: time.Now(),
}
return status, nil
}

15
internal/setup/README.md Normal file
View File

@@ -0,0 +1,15 @@
# Setup instructions
Install dependencies:
Follow the instructions to [set up a dnsmasq machine](./dnsmasq/README.md).
Follow the instructions to [set up cluster nodes](./cluster-nodes/README.md).
Follow the instruction to set up [cluster services](./cluster-services/README.md).
Now make sure everything works:
```bash
wild-health
```

View File

@@ -0,0 +1,80 @@
#!/bin/bash
# Talos cluster initialization script
# This script performs one-time cluster setup: generates secrets, base configs, and sets up talosctl
set -euo pipefail
# Check if WC_HOME is set
if [ -z "${WC_HOME:-}" ]; then
echo "Error: WC_HOME environment variable not set. Run \`source ./env.sh\`."
exit 1
fi
NODE_SETUP_DIR="${WC_HOME}/setup/cluster-nodes"
# Get cluster configuration from config.yaml
CLUSTER_NAME=$(wild-config cluster.name)
VIP=$(wild-config cluster.nodes.control.vip)
TALOS_VERSION=$(wild-config cluster.nodes.talos.version)
echo "Initializing Talos cluster: $CLUSTER_NAME"
echo "VIP: $VIP"
echo "Talos version: $TALOS_VERSION"
# Create directories
mkdir -p generated final patch
# Check if cluster secrets already exist
if [ -f "generated/secrets.yaml" ]; then
echo ""
echo "⚠️ Cluster secrets already exist!"
echo "This will regenerate ALL cluster certificates and invalidate existing nodes."
echo ""
read -p "Do you want to continue? (y/N): " -r
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
echo "Cancelled."
exit 0
fi
echo ""
fi
# Generate fresh cluster secrets
echo "Generating cluster secrets..."
cd generated
talosctl gen secrets -o secrets.yaml --force
echo "Generating base machine configs..."
talosctl gen config --with-secrets secrets.yaml "$CLUSTER_NAME" "https://$VIP:6443" --force
cd ..
# Setup talosctl context
echo "Setting up talosctl context..."
# Remove existing context if it exists
talosctl config context "$CLUSTER_NAME" --remove 2>/dev/null || true
# Merge new configuration
talosctl config merge ./generated/talosconfig
talosctl config endpoint "$VIP"
echo ""
echo "✅ Cluster initialization complete!"
echo ""
echo "Cluster details:"
echo " - Name: $CLUSTER_NAME"
echo " - VIP: $VIP"
echo " - Secrets: generated/secrets.yaml"
echo " - Base configs: generated/controlplane.yaml, generated/worker.yaml"
echo ""
echo "Talosctl context configured:"
talosctl config info
echo ""
echo "Next steps:"
echo "1. Register nodes with hardware detection:"
echo " ./detect-node-hardware.sh <maintenance-ip> <node-number>"
echo ""
echo "2. Generate machine configurations:"
echo " ./generate-machine-configs.sh"
echo ""
echo "3. Apply configurations to nodes"

View File

@@ -0,0 +1,23 @@
machine:
install:
disk: {{ index .cluster.nodes.active "{{NODE_NAME}}" "disk" }}
image: factory.talos.dev/metal-installer/{{SCHEMATIC_ID}}:{{VERSION}}
network:
hostname: "{{NODE_NAME}}"
interfaces:
- interface: {{ index .cluster.nodes.active "{{NODE_NAME}}" "interface" }}
dhcp: false
addresses:
- "{{NODE_IP}}/24"
routes:
- network: 0.0.0.0/0
gateway: {{ .cloud.router.ip }}
vip:
ip: {{ .cluster.nodes.control.vip }}
# cluster:
# discovery:
# enabled: true
# registries:
# service:
# disabled: true
# allowSchedulingOnControlPlanes: true

View File

@@ -0,0 +1,23 @@
machine:
install:
disk: {{ index .cluster.nodes.active "{{NODE_NAME}}" "disk" }}
image: factory.talos.dev/metal-installer/{{ .cluster.nodes.talos.schematicId}}:{{ .cluster.nodes.talos.version}}
network:
hostname: "{{NODE_NAME}}"
interfaces:
- interface: {{ index .cluster.nodes.active "{{NODE_NAME}}" "interface" }}
dhcp: true
addresses:
- "{{NODE_IP}}/24"
routes:
- network: 0.0.0.0/0
gateway: {{ .cloud.router.ip }}
kubelet:
extraMounts:
- destination: /var/lib/longhorn
type: bind
source: /var/lib/longhorn
options:
- bind
- rshared
- rw

View File

@@ -0,0 +1,63 @@
# Talos Version to Schematic ID Mappings
#
# This file contains mappings of Talos versions to their corresponding
# default schematic IDs for wild-cloud deployments.
#
# Schematic IDs are generated from factory.talos.dev and include
# common system extensions needed for typical hardware.
#
# To add new versions:
# 1. Go to https://factory.talos.dev/
# 2. Select the system extensions you need
# 3. Generate the schematic
# 4. Add the version and schematic ID below
# Format: Each schematic ID is the primary key with version and definition nested
"434a0300db532066f1098e05ac068159371d00f0aba0a3103a0e826e83825c82":
schematic:
customization:
systemExtensions:
officialExtensions:
- siderolabs/gvisor
- siderolabs/intel-ucode
- siderolabs/iscsi-tools
- siderolabs/util-linux-tools
"f309e674d9ad94655e2cf8a43ea1432475c717cd1885f596bd7ec852b900bc5b":
schematic:
customization:
systemExtensions:
officialExtensions:
- siderolabs/gvisor
- siderolabs/intel-ucode
- siderolabs/iscsi-tools
- siderolabs/nvidia-container-toolkit-lts
- siderolabs/nvidia-container-toolkit-production
- siderolabs/nvidia-fabricmanager-lts
- siderolabs/nvidia-fabricmanager-production
- siderolabs/nvidia-open-gpu-kernel-modules-lts
- siderolabs/nvidia-open-gpu-kernel-modules-production
- siderolabs/util-linux-tools"
"56774e0894c8a3a3a9834a2aea65f24163cacf9506abbcbdc3ba135eaca4953f":
schematic:
customization:
systemExtensions:
officialExtensions:
- siderolabs/gvisor
- siderolabs/intel-ucode
- siderolabs/iscsi-tools
- siderolabs/nvidia-container-toolkit-production
- siderolabs/nvidia-fabricmanager-production
- siderolabs/nvidia-open-gpu-kernel-modules-production
- siderolabs/util-linux-tools
"9ac1424dbdf4b964154a36780dbf2215bf17d2752cd0847fa3b81d7da761457f":
schematic:
customization:
systemExtensions:
officialExtensions:
- siderolabs/gvisor
- siderolabs/intel-ucode
- siderolabs/iscsi-tools
- siderolabs/nonfree-kmod-nvidia-production
- siderolabs/nvidia-container-toolkit-production
- siderolabs/nvidia-fabricmanager-production
- siderolabs/util-linux-tools

View File

@@ -0,0 +1,102 @@
# Wild Cloud Cluster Services
Creates a fully functional personal cloud infrastructure on a bare metal Kubernetes cluster that provides:
1. **External access** to services via configured domain names (using ${DOMAIN})
2. **Internal-only access** to admin interfaces (via internal.${DOMAIN} subdomains)
3. **Secure traffic routing** with automatic TLS
4. **Reliable networking** with proper load balancing
## Service Management
Wild Cloud uses a streamlined per-service setup approach:
**Primary Command**: `wild-service-setup <service> [options]`
- **Default**: Configure and deploy service using existing templates
- **`--fetch`**: Fetch fresh templates before setup (for updates)
- **`--no-deploy`**: Configure only, skip deployment (for planning)
**Master Orchestrator**: `wild-setup-services`
- Sets up all services in proper dependency order
- Each service validates its prerequisites before deployment
- Fail-fast approach with clear recovery instructions
## Architecture
```
Internet → External DNS → MetalLB LoadBalancer → Traefik → Kubernetes Services
Internal DNS
Internal Network
```
## Key Components
- **[MetalLB](metallb/README.md)** - Provides load balancing for bare metal clusters
- **[Traefik](traefik/README.md)** - Handles ingress traffic, TLS termination, and routing
- **[cert-manager](cert-manager/README.md)** - Manages TLS certificates
- **[CoreDNS](coredns/README.md)** - Provides DNS resolution for services
- **[ExternalDNS](externaldns/README.md)** - Automatic DNS record management
- **[Longhorn](longhorn/README.md)** - Distributed storage system for persistent volumes
- **[NFS](nfs/README.md)** - Network file system for shared media storage (optional)
- **[Kubernetes Dashboard](kubernetes-dashboard/README.md)** - Web UI for cluster management (accessible via https://dashboard.internal.${DOMAIN})
- **[Docker Registry](docker-registry/README.md)** - Private container registry for custom images
- **[Utils](utils/README.md)** - Cluster utilities and debugging tools
## Common Usage Patterns
### Complete Infrastructure Setup
```bash
# All services with fresh templates (recommended for first-time setup)
wild-setup-services --fetch
# All services using existing templates (fastest)
wild-setup-services
# Configure all services but don't deploy (for planning)
wild-setup-services --no-deploy
```
### Individual Service Management
```bash
# Most common - reconfigure and deploy existing service
wild-service-setup cert-manager
# Get fresh templates and deploy (for updates)
wild-service-setup cert-manager --fetch
# Configure only, don't deploy (for planning)
wild-service-setup cert-manager --no-deploy
# Fresh templates + configure + deploy
wild-service-setup cert-manager --fetch
```
### Service Dependencies
Services are automatically deployed in dependency order:
1. **metallb** → Load balancing foundation
2. **traefik** → Ingress (requires metallb)
3. **cert-manager** → TLS certificates (requires traefik)
4. **externaldns** → DNS automation (requires cert-manager)
5. **kubernetes-dashboard** → Admin UI (requires cert-manager)
Each service validates its dependencies before deployment.
## Idempotent Design
All setup is designed to be idempotent and reliable:
- **Atomic Operations**: Each service handles its complete lifecycle
- **Dependency Validation**: Services check prerequisites before deployment
- **Error Recovery**: Failed services can be individually fixed and re-run
- **Safe Retries**: Operations can be repeated without harm
- **Incremental Updates**: Configuration changes applied cleanly
Example recovery from cert-manager failure:
```bash
# Fix the issue, then resume
wild-service-setup cert-manager --fetch
# Continue with remaining services
wild-service-setup externaldns --fetch
```

View File

@@ -0,0 +1,260 @@
#!/bin/bash
set -e
set -o pipefail
# Ensure WILD_INSTANCE is set
if [ -z "${WILD_INSTANCE}" ]; then
echo "❌ ERROR: WILD_INSTANCE is not set"
exit 1
fi
# Ensure WILD_API_DATA_DIR is set
if [ -z "${WILD_API_DATA_DIR}" ]; then
echo "❌ ERROR: WILD_API_DATA_DIR is not set"
exit 1
fi
# Ensure KUBECONFIG is set
if [ -z "${KUBECONFIG}" ]; then
echo "❌ ERROR: KUBECONFIG is not set"
exit 1
fi
INSTANCE_DIR="${WILD_API_DATA_DIR}/instances/${WILD_INSTANCE}"
CLUSTER_SETUP_DIR="${INSTANCE_DIR}/setup/cluster-services"
CERT_MANAGER_DIR="${CLUSTER_SETUP_DIR}/cert-manager"
echo "🔧 === Setting up cert-manager ==="
echo ""
#######################
# Dependencies
#######################
# Check Traefik dependency
echo "🔍 Verifying Traefik is ready (required for cert-manager)..."
kubectl wait --for=condition=Available deployment/traefik -n traefik --timeout=60s 2>/dev/null || {
echo "⚠️ Traefik not ready, but continuing with cert-manager installation"
echo "💡 Note: cert-manager may not work properly without Traefik"
}
if [ ! -d "${CERT_MANAGER_DIR}/kustomize" ]; then
echo "❌ ERROR: Compiled templates not found at ${CERT_MANAGER_DIR}/kustomize"
echo "Templates should be compiled before deployment."
exit 1
fi
# Note: DNS validation and Cloudflare token setup moved to configuration phase
# The configuration should be set via: wild config set cluster.certManager.cloudflare.*
########################
# Kubernetes components
########################
echo "📦 Installing cert-manager components..."
# Using stable URL for cert-manager installation
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.1/cert-manager.yaml || \
kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.13.1/cert-manager.yaml
# Wait for cert-manager to be ready
echo "⏳ Waiting for cert-manager to be ready..."
kubectl wait --for=condition=Available deployment/cert-manager -n cert-manager --timeout=120s
kubectl wait --for=condition=Available deployment/cert-manager-cainjector -n cert-manager --timeout=120s
kubectl wait --for=condition=Available deployment/cert-manager-webhook -n cert-manager --timeout=120s
# Create Cloudflare API token secret
# Read token from Wild Central secrets file
echo "🔐 Creating Cloudflare API token secret..."
SECRETS_FILE="${WILD_API_DATA_DIR}/instances/${WILD_INSTANCE}/secrets.yaml"
CLOUDFLARE_API_TOKEN=$(yq '.cloudflare.token' "$SECRETS_FILE" 2>/dev/null)
CLOUDFLARE_API_TOKEN=$(echo "$CLOUDFLARE_API_TOKEN")
if [ -z "$CLOUDFLARE_API_TOKEN" ] || [ "$CLOUDFLARE_API_TOKEN" = "null" ]; then
echo "❌ ERROR: Cloudflare API token not found"
echo "💡 Please set: wild secret set cloudflare.token YOUR_TOKEN"
exit 1
fi
kubectl create secret generic cloudflare-api-token \
--namespace cert-manager \
--from-literal=api-token="${CLOUDFLARE_API_TOKEN}" \
--dry-run=client -o yaml | kubectl apply -f -
# Ensure webhook is fully operational
echo "🔍 Verifying cert-manager webhook is fully operational..."
until kubectl get validatingwebhookconfigurations cert-manager-webhook &>/dev/null; do
echo "⏳ Waiting for cert-manager webhook to register..."
sleep 5
done
# Configure cert-manager to use external DNS for challenge verification
echo "🌐 Configuring cert-manager to use external DNS servers..."
kubectl patch deployment cert-manager -n cert-manager --patch '
spec:
template:
spec:
dnsPolicy: None
dnsConfig:
nameservers:
- "1.1.1.1"
- "8.8.8.8"
searches:
- cert-manager.svc.cluster.local
- svc.cluster.local
- cluster.local
options:
- name: ndots
value: "5"'
# Wait for cert-manager to restart with new DNS config
echo "⏳ Waiting for cert-manager to restart with new DNS configuration..."
kubectl rollout status deployment/cert-manager -n cert-manager --timeout=120s
########################
# Create issuers and certificates
########################
# Apply Let's Encrypt issuers and certificates using kustomize
echo "🚀 Creating Let's Encrypt issuers and certificates..."
kubectl apply -k ${CERT_MANAGER_DIR}/kustomize
# Wait for issuers to be ready
echo "⏳ Waiting for Let's Encrypt issuers to be ready..."
kubectl wait --for=condition=Ready clusterissuer/letsencrypt-prod --timeout=60s || echo "⚠️ Production issuer not ready, proceeding anyway..."
kubectl wait --for=condition=Ready clusterissuer/letsencrypt-staging --timeout=60s || echo "⚠️ Staging issuer not ready, proceeding anyway..."
# Give cert-manager a moment to process the certificates
sleep 5
######################################
# Fix stuck certificates and cleanup
######################################
needs_restart=false
# STEP 1: Fix certificates stuck with 404 errors
echo "🔍 Checking for certificates with failed issuance attempts..."
stuck_certs=$(kubectl get certificates --all-namespaces -o json 2>/dev/null | \
jq -r '.items[] | select(.status.conditions[]? | select(.type=="Issuing" and .status=="False" and (.message | contains("404")))) | "\(.metadata.namespace) \(.metadata.name)"')
if [ -n "$stuck_certs" ]; then
echo "⚠️ Found certificates stuck with non-existent orders, recreating them..."
echo "$stuck_certs" | while read ns name; do
echo "🔄 Recreating certificate $ns/$name..."
cert_spec=$(kubectl get certificate "$name" -n "$ns" -o json | jq '.spec')
kubectl delete certificate "$name" -n "$ns"
echo "{\"apiVersion\":\"cert-manager.io/v1\",\"kind\":\"Certificate\",\"metadata\":{\"name\":\"$name\",\"namespace\":\"$ns\"},\"spec\":$cert_spec}" | kubectl apply -f -
done
needs_restart=true
sleep 5
else
echo "✅ No certificates stuck with failed orders"
fi
# STEP 2: Clean up orphaned orders
echo "🔍 Checking for orphaned ACME orders..."
orphaned_orders=$(kubectl logs -n cert-manager deployment/cert-manager --tail=200 2>/dev/null | \
grep -E "failed to retrieve the ACME order.*404" 2>/dev/null | \
sed -n 's/.*resource_name="\([^"]*\)".*/\1/p' | \
sort -u || true)
if [ -n "$orphaned_orders" ]; then
echo "⚠️ Found orphaned ACME orders from logs"
for order in $orphaned_orders; do
echo "🗑️ Deleting orphaned order: $order"
orders_found=$(kubectl get orders --all-namespaces 2>/dev/null | grep "$order" 2>/dev/null || true)
if [ -n "$orders_found" ]; then
echo "$orders_found" | while read ns name rest; do
kubectl delete order "$name" -n "$ns" 2>/dev/null || true
done
fi
done
needs_restart=true
else
echo "✅ No orphaned orders found in logs"
fi
# STEP 2.5: Check for Cloudflare DNS cleanup errors
echo "🔍 Checking for Cloudflare DNS cleanup errors..."
cloudflare_errors=$(kubectl logs -n cert-manager deployment/cert-manager --tail=200 2>/dev/null | \
grep -c "Error: 7003.*Could not route" 2>/dev/null || echo "0")
if [ "$cloudflare_errors" -gt "0" ]; then
echo "⚠️ Found $cloudflare_errors Cloudflare DNS cleanup errors (stale DNS record references)"
echo "💡 Deleting stuck challenges and orders to allow fresh start"
# Delete all challenges and orders in cert-manager namespace
kubectl delete challenges --all -n cert-manager 2>/dev/null || true
kubectl delete orders --all -n cert-manager 2>/dev/null || true
needs_restart=true
else
echo "✅ No Cloudflare DNS cleanup errors"
fi
# STEP 3: Single restart if anything needs cleaning
if [ "$needs_restart" = true ]; then
echo "🔄 Restarting cert-manager to clear internal state..."
kubectl rollout restart deployment cert-manager -n cert-manager
kubectl rollout status deployment/cert-manager -n cert-manager --timeout=120s
echo "⏳ Waiting for cert-manager to recreate fresh challenges..."
sleep 15
else
echo "✅ No restart needed - cert-manager state is clean"
fi
#########################
# Final checks
#########################
# Wait for the certificates to be issued with progress feedback
echo "⏳ Waiting for wildcard certificates to be ready (this may take several minutes)..."
# Function to wait for certificate with progress output
wait_for_cert() {
local cert_name="$1"
local timeout=300
local elapsed=0
echo " 📜 Checking $cert_name..."
while [ $elapsed -lt $timeout ]; do
if kubectl get certificate "$cert_name" -n cert-manager -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}' 2>/dev/null | grep -q "True"; then
echo "$cert_name is ready"
return 0
fi
# Show progress every 30 seconds
if [ $((elapsed % 30)) -eq 0 ] && [ $elapsed -gt 0 ]; then
local status=$(kubectl get certificate "$cert_name" -n cert-manager -o jsonpath='{.status.conditions[?(@.type=="Ready")].message}' 2>/dev/null || echo "Waiting...")
echo " ⏳ Still waiting for $cert_name... ($elapsed/${timeout}s) - $status"
fi
sleep 5
elapsed=$((elapsed + 5))
done
echo " ⚠️ Timeout waiting for $cert_name (will continue anyway)"
return 1
}
wait_for_cert "wildcard-internal-wild-cloud"
wait_for_cert "wildcard-wild-cloud"
# Final health check
echo "🔍 Performing final cert-manager health check..."
failed_certs=$(kubectl get certificates --all-namespaces -o json 2>/dev/null | jq -r '.items[] | select(.status.conditions[]? | select(.type=="Ready" and .status!="True")) | "\(.metadata.namespace)/\(.metadata.name)"' | wc -l)
if [ "$failed_certs" -gt 0 ]; then
echo "⚠️ Found $failed_certs certificates not in Ready state"
echo "💡 Check certificate status with: kubectl get certificates --all-namespaces"
echo "💡 Check cert-manager logs with: kubectl logs -n cert-manager deployment/cert-manager"
else
echo "✅ All certificates are in Ready state"
fi
echo ""
echo "✅ cert-manager setup complete!"
echo ""
echo "💡 To verify the installation:"
echo " kubectl get certificates --all-namespaces"
echo " kubectl get clusterissuers"

View File

@@ -0,0 +1,19 @@
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: wildcard-internal-wild-cloud
namespace: cert-manager
spec:
secretName: wildcard-internal-wild-cloud-tls
dnsNames:
- "*.{{ .cloud.internalDomain }}"
- "{{ .cloud.internalDomain }}"
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
duration: 2160h # 90 days
renewBefore: 360h # 15 days
privateKey:
algorithm: RSA
size: 2048

View File

@@ -0,0 +1,12 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- namespace.yaml
- letsencrypt-staging-dns01.yaml
- letsencrypt-prod-dns01.yaml
- internal-wildcard-certificate.yaml
- wildcard-certificate.yaml
# Note: cert-manager.yaml contains the main installation manifests
# but is applied separately via URL in the install script

View File

@@ -0,0 +1,25 @@
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
email: {{ .operator.email }}
privateKeySecretRef:
name: letsencrypt-prod
server: https://acme-v02.api.letsencrypt.org/directory
solvers:
# DNS-01 solver for wildcard certificates
- dns01:
cloudflare:
apiTokenSecretRef:
name: cloudflare-api-token
key: api-token
selector:
dnsZones:
- "{{ .cluster.certManager.cloudflare.domain }}"
# Keep the HTTP-01 solver for non-wildcard certificates
- http01:
ingress:
class: traefik

View File

@@ -0,0 +1,25 @@
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-staging
spec:
acme:
email: {{ .operator.email }}
privateKeySecretRef:
name: letsencrypt-staging
server: https://acme-staging-v02.api.letsencrypt.org/directory
solvers:
# DNS-01 solver for wildcard certificates
- dns01:
cloudflare:
apiTokenSecretRef:
name: cloudflare-api-token
key: api-token
selector:
dnsZones:
- "{{ .cluster.certManager.cloudflare.domain }}"
# Keep the HTTP-01 solver for non-wildcard certificates
- http01:
ingress:
class: traefik

View File

@@ -0,0 +1,4 @@
apiVersion: v1
kind: Namespace
metadata:
name: cert-manager

View File

@@ -0,0 +1,19 @@
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: wildcard-wild-cloud
namespace: cert-manager
spec:
secretName: wildcard-wild-cloud-tls
dnsNames:
- "*.{{ .cloud.domain }}"
- "{{ .cloud.domain }}"
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
duration: 2160h # 90 days
renewBefore: 360h # 15 days
privateKey:
algorithm: RSA
size: 2048

View File

@@ -0,0 +1,25 @@
name: cert-manager
description: X.509 certificate management for Kubernetes
namespace: cert-manager
category: infrastructure
dependencies:
- traefik
configReferences:
- cloud.domain
- cloud.baseDomain
- cloud.internalDomain
- operator.email
serviceConfig:
cloudflareDomain:
path: cluster.certManager.cloudflare.domain
prompt: "Enter Cloudflare domain"
default: "{{ .cloud.baseDomain }}"
type: string
cloudflareZoneID:
path: cluster.certManager.cloudflare.zoneID
prompt: "Enter Cloudflare zone ID"
default: ""
type: string

View File

@@ -0,0 +1,112 @@
#!/bin/bash
# Common functions for Wild Central service installation scripts
# TODO: We should use this. :P
# Ensure required environment variables are set
if [ -z "${WILD_INSTANCE}" ]; then
echo "❌ ERROR: WILD_INSTANCE environment variable is not set"
exit 1
fi
if [ -z "${WILD_API_DATA_DIR}" ]; then
echo "❌ ERROR: WILD_API_DATA_DIR environment variable is not set"
exit 1
fi
# Get the instance directory path
get_instance_dir() {
echo "${WILD_API_DATA_DIR}/instances/${WILD_INSTANCE}"
}
# Get the secrets file path
get_secrets_file() {
echo "$(get_instance_dir)/secrets.yaml"
}
# Get the config file path
get_config_file() {
echo "$(get_instance_dir)/config.yaml"
}
# Get a secret value from the secrets file
# Usage: get_secret "path.to.secret"
get_secret() {
local path="$1"
local secrets_file="$(get_secrets_file)"
if [ ! -f "$secrets_file" ]; then
echo ""
return 1
fi
local value=$(yq ".$path" "$secrets_file" 2>/dev/null)
# Remove quotes and return empty string if null
value=$(echo "$value" | tr -d '"')
if [ "$value" = "null" ]; then
echo ""
return 1
fi
echo "$value"
}
# Get a config value from the config file
# Usage: get_config "path.to.config"
get_config() {
local path="$1"
local config_file="$(get_config_file)"
if [ ! -f "$config_file" ]; then
echo ""
return 1
fi
local value=$(yq ".$path" "$config_file" 2>/dev/null)
# Remove quotes and return empty string if null
value=$(echo "$value" | tr -d '"')
if [ "$value" = "null" ]; then
echo ""
return 1
fi
echo "$value"
}
# Check if a secret exists and is not empty
# Usage: require_secret "path.to.secret" "Friendly Name" "wild secret set command"
require_secret() {
local path="$1"
local name="$2"
local set_command="$3"
local value=$(get_secret "$path")
if [ -z "$value" ]; then
echo "❌ ERROR: $name not found"
echo "💡 Please set: $set_command"
exit 1
fi
echo "$value"
}
# Check if a config value exists and is not empty
# Usage: require_config "path.to.config" "Friendly Name" "wild config set command"
require_config() {
local path="$1"
local name="$2"
local set_command="$3"
local value=$(get_config "$path")
if [ -z "$value" ]; then
echo "❌ ERROR: $name not found"
echo "💡 Please set: $set_command"
exit 1
fi
echo "$value"
}

View File

@@ -0,0 +1,45 @@
# CoreDNS
- https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/
- https://github.com/kubernetes/dns/blob/master/docs/specification.md
- https://coredns.io/
CoreDNS has the `kubernetes` plugin, so it returns all k8s service endpoints in well-known format.
All services and pods are registered in CoreDNS.
- <service-name>.<namespace>.svc.cluster.local
- <service-name>.<namespace>
- <service-name> (if in the same namespace)
- <pod-ipv4-address>.<namespace>.pod.cluster.local
- <pod-ipv4-address>.<service-name>.<namespace>.svc.cluster.local
Any query for a resource in the `internal.$DOMAIN` domain will be given the IP of the Traefik proxy. We expose the CoreDNS server in the LAN via MetalLB just for this capability.
## Default CoreDNS Configuration
This is the default CoreDNS configuration, for reference:
```txt
.:53 {
errors
health { lameduck 5s }
ready
log . { class error }
prometheus :9153
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
forward . /etc/resolv.conf { max_concurrent 1000 }
cache 30 {
disable success cluster.local
disable denial cluster.local
}
loop
reload
loadbalance
}
```

View File

@@ -0,0 +1,57 @@
#!/bin/bash
set -e
set -o pipefail
# Ensure WILD_INSTANCE is set
if [ -z "${WILD_INSTANCE}" ]; then
echo "❌ ERROR: WILD_INSTANCE is not set"
exit 1
fi
# Ensure WILD_API_DATA_DIR is set
if [ -z "${WILD_API_DATA_DIR}" ]; then
echo "❌ ERROR: WILD_API_DATA_DIR is not set"
exit 1
fi
# Ensure KUBECONFIG is set
if [ -z "${KUBECONFIG}" ]; then
echo "❌ ERROR: KUBECONFIG is not set"
exit 1
fi
INSTANCE_DIR="${WILD_API_DATA_DIR}/instances/${WILD_INSTANCE}"
CLUSTER_SETUP_DIR="${INSTANCE_DIR}/setup/cluster-services"
COREDNS_DIR="${CLUSTER_SETUP_DIR}/coredns"
echo "🔧 === Setting up CoreDNS ==="
echo ""
# Templates should already be compiled
echo "📦 Using pre-compiled CoreDNS templates..."
if [ ! -d "${COREDNS_DIR}/kustomize" ]; then
echo "❌ ERROR: Compiled templates not found at ${COREDNS_DIR}/kustomize"
echo "Templates should be compiled before deployment."
exit 1
fi
# Apply the custom DNS override
# TODO: Is this needed now that we are no longer on k3s?
echo "🚀 Applying CoreDNS custom override configuration..."
kubectl apply -f "${COREDNS_DIR}/kustomize/coredns-custom-config.yaml"
echo "🔄 Restarting CoreDNS pods to apply changes..."
kubectl rollout restart deployment/coredns -n kube-system
echo "⏳ Waiting for CoreDNS rollout to complete..."
kubectl rollout status deployment/coredns -n kube-system
echo ""
echo "✅ CoreDNS configured successfully"
echo ""
echo "💡 To verify the installation:"
echo " kubectl get pods -n kube-system -l k8s-app=kube-dns"
echo " kubectl get svc -n kube-system coredns"
echo " kubectl describe svc -n kube-system coredns"
echo ""
echo "📋 To view CoreDNS logs:"
echo " kubectl logs -n kube-system -l k8s-app=kube-dns -f"

View File

@@ -0,0 +1,28 @@
---
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns-custom
namespace: kube-system
data:
# Custom server block for internal domains. All internal domains should
# resolve to the cluster proxy.
internal.server: |
{{ .cloud.internalDomain }} {
errors
cache 30
reload
template IN A {
match (.*)\.{{ .cloud.internalDomain | strings.ReplaceAll "." "\\." }}\.
answer "{{`{{ .Name }}`}} 60 IN A {{ .cluster.loadBalancerIp }}"
}
template IN AAAA {
match (.*)\.{{ .cloud.internalDomain | strings.ReplaceAll "." "\\." }}\.
rcode NXDOMAIN
}
}
# Custom override to set external resolvers.
external.override: |
forward . {{ .cloud.dns.externalResolver }} {
max_concurrent 1000
}

View File

@@ -0,0 +1,15 @@
name: coredns
description: DNS server for internal cluster DNS resolution
namespace: kube-system
category: infrastructure
configReferences:
- cloud.internalDomain
- cluster.loadBalancerIp
serviceConfig:
externalResolver:
path: cloud.dns.externalResolver
prompt: "Enter external DNS resolver"
default: "8.8.8.8"
type: string

View File

@@ -0,0 +1,53 @@
#!/bin/bash
set -e
set -o pipefail
# Ensure WILD_INSTANCE is set
if [ -z "${WILD_INSTANCE}" ]; then
echo "❌ ERROR: WILD_INSTANCE is not set"
exit 1
fi
# Ensure WILD_API_DATA_DIR is set
if [ -z "${WILD_API_DATA_DIR}" ]; then
echo "❌ ERROR: WILD_API_DATA_DIR is not set"
exit 1
fi
# Ensure KUBECONFIG is set
if [ -z "${KUBECONFIG}" ]; then
echo "❌ ERROR: KUBECONFIG is not set"
exit 1
fi
INSTANCE_DIR="${WILD_API_DATA_DIR}/instances/${WILD_INSTANCE}"
CLUSTER_SETUP_DIR="${INSTANCE_DIR}/setup/cluster-services"
DOCKER_REGISTRY_DIR="${CLUSTER_SETUP_DIR}/docker-registry"
echo "🔧 === Setting up Docker Registry ==="
echo ""
# Templates should already be compiled
echo "📦 Using pre-compiled Docker Registry templates..."
if [ ! -d "${DOCKER_REGISTRY_DIR}/kustomize" ]; then
echo "❌ ERROR: Compiled templates not found at ${DOCKER_REGISTRY_DIR}/kustomize"
echo "Templates should be compiled before deployment."
exit 1
fi
echo "🚀 Deploying Docker Registry..."
kubectl apply -k "${DOCKER_REGISTRY_DIR}/kustomize"
echo "⏳ Waiting for Docker Registry to be ready..."
kubectl wait --for=condition=available --timeout=300s deployment/docker-registry -n docker-registry
echo ""
echo "✅ Docker Registry installed successfully"
echo ""
echo "📊 Deployment status:"
kubectl get pods -n docker-registry
kubectl get services -n docker-registry
echo ""
echo "💡 To use the registry:"
echo " docker tag myimage registry.local/myimage"
echo " docker push registry.local/myimage"

View File

@@ -0,0 +1,36 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: docker-registry
labels:
app: docker-registry
spec:
replicas: 1
selector:
matchLabels:
app: docker-registry
strategy:
rollingUpdate:
maxSurge: 0
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
labels:
app: docker-registry
spec:
containers:
- image: registry:3.0.0
name: docker-registry
ports:
- containerPort: 5000
protocol: TCP
volumeMounts:
- mountPath: /var/lib/registry
name: docker-registry-storage
readOnly: false
volumes:
- name: docker-registry-storage
persistentVolumeClaim:
claimName: docker-registry-pvc

View File

@@ -0,0 +1,20 @@
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: docker-registry
spec:
rules:
- host: {{ .cloud.dockerRegistryHost }}
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: docker-registry
port:
number: 5000
tls:
- hosts:
- {{ .cloud.dockerRegistryHost }}
secretName: wildcard-internal-wild-cloud-tls

View File

@@ -0,0 +1,14 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: docker-registry
labels:
- includeSelectors: true
pairs:
app: docker-registry
managedBy: wild-cloud
resources:
- deployment.yaml
- ingress.yaml
- service.yaml
- namespace.yaml
- pvc.yaml

View File

@@ -0,0 +1,4 @@
apiVersion: v1
kind: Namespace
metadata:
name: docker-registry

View File

@@ -0,0 +1,12 @@
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: docker-registry-pvc
spec:
storageClassName: longhorn
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: {{ .cluster.dockerRegistry.storage }}

View File

@@ -0,0 +1,13 @@
---
apiVersion: v1
kind: Service
metadata:
name: docker-registry
labels:
app: docker-registry
spec:
ports:
- port: 5000
targetPort: 5000
selector:
app: docker-registry

View File

@@ -0,0 +1,20 @@
name: docker-registry
description: Private Docker image registry for cluster
namespace: docker-registry
category: infrastructure
dependencies:
- traefik
- cert-manager
serviceConfig:
registryHost:
path: cloud.dockerRegistryHost
prompt: "Enter Docker Registry hostname"
default: "registry.{{ .cloud.internalDomain }}"
type: string
storage:
path: cluster.dockerRegistry.storage
prompt: "Enter Docker Registry storage size"
default: "100Gi"
type: string

View File

@@ -0,0 +1,14 @@
# External DNS
See: https://github.com/kubernetes-sigs/external-dns
ExternalDNS allows you to keep selected zones (via --domain-filter) synchronized with Ingresses and Services of type=LoadBalancer and nodes in various DNS providers.
Currently, we are only configured to use CloudFlare.
Docs: https://github.com/kubernetes-sigs/external-dns/blob/master/docs/tutorials/cloudflare.md
Any Ingress that has metatdata.annotions with
external-dns.alpha.kubernetes.io/hostname: `<something>.${DOMAIN}`
will have Cloudflare records created by External DNS.

View File

@@ -0,0 +1,79 @@
#!/bin/bash
set -e
set -o pipefail
# Ensure WILD_INSTANCE is set
if [ -z "${WILD_INSTANCE}" ]; then
echo "❌ ERROR: WILD_INSTANCE is not set"
exit 1
fi
# Ensure WILD_API_DATA_DIR is set
if [ -z "${WILD_API_DATA_DIR}" ]; then
echo "❌ ERROR: WILD_API_DATA_DIR is not set"
exit 1
fi
# Ensure KUBECONFIG is set
if [ -z "${KUBECONFIG}" ]; then
echo "❌ ERROR: KUBECONFIG is not set"
exit 1
fi
INSTANCE_DIR="${WILD_API_DATA_DIR}/instances/${WILD_INSTANCE}"
CLUSTER_SETUP_DIR="${INSTANCE_DIR}/setup/cluster-services"
EXTERNALDNS_DIR="${CLUSTER_SETUP_DIR}/externaldns"
echo "🌐 === Setting up ExternalDNS ==="
echo ""
# Check cert-manager dependency
echo "🔍 Verifying cert-manager is ready (required for ExternalDNS)..."
kubectl wait --for=condition=Available deployment/cert-manager -n cert-manager --timeout=60s 2>/dev/null && \
kubectl wait --for=condition=Available deployment/cert-manager-webhook -n cert-manager --timeout=60s 2>/dev/null || {
echo "⚠️ cert-manager not ready, but continuing with ExternalDNS installation"
echo "💡 Note: ExternalDNS may not work properly without cert-manager"
}
# Templates should already be compiled
echo "📦 Using pre-compiled ExternalDNS templates..."
if [ ! -d "${EXTERNALDNS_DIR}/kustomize" ]; then
echo "❌ ERROR: Compiled templates not found at ${EXTERNALDNS_DIR}/kustomize"
echo "Templates should be compiled before deployment."
exit 1
fi
# Apply ExternalDNS manifests using kustomize
echo "🚀 Deploying ExternalDNS..."
kubectl apply -k ${EXTERNALDNS_DIR}/kustomize
# Setup Cloudflare API token secret
echo "🔐 Creating Cloudflare API token secret..."
SECRETS_FILE="${WILD_API_DATA_DIR}/instances/${WILD_INSTANCE}/secrets.yaml"
CLOUDFLARE_API_TOKEN=$(yq '.cloudflare.token' "$SECRETS_FILE" 2>/dev/null | tr -d '"')
if [ -z "$CLOUDFLARE_API_TOKEN" ] || [ "$CLOUDFLARE_API_TOKEN" = "null" ]; then
echo "❌ ERROR: Cloudflare API token not found."
echo "💡 Please set: wild secret set cloudflare.token YOUR_TOKEN"
exit 1
fi
kubectl create secret generic cloudflare-api-token \
--namespace externaldns \
--from-literal=api-token="${CLOUDFLARE_API_TOKEN}" \
--dry-run=client -o yaml | kubectl apply -f -
# Wait for ExternalDNS to be ready
echo "⏳ Waiting for Cloudflare ExternalDNS to be ready..."
kubectl rollout status deployment/external-dns -n externaldns --timeout=60s
# echo "⏳ Waiting for CoreDNS ExternalDNS to be ready..."
# kubectl rollout status deployment/external-dns-coredns -n externaldns --timeout=60s
echo ""
echo "✅ ExternalDNS installed successfully"
echo ""
echo "💡 To verify the installation:"
echo " kubectl get pods -n externaldns"
echo " kubectl logs -n externaldns -l app=external-dns -f"
echo " kubectl logs -n externaldns -l app=external-dns-coredns -f"
echo ""

View File

@@ -0,0 +1,39 @@
---
# CloudFlare provider for ExternalDNS
apiVersion: apps/v1
kind: Deployment
metadata:
name: external-dns
namespace: externaldns
spec:
selector:
matchLabels:
app: external-dns
strategy:
type: Recreate
template:
metadata:
labels:
app: external-dns
spec:
serviceAccountName: external-dns
containers:
- name: external-dns
image: registry.k8s.io/external-dns/external-dns:v0.13.4
args:
- --source=service
- --source=ingress
- --txt-owner-id={{ .cluster.externalDns.ownerId }}
- --provider=cloudflare
- --domain-filter=payne.io
#- --exclude-domains=internal.${DOMAIN}
- --cloudflare-dns-records-per-page=5000
- --publish-internal-services
- --no-cloudflare-proxied
- --log-level=debug
env:
- name: CF_API_TOKEN
valueFrom:
secretKeyRef:
name: cloudflare-api-token
key: api-token

View File

@@ -0,0 +1,35 @@
---
# Common RBAC resources for all ExternalDNS deployments
apiVersion: v1
kind: ServiceAccount
metadata:
name: external-dns
namespace: externaldns
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: external-dns
rules:
- apiGroups: [""]
resources: ["services", "endpoints", "pods"]
verbs: ["get", "watch", "list"]
- apiGroups: ["extensions", "networking.k8s.io"]
resources: ["ingresses"]
verbs: ["get", "watch", "list"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: external-dns-viewer
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: external-dns
subjects:
- kind: ServiceAccount
name: external-dns
namespace: externaldns

View File

@@ -0,0 +1,7 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- namespace.yaml
- externaldns-rbac.yaml
- externaldns-cloudflare.yaml

View File

@@ -0,0 +1,4 @@
apiVersion: v1
kind: Namespace
metadata:
name: externaldns

View File

@@ -0,0 +1,15 @@
name: externaldns
description: Automatically configures DNS records for services
namespace: externaldns
category: infrastructure
configReferences:
- cloud.internalDomain
- cluster.name
serviceConfig:
ownerId:
path: cluster.externalDns.ownerId
prompt: "Enter ExternalDNS owner ID (unique identifier for this cluster)"
default: "wild-cloud-{{ .cluster.name }}"
type: string

View File

@@ -0,0 +1,91 @@
#!/bin/bash
set -e
set -o pipefail
# Ensure WILD_INSTANCE is set
if [ -z "${WILD_INSTANCE}" ]; then
echo "❌ ERROR: WILD_INSTANCE is not set"
exit 1
fi
# Ensure WILD_API_DATA_DIR is set
if [ -z "${WILD_API_DATA_DIR}" ]; then
echo "❌ ERROR: WILD_API_DATA_DIR is not set"
exit 1
fi
# Ensure KUBECONFIG is set
if [ -z "${KUBECONFIG}" ]; then
echo "❌ ERROR: KUBECONFIG is not set"
exit 1
fi
INSTANCE_DIR="${WILD_API_DATA_DIR}/instances/${WILD_INSTANCE}"
CLUSTER_SETUP_DIR="${INSTANCE_DIR}/setup/cluster-services"
KUBERNETES_DASHBOARD_DIR="${CLUSTER_SETUP_DIR}/kubernetes-dashboard"
echo "🎮 === Setting up Kubernetes Dashboard ==="
echo ""
# Templates should already be compiled
echo "📦 Using pre-compiled Dashboard templates..."
if [ ! -d "${KUBERNETES_DASHBOARD_DIR}/kustomize" ]; then
echo "❌ ERROR: Compiled templates not found at ${KUBERNETES_DASHBOARD_DIR}/kustomize"
echo "Templates should be compiled before deployment."
exit 1
fi
NAMESPACE="kubernetes-dashboard"
# Apply the official dashboard installation
echo "🚀 Installing Kubernetes Dashboard core components..."
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml
# Wait for cert-manager certificates to be ready
echo "🔐 Waiting for cert-manager certificates to be ready..."
kubectl wait --for=condition=Ready certificate wildcard-internal-wild-cloud -n cert-manager --timeout=300s || echo "⚠️ Warning: Internal wildcard certificate not ready yet"
kubectl wait --for=condition=Ready certificate wildcard-wild-cloud -n cert-manager --timeout=300s || echo "⚠️ Warning: Wildcard certificate not ready yet"
# Copying cert-manager secrets to the dashboard namespace (if available)
echo "📋 Copying cert-manager secrets to dashboard namespace..."
if kubectl get secret wildcard-internal-wild-cloud-tls -n cert-manager >/dev/null 2>&1; then
kubectl get secret wildcard-internal-wild-cloud-tls -n cert-manager -o yaml | \
sed "s/namespace: cert-manager/namespace: ${NAMESPACE}/" | \
kubectl apply -f -
else
echo "⚠️ Warning: wildcard-internal-wild-cloud-tls secret not yet available"
fi
if kubectl get secret wildcard-wild-cloud-tls -n cert-manager >/dev/null 2>&1; then
kubectl get secret wildcard-wild-cloud-tls -n cert-manager -o yaml | \
sed "s/namespace: cert-manager/namespace: ${NAMESPACE}/" | \
kubectl apply -f -
else
echo "⚠️ Warning: wildcard-wild-cloud-tls secret not yet available"
fi
# Apply dashboard customizations using kustomize
echo "🔧 Applying dashboard customizations..."
kubectl apply -k "${KUBERNETES_DASHBOARD_DIR}/kustomize"
# Restart CoreDNS to pick up the changes
# echo "🔄 Restarting CoreDNS to pick up DNS changes..."
# kubectl delete pods -n kube-system -l k8s-app=kube-dns
# Wait for dashboard to be ready
echo "⏳ Waiting for Kubernetes Dashboard to be ready..."
kubectl rollout status deployment/kubernetes-dashboard -n $NAMESPACE --timeout=60s
echo ""
echo "✅ Kubernetes Dashboard installed successfully"
echo ""
# INTERNAL_DOMAIN should be available in environment (set from config before deployment)
if [ -n "${INTERNAL_DOMAIN}" ]; then
echo "🌐 Access the dashboard at: https://dashboard.${INTERNAL_DOMAIN}"
else
echo "💡 Access the dashboard via the configured internal domain"
fi
echo ""
echo "💡 To get the authentication token:"
echo " kubectl create token admin-user -n kubernetes-dashboard"
echo ""

View File

@@ -0,0 +1,32 @@
---
# Service Account and RBAC for Dashboard admin access
apiVersion: v1
kind: ServiceAccount
metadata:
name: dashboard-admin
namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: dashboard-admin
subjects:
- kind: ServiceAccount
name: dashboard-admin
namespace: kubernetes-dashboard
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
---
# Token for dashboard-admin
apiVersion: v1
kind: Secret
metadata:
name: dashboard-admin-token
namespace: kubernetes-dashboard
annotations:
kubernetes.io/service-account.name: dashboard-admin
type: kubernetes.io/service-account-token

View File

@@ -0,0 +1,84 @@
---
# Internal-only middleware
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
name: internal-only
namespace: kubernetes-dashboard
spec:
ipWhiteList:
# Restrict to local private network ranges
sourceRange:
- 127.0.0.1/32 # localhost
- 10.0.0.0/8 # Private network
- 172.16.0.0/12 # Private network
- 192.168.0.0/16 # Private network
---
# HTTPS redirect middleware
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
name: dashboard-redirect-scheme
namespace: kubernetes-dashboard
spec:
redirectScheme:
scheme: https
permanent: true
---
# IngressRoute for Dashboard
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: kubernetes-dashboard-https
namespace: kubernetes-dashboard
spec:
entryPoints:
- websecure
routes:
- match: Host(`dashboard.{{ .cloud.internalDomain }}`)
kind: Rule
middlewares:
- name: internal-only
namespace: kubernetes-dashboard
services:
- name: kubernetes-dashboard
port: 443
serversTransport: dashboard-transport
tls:
secretName: wildcard-internal-wild-cloud-tls
---
# HTTP to HTTPS redirect.
# FIXME: Is this needed?
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: kubernetes-dashboard-http
namespace: kubernetes-dashboard
spec:
entryPoints:
- web
routes:
- match: Host(`dashboard.{{ .cloud.internalDomain }}`)
kind: Rule
middlewares:
- name: dashboard-redirect-scheme
namespace: kubernetes-dashboard
services:
- name: kubernetes-dashboard
port: 443
serversTransport: dashboard-transport
---
# ServersTransport for HTTPS backend with skip verify.
# FIXME: Is this needed?
apiVersion: traefik.io/v1alpha1
kind: ServersTransport
metadata:
name: dashboard-transport
namespace: kubernetes-dashboard
spec:
insecureSkipVerify: true
serverName: dashboard.{{ .cloud.internalDomain }}

View File

@@ -0,0 +1,6 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- dashboard-admin-rbac.yaml
- dashboard-kube-system.yaml

View File

@@ -0,0 +1,11 @@
name: kubernetes-dashboard
description: Web-based Kubernetes user interface
namespace: kubernetes-dashboard
category: infrastructure
dependencies:
- traefik
- cert-manager
configReferences:
- cloud.internalDomain

View File

@@ -0,0 +1,20 @@
# Longhorn Storage
See: [Longhorn Docs v 1.8.1](https://longhorn.io/docs/1.8.1/deploy/install/install-with-kubectl/)
## Installation Notes
- Manifest copied from https://raw.githubusercontent.com/longhorn/longhorn/v1.8.1/deploy/longhorn.yaml
- Using kustomize to apply custom configuration (see `kustomization.yaml`)
## Important Settings
- **Number of Replicas**: Set to 1 (default is 3) to accommodate smaller clusters
- This avoids "degraded" volumes when fewer than 3 nodes are available
- For production with 3+ nodes, consider changing back to 3 for better availability
## Common Operations
- View volumes: `kubectl get volumes.longhorn.io -n longhorn-system`
- Check volume status: `kubectl describe volumes.longhorn.io <volume-name> -n longhorn-system`
- Access Longhorn UI: Set up port-forwarding with `kubectl -n longhorn-system port-forward service/longhorn-frontend 8080:80`

View File

@@ -0,0 +1,52 @@
#!/bin/bash
set -e
set -o pipefail
# Ensure WILD_INSTANCE is set
if [ -z "${WILD_INSTANCE}" ]; then
echo "❌ ERROR: WILD_INSTANCE is not set"
exit 1
fi
# Ensure WILD_API_DATA_DIR is set
if [ -z "${WILD_API_DATA_DIR}" ]; then
echo "❌ ERROR: WILD_API_DATA_DIR is not set"
exit 1
fi
# Ensure KUBECONFIG is set
if [ -z "${KUBECONFIG}" ]; then
echo "❌ ERROR: KUBECONFIG is not set"
exit 1
fi
INSTANCE_DIR="${WILD_API_DATA_DIR}/instances/${WILD_INSTANCE}"
CLUSTER_SETUP_DIR="${INSTANCE_DIR}/setup/cluster-services"
LONGHORN_DIR="${CLUSTER_SETUP_DIR}/longhorn"
echo "🔧 === Setting up Longhorn ==="
echo ""
# Templates should already be compiled
echo "📦 Using pre-compiled Longhorn templates..."
if [ ! -d "${LONGHORN_DIR}/kustomize" ]; then
echo "❌ ERROR: Compiled templates not found at ${LONGHORN_DIR}/kustomize"
echo "Templates should be compiled before deployment."
exit 1
fi
echo "🚀 Deploying Longhorn..."
kubectl apply -k ${LONGHORN_DIR}/kustomize/
echo "⏳ Waiting for Longhorn to be ready..."
kubectl wait --for=condition=available --timeout=300s deployment/longhorn-driver-deployer -n longhorn-system || true
echo ""
echo "✅ Longhorn installed successfully"
echo ""
echo "💡 To verify the installation:"
echo " kubectl get pods -n longhorn-system"
echo " kubectl get storageclass"
echo ""
echo "🌐 To access the Longhorn UI:"
echo " kubectl port-forward -n longhorn-system svc/longhorn-frontend 8080:80"

View File

@@ -0,0 +1,21 @@
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: longhorn-ingress
namespace: longhorn-system
spec:
rules:
- host: "longhorn.{{ .cloud.internalDomain }}"
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: longhorn-frontend
port:
number: 80
tls:
- secretName: wildcard-internal-wild-cloud-tls
hosts:
- "longhorn.{{ .cloud.internalDomain }}"

View File

@@ -0,0 +1,6 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- longhorn.yaml
- ingress.yaml

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,7 @@
name: longhorn
description: Cloud-native distributed block storage for Kubernetes
namespace: longhorn-system
category: infrastructure
dependencies:
- traefik

View File

@@ -0,0 +1,56 @@
#!/bin/bash
set -e
set -o pipefail
# Ensure WILD_INSTANCE is set
if [ -z "${WILD_INSTANCE}" ]; then
echo "❌ ERROR: WILD_INSTANCE is not set"
exit 1
fi
# Ensure WILD_API_DATA_DIR is set
if [ -z "${WILD_API_DATA_DIR}" ]; then
echo "❌ ERROR: WILD_API_DATA_DIR is not set"
exit 1
fi
# Ensure KUBECONFIG is set
if [ -z "${KUBECONFIG}" ]; then
echo "❌ ERROR: KUBECONFIG is not set"
exit 1
fi
INSTANCE_DIR="${WILD_API_DATA_DIR}/instances/${WILD_INSTANCE}"
CLUSTER_SETUP_DIR="${INSTANCE_DIR}/setup/cluster-services"
METALLB_DIR="${CLUSTER_SETUP_DIR}/metallb"
echo "🔧 === Setting up MetalLB ==="
echo ""
# Templates should already be compiled
echo "📦 Using pre-compiled MetalLB templates..."
if [ ! -d "${METALLB_DIR}/kustomize" ]; then
echo "❌ ERROR: Compiled templates not found at ${METALLB_DIR}/kustomize"
echo "Templates should be compiled before deployment."
exit 1
fi
echo "🚀 Deploying MetalLB installation..."
kubectl apply -k ${METALLB_DIR}/kustomize/installation
echo "⏳ Waiting for MetalLB controller to be ready..."
kubectl wait --for=condition=Available deployment/controller -n metallb-system --timeout=60s
echo "⏳ Extra buffer for webhook initialization..."
sleep 10
echo "⚙️ Applying MetalLB configuration..."
kubectl apply -k ${METALLB_DIR}/kustomize/configuration
echo ""
echo "✅ MetalLB installed and configured successfully"
echo ""
echo "💡 To verify the installation:"
echo " kubectl get pods -n metallb-system"
echo " kubectl get ipaddresspools.metallb.io -n metallb-system"
echo ""
echo "🌐 MetalLB will now provide LoadBalancer IPs for your services"

View File

@@ -0,0 +1,3 @@
namespace: metallb-system
resources:
- pool.yaml

View File

@@ -0,0 +1,19 @@
---
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: first-pool
namespace: metallb-system
spec:
addresses:
- {{ .cluster.ipAddressPool }}
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: l2-advertisement
namespace: metallb-system
spec:
ipAddressPools:
- first-pool

View File

@@ -0,0 +1,3 @@
namespace: metallb-system
resources:
- github.com/metallb/metallb/config/native?ref=v0.15.0

View File

@@ -0,0 +1,19 @@
name: metallb
description: Bare metal load-balancer for Kubernetes
namespace: metallb-system
category: infrastructure
configReferences:
- cluster.name
serviceConfig:
ipRange:
path: cluster.ipAddressPool
prompt: "Enter IP range for MetalLB (e.g., 192.168.1.240-192.168.1.250)"
default: "192.168.1.240-192.168.1.250"
type: string
loadBalancerIp:
path: cluster.loadBalancerIp
prompt: "Enter primary load balancer IP"
default: "192.168.1.240"
type: string

Some files were not shown because too many files have changed in this diff Show More