Compare commits

...

8 Commits

Author SHA1 Message Date
Paul Payne
7cd434aabf feat(api): Enhance NodeDiscover with subnet auto-detection and discovery cancellation
- Updated NodeDiscover to accept an optional subnet parameter, with auto-detection of local networks if none is provided.
- Removed support for IP list format in NodeDiscover request body.
- Implemented discovery cancellation functionality with NodeDiscoveryCancel endpoint.
- Improved error handling and response messages for better clarity.

feat(cluster): Add operation tracking for cluster bootstrap process

- Integrated operations manager into cluster manager for tracking bootstrap progress.
- Refactored Bootstrap method to run asynchronously with detailed progress updates.
- Added methods to wait for various bootstrap steps (etcd health, VIP assignment, control plane readiness, etc.).

fix(discovery): Optimize node discovery process and improve maintenance mode detection

- Enhanced node discovery to run in parallel with a semaphore to limit concurrent scans.
- Updated probeNode to detect maintenance mode more reliably.
- Added functions to expand CIDR notation into individual IP addresses and retrieve local network interfaces.

refactor(node): Update node manager to handle instance-specific configurations

- Modified NewManager to accept instanceName for tailored talosconfig usage.
- Improved hardware detection logic to handle maintenance mode scenarios.

feat(operations): Implement detailed bootstrap progress tracking

- Introduced BootstrapProgress struct to track and report the status of bootstrap operations.
- Updated operation management to include bootstrap-specific details.

fix(tools): Improve talosctl command execution with context and error handling

- Added context with timeout to talosctl commands to prevent hanging on unreachable nodes.
- Enhanced error handling for version retrieval in maintenance mode.
2025-11-04 17:16:16 +00:00
Paul Payne
005dc30aa5 Adds app endpoints for configuration and status. 2025-10-22 23:17:52 +00:00
Paul Payne
5b7d2835e7 Instance-namespace additional utility endpoints. 2025-10-14 21:06:18 +00:00
Paul Payne
67ca1b85be Functions for common paths. 2025-10-14 19:23:16 +00:00
Paul Payne
679ea18446 Namespace dashboard token endpoint in an instance. 2025-10-14 18:52:27 +00:00
Paul Payne
d2c8ff716e Lint fixes. 2025-10-14 07:31:54 +00:00
Paul Payne
2fd71c32dc Formatting. 2025-10-14 07:13:00 +00:00
Paul Payne
afb0e09aae Service config. Service logs. Service status. 2025-10-14 05:26:45 +00:00
36 changed files with 3414 additions and 533 deletions

View File

@@ -38,7 +38,106 @@
- Write unit tests for all functions and methods.
- Make and use common modules. For example, one module should handle all interactions with talosctl. Another modules should handle all interactions with kubectl.
- If the code is getting long and complex, break it into smaller modules.
- API requests and responses should be valid JSON. Object attributes should be standard JSON camel-cased.
### Features
- If WILD_CENTRAL_ENV environment variable is set to "development", the API should run in development mode.
## Patterns
### Instance-scoped Endpoints
Instance-scoped endpoints follow a consistent pattern to ensure stateless, RESTful API design. The instance name is always included in the URL path, not retrieved from session state or context.
#### Route Pattern
```go
// In handlers.go
r.HandleFunc("/api/v1/instances/{name}/utilities/dashboard/token", api.UtilitiesDashboardToken).Methods("GET")
```
#### Handler Pattern
```go
// In handlers_utilities.go
func (api *API) UtilitiesDashboardToken(w http.ResponseWriter, r *http.Request) {
// 1. Extract instance name from URL path parameters
vars := mux.Vars(r)
instanceName := vars["name"]
// 2. Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// 3. Construct instance-specific paths using tools helpers
kubeconfigPath := tools.GetKubeconfigPath(api.dataDir, instanceName)
// 4. Perform instance-specific operations
token, err := utilities.GetDashboardToken(kubeconfigPath)
if err != nil {
respondError(w, http.StatusInternalServerError, "Failed to get dashboard token")
return
}
// 5. Return response
respondJSON(w, http.StatusOK, map[string]interface{}{
"success": true,
"data": token,
})
}
```
#### Key Principles
1. **Instance name in URL**: Always include instance name as a path parameter (`{name}`)
2. **Extract from mux.Vars()**: Get instance name from `mux.Vars(r)["name"]`, not from context
3. **Validate instance**: Always validate the instance exists before operations
4. **Use path helpers**: Use `tools.GetKubeconfigPath()`, `tools.GetInstanceConfigPath()`, etc. instead of inline `filepath.Join()` constructions
5. **Stateless handlers**: Handlers should not depend on session state or current context
### kubectl and talosctl Commands
When making kubectl or talosctl calls for a specific instance, always use the `tools` package helpers to set the correct context.
#### Using kubectl with Instance Kubeconfig
```go
// In utilities.go or similar
func GetDashboardToken(kubeconfigPath string) (*DashboardToken, error) {
cmd := exec.Command("kubectl", "-n", "kubernetes-dashboard", "create", "token", "dashboard-admin")
tools.WithKubeconfig(cmd, kubeconfigPath)
output, err := cmd.Output()
if err != nil {
return nil, fmt.Errorf("failed to create token: %w", err)
}
token := strings.TrimSpace(string(output))
return &DashboardToken{Token: token}, nil
}
```
#### Using talosctl with Instance Talosconfig
```go
// In cluster operations
func GetClusterHealth(talosconfigPath string, nodeIP string) error {
cmd := exec.Command("talosctl", "health", "--nodes", nodeIP)
tools.WithTalosconfig(cmd, talosconfigPath)
output, err := cmd.Output()
if err != nil {
return fmt.Errorf("failed to check health: %w", err)
}
// Process output...
return nil
}
```
#### Key Principles
1. **Use tools helpers**: Always use `tools.WithKubeconfig()` or `tools.WithTalosconfig()` instead of manually setting environment variables
2. **Get paths from tools package**: Use `tools.GetKubeconfigPath()` or `tools.GetTalosconfigPath()` to construct config paths
3. **One config per command**: Each exec.Command should have its config set via the appropriate helper
4. **Error handling**: Always check for command execution errors and provide context

3
go.mod
View File

@@ -4,7 +4,6 @@ go 1.24
require (
github.com/gorilla/mux v1.8.1
github.com/rs/cors v1.11.1
gopkg.in/yaml.v3 v3.0.1
)
require github.com/rs/cors v1.11.1 // indirect

View File

@@ -7,7 +7,6 @@ import (
"log"
"net/http"
"os"
"path/filepath"
"time"
"github.com/gorilla/mux"
@@ -19,6 +18,7 @@ import (
"github.com/wild-cloud/wild-central/daemon/internal/instance"
"github.com/wild-cloud/wild-central/daemon/internal/operations"
"github.com/wild-cloud/wild-central/daemon/internal/secrets"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// API holds all dependencies for API handlers
@@ -30,6 +30,7 @@ type API struct {
context *context.Manager
instance *instance.Manager
dnsmasq *dnsmasq.ConfigGenerator
opsMgr *operations.Manager // Operations manager
broadcaster *operations.Broadcaster // SSE broadcaster for operation output
}
@@ -37,7 +38,7 @@ type API struct {
// Note: Setup files (cluster-services, cluster-nodes, etc.) are now embedded in the binary
func NewAPI(dataDir, appsDir string) (*API, error) {
// Ensure base directories exist
instancesDir := filepath.Join(dataDir, "instances")
instancesDir := tools.GetInstancesPath(dataDir)
if err := os.MkdirAll(instancesDir, 0755); err != nil {
return nil, fmt.Errorf("failed to create instances directory: %w", err)
}
@@ -57,6 +58,7 @@ func NewAPI(dataDir, appsDir string) (*API, error) {
context: context.NewManager(dataDir),
instance: instance.NewManager(dataDir),
dnsmasq: dnsmasq.NewConfigGenerator(dnsmasqConfigPath),
opsMgr: operations.NewManager(dataDir),
broadcaster: operations.NewBroadcaster(),
}, nil
}
@@ -85,6 +87,7 @@ func (api *API) RegisterRoutes(r *mux.Router) {
r.HandleFunc("/api/v1/instances/{name}/nodes/discover", api.NodeDiscover).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/nodes/detect", api.NodeDetect).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/discovery", api.NodeDiscoveryStatus).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/discovery/cancel", api.NodeDiscoveryCancel).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/nodes/hardware/{ip}", api.NodeHardware).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/nodes/fetch-templates", api.NodeFetchTemplates).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/nodes", api.NodeAdd).Methods("POST")
@@ -108,9 +111,9 @@ func (api *API) RegisterRoutes(r *mux.Router) {
// Operations
r.HandleFunc("/api/v1/instances/{name}/operations", api.OperationList).Methods("GET")
r.HandleFunc("/api/v1/operations/{id}", api.OperationGet).Methods("GET")
r.HandleFunc("/api/v1/operations/{id}/stream", api.OperationStream).Methods("GET")
r.HandleFunc("/api/v1/operations/{id}/cancel", api.OperationCancel).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/operations/{id}", api.OperationGet).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/operations/{id}/stream", api.OperationStream).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/operations/{id}/cancel", api.OperationCancel).Methods("POST")
// Cluster operations
r.HandleFunc("/api/v1/instances/{name}/cluster/config/generate", api.ClusterGenerateConfig).Methods("POST")
@@ -138,6 +141,8 @@ func (api *API) RegisterRoutes(r *mux.Router) {
r.HandleFunc("/api/v1/instances/{name}/services/{service}/fetch", api.ServicesFetch).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/services/{service}/compile", api.ServicesCompile).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/services/{service}/deploy", api.ServicesDeploy).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/services/{service}/logs", api.ServicesGetLogs).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/services/{service}/config", api.ServicesUpdateConfig).Methods("PATCH")
// Apps
r.HandleFunc("/api/v1/apps", api.AppsListAvailable).Methods("GET")
@@ -148,19 +153,25 @@ func (api *API) RegisterRoutes(r *mux.Router) {
r.HandleFunc("/api/v1/instances/{name}/apps/{app}", api.AppsDelete).Methods("DELETE")
r.HandleFunc("/api/v1/instances/{name}/apps/{app}/status", api.AppsGetStatus).Methods("GET")
// Enhanced app endpoints
r.HandleFunc("/api/v1/instances/{name}/apps/{app}/enhanced", api.AppsGetEnhanced).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/apps/{app}/runtime", api.AppsGetEnhancedStatus).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/apps/{app}/logs", api.AppsGetLogs).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/apps/{app}/events", api.AppsGetEvents).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/apps/{app}/readme", api.AppsGetReadme).Methods("GET")
// Backup & Restore
r.HandleFunc("/api/v1/instances/{name}/apps/{app}/backup", api.BackupAppStart).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/apps/{app}/backup", api.BackupAppList).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/apps/{app}/restore", api.BackupAppRestore).Methods("POST")
// Utilities
r.HandleFunc("/api/v1/utilities/health", api.UtilitiesHealth).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/utilities/health", api.InstanceUtilitiesHealth).Methods("GET")
r.HandleFunc("/api/v1/utilities/dashboard/token", api.UtilitiesDashboardToken).Methods("GET")
r.HandleFunc("/api/v1/utilities/nodes/ips", api.UtilitiesNodeIPs).Methods("GET")
r.HandleFunc("/api/v1/utilities/controlplane/ip", api.UtilitiesControlPlaneIP).Methods("GET")
r.HandleFunc("/api/v1/utilities/secrets/{secret}/copy", api.UtilitiesSecretCopy).Methods("POST")
r.HandleFunc("/api/v1/utilities/version", api.UtilitiesVersion).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/utilities/dashboard/token", api.UtilitiesDashboardToken).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/utilities/nodes/ips", api.UtilitiesNodeIPs).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/utilities/controlplane/ip", api.UtilitiesControlPlaneIP).Methods("GET")
r.HandleFunc("/api/v1/instances/{name}/utilities/secrets/{secret}/copy", api.UtilitiesSecretCopy).Methods("POST")
r.HandleFunc("/api/v1/instances/{name}/utilities/version", api.UtilitiesVersion).Methods("GET")
// dnsmasq management
r.HandleFunc("/api/v1/dnsmasq/status", api.DnsmasqStatus).Methods("GET")
@@ -290,12 +301,9 @@ func (api *API) GetConfig(w http.ResponseWriter, r *http.Request) {
respondJSON(w, http.StatusOK, configMap)
}
// UpdateConfig updates instance configuration
func (api *API) UpdateConfig(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
name := vars["name"]
if err := api.instance.ValidateInstance(name); err != nil {
// updateYAMLFile updates a YAML file with the provided key-value pairs
func (api *API) updateYAMLFile(w http.ResponseWriter, r *http.Request, instanceName, fileType string, updateFunc func(string, string, string) error) {
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
@@ -312,22 +320,40 @@ func (api *API) UpdateConfig(w http.ResponseWriter, r *http.Request) {
return
}
configPath := api.instance.GetInstanceConfigPath(name)
var filePath string
if fileType == "config" {
filePath = api.instance.GetInstanceConfigPath(instanceName)
} else {
filePath = api.instance.GetInstanceSecretsPath(instanceName)
}
// Update each key-value pair
for key, value := range updates {
valueStr := fmt.Sprintf("%v", value)
if err := api.config.SetConfigValue(configPath, key, valueStr); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to update config key %s: %v", key, err))
if err := updateFunc(filePath, key, valueStr); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to update %s key %s: %v", fileType, key, err))
return
}
}
// Capitalize first letter of fileType for message
fileTypeCap := fileType
if len(fileType) > 0 {
fileTypeCap = string(fileType[0]-32) + fileType[1:]
}
respondJSON(w, http.StatusOK, map[string]string{
"message": "Config updated successfully",
"message": fmt.Sprintf("%s updated successfully", fileTypeCap),
})
}
// UpdateConfig updates instance configuration
func (api *API) UpdateConfig(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
name := vars["name"]
api.updateYAMLFile(w, r, name, "config", api.config.SetConfigValue)
}
// GetSecrets retrieves instance secrets (redacted by default)
func (api *API) GetSecrets(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
@@ -373,39 +399,7 @@ func (api *API) GetSecrets(w http.ResponseWriter, r *http.Request) {
func (api *API) UpdateSecrets(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
name := vars["name"]
if err := api.instance.ValidateInstance(name); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
body, err := io.ReadAll(r.Body)
if err != nil {
respondError(w, http.StatusBadRequest, "Failed to read request body")
return
}
var updates map[string]interface{}
if err := yaml.Unmarshal(body, &updates); err != nil {
respondError(w, http.StatusBadRequest, fmt.Sprintf("Invalid YAML: %v", err))
return
}
// Get secrets file path
secretsPath := api.instance.GetInstanceSecretsPath(name)
// Update each secret
for key, value := range updates {
valueStr := fmt.Sprintf("%v", value)
if err := api.secrets.SetSecret(secretsPath, key, valueStr); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to update secret %s: %v", key, err))
return
}
}
respondJSON(w, http.StatusOK, map[string]string{
"message": "Secrets updated successfully",
})
api.updateYAMLFile(w, r, name, "secrets", api.secrets.SetSecret)
}
// GetContext retrieves current context
@@ -485,7 +479,7 @@ func (api *API) StatusHandler(w http.ResponseWriter, r *http.Request, startTime
func respondJSON(w http.ResponseWriter, status int, data interface{}) {
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(status)
json.NewEncoder(w).Encode(data)
_ = json.NewEncoder(w).Encode(data)
}
func respondError(w http.ResponseWriter, status int, message string) {

View File

@@ -4,11 +4,16 @@ import (
"encoding/json"
"fmt"
"net/http"
"os"
"path/filepath"
"strconv"
"strings"
"github.com/gorilla/mux"
"github.com/wild-cloud/wild-central/daemon/internal/apps"
"github.com/wild-cloud/wild-central/daemon/internal/operations"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// AppsListAvailable lists all available apps
@@ -106,80 +111,62 @@ func (api *API) AppsAdd(w http.ResponseWriter, r *http.Request) {
})
}
// AppsDeploy deploys an app to the cluster
func (api *API) AppsDeploy(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
appName := vars["app"]
// startAppOperation starts an app operation (deploy or delete) in the background
func (api *API) startAppOperation(w http.ResponseWriter, instanceName, appName, operationType, successMessage string, operation func(*apps.Manager, string, string) error) {
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Start deploy operation
// Start operation
opsMgr := operations.NewManager(api.dataDir)
opID, err := opsMgr.Start(instanceName, "deploy_app", appName)
opID, err := opsMgr.Start(instanceName, operationType, appName)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to start operation: %v", err))
return
}
// Deploy in background
// Execute operation in background
go func() {
appsMgr := apps.NewManager(api.dataDir, api.appsDir)
opsMgr.UpdateStatus(instanceName, opID, "running")
_ = opsMgr.UpdateStatus(instanceName, opID, "running")
if err := appsMgr.Deploy(instanceName, appName); err != nil {
opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
if err := operation(appsMgr, instanceName, appName); err != nil {
_ = opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
} else {
opsMgr.Update(instanceName, opID, "completed", "App deployed", 100)
_ = opsMgr.Update(instanceName, opID, "completed", successMessage, 100)
}
}()
respondJSON(w, http.StatusAccepted, map[string]string{
"operation_id": opID,
"message": "App deployment initiated",
"message": fmt.Sprintf("App %s initiated", operationType),
})
}
// AppsDeploy deploys an app to the cluster
func (api *API) AppsDeploy(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
appName := vars["app"]
api.startAppOperation(w, instanceName, appName, "deploy_app", "App deployed",
func(mgr *apps.Manager, instance, app string) error {
return mgr.Deploy(instance, app)
})
}
// AppsDelete deletes an app
func (api *API) AppsDelete(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
appName := vars["app"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Start delete operation
opsMgr := operations.NewManager(api.dataDir)
opID, err := opsMgr.Start(instanceName, "delete_app", appName)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to start operation: %v", err))
return
}
// Delete in background
go func() {
appsMgr := apps.NewManager(api.dataDir, api.appsDir)
opsMgr.UpdateStatus(instanceName, opID, "running")
if err := appsMgr.Delete(instanceName, appName); err != nil {
opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
} else {
opsMgr.Update(instanceName, opID, "completed", "App deleted", 100)
}
}()
respondJSON(w, http.StatusAccepted, map[string]string{
"operation_id": opID,
"message": "App deletion initiated",
})
api.startAppOperation(w, instanceName, appName, "delete_app", "App deleted",
func(mgr *apps.Manager, instance, app string) error {
return mgr.Delete(instance, app)
})
}
// AppsGetStatus returns app status
@@ -204,3 +191,190 @@ func (api *API) AppsGetStatus(w http.ResponseWriter, r *http.Request) {
respondJSON(w, http.StatusOK, status)
}
// AppsGetEnhanced returns enhanced app details with runtime status
func (api *API) AppsGetEnhanced(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
appName := vars["app"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Get enhanced app details
appsMgr := apps.NewManager(api.dataDir, api.appsDir)
enhanced, err := appsMgr.GetEnhanced(instanceName, appName)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to get app details: %v", err))
return
}
respondJSON(w, http.StatusOK, enhanced)
}
// AppsGetEnhancedStatus returns just runtime status for an app
func (api *API) AppsGetEnhancedStatus(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
appName := vars["app"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Get runtime status
appsMgr := apps.NewManager(api.dataDir, api.appsDir)
status, err := appsMgr.GetEnhancedStatus(instanceName, appName)
if err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Failed to get runtime status: %v", err))
return
}
respondJSON(w, http.StatusOK, status)
}
// AppsGetLogs returns logs for an app (from first pod)
func (api *API) AppsGetLogs(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
appName := vars["app"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Parse query parameters
tailStr := r.URL.Query().Get("tail")
sinceSecondsStr := r.URL.Query().Get("sinceSeconds")
podName := r.URL.Query().Get("pod")
tail := 100 // default
if tailStr != "" {
if t, err := strconv.Atoi(tailStr); err == nil && t > 0 {
tail = t
}
}
sinceSeconds := 0
if sinceSecondsStr != "" {
if s, err := strconv.Atoi(sinceSecondsStr); err == nil && s > 0 {
sinceSeconds = s
}
}
// Get logs
kubeconfigPath := api.dataDir + "/instances/" + instanceName + "/kubeconfig"
kubectl := tools.NewKubectl(kubeconfigPath)
// If no pod specified, get the first pod
if podName == "" {
pods, err := kubectl.GetPods(appName, true)
if err != nil || len(pods) == 0 {
respondError(w, http.StatusNotFound, "No pods found for app")
return
}
podName = pods[0].Name
}
logOpts := tools.LogOptions{
Tail: tail,
SinceSeconds: sinceSeconds,
}
logs, err := kubectl.GetLogs(appName, podName, logOpts)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to get logs: %v", err))
return
}
respondJSON(w, http.StatusOK, map[string]interface{}{
"pod": podName,
"logs": logs,
})
}
// AppsGetEvents returns kubernetes events for an app
func (api *API) AppsGetEvents(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
appName := vars["app"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Parse query parameters
limitStr := r.URL.Query().Get("limit")
limit := 20 // default
if limitStr != "" {
if l, err := strconv.Atoi(limitStr); err == nil && l > 0 {
limit = l
}
}
// Get events
kubeconfigPath := api.dataDir + "/instances/" + instanceName + "/kubeconfig"
kubectl := tools.NewKubectl(kubeconfigPath)
events, err := kubectl.GetRecentEvents(appName, limit)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to get events: %v", err))
return
}
respondJSON(w, http.StatusOK, map[string]interface{}{
"events": events,
})
}
// AppsGetReadme returns the README.md content for an app
func (api *API) AppsGetReadme(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
appName := vars["app"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Validate app name to prevent path traversal
if appName == "" || appName == "." || appName == ".." ||
strings.Contains(appName, "/") || strings.Contains(appName, "\\") {
respondError(w, http.StatusBadRequest, "Invalid app name")
return
}
// Try instance-specific README first
instancePath := filepath.Join(api.dataDir, "instances", instanceName, "apps", appName, "README.md")
content, err := os.ReadFile(instancePath)
if err == nil {
w.Header().Set("Content-Type", "text/markdown; charset=utf-8")
w.Write(content)
return
}
// Fall back to global directory
globalPath := filepath.Join(api.appsDir, appName, "README.md")
content, err = os.ReadFile(globalPath)
if err != nil {
if os.IsNotExist(err) {
respondError(w, http.StatusNotFound, fmt.Sprintf("README not found for app '%s' in instance '%s'", appName, instanceName))
} else {
respondError(w, http.StatusInternalServerError, "Failed to read README file")
}
return
}
w.Header().Set("Content-Type", "text/markdown; charset=utf-8")
w.Write(content)
}

View File

@@ -27,15 +27,15 @@ func (api *API) BackupAppStart(w http.ResponseWriter, r *http.Request) {
// Run backup in background
go func() {
opMgr.UpdateProgress(instanceName, opID, 10, "Starting backup")
_ = opMgr.UpdateProgress(instanceName, opID, 10, "Starting backup")
info, err := mgr.BackupApp(instanceName, appName)
if err != nil {
opMgr.Update(instanceName, opID, "failed", err.Error(), 100)
_ = opMgr.Update(instanceName, opID, "failed", err.Error(), 100)
return
}
opMgr.Update(instanceName, opID, "completed", "Backup completed", 100)
_ = opMgr.Update(instanceName, opID, "completed", "Backup completed", 100)
_ = info // Metadata saved in backup.json
}()
@@ -92,14 +92,14 @@ func (api *API) BackupAppRestore(w http.ResponseWriter, r *http.Request) {
// Run restore in background
go func() {
opMgr.UpdateProgress(instanceName, opID, 10, "Starting restore")
_ = opMgr.UpdateProgress(instanceName, opID, 10, "Starting restore")
if err := mgr.RestoreApp(instanceName, appName, opts); err != nil {
opMgr.Update(instanceName, opID, "failed", err.Error(), 100)
_ = opMgr.Update(instanceName, opID, "failed", err.Error(), 100)
return
}
opMgr.Update(instanceName, opID, "completed", "Restore completed", 100)
_ = opMgr.Update(instanceName, opID, "completed", "Restore completed", 100)
}()
respondJSON(w, http.StatusAccepted, map[string]interface{}{

View File

@@ -46,15 +46,15 @@ func (api *API) ClusterGenerateConfig(w http.ResponseWriter, r *http.Request) {
}
// Create cluster config
config := cluster.ClusterConfig{
clusterConfig := cluster.ClusterConfig{
ClusterName: clusterName,
VIP: vip,
Version: version,
}
// Generate configuration
clusterMgr := cluster.NewManager(api.dataDir)
if err := clusterMgr.GenerateConfig(instanceName, &config); err != nil {
clusterMgr := cluster.NewManager(api.dataDir, api.opsMgr)
if err := clusterMgr.GenerateConfig(instanceName, &clusterConfig); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to generate config: %v", err))
return
}
@@ -90,26 +90,14 @@ func (api *API) ClusterBootstrap(w http.ResponseWriter, r *http.Request) {
return
}
// Start bootstrap operation
opsMgr := operations.NewManager(api.dataDir)
opID, err := opsMgr.Start(instanceName, "bootstrap", req.Node)
// Bootstrap with progress tracking
clusterMgr := cluster.NewManager(api.dataDir, api.opsMgr)
opID, err := clusterMgr.Bootstrap(instanceName, req.Node)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to start operation: %v", err))
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to start bootstrap: %v", err))
return
}
// Bootstrap in background
go func() {
clusterMgr := cluster.NewManager(api.dataDir)
opsMgr.UpdateStatus(instanceName, opID, "running")
if err := clusterMgr.Bootstrap(instanceName, req.Node); err != nil {
opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
} else {
opsMgr.Update(instanceName, opID, "completed", "Bootstrap completed", 100)
}
}()
respondJSON(w, http.StatusAccepted, map[string]string{
"operation_id": opID,
"message": "Bootstrap initiated",
@@ -138,7 +126,7 @@ func (api *API) ClusterConfigureEndpoints(w http.ResponseWriter, r *http.Request
}
// Configure endpoints
clusterMgr := cluster.NewManager(api.dataDir)
clusterMgr := cluster.NewManager(api.dataDir, api.opsMgr)
if err := clusterMgr.ConfigureEndpoints(instanceName, req.IncludeNodes); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to configure endpoints: %v", err))
return
@@ -161,7 +149,7 @@ func (api *API) ClusterGetStatus(w http.ResponseWriter, r *http.Request) {
}
// Get status
clusterMgr := cluster.NewManager(api.dataDir)
clusterMgr := cluster.NewManager(api.dataDir, api.opsMgr)
status, err := clusterMgr.GetStatus(instanceName)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to get status: %v", err))
@@ -183,7 +171,7 @@ func (api *API) ClusterHealth(w http.ResponseWriter, r *http.Request) {
}
// Get health checks
clusterMgr := cluster.NewManager(api.dataDir)
clusterMgr := cluster.NewManager(api.dataDir, api.opsMgr)
checks, err := clusterMgr.Health(instanceName)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to get health: %v", err))
@@ -219,7 +207,7 @@ func (api *API) ClusterGetKubeconfig(w http.ResponseWriter, r *http.Request) {
}
// Get kubeconfig
clusterMgr := cluster.NewManager(api.dataDir)
clusterMgr := cluster.NewManager(api.dataDir, api.opsMgr)
kubeconfig, err := clusterMgr.GetKubeconfig(instanceName)
if err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Kubeconfig not found: %v", err))
@@ -243,7 +231,7 @@ func (api *API) ClusterGenerateKubeconfig(w http.ResponseWriter, r *http.Request
}
// Regenerate kubeconfig from cluster
clusterMgr := cluster.NewManager(api.dataDir)
clusterMgr := cluster.NewManager(api.dataDir, api.opsMgr)
if err := clusterMgr.RegenerateKubeconfig(instanceName); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to generate kubeconfig: %v", err))
return
@@ -266,7 +254,7 @@ func (api *API) ClusterGetTalosconfig(w http.ResponseWriter, r *http.Request) {
}
// Get talosconfig
clusterMgr := cluster.NewManager(api.dataDir)
clusterMgr := cluster.NewManager(api.dataDir, api.opsMgr)
talosconfig, err := clusterMgr.GetTalosconfig(instanceName)
if err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Talosconfig not found: %v", err))
@@ -314,13 +302,13 @@ func (api *API) ClusterReset(w http.ResponseWriter, r *http.Request) {
// Reset in background
go func() {
clusterMgr := cluster.NewManager(api.dataDir)
opsMgr.UpdateStatus(instanceName, opID, "running")
clusterMgr := cluster.NewManager(api.dataDir, api.opsMgr)
_ = opsMgr.UpdateStatus(instanceName, opID, "running")
if err := clusterMgr.Reset(instanceName, req.Confirm); err != nil {
opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
_ = opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
} else {
opsMgr.Update(instanceName, opID, "completed", "Cluster reset completed", 100)
_ = opsMgr.Update(instanceName, opID, "completed", "Cluster reset completed", 100)
}
}()

View File

@@ -12,6 +12,7 @@ import (
)
// NodeDiscover initiates node discovery
// Accepts optional subnet parameter. If no subnet provided, auto-detects local networks.
func (api *API) NodeDiscover(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
@@ -22,10 +23,9 @@ func (api *API) NodeDiscover(w http.ResponseWriter, r *http.Request) {
return
}
// Parse request body - support both subnet and ip_list formats
// Parse request body - only subnet is supported
var req struct {
Subnet string `json:"subnet"`
IPList []string `json:"ip_list"`
Subnet string `json:"subnet,omitempty"`
}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
@@ -33,16 +33,38 @@ func (api *API) NodeDiscover(w http.ResponseWriter, r *http.Request) {
return
}
// If subnet provided, use it as a single "IP" for discovery
// The discovery manager will scan this subnet
// Build IP list
var ipList []string
var err error
if req.Subnet != "" {
ipList = []string{req.Subnet}
} else if len(req.IPList) > 0 {
ipList = req.IPList
// Expand provided CIDR notation to individual IPs
ipList, err = discovery.ExpandSubnet(req.Subnet)
if err != nil {
respondError(w, http.StatusBadRequest, fmt.Sprintf("Invalid subnet: %v", err))
return
}
} else {
respondError(w, http.StatusBadRequest, "subnet or ip_list is required")
return
// Auto-detect: Get local networks when no subnet provided
networks, err := discovery.GetLocalNetworks()
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to detect local networks: %v", err))
return
}
if len(networks) == 0 {
respondError(w, http.StatusNotFound, "No local networks found")
return
}
// Expand all detected networks
for _, network := range networks {
ips, err := discovery.ExpandSubnet(network)
if err != nil {
continue // Skip invalid networks
}
ipList = append(ipList, ips...)
}
}
// Start discovery
@@ -52,9 +74,10 @@ func (api *API) NodeDiscover(w http.ResponseWriter, r *http.Request) {
return
}
respondJSON(w, http.StatusAccepted, map[string]string{
"message": "Discovery started",
"status": "running",
respondJSON(w, http.StatusAccepted, map[string]interface{}{
"message": "Discovery started",
"status": "running",
"ips_to_scan": len(ipList),
})
}
@@ -92,7 +115,7 @@ func (api *API) NodeHardware(w http.ResponseWriter, r *http.Request) {
}
// Detect hardware
nodeMgr := node.NewManager(api.dataDir)
nodeMgr := node.NewManager(api.dataDir, instanceName)
hwInfo, err := nodeMgr.DetectHardware(nodeIP)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to detect hardware: %v", err))
@@ -103,6 +126,7 @@ func (api *API) NodeHardware(w http.ResponseWriter, r *http.Request) {
}
// NodeDetect detects hardware on a single node (POST with IP in body)
// IP address is required.
func (api *API) NodeDetect(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
@@ -123,13 +147,14 @@ func (api *API) NodeDetect(w http.ResponseWriter, r *http.Request) {
return
}
// Validate IP is provided
if req.IP == "" {
respondError(w, http.StatusBadRequest, "ip is required")
respondError(w, http.StatusBadRequest, "IP address is required")
return
}
// Detect hardware
nodeMgr := node.NewManager(api.dataDir)
// Detect hardware for specific IP
nodeMgr := node.NewManager(api.dataDir, instanceName)
hwInfo, err := nodeMgr.DetectHardware(req.IP)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to detect hardware: %v", err))
@@ -158,7 +183,7 @@ func (api *API) NodeAdd(w http.ResponseWriter, r *http.Request) {
}
// Add node
nodeMgr := node.NewManager(api.dataDir)
nodeMgr := node.NewManager(api.dataDir, instanceName)
if err := nodeMgr.Add(instanceName, &nodeData); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to add node: %v", err))
return
@@ -182,7 +207,7 @@ func (api *API) NodeList(w http.ResponseWriter, r *http.Request) {
}
// List nodes
nodeMgr := node.NewManager(api.dataDir)
nodeMgr := node.NewManager(api.dataDir, instanceName)
nodes, err := nodeMgr.List(instanceName)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to list nodes: %v", err))
@@ -207,7 +232,7 @@ func (api *API) NodeGet(w http.ResponseWriter, r *http.Request) {
}
// Get node
nodeMgr := node.NewManager(api.dataDir)
nodeMgr := node.NewManager(api.dataDir, instanceName)
nodeData, err := nodeMgr.Get(instanceName, nodeIdentifier)
if err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Node not found: %v", err))
@@ -233,7 +258,7 @@ func (api *API) NodeApply(w http.ResponseWriter, r *http.Request) {
opts := node.ApplyOptions{}
// Apply node configuration
nodeMgr := node.NewManager(api.dataDir)
nodeMgr := node.NewManager(api.dataDir, instanceName)
if err := nodeMgr.Apply(instanceName, nodeIdentifier, opts); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to apply node configuration: %v", err))
return
@@ -265,7 +290,7 @@ func (api *API) NodeUpdate(w http.ResponseWriter, r *http.Request) {
}
// Update node
nodeMgr := node.NewManager(api.dataDir)
nodeMgr := node.NewManager(api.dataDir, instanceName)
if err := nodeMgr.Update(instanceName, nodeIdentifier, updates); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to update node: %v", err))
return
@@ -289,7 +314,7 @@ func (api *API) NodeFetchTemplates(w http.ResponseWriter, r *http.Request) {
}
// Fetch templates
nodeMgr := node.NewManager(api.dataDir)
nodeMgr := node.NewManager(api.dataDir, instanceName)
if err := nodeMgr.FetchTemplates(instanceName); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to fetch templates: %v", err))
return
@@ -313,7 +338,7 @@ func (api *API) NodeDelete(w http.ResponseWriter, r *http.Request) {
}
// Delete node
nodeMgr := node.NewManager(api.dataDir)
nodeMgr := node.NewManager(api.dataDir, instanceName)
if err := nodeMgr.Delete(instanceName, nodeIdentifier); err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to delete node: %v", err))
return
@@ -323,3 +348,26 @@ func (api *API) NodeDelete(w http.ResponseWriter, r *http.Request) {
"message": "Node deleted successfully",
})
}
// NodeDiscoveryCancel cancels an in-progress discovery operation
func (api *API) NodeDiscoveryCancel(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Cancel discovery
discoveryMgr := discovery.NewManager(api.dataDir, instanceName)
if err := discoveryMgr.CancelDiscovery(instanceName); err != nil {
respondError(w, http.StatusBadRequest, fmt.Sprintf("Failed to cancel discovery: %v", err))
return
}
respondJSON(w, http.StatusOK, map[string]string{
"message": "Discovery cancelled successfully",
})
}

View File

@@ -11,17 +11,18 @@ import (
"github.com/gorilla/mux"
"github.com/wild-cloud/wild-central/daemon/internal/operations"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// OperationGet returns operation status
func (api *API) OperationGet(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
opID := vars["id"]
// Extract instance name from query param or header
instanceName := r.URL.Query().Get("instance")
if instanceName == "" {
respondError(w, http.StatusBadRequest, "instance parameter is required")
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
@@ -63,12 +64,12 @@ func (api *API) OperationList(w http.ResponseWriter, r *http.Request) {
// OperationCancel cancels an operation
func (api *API) OperationCancel(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
opID := vars["id"]
// Extract instance name from query param
instanceName := r.URL.Query().Get("instance")
if instanceName == "" {
respondError(w, http.StatusBadRequest, "instance parameter is required")
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
@@ -88,12 +89,12 @@ func (api *API) OperationCancel(w http.ResponseWriter, r *http.Request) {
// OperationStream streams operation output via Server-Sent Events (SSE)
func (api *API) OperationStream(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
opID := vars["id"]
// Extract instance name from query param
instanceName := r.URL.Query().Get("instance")
if instanceName == "" {
respondError(w, http.StatusBadRequest, "instance parameter is required")
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
@@ -110,7 +111,7 @@ func (api *API) OperationStream(w http.ResponseWriter, r *http.Request) {
}
// Check if operation is already completed
statusFile := filepath.Join(api.dataDir, "instances", instanceName, "operations", opID+".json")
statusFile := filepath.Join(tools.GetInstanceOperationsPath(api.dataDir, instanceName), opID+".json")
isCompleted := false
if data, err := os.ReadFile(statusFile); err == nil {
var op map[string]interface{}
@@ -122,7 +123,7 @@ func (api *API) OperationStream(w http.ResponseWriter, r *http.Request) {
}
// Send existing log file content first (if exists)
logPath := filepath.Join(api.dataDir, "instances", instanceName, "operations", opID, "output.log")
logPath := filepath.Join(tools.GetInstanceOperationsPath(api.dataDir, instanceName), opID, "output.log")
if _, err := os.Stat(logPath); err == nil {
file, err := os.Open(logPath)
if err == nil {

View File

@@ -5,14 +5,15 @@ import (
"fmt"
"net/http"
"os"
"path/filepath"
"strings"
"github.com/gorilla/mux"
"gopkg.in/yaml.v3"
"github.com/wild-cloud/wild-central/daemon/internal/contracts"
"github.com/wild-cloud/wild-central/daemon/internal/operations"
"github.com/wild-cloud/wild-central/daemon/internal/services"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// ServicesList lists all base services
@@ -104,20 +105,20 @@ func (api *API) ServicesInstall(w http.ResponseWriter, r *http.Request) {
defer func() {
if r := recover(); r != nil {
fmt.Printf("[ERROR] Service install goroutine panic: %v\n", r)
opsMgr.Update(instanceName, opID, "failed", fmt.Sprintf("Internal error: %v", r), 0)
_ = opsMgr.Update(instanceName, opID, "failed", fmt.Sprintf("Internal error: %v", r), 0)
}
}()
fmt.Printf("[DEBUG] Service install goroutine started: service=%s instance=%s opID=%s\n", req.Name, instanceName, opID)
servicesMgr := services.NewManager(api.dataDir)
opsMgr.UpdateStatus(instanceName, opID, "running")
_ = opsMgr.UpdateStatus(instanceName, opID, "running")
if err := servicesMgr.Install(instanceName, req.Name, req.Fetch, req.Deploy, opID, api.broadcaster); err != nil {
fmt.Printf("[DEBUG] Service install failed: %v\n", err)
opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
_ = opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
} else {
fmt.Printf("[DEBUG] Service install completed successfully\n")
opsMgr.Update(instanceName, opID, "completed", "Service installed", 100)
_ = opsMgr.Update(instanceName, opID, "completed", "Service installed", 100)
}
}()
@@ -160,12 +161,12 @@ func (api *API) ServicesInstallAll(w http.ResponseWriter, r *http.Request) {
// Install in background
go func() {
servicesMgr := services.NewManager(api.dataDir)
opsMgr.UpdateStatus(instanceName, opID, "running")
_ = opsMgr.UpdateStatus(instanceName, opID, "running")
if err := servicesMgr.InstallAll(instanceName, req.Fetch, req.Deploy, opID, api.broadcaster); err != nil {
opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
_ = opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
} else {
opsMgr.Update(instanceName, opID, "completed", "All services installed", 100)
_ = opsMgr.Update(instanceName, opID, "completed", "All services installed", 100)
}
}()
@@ -198,12 +199,12 @@ func (api *API) ServicesDelete(w http.ResponseWriter, r *http.Request) {
// Delete in background
go func() {
servicesMgr := services.NewManager(api.dataDir)
opsMgr.UpdateStatus(instanceName, opID, "running")
_ = opsMgr.UpdateStatus(instanceName, opID, "running")
if err := servicesMgr.Delete(instanceName, serviceName); err != nil {
opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
_ = opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
} else {
opsMgr.Update(instanceName, opID, "completed", "Service deleted", 100)
_ = opsMgr.Update(instanceName, opID, "completed", "Service deleted", 100)
}
}()
@@ -225,11 +226,11 @@ func (api *API) ServicesGetStatus(w http.ResponseWriter, r *http.Request) {
return
}
// Get status
// Get detailed status
servicesMgr := services.NewManager(api.dataDir)
status, err := servicesMgr.GetStatus(instanceName, serviceName)
status, err := servicesMgr.GetDetailedStatus(instanceName, serviceName)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to get status: %v", err))
respondError(w, http.StatusNotFound, fmt.Sprintf("Failed to get status: %v", err))
return
}
@@ -296,7 +297,7 @@ func (api *API) ServicesGetInstanceConfig(w http.ResponseWriter, r *http.Request
}
// Load instance config as map for dynamic path extraction
configPath := filepath.Join(api.dataDir, "instances", instanceName, "config.yaml")
configPath := tools.GetInstanceConfigPath(api.dataDir, instanceName)
configData, err := os.ReadFile(configPath)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to read instance config: %v", err))
@@ -422,3 +423,110 @@ func (api *API) ServicesDeploy(w http.ResponseWriter, r *http.Request) {
"message": fmt.Sprintf("Service %s deployed successfully", serviceName),
})
}
// ServicesGetLogs retrieves or streams service logs
func (api *API) ServicesGetLogs(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
serviceName := vars["service"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Parse query parameters
query := r.URL.Query()
logsReq := contracts.ServiceLogsRequest{
Container: query.Get("container"),
Follow: query.Get("follow") == "true",
Previous: query.Get("previous") == "true",
Since: query.Get("since"),
}
// Parse tail parameter
if tailStr := query.Get("tail"); tailStr != "" {
var tail int
if _, err := fmt.Sscanf(tailStr, "%d", &tail); err == nil {
logsReq.Tail = tail
}
}
// Validate parameters
if logsReq.Tail < 0 {
respondError(w, http.StatusBadRequest, "tail parameter must be positive")
return
}
if logsReq.Tail > 5000 {
respondError(w, http.StatusBadRequest, "tail parameter cannot exceed 5000")
return
}
if logsReq.Previous && logsReq.Follow {
respondError(w, http.StatusBadRequest, "previous and follow cannot be used together")
return
}
servicesMgr := services.NewManager(api.dataDir)
// Stream logs with SSE if follow=true
if logsReq.Follow {
// Set SSE headers
w.Header().Set("Content-Type", "text/event-stream")
w.Header().Set("Cache-Control", "no-cache")
w.Header().Set("Connection", "keep-alive")
w.Header().Set("X-Accel-Buffering", "no")
// Stream logs
if err := servicesMgr.StreamLogs(instanceName, serviceName, logsReq, w); err != nil {
// Log error but can't send response (SSE already started)
fmt.Printf("Error streaming logs: %v\n", err)
}
return
}
// Get buffered logs
logsResp, err := servicesMgr.GetLogs(instanceName, serviceName, logsReq)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to get logs: %v", err))
return
}
respondJSON(w, http.StatusOK, logsResp)
}
// ServicesUpdateConfig updates service configuration
func (api *API) ServicesUpdateConfig(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
serviceName := vars["service"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Parse request body
var update contracts.ServiceConfigUpdate
if err := json.NewDecoder(r.Body).Decode(&update); err != nil {
respondError(w, http.StatusBadRequest, fmt.Sprintf("Invalid request body: %v", err))
return
}
// Validate request
if len(update.Config) == 0 {
respondError(w, http.StatusBadRequest, "config field is required and must not be empty")
return
}
// Update config
servicesMgr := services.NewManager(api.dataDir)
response, err := servicesMgr.UpdateConfig(instanceName, serviceName, update, api.broadcaster)
if err != nil {
respondError(w, http.StatusInternalServerError, fmt.Sprintf("Failed to update config: %v", err))
return
}
respondJSON(w, http.StatusOK, response)
}

View File

@@ -4,26 +4,12 @@ import (
"encoding/json"
"fmt"
"net/http"
"path/filepath"
"github.com/gorilla/mux"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
"github.com/wild-cloud/wild-central/daemon/internal/utilities"
)
// UtilitiesHealth returns cluster health status (legacy, no instance context)
func (api *API) UtilitiesHealth(w http.ResponseWriter, r *http.Request) {
status, err := utilities.GetClusterHealth("")
if err != nil {
respondError(w, http.StatusInternalServerError, "Failed to get cluster health")
return
}
respondJSON(w, http.StatusOK, map[string]interface{}{
"success": true,
"data": status,
})
}
// InstanceUtilitiesHealth returns cluster health status for a specific instance
func (api *API) InstanceUtilitiesHealth(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
@@ -36,7 +22,7 @@ func (api *API) InstanceUtilitiesHealth(w http.ResponseWriter, r *http.Request)
}
// Get kubeconfig path for this instance
kubeconfigPath := filepath.Join(api.dataDir, "instances", instanceName, "kubeconfig")
kubeconfigPath := tools.GetKubeconfigPath(api.dataDir, instanceName)
status, err := utilities.GetClusterHealth(kubeconfigPath)
if err != nil {
@@ -50,12 +36,24 @@ func (api *API) InstanceUtilitiesHealth(w http.ResponseWriter, r *http.Request)
})
}
// UtilitiesDashboardToken returns a Kubernetes dashboard token
// InstanceUtilitiesDashboardToken returns a Kubernetes dashboard token for a specific instance
func (api *API) UtilitiesDashboardToken(w http.ResponseWriter, r *http.Request) {
token, err := utilities.GetDashboardToken()
vars := mux.Vars(r)
instanceName := vars["name"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Get kubeconfig path for the instance
kubeconfigPath := tools.GetKubeconfigPath(api.dataDir, instanceName)
token, err := utilities.GetDashboardToken(kubeconfigPath)
if err != nil {
// Try fallback method
token, err = utilities.GetDashboardTokenFromSecret()
token, err = utilities.GetDashboardTokenFromSecret(kubeconfigPath)
if err != nil {
respondError(w, http.StatusInternalServerError, "Failed to get dashboard token")
return
@@ -70,7 +68,19 @@ func (api *API) UtilitiesDashboardToken(w http.ResponseWriter, r *http.Request)
// UtilitiesNodeIPs returns IP addresses for all cluster nodes
func (api *API) UtilitiesNodeIPs(w http.ResponseWriter, r *http.Request) {
nodes, err := utilities.GetNodeIPs()
vars := mux.Vars(r)
instanceName := vars["name"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Get kubeconfig path for this instance
kubeconfigPath := tools.GetKubeconfigPath(api.dataDir, instanceName)
nodes, err := utilities.GetNodeIPs(kubeconfigPath)
if err != nil {
respondError(w, http.StatusInternalServerError, "Failed to get node IPs")
return
@@ -86,7 +96,19 @@ func (api *API) UtilitiesNodeIPs(w http.ResponseWriter, r *http.Request) {
// UtilitiesControlPlaneIP returns the control plane IP
func (api *API) UtilitiesControlPlaneIP(w http.ResponseWriter, r *http.Request) {
ip, err := utilities.GetControlPlaneIP()
vars := mux.Vars(r)
instanceName := vars["name"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Get kubeconfig path for this instance
kubeconfigPath := tools.GetKubeconfigPath(api.dataDir, instanceName)
ip, err := utilities.GetControlPlaneIP(kubeconfigPath)
if err != nil {
respondError(w, http.StatusInternalServerError, "Failed to get control plane IP")
return
@@ -103,8 +125,15 @@ func (api *API) UtilitiesControlPlaneIP(w http.ResponseWriter, r *http.Request)
// UtilitiesSecretCopy copies a secret between namespaces
func (api *API) UtilitiesSecretCopy(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
instanceName := vars["name"]
secretName := vars["secret"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
var req struct {
SourceNamespace string `json:"source_namespace"`
DestinationNamespace string `json:"destination_namespace"`
@@ -120,7 +149,10 @@ func (api *API) UtilitiesSecretCopy(w http.ResponseWriter, r *http.Request) {
return
}
if err := utilities.CopySecretBetweenNamespaces(secretName, req.SourceNamespace, req.DestinationNamespace); err != nil {
// Get kubeconfig path for this instance
kubeconfigPath := tools.GetKubeconfigPath(api.dataDir, instanceName)
if err := utilities.CopySecretBetweenNamespaces(kubeconfigPath, secretName, req.SourceNamespace, req.DestinationNamespace); err != nil {
respondError(w, http.StatusInternalServerError, "Failed to copy secret")
return
}
@@ -133,7 +165,19 @@ func (api *API) UtilitiesSecretCopy(w http.ResponseWriter, r *http.Request) {
// UtilitiesVersion returns cluster and Talos versions
func (api *API) UtilitiesVersion(w http.ResponseWriter, r *http.Request) {
k8sVersion, err := utilities.GetClusterVersion()
vars := mux.Vars(r)
instanceName := vars["name"]
// Validate instance exists
if err := api.instance.ValidateInstance(instanceName); err != nil {
respondError(w, http.StatusNotFound, fmt.Sprintf("Instance not found: %v", err))
return
}
// Get kubeconfig path for this instance
kubeconfigPath := tools.GetKubeconfigPath(api.dataDir, instanceName)
k8sVersion, err := utilities.GetClusterVersion(kubeconfigPath)
if err != nil {
respondError(w, http.StatusInternalServerError, "Failed to get cluster version")
return

View File

@@ -2,6 +2,7 @@ package apps
import (
"bytes"
"encoding/json"
"fmt"
"os"
"os/exec"
@@ -31,12 +32,15 @@ func NewManager(dataDir, appsDir string) *Manager {
// App represents an application
type App struct {
Name string `json:"name" yaml:"name"`
Description string `json:"description" yaml:"description"`
Version string `json:"version" yaml:"version"`
Category string `json:"category" yaml:"category"`
Dependencies []string `json:"dependencies" yaml:"dependencies"`
Config map[string]string `json:"config,omitempty" yaml:"config,omitempty"`
Name string `json:"name" yaml:"name"`
Description string `json:"description" yaml:"description"`
Version string `json:"version" yaml:"version"`
Category string `json:"category,omitempty" yaml:"category,omitempty"`
Icon string `json:"icon,omitempty" yaml:"icon,omitempty"`
Dependencies []string `json:"dependencies" yaml:"dependencies"`
Config map[string]string `json:"config,omitempty" yaml:"config,omitempty"`
DefaultConfig map[string]interface{} `json:"defaultConfig,omitempty" yaml:"defaultConfig,omitempty"`
RequiredSecrets []string `json:"requiredSecrets,omitempty" yaml:"requiredSecrets,omitempty"`
}
// DeployedApp represents a deployed application instance
@@ -78,12 +82,30 @@ func (m *Manager) ListAvailable() ([]App, error) {
continue
}
var app App
if err := yaml.Unmarshal(data, &app); err != nil {
var manifest AppManifest
if err := yaml.Unmarshal(data, &manifest); err != nil {
continue
}
app.Name = entry.Name() // Use directory name as app name
// Convert manifest to App struct
app := App{
Name: entry.Name(), // Use directory name as app name
Description: manifest.Description,
Version: manifest.Version,
Category: manifest.Category,
Icon: manifest.Icon,
DefaultConfig: manifest.DefaultConfig,
RequiredSecrets: manifest.RequiredSecrets,
}
// Extract dependencies from Requires field
if len(manifest.Requires) > 0 {
app.Dependencies = make([]string, len(manifest.Requires))
for i, dep := range manifest.Requires {
app.Dependencies[i] = dep.Name
}
}
apps = append(apps, app)
}
@@ -103,19 +125,37 @@ func (m *Manager) Get(appName string) (*App, error) {
return nil, fmt.Errorf("failed to read app file: %w", err)
}
var app App
if err := yaml.Unmarshal(data, &app); err != nil {
var manifest AppManifest
if err := yaml.Unmarshal(data, &manifest); err != nil {
return nil, fmt.Errorf("failed to parse app file: %w", err)
}
app.Name = appName
return &app, nil
// Convert manifest to App struct
app := &App{
Name: appName,
Description: manifest.Description,
Version: manifest.Version,
Category: manifest.Category,
Icon: manifest.Icon,
DefaultConfig: manifest.DefaultConfig,
RequiredSecrets: manifest.RequiredSecrets,
}
// Extract dependencies from Requires field
if len(manifest.Requires) > 0 {
app.Dependencies = make([]string, len(manifest.Requires))
for i, dep := range manifest.Requires {
app.Dependencies[i] = dep.Name
}
}
return app, nil
}
// ListDeployed lists deployed apps for an instance
func (m *Manager) ListDeployed(instanceName string) ([]DeployedApp, error) {
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
instancePath := filepath.Join(m.dataDir, "instances", instanceName)
instancePath := tools.GetInstancePath(m.dataDir, instanceName)
appsDir := filepath.Join(instancePath, "apps")
apps := []DeployedApp{}
@@ -173,6 +213,66 @@ func (m *Manager) ListDeployed(instanceName string) ([]DeployedApp, error) {
if yaml.Unmarshal(output, &ns) == nil && ns.Status.Phase == "Active" {
// Namespace is active - app is deployed
app.Status = "deployed"
// Get ingress URL if available
// Try Traefik IngressRoute first
ingressCmd := exec.Command("kubectl", "get", "ingressroute", "-n", appName, "-o", "json")
tools.WithKubeconfig(ingressCmd, kubeconfigPath)
ingressOutput, err := ingressCmd.CombinedOutput()
if err == nil {
var ingressList struct {
Items []struct {
Spec struct {
Routes []struct {
Match string `json:"match"`
} `json:"routes"`
} `json:"spec"`
} `json:"items"`
}
if json.Unmarshal(ingressOutput, &ingressList) == nil && len(ingressList.Items) > 0 {
// Extract host from the first route match (format: Host(`example.com`))
if len(ingressList.Items[0].Spec.Routes) > 0 {
match := ingressList.Items[0].Spec.Routes[0].Match
// Parse Host(`domain.com`) format
if strings.Contains(match, "Host(`") {
start := strings.Index(match, "Host(`") + 6
end := strings.Index(match[start:], "`")
if end > 0 {
host := match[start : start+end]
app.URL = "https://" + host
}
}
}
}
}
// If no IngressRoute, try standard Ingress
if app.URL == "" {
ingressCmd := exec.Command("kubectl", "get", "ingress", "-n", appName, "-o", "json")
tools.WithKubeconfig(ingressCmd, kubeconfigPath)
ingressOutput, err := ingressCmd.CombinedOutput()
if err == nil {
var ingressList struct {
Items []struct {
Spec struct {
Rules []struct {
Host string `json:"host"`
} `json:"rules"`
} `json:"spec"`
} `json:"items"`
}
if json.Unmarshal(ingressOutput, &ingressList) == nil && len(ingressList.Items) > 0 {
if len(ingressList.Items[0].Spec.Rules) > 0 {
host := ingressList.Items[0].Spec.Rules[0].Host
if host != "" {
app.URL = "https://" + host
}
}
}
}
}
}
}
@@ -190,9 +290,9 @@ func (m *Manager) Add(instanceName, appName string, config map[string]string) er
return fmt.Errorf("app %s not found at %s", appName, manifestPath)
}
instancePath := filepath.Join(m.dataDir, "instances", instanceName)
configFile := filepath.Join(instancePath, "config.yaml")
secretsFile := filepath.Join(instancePath, "secrets.yaml")
instancePath := tools.GetInstancePath(m.dataDir, instanceName)
configFile := tools.GetInstanceConfigPath(m.dataDir, instanceName)
secretsFile := tools.GetInstanceSecretsPath(m.dataDir, instanceName)
appDestDir := filepath.Join(instancePath, "apps", appName)
// Check instance config exists
@@ -306,8 +406,8 @@ func (m *Manager) Add(instanceName, appName string, config map[string]string) er
// Deploy deploys an app to the cluster
func (m *Manager) Deploy(instanceName, appName string) error {
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
instancePath := filepath.Join(m.dataDir, "instances", instanceName)
secretsFile := filepath.Join(instancePath, "secrets.yaml")
instancePath := tools.GetInstancePath(m.dataDir, instanceName)
secretsFile := tools.GetInstanceSecretsPath(m.dataDir, instanceName)
// Get compiled app manifests from instance directory
appDir := filepath.Join(instancePath, "apps", appName)
@@ -323,7 +423,7 @@ func (m *Manager) Deploy(instanceName, appName string) error {
applyNsCmd := exec.Command("kubectl", "apply", "-f", "-")
applyNsCmd.Stdin = bytes.NewReader(namespaceYaml)
tools.WithKubeconfig(applyNsCmd, kubeconfigPath)
applyNsCmd.CombinedOutput() // Ignore errors - namespace might already exist
_, _ = applyNsCmd.CombinedOutput() // Ignore errors - namespace might already exist
// Create Kubernetes secrets from secrets.yaml
if storage.FileExists(secretsFile) {
@@ -334,7 +434,7 @@ func (m *Manager) Deploy(instanceName, appName string) error {
// Delete existing secret if it exists (to update it)
deleteCmd := exec.Command("kubectl", "delete", "secret", fmt.Sprintf("%s-secrets", appName), "-n", appName, "--ignore-not-found")
tools.WithKubeconfig(deleteCmd, kubeconfigPath)
deleteCmd.CombinedOutput()
_, _ = deleteCmd.CombinedOutput()
// Create secret from literals
createSecretCmd := exec.Command("kubectl", "create", "secret", "generic", fmt.Sprintf("%s-secrets", appName), "-n", appName)
@@ -369,9 +469,9 @@ func (m *Manager) Deploy(instanceName, appName string) error {
// Delete removes an app from the cluster and configuration
func (m *Manager) Delete(instanceName, appName string) error {
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
instancePath := filepath.Join(m.dataDir, "instances", instanceName)
configFile := filepath.Join(instancePath, "config.yaml")
secretsFile := filepath.Join(instancePath, "secrets.yaml")
instancePath := tools.GetInstancePath(m.dataDir, instanceName)
configFile := tools.GetInstanceConfigPath(m.dataDir, instanceName)
secretsFile := tools.GetInstanceSecretsPath(m.dataDir, instanceName)
// Get compiled app manifests from instance directory
appDir := filepath.Join(instancePath, "apps", appName)
@@ -390,7 +490,7 @@ func (m *Manager) Delete(instanceName, appName string) error {
// Wait for namespace deletion to complete (timeout after 60s)
waitCmd := exec.Command("kubectl", "wait", "--for=delete", "namespace", appName, "--timeout=60s")
tools.WithKubeconfig(waitCmd, kubeconfigPath)
waitCmd.CombinedOutput() // Ignore errors - namespace might not exist
_, _ = waitCmd.CombinedOutput() // Ignore errors - namespace might not exist
// Delete local app configuration directory
if err := os.RemoveAll(appDir); err != nil {
@@ -425,7 +525,7 @@ func (m *Manager) Delete(instanceName, appName string) error {
// GetStatus returns the status of a deployed app
func (m *Manager) GetStatus(instanceName, appName string) (*DeployedApp, error) {
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
instancePath := filepath.Join(m.dataDir, "instances", instanceName)
instancePath := tools.GetInstancePath(m.dataDir, instanceName)
appDir := filepath.Join(instancePath, "apps", appName)
app := &DeployedApp{
@@ -526,3 +626,214 @@ func (m *Manager) GetStatus(instanceName, appName string) (*DeployedApp, error)
return app, nil
}
// GetEnhanced returns enhanced app information with runtime status
func (m *Manager) GetEnhanced(instanceName, appName string) (*EnhancedApp, error) {
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
instancePath := tools.GetInstancePath(m.dataDir, instanceName)
configFile := tools.GetInstanceConfigPath(m.dataDir, instanceName)
appDir := filepath.Join(instancePath, "apps", appName)
enhanced := &EnhancedApp{
Name: appName,
Status: "not-added",
Namespace: appName,
}
// Check if app was added to instance
if !storage.FileExists(appDir) {
return enhanced, nil
}
enhanced.Status = "not-deployed"
// Load manifest
manifestPath := filepath.Join(appDir, "manifest.yaml")
if storage.FileExists(manifestPath) {
manifestData, _ := os.ReadFile(manifestPath)
var manifest AppManifest
if yaml.Unmarshal(manifestData, &manifest) == nil {
enhanced.Version = manifest.Version
enhanced.Description = manifest.Description
enhanced.Icon = manifest.Icon
enhanced.Manifest = &manifest
}
}
// Note: README content is now served via dedicated /readme endpoint
// No need to populate readme/documentation fields here
// Load config
yq := tools.NewYQ()
configJSON, err := yq.Get(configFile, fmt.Sprintf(".apps.%s | @json", appName))
if err == nil && configJSON != "" && configJSON != "null" {
var config map[string]string
if json.Unmarshal([]byte(configJSON), &config) == nil {
enhanced.Config = config
}
}
// Check if namespace exists
checkNsCmd := exec.Command("kubectl", "get", "namespace", appName, "-o", "json")
tools.WithKubeconfig(checkNsCmd, kubeconfigPath)
nsOutput, err := checkNsCmd.CombinedOutput()
if err != nil {
// Namespace doesn't exist - not deployed
return enhanced, nil
}
// Parse namespace to check if it's active
var ns struct {
Status struct {
Phase string `json:"phase"`
} `json:"status"`
}
if err := json.Unmarshal(nsOutput, &ns); err != nil || ns.Status.Phase != "Active" {
return enhanced, nil
}
enhanced.Status = "deployed"
// Get URL (ingress)
enhanced.URL = m.getAppURL(kubeconfigPath, appName)
// Get runtime status
runtime, err := m.getRuntimeStatus(kubeconfigPath, appName)
if err == nil {
enhanced.Runtime = runtime
// Update status based on runtime
if runtime.Pods != nil && len(runtime.Pods) > 0 {
allRunning := true
allReady := true
for _, pod := range runtime.Pods {
if pod.Status != "Running" {
allRunning = false
}
// Check ready ratio
parts := strings.Split(pod.Ready, "/")
if len(parts) == 2 && parts[0] != parts[1] {
allReady = false
}
}
if allRunning && allReady {
enhanced.Status = "running"
} else if allRunning {
enhanced.Status = "starting"
} else {
enhanced.Status = "unhealthy"
}
}
}
return enhanced, nil
}
// GetEnhancedStatus returns just the runtime status for an app
func (m *Manager) GetEnhancedStatus(instanceName, appName string) (*RuntimeStatus, error) {
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
// Check if namespace exists
checkNsCmd := exec.Command("kubectl", "get", "namespace", appName, "-o", "json")
tools.WithKubeconfig(checkNsCmd, kubeconfigPath)
if err := checkNsCmd.Run(); err != nil {
return nil, fmt.Errorf("namespace not found or not deployed")
}
return m.getRuntimeStatus(kubeconfigPath, appName)
}
// getRuntimeStatus fetches runtime information from kubernetes
func (m *Manager) getRuntimeStatus(kubeconfigPath, namespace string) (*RuntimeStatus, error) {
kubectl := tools.NewKubectl(kubeconfigPath)
runtime := &RuntimeStatus{}
// Get pods (with detailed info for app status display)
pods, err := kubectl.GetPods(namespace, true)
if err == nil {
runtime.Pods = pods
}
// Get replicas
replicas, err := kubectl.GetReplicas(namespace)
if err == nil && (replicas.Desired > 0 || replicas.Current > 0) {
runtime.Replicas = replicas
}
// Get resources
resources, err := kubectl.GetResources(namespace)
if err == nil {
runtime.Resources = resources
}
// Get recent events (last 10)
events, err := kubectl.GetRecentEvents(namespace, 10)
if err == nil {
runtime.RecentEvents = events
}
return runtime, nil
}
// getAppURL extracts the ingress URL for an app
func (m *Manager) getAppURL(kubeconfigPath, appName string) string {
// Try Traefik IngressRoute first
ingressCmd := exec.Command("kubectl", "get", "ingressroute", "-n", appName, "-o", "json")
tools.WithKubeconfig(ingressCmd, kubeconfigPath)
ingressOutput, err := ingressCmd.CombinedOutput()
if err == nil {
var ingressList struct {
Items []struct {
Spec struct {
Routes []struct {
Match string `json:"match"`
} `json:"routes"`
} `json:"spec"`
} `json:"items"`
}
if json.Unmarshal(ingressOutput, &ingressList) == nil && len(ingressList.Items) > 0 {
if len(ingressList.Items[0].Spec.Routes) > 0 {
match := ingressList.Items[0].Spec.Routes[0].Match
// Parse Host(`domain.com`) format
if strings.Contains(match, "Host(`") {
start := strings.Index(match, "Host(`") + 6
end := strings.Index(match[start:], "`")
if end > 0 {
host := match[start : start+end]
return "https://" + host
}
}
}
}
}
// If no IngressRoute, try standard Ingress
ingressCmd = exec.Command("kubectl", "get", "ingress", "-n", appName, "-o", "json")
tools.WithKubeconfig(ingressCmd, kubeconfigPath)
ingressOutput, err = ingressCmd.CombinedOutput()
if err == nil {
var ingressList struct {
Items []struct {
Spec struct {
Rules []struct {
Host string `json:"host"`
} `json:"rules"`
} `json:"spec"`
} `json:"items"`
}
if json.Unmarshal(ingressOutput, &ingressList) == nil && len(ingressList.Items) > 0 {
if len(ingressList.Items[0].Spec.Rules) > 0 {
host := ingressList.Items[0].Spec.Rules[0].Host
if host != "" {
return "https://" + host
}
}
}
}
return ""
}

55
internal/apps/models.go Normal file
View File

@@ -0,0 +1,55 @@
package apps
import "github.com/wild-cloud/wild-central/daemon/internal/tools"
// AppManifest represents the complete app manifest from manifest.yaml
type AppManifest struct {
Name string `json:"name" yaml:"name"`
Description string `json:"description" yaml:"description"`
Version string `json:"version" yaml:"version"`
Icon string `json:"icon,omitempty" yaml:"icon,omitempty"`
Category string `json:"category,omitempty" yaml:"category,omitempty"`
Requires []AppDependency `json:"requires,omitempty" yaml:"requires,omitempty"`
DefaultConfig map[string]interface{} `json:"defaultConfig,omitempty" yaml:"defaultConfig,omitempty"`
RequiredSecrets []string `json:"requiredSecrets,omitempty" yaml:"requiredSecrets,omitempty"`
}
// AppDependency represents a dependency on another app
type AppDependency struct {
Name string `json:"name" yaml:"name"`
}
// EnhancedApp extends DeployedApp with runtime status information
type EnhancedApp struct {
Name string `json:"name"`
Status string `json:"status"`
Version string `json:"version"`
Namespace string `json:"namespace"`
URL string `json:"url,omitempty"`
Description string `json:"description,omitempty"`
Icon string `json:"icon,omitempty"`
Manifest *AppManifest `json:"manifest,omitempty"`
Runtime *RuntimeStatus `json:"runtime,omitempty"`
Config map[string]string `json:"config,omitempty"`
Readme string `json:"readme,omitempty"`
Documentation string `json:"documentation,omitempty"`
}
// RuntimeStatus contains runtime information from kubernetes
type RuntimeStatus struct {
Pods []PodInfo `json:"pods,omitempty"`
Replicas *ReplicaInfo `json:"replicas,omitempty"`
Resources *ResourceUsage `json:"resources,omitempty"`
RecentEvents []KubernetesEvent `json:"recentEvents,omitempty"`
}
// Type aliases for kubectl wrapper types
// These types are defined in internal/tools and shared across the codebase
type PodInfo = tools.PodInfo
type ContainerInfo = tools.ContainerInfo
type ContainerState = tools.ContainerState
type PodCondition = tools.PodCondition
type ReplicaInfo = tools.ReplicaInfo
type ResourceUsage = tools.ResourceUsage
type KubernetesEvent = tools.KubernetesEvent
type LogEntry = tools.LogEntry

View File

@@ -26,27 +26,27 @@ func NewManager(dataDir string) *Manager {
// Asset represents a Talos boot asset
type Asset struct {
Type string `json:"type"` // kernel, initramfs, iso
Path string `json:"path"` // Full path to asset file
Size int64 `json:"size"` // File size in bytes
SHA256 string `json:"sha256"` // SHA256 hash
Downloaded bool `json:"downloaded"` // Whether asset exists
Type string `json:"type"` // kernel, initramfs, iso
Path string `json:"path"` // Full path to asset file
Size int64 `json:"size"` // File size in bytes
SHA256 string `json:"sha256"` // SHA256 hash
Downloaded bool `json:"downloaded"` // Whether asset exists
}
// Schematic represents a Talos schematic and its assets
type Schematic struct {
SchematicID string `json:"schematic_id"`
Version string `json:"version"`
Path string `json:"path"`
Assets []Asset `json:"assets"`
SchematicID string `json:"schematic_id"`
Version string `json:"version"`
Path string `json:"path"`
Assets []Asset `json:"assets"`
}
// AssetStatus represents download status for a schematic
type AssetStatus struct {
SchematicID string `json:"schematic_id"`
Version string `json:"version"`
Assets map[string]Asset `json:"assets"`
Complete bool `json:"complete"`
SchematicID string `json:"schematic_id"`
Version string `json:"version"`
Assets map[string]Asset `json:"assets"`
Complete bool `json:"complete"`
}
// GetAssetDir returns the asset directory for a schematic

View File

@@ -46,7 +46,7 @@ func NewManager(dataDir string) *Manager {
// GetBackupDir returns the backup directory for an instance
func (m *Manager) GetBackupDir(instanceName string) string {
return filepath.Join(m.dataDir, "instances", instanceName, "backups")
return tools.GetInstanceBackupsPath(m.dataDir, instanceName)
}
// GetStagingDir returns the staging directory for backups

View File

@@ -1,6 +1,7 @@
package cluster
import (
"context"
"encoding/json"
"fmt"
"log"
@@ -10,6 +11,7 @@ import (
"strings"
"time"
"github.com/wild-cloud/wild-central/daemon/internal/operations"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
@@ -18,13 +20,15 @@ import (
type Manager struct {
dataDir string
talosctl *tools.Talosctl
opsMgr *operations.Manager
}
// NewManager creates a new cluster manager
func NewManager(dataDir string) *Manager {
func NewManager(dataDir string, opsMgr *operations.Manager) *Manager {
return &Manager{
dataDir: dataDir,
talosctl: tools.NewTalosctl(),
opsMgr: opsMgr,
}
}
@@ -48,7 +52,7 @@ type ClusterStatus struct {
// GetTalosDir returns the talos directory for an instance
func (m *Manager) GetTalosDir(instanceName string) string {
return filepath.Join(m.dataDir, "instances", instanceName, "talos")
return tools.GetInstanceTalosPath(m.dataDir, instanceName)
}
// GetGeneratedDir returns the generated config directory
@@ -96,12 +100,28 @@ func (m *Manager) GenerateConfig(instanceName string, config *ClusterConfig) err
return nil
}
// Bootstrap bootstraps the cluster on the specified node
func (m *Manager) Bootstrap(instanceName, nodeName string) error {
// Get node configuration to find the target IP
instancePath := filepath.Join(m.dataDir, "instances", instanceName)
configPath := filepath.Join(instancePath, "config.yaml")
// Bootstrap bootstraps the cluster on the specified node with progress tracking
func (m *Manager) Bootstrap(instanceName, nodeName string) (string, error) {
// Create operation for tracking
opID, err := m.opsMgr.Start(instanceName, "bootstrap", nodeName)
if err != nil {
return "", fmt.Errorf("failed to start bootstrap operation: %w", err)
}
// Run bootstrap asynchronously
go func() {
if err := m.runBootstrapWithTracking(instanceName, nodeName, opID); err != nil {
_ = m.opsMgr.Update(instanceName, opID, "failed", err.Error(), 0)
}
}()
return opID, nil
}
// runBootstrapWithTracking runs the bootstrap process with detailed progress tracking
func (m *Manager) runBootstrapWithTracking(instanceName, nodeName, opID string) error {
ctx := context.Background()
configPath := tools.GetInstanceConfigPath(m.dataDir, instanceName)
yq := tools.NewYQ()
// Get node's target IP
@@ -115,17 +135,71 @@ func (m *Manager) Bootstrap(instanceName, nodeName string) error {
return fmt.Errorf("node %s does not have a target IP configured", nodeName)
}
// Get talosconfig path for this instance
// Get VIP
vipRaw, err := yq.Get(configPath, ".cluster.nodes.control.vip")
if err != nil {
return fmt.Errorf("failed to get VIP: %w", err)
}
vip := tools.CleanYQOutput(vipRaw)
if vip == "" || vip == "null" {
return fmt.Errorf("control plane VIP not configured")
}
// Step 0: Run talosctl bootstrap
if err := m.runBootstrapCommand(instanceName, nodeIP, opID); err != nil {
return err
}
// Step 1: Wait for etcd health
if err := m.waitForEtcd(ctx, instanceName, nodeIP, opID); err != nil {
return err
}
// Step 2: Wait for VIP assignment
if err := m.waitForVIP(ctx, instanceName, nodeIP, vip, opID); err != nil {
return err
}
// Step 3: Wait for control plane components
if err := m.waitForControlPlane(ctx, instanceName, nodeIP, opID); err != nil {
return err
}
// Step 4: Wait for API server on VIP
if err := m.waitForAPIServer(ctx, instanceName, vip, opID); err != nil {
return err
}
// Step 5: Configure cluster access
if err := m.configureClusterAccess(instanceName, vip, opID); err != nil {
return err
}
// Step 6: Verify node registration
if err := m.waitForNodeRegistration(ctx, instanceName, opID); err != nil {
return err
}
// Mark as completed
_ = m.opsMgr.Update(instanceName, opID, "completed", "Bootstrap completed successfully", 100)
return nil
}
// runBootstrapCommand executes the initial bootstrap command
func (m *Manager) runBootstrapCommand(instanceName, nodeIP, opID string) error {
_ = m.opsMgr.UpdateBootstrapProgress(instanceName, opID, 0, "bootstrap", 1, 1, "Running talosctl bootstrap command")
talosconfigPath := tools.GetTalosconfigPath(m.dataDir, instanceName)
// Set talosctl endpoint (with proper context via TALOSCONFIG env var)
// Set talosctl endpoint
cmdEndpoint := exec.Command("talosctl", "config", "endpoint", nodeIP)
tools.WithTalosconfig(cmdEndpoint, talosconfigPath)
if output, err := cmdEndpoint.CombinedOutput(); err != nil {
return fmt.Errorf("failed to set talosctl endpoint: %w\nOutput: %s", err, string(output))
}
// Bootstrap command (with proper context via TALOSCONFIG env var)
// Bootstrap command
cmd := exec.Command("talosctl", "bootstrap", "--nodes", nodeIP)
tools.WithTalosconfig(cmd, talosconfigPath)
output, err := cmd.CombinedOutput()
@@ -133,16 +207,152 @@ func (m *Manager) Bootstrap(instanceName, nodeName string) error {
return fmt.Errorf("failed to bootstrap cluster: %w\nOutput: %s", err, string(output))
}
// Retrieve kubeconfig after bootstrap (best-effort with retry)
log.Printf("Waiting for Kubernetes API server to become ready...")
if err := m.retrieveKubeconfigFromCluster(instanceName, nodeIP, 5*time.Minute); err != nil {
log.Printf("Warning: %v", err)
log.Printf("You can retrieve it manually later using: wild cluster kubeconfig --generate")
return nil
}
// waitForEtcd waits for etcd to become healthy
func (m *Manager) waitForEtcd(ctx context.Context, instanceName, nodeIP, opID string) error {
maxAttempts := 30
talosconfigPath := tools.GetTalosconfigPath(m.dataDir, instanceName)
for attempt := 1; attempt <= maxAttempts; attempt++ {
_ = m.opsMgr.UpdateBootstrapProgress(instanceName, opID, 1, "etcd", attempt, maxAttempts, "Waiting for etcd to become healthy")
cmd := exec.Command("talosctl", "-n", nodeIP, "etcd", "status")
tools.WithTalosconfig(cmd, talosconfigPath)
output, err := cmd.CombinedOutput()
if err == nil && strings.Contains(string(output), nodeIP) {
return nil
}
if attempt < maxAttempts {
time.Sleep(10 * time.Second)
}
}
return fmt.Errorf("etcd did not become healthy after %d attempts", maxAttempts)
}
// waitForVIP waits for VIP to be assigned to the node
func (m *Manager) waitForVIP(ctx context.Context, instanceName, nodeIP, vip, opID string) error {
maxAttempts := 90
talosconfigPath := tools.GetTalosconfigPath(m.dataDir, instanceName)
for attempt := 1; attempt <= maxAttempts; attempt++ {
_ = m.opsMgr.UpdateBootstrapProgress(instanceName, opID, 2, "vip", attempt, maxAttempts, "Waiting for VIP assignment")
cmd := exec.Command("talosctl", "-n", nodeIP, "get", "addresses")
tools.WithTalosconfig(cmd, talosconfigPath)
output, err := cmd.CombinedOutput()
if err == nil && strings.Contains(string(output), vip+"/32") {
return nil
}
if attempt < maxAttempts {
time.Sleep(10 * time.Second)
}
}
return fmt.Errorf("VIP was not assigned after %d attempts", maxAttempts)
}
// waitForControlPlane waits for control plane components to start
func (m *Manager) waitForControlPlane(ctx context.Context, instanceName, nodeIP, opID string) error {
maxAttempts := 60
talosconfigPath := tools.GetTalosconfigPath(m.dataDir, instanceName)
for attempt := 1; attempt <= maxAttempts; attempt++ {
_ = m.opsMgr.UpdateBootstrapProgress(instanceName, opID, 3, "controlplane", attempt, maxAttempts, "Waiting for control plane components")
cmd := exec.Command("talosctl", "-n", nodeIP, "containers", "-k")
tools.WithTalosconfig(cmd, talosconfigPath)
output, err := cmd.CombinedOutput()
if err == nil && strings.Contains(string(output), "kube-") {
return nil
}
if attempt < maxAttempts {
time.Sleep(10 * time.Second)
}
}
return fmt.Errorf("control plane components did not start after %d attempts", maxAttempts)
}
// waitForAPIServer waits for Kubernetes API server to respond
func (m *Manager) waitForAPIServer(ctx context.Context, instanceName, vip, opID string) error {
maxAttempts := 60
apiURL := fmt.Sprintf("https://%s:6443/healthz", vip)
for attempt := 1; attempt <= maxAttempts; attempt++ {
_ = m.opsMgr.UpdateBootstrapProgress(instanceName, opID, 4, "apiserver", attempt, maxAttempts, "Waiting for Kubernetes API server")
cmd := exec.Command("curl", "-k", "-s", "--max-time", "5", apiURL)
output, err := cmd.CombinedOutput()
if err == nil && strings.Contains(string(output), "ok") {
return nil
}
if attempt < maxAttempts {
time.Sleep(10 * time.Second)
}
}
return fmt.Errorf("API server did not respond after %d attempts", maxAttempts)
}
// configureClusterAccess configures talosctl and kubectl to use the VIP
func (m *Manager) configureClusterAccess(instanceName, vip, opID string) error {
_ = m.opsMgr.UpdateBootstrapProgress(instanceName, opID, 5, "configure", 1, 1, "Configuring cluster access")
talosconfigPath := tools.GetTalosconfigPath(m.dataDir, instanceName)
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
// Set talosctl endpoint to VIP
cmdEndpoint := exec.Command("talosctl", "config", "endpoint", vip)
tools.WithTalosconfig(cmdEndpoint, talosconfigPath)
if output, err := cmdEndpoint.CombinedOutput(); err != nil {
return fmt.Errorf("failed to set talosctl endpoint: %w\nOutput: %s", err, string(output))
}
// Retrieve kubeconfig
cmdKubeconfig := exec.Command("talosctl", "kubeconfig", "--nodes", vip, kubeconfigPath)
tools.WithTalosconfig(cmdKubeconfig, talosconfigPath)
if output, err := cmdKubeconfig.CombinedOutput(); err != nil {
return fmt.Errorf("failed to retrieve kubeconfig: %w\nOutput: %s", err, string(output))
}
return nil
}
// waitForNodeRegistration waits for the node to register with Kubernetes
func (m *Manager) waitForNodeRegistration(ctx context.Context, instanceName, opID string) error {
maxAttempts := 10
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
for attempt := 1; attempt <= maxAttempts; attempt++ {
_ = m.opsMgr.UpdateBootstrapProgress(instanceName, opID, 6, "nodes", attempt, maxAttempts, "Waiting for node registration")
cmd := exec.Command("kubectl", "get", "nodes")
tools.WithKubeconfig(cmd, kubeconfigPath)
output, err := cmd.CombinedOutput()
if err == nil && strings.Contains(string(output), "Ready") {
return nil
}
if attempt < maxAttempts {
time.Sleep(10 * time.Second)
}
}
return fmt.Errorf("node did not register after %d attempts", maxAttempts)
}
// retrieveKubeconfigFromCluster retrieves kubeconfig from the cluster with retry logic
func (m *Manager) retrieveKubeconfigFromCluster(instanceName, nodeIP string, timeout time.Duration) error {
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
@@ -183,8 +393,7 @@ func (m *Manager) retrieveKubeconfigFromCluster(instanceName, nodeIP string, tim
// RegenerateKubeconfig regenerates the kubeconfig by retrieving it from the cluster
func (m *Manager) RegenerateKubeconfig(instanceName string) error {
instancePath := filepath.Join(m.dataDir, "instances", instanceName)
configPath := filepath.Join(instancePath, "config.yaml")
configPath := tools.GetInstanceConfigPath(m.dataDir, instanceName)
yq := tools.NewYQ()
@@ -206,8 +415,7 @@ func (m *Manager) RegenerateKubeconfig(instanceName string) error {
// ConfigureEndpoints updates talosconfig to use VIP and retrieves kubeconfig
func (m *Manager) ConfigureEndpoints(instanceName string, includeNodes bool) error {
instancePath := filepath.Join(m.dataDir, "instances", instanceName)
configPath := filepath.Join(instancePath, "config.yaml")
configPath := tools.GetInstanceConfigPath(m.dataDir, instanceName)
talosconfigPath := tools.GetTalosconfigPath(m.dataDir, instanceName)
yq := tools.NewYQ()
@@ -276,7 +484,8 @@ func (m *Manager) GetStatus(instanceName string) (*ClusterStatus, error) {
}
// Get node count and types using kubectl
cmd := exec.Command("kubectl", "--kubeconfig", kubeconfigPath, "get", "nodes", "-o", "json")
cmd := exec.Command("kubectl", "get", "nodes", "-o", "json")
tools.WithKubeconfig(cmd, kubeconfigPath)
output, err := cmd.Output()
if err != nil {
status.Status = "unreachable"
@@ -359,9 +568,9 @@ func (m *Manager) GetStatus(instanceName string) (*ClusterStatus, error) {
}
for _, svc := range services {
cmd := exec.Command("kubectl", "--kubeconfig", kubeconfigPath,
"get", "pods", "-n", svc.namespace, "-l", svc.selector,
cmd := exec.Command("kubectl", "get", "pods", "-n", svc.namespace, "-l", svc.selector,
"-o", "jsonpath={.items[*].status.phase}")
tools.WithKubeconfig(cmd, kubeconfigPath)
output, err := cmd.Output()
if err != nil || len(output) == 0 {
status.Services[svc.name] = "not_found"

View File

@@ -109,14 +109,14 @@ type InstanceConfig struct {
IP string `yaml:"ip" json:"ip"`
ExternalResolver string `yaml:"externalResolver" json:"externalResolver"`
} `yaml:"dns" json:"dns"`
DHCPRange string `yaml:"dhcpRange" json:"dhcpRange"`
Dnsmasq struct {
DHCPRange string `yaml:"dhcpRange" json:"dhcpRange"`
Dnsmasq struct {
Interface string `yaml:"interface" json:"interface"`
} `yaml:"dnsmasq" json:"dnsmasq"`
BaseDomain string `yaml:"baseDomain" json:"baseDomain"`
Domain string `yaml:"domain" json:"domain"`
InternalDomain string `yaml:"internalDomain" json:"internalDomain"`
NFS struct {
BaseDomain string `yaml:"baseDomain" json:"baseDomain"`
Domain string `yaml:"domain" json:"domain"`
InternalDomain string `yaml:"internalDomain" json:"internalDomain"`
NFS struct {
MediaPath string `yaml:"mediaPath" json:"mediaPath"`
Host string `yaml:"host" json:"host"`
StorageCapacity string `yaml:"storageCapacity" json:"storageCapacity"`

View File

@@ -152,16 +152,19 @@ func (m *Manager) CopyConfig(srcPath, dstPath string) error {
}
// GetInstanceConfigPath returns the path to an instance's config file
// Deprecated: Use tools.GetInstanceConfigPath instead
func GetInstanceConfigPath(dataDir, instanceName string) string {
return filepath.Join(dataDir, "instances", instanceName, "config.yaml")
return tools.GetInstanceConfigPath(dataDir, instanceName)
}
// GetInstanceSecretsPath returns the path to an instance's secrets file
// Deprecated: Use tools.GetInstanceSecretsPath instead
func GetInstanceSecretsPath(dataDir, instanceName string) string {
return filepath.Join(dataDir, "instances", instanceName, "secrets.yaml")
return tools.GetInstanceSecretsPath(dataDir, instanceName)
}
// GetInstancePath returns the path to an instance directory
// Deprecated: Use tools.GetInstancePath instead
func GetInstancePath(dataDir, instanceName string) string {
return filepath.Join(dataDir, "instances", instanceName)
return tools.GetInstancePath(dataDir, instanceName)
}

View File

@@ -6,6 +6,7 @@ import (
"strings"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// Manager handles current instance context tracking
@@ -53,7 +54,7 @@ func (m *Manager) SetCurrentContext(instanceName string) error {
}
// Verify instance exists
instancePath := filepath.Join(m.dataDir, "instances", instanceName)
instancePath := tools.GetInstancePath(m.dataDir, instanceName)
if !storage.FileExists(instancePath) {
return fmt.Errorf("instance %s does not exist", instanceName)
}
@@ -101,7 +102,7 @@ func (m *Manager) ValidateContext() error {
return err
}
instancePath := filepath.Join(m.dataDir, "instances", contextName)
instancePath := tools.GetInstancePath(m.dataDir, contextName)
if !storage.FileExists(instancePath) {
return fmt.Errorf("current context %s points to non-existent instance", contextName)
}
@@ -116,7 +117,7 @@ func (m *Manager) GetCurrentInstancePath() (string, error) {
return "", err
}
return filepath.Join(m.dataDir, "instances", contextName), nil
return tools.GetInstancePath(m.dataDir, contextName), nil
}
// GetCurrentInstanceConfigPath returns the path to the current instance's config file

View File

@@ -0,0 +1,339 @@
// Package contracts contains API contracts for service management endpoints
package contracts
import "time"
// ==============================
// Request/Response Types
// ==============================
// ServiceManifest represents basic service information
type ServiceManifest struct {
Name string `json:"name"`
Description string `json:"description"`
Namespace string `json:"namespace"`
ConfigReferences []string `json:"configReferences,omitempty"`
ServiceConfig map[string]ConfigDefinition `json:"serviceConfig,omitempty"`
}
// ConfigDefinition defines config that should be prompted during service setup
type ConfigDefinition struct {
Path string `json:"path"`
Prompt string `json:"prompt"`
Default string `json:"default"`
Type string `json:"type,omitempty"`
}
// PodStatus represents the status of a single pod
type PodStatus struct {
Name string `json:"name"` // Pod name
Status string `json:"status"` // Pod phase: Running, Pending, Failed, etc.
Ready string `json:"ready"` // Ready containers e.g. "1/1", "0/1"
Restarts int `json:"restarts"` // Container restart count
Age string `json:"age"` // Human-readable age e.g. "2h", "5m"
Node string `json:"node"` // Node name where pod is scheduled
IP string `json:"ip,omitempty"` // Pod IP if available
}
// DetailedServiceStatus provides comprehensive service status
type DetailedServiceStatus struct {
Name string `json:"name"` // Service name
Namespace string `json:"namespace"` // Kubernetes namespace
DeploymentStatus string `json:"deploymentStatus"` // "Ready", "Progressing", "Degraded", "NotFound"
Replicas ReplicaStatus `json:"replicas"` // Desired/current/ready replicas
Pods []PodStatus `json:"pods"` // Pod details
Config map[string]interface{} `json:"config,omitempty"` // Current config from config.yaml
Manifest *ServiceManifest `json:"manifest,omitempty"` // Service manifest if available
LastUpdated time.Time `json:"lastUpdated"` // Timestamp of status
}
// ReplicaStatus tracks deployment replica counts
type ReplicaStatus struct {
Desired int32 `json:"desired"` // Desired replica count
Current int32 `json:"current"` // Current replica count
Ready int32 `json:"ready"` // Ready replica count
Available int32 `json:"available"` // Available replica count
}
// ServiceLogsRequest query parameters for log retrieval
type ServiceLogsRequest struct {
Container string `json:"container,omitempty"` // Specific container name (optional)
Tail int `json:"tail,omitempty"` // Number of lines from end (default: 100)
Follow bool `json:"follow,omitempty"` // Stream logs via SSE
Previous bool `json:"previous,omitempty"` // Get previous container logs
Since string `json:"since,omitempty"` // RFC3339 or duration string e.g. "10m"
}
// ServiceLogsResponse for non-streaming log retrieval
type ServiceLogsResponse struct {
Service string `json:"service"` // Service name
Namespace string `json:"namespace"` // Kubernetes namespace
Container string `json:"container,omitempty"` // Container name if specified
Lines []string `json:"lines"` // Log lines
Truncated bool `json:"truncated"` // Whether logs were truncated
Timestamp time.Time `json:"timestamp"` // Response timestamp
}
// ServiceLogsSSEEvent for streaming logs via Server-Sent Events
type ServiceLogsSSEEvent struct {
Type string `json:"type"` // "log", "error", "end"
Line string `json:"line,omitempty"` // Log line content
Error string `json:"error,omitempty"` // Error message if type="error"
Container string `json:"container,omitempty"` // Container source
Timestamp time.Time `json:"timestamp"` // Event timestamp
}
// ServiceConfigUpdate request to update service configuration
type ServiceConfigUpdate struct {
Config map[string]interface{} `json:"config"` // Configuration updates
Redeploy bool `json:"redeploy"` // Trigger recompilation/redeployment
Fetch bool `json:"fetch"` // Fetch fresh templates before redeployment
}
// ServiceConfigResponse response after config update
type ServiceConfigResponse struct {
Service string `json:"service"` // Service name
Namespace string `json:"namespace"` // Kubernetes namespace
Config map[string]interface{} `json:"config"` // Updated configuration
Redeployed bool `json:"redeployed"` // Whether service was redeployed
Message string `json:"message"` // Success/info message
}
// ==============================
// Error Response
// ==============================
// ErrorResponse standard error format for all endpoints
type ErrorResponse struct {
Error ErrorDetail `json:"error"`
}
// ErrorDetail contains error information
type ErrorDetail struct {
Code string `json:"code"` // Machine-readable error code
Message string `json:"message"` // Human-readable error message
Details map[string]interface{} `json:"details,omitempty"` // Additional error context
}
// Standard error codes
const (
ErrCodeNotFound = "SERVICE_NOT_FOUND"
ErrCodeInstanceNotFound = "INSTANCE_NOT_FOUND"
ErrCodeInvalidRequest = "INVALID_REQUEST"
ErrCodeKubectlFailed = "KUBECTL_FAILED"
ErrCodeConfigInvalid = "CONFIG_INVALID"
ErrCodeDeploymentFailed = "DEPLOYMENT_FAILED"
ErrCodeStreamingError = "STREAMING_ERROR"
ErrCodeInternalError = "INTERNAL_ERROR"
)
// ==============================
// API Endpoint Specifications
// ==============================
/*
1. GET /api/v1/instances/{name}/services/{service}/status
Purpose: Returns comprehensive service status including pods and health
Response Codes:
- 200 OK: Service status retrieved successfully
- 404 Not Found: Instance or service not found
- 500 Internal Server Error: kubectl command failed
Example Request:
GET /api/v1/instances/production/services/nginx/status
Example Response (200 OK):
{
"name": "nginx",
"namespace": "nginx",
"deploymentStatus": "Ready",
"replicas": {
"desired": 3,
"current": 3,
"ready": 3,
"available": 3
},
"pods": [
{
"name": "nginx-7c5464c66d-abc123",
"status": "Running",
"ready": "1/1",
"restarts": 0,
"age": "2h",
"node": "worker-1",
"ip": "10.42.1.5"
}
],
"config": {
"nginx.image": "nginx:1.21",
"nginx.replicas": 3
},
"manifest": {
"name": "nginx",
"description": "NGINX web server",
"namespace": "nginx"
},
"lastUpdated": "2024-01-15T10:30:00Z"
}
Example Error Response (404):
{
"error": {
"code": "SERVICE_NOT_FOUND",
"message": "Service nginx not found in instance production",
"details": {
"instance": "production",
"service": "nginx"
}
}
}
*/
/*
2. GET /api/v1/instances/{name}/services/{service}/logs
Purpose: Retrieve or stream service logs
Query Parameters:
- container (string): Specific container name
- tail (int): Number of lines from end (default: 100, max: 5000)
- follow (bool): Stream logs via SSE (default: false)
- previous (bool): Get previous container logs (default: false)
- since (string): RFC3339 timestamp or duration (e.g. "10m")
Response Codes:
- 200 OK: Logs retrieved successfully (or SSE stream started)
- 400 Bad Request: Invalid query parameters
- 404 Not Found: Instance, service, or container not found
- 500 Internal Server Error: kubectl command failed
Example Request (buffered):
GET /api/v1/instances/production/services/nginx/logs?tail=50
Example Response (200 OK):
{
"service": "nginx",
"namespace": "nginx",
"container": "nginx",
"lines": [
"2024/01/15 10:00:00 [notice] Configuration loaded",
"2024/01/15 10:00:01 [info] Server started on port 80"
],
"truncated": false,
"timestamp": "2024-01-15T10:30:00Z"
}
Example Request (streaming):
GET /api/v1/instances/production/services/nginx/logs?follow=true
Accept: text/event-stream
Example SSE Response:
data: {"type":"log","line":"2024/01/15 10:00:00 [notice] Configuration loaded","container":"nginx","timestamp":"2024-01-15T10:30:00Z"}
data: {"type":"log","line":"2024/01/15 10:00:01 [info] Request from 10.0.0.1","container":"nginx","timestamp":"2024-01-15T10:30:01Z"}
data: {"type":"error","error":"Container restarting","timestamp":"2024-01-15T10:30:02Z"}
data: {"type":"end","timestamp":"2024-01-15T10:30:03Z"}
*/
/*
3. PATCH /api/v1/instances/{name}/services/{service}/config
Purpose: Update service configuration in config.yaml and optionally redeploy
Request Body: ServiceConfigUpdate (JSON)
Response Codes:
- 200 OK: Configuration updated successfully
- 400 Bad Request: Invalid configuration
- 404 Not Found: Instance or service not found
- 500 Internal Server Error: Update or deployment failed
Example Request:
PATCH /api/v1/instances/production/services/nginx/config
Content-Type: application/json
{
"config": {
"nginx.image": "nginx:1.22",
"nginx.replicas": 5,
"nginx.resources.memory": "512Mi"
},
"redeploy": true
}
Example Response (200 OK):
{
"service": "nginx",
"namespace": "nginx",
"config": {
"nginx.image": "nginx:1.22",
"nginx.replicas": 5,
"nginx.resources.memory": "512Mi"
},
"redeployed": true,
"message": "Service configuration updated and redeployed successfully"
}
Example Error Response (400):
{
"error": {
"code": "CONFIG_INVALID",
"message": "Invalid configuration: nginx.replicas must be a positive integer",
"details": {
"field": "nginx.replicas",
"value": -1,
"constraint": "positive integer"
}
}
}
*/
// ==============================
// Validation Rules
// ==============================
/*
Query Parameter Validation:
ServiceLogsRequest:
- tail: Must be between 1 and 5000 (default: 100)
- since: Must be valid RFC3339 timestamp or Go duration string (e.g. "5m", "1h")
- container: Must match existing container name if specified
- follow: When true, response uses Server-Sent Events (SSE)
- previous: Cannot be combined with follow=true
ServiceConfigUpdate:
- config: Must be valid YAML-compatible structure
- config keys: Must follow service's expected configuration schema
- redeploy: When true, triggers kustomize recompilation and kubectl apply
Path Parameters:
- instance name: Must match existing instance directory
- service name: Must match installed service name
*/
// ==============================
// HTTP Status Code Summary
// ==============================
/*
200 OK:
- Service status retrieved successfully
- Logs retrieved successfully (non-streaming)
- Configuration updated successfully
400 Bad Request:
- Invalid query parameters
- Invalid configuration in request body
- Validation errors
404 Not Found:
- Instance does not exist
- Service not installed in instance
- Container name not found (for logs)
500 Internal Server Error:
- kubectl command execution failed
- File system operations failed
- Unexpected errors during processing
*/

View File

@@ -3,6 +3,7 @@ package discovery
import (
"encoding/json"
"fmt"
"net"
"os"
"path/filepath"
"sync"
@@ -24,23 +25,21 @@ type Manager struct {
// NewManager creates a new discovery manager
func NewManager(dataDir string, instanceName string) *Manager {
// Get talosconfig path for the instance
talosconfigPath := filepath.Join(dataDir, "instances", instanceName, "setup", "cluster-nodes", "generated", "talosconfig")
talosconfigPath := tools.GetTalosconfigPath(dataDir, instanceName)
return &Manager{
dataDir: dataDir,
nodeMgr: node.NewManager(dataDir),
nodeMgr: node.NewManager(dataDir, instanceName),
talosctl: tools.NewTalosconfigWithConfig(talosconfigPath),
}
}
// DiscoveredNode represents a discovered node on the network
// DiscoveredNode represents a discovered node on the network (maintenance mode only)
type DiscoveredNode struct {
IP string `json:"ip"`
Hostname string `json:"hostname,omitempty"`
MaintenanceMode bool `json:"maintenance_mode"`
Version string `json:"version,omitempty"`
Interface string `json:"interface,omitempty"`
Disks []string `json:"disks,omitempty"`
IP string `json:"ip"`
Hostname string `json:"hostname,omitempty"`
MaintenanceMode bool `json:"maintenance_mode"`
Version string `json:"version,omitempty"`
}
// DiscoveryStatus represents the current state of discovery
@@ -53,7 +52,7 @@ type DiscoveryStatus struct {
// GetDiscoveryDir returns the discovery directory for an instance
func (m *Manager) GetDiscoveryDir(instanceName string) string {
return filepath.Join(m.dataDir, "instances", instanceName, "discovery")
return tools.GetInstanceDiscoveryPath(m.dataDir, instanceName)
}
// GetDiscoveryStatusPath returns the path to discovery status file
@@ -127,61 +126,69 @@ func (m *Manager) runDiscovery(instanceName string, ipList []string) {
status, _ := m.GetDiscoveryStatus(instanceName)
status.Active = false
m.writeDiscoveryStatus(instanceName, status)
_ = m.writeDiscoveryStatus(instanceName, status)
}()
// Discover nodes by probing each IP
discoveredNodes := []DiscoveredNode{}
// Discover nodes by probing each IP in parallel
var wg sync.WaitGroup
resultsChan := make(chan DiscoveredNode, len(ipList))
// Limit concurrent scans to avoid overwhelming the network
semaphore := make(chan struct{}, 50)
for _, ip := range ipList {
node, err := m.probeNode(ip)
if err != nil {
// Node not reachable or not a Talos node
continue
}
wg.Add(1)
go func(ip string) {
defer wg.Done()
discoveredNodes = append(discoveredNodes, *node)
// Acquire semaphore
semaphore <- struct{}{}
defer func() { <-semaphore }()
node, err := m.probeNode(ip)
if err != nil {
// Node not reachable or not a Talos node
return
}
resultsChan <- *node
}(ip)
}
// Close results channel when all goroutines complete
go func() {
wg.Wait()
close(resultsChan)
}()
// Collect results and update status incrementally
discoveredNodes := []DiscoveredNode{}
for node := range resultsChan {
discoveredNodes = append(discoveredNodes, node)
// Update status incrementally
m.discoveryMu.Lock()
status, _ := m.GetDiscoveryStatus(instanceName)
status.NodesFound = discoveredNodes
m.writeDiscoveryStatus(instanceName, status)
_ = m.writeDiscoveryStatus(instanceName, status)
m.discoveryMu.Unlock()
}
}
// probeNode attempts to detect if a node is running Talos
// probeNode attempts to detect if a node is running Talos in maintenance mode
func (m *Manager) probeNode(ip string) (*DiscoveredNode, error) {
// Attempt to get version (quick connectivity test)
version, err := m.talosctl.GetVersion(ip, false)
// Try insecure connection first (maintenance mode)
version, err := m.talosctl.GetVersion(ip, true)
if err != nil {
// Not in maintenance mode or not reachable
return nil, err
}
// Node is reachable, get hardware info
hwInfo, err := m.nodeMgr.DetectHardware(ip)
if err != nil {
// Still count it as discovered even if we can't get full hardware
return &DiscoveredNode{
IP: ip,
MaintenanceMode: false,
Version: version,
}, nil
}
// Extract just the disk paths for discovery output
diskPaths := make([]string, len(hwInfo.Disks))
for i, disk := range hwInfo.Disks {
diskPaths[i] = disk.Path
}
// If insecure connection works, node is in maintenance mode
return &DiscoveredNode{
IP: ip,
MaintenanceMode: hwInfo.MaintenanceMode,
MaintenanceMode: true,
Version: version,
Interface: hwInfo.Interface,
Disks: diskPaths,
}, nil
}
@@ -245,3 +252,132 @@ func (m *Manager) writeDiscoveryStatus(instanceName string, status *DiscoverySta
return nil
}
// CancelDiscovery cancels an in-progress discovery operation
func (m *Manager) CancelDiscovery(instanceName string) error {
m.discoveryMu.Lock()
defer m.discoveryMu.Unlock()
// Get current status
status, err := m.GetDiscoveryStatus(instanceName)
if err != nil {
return err
}
if !status.Active {
return fmt.Errorf("no discovery in progress")
}
// Mark discovery as cancelled
status.Active = false
status.Error = "Discovery cancelled by user"
if err := m.writeDiscoveryStatus(instanceName, status); err != nil {
return err
}
return nil
}
// GetLocalNetworks discovers local network interfaces and returns their CIDR addresses
// Skips loopback, link-local, and down interfaces
// Only returns IPv4 networks
func GetLocalNetworks() ([]string, error) {
interfaces, err := net.Interfaces()
if err != nil {
return nil, fmt.Errorf("failed to get network interfaces: %w", err)
}
var networks []string
for _, iface := range interfaces {
// Skip loopback and down interfaces
if iface.Flags&net.FlagLoopback != 0 || iface.Flags&net.FlagUp == 0 {
continue
}
addrs, err := iface.Addrs()
if err != nil {
continue
}
for _, addr := range addrs {
ipnet, ok := addr.(*net.IPNet)
if !ok {
continue
}
// Only IPv4 for now
if ipnet.IP.To4() == nil {
continue
}
// Skip link-local addresses (169.254.0.0/16)
if ipnet.IP.IsLinkLocalUnicast() {
continue
}
networks = append(networks, ipnet.String())
}
}
return networks, nil
}
// ExpandSubnet expands a CIDR notation subnet into individual IP addresses
// Example: "192.168.8.0/24" → ["192.168.8.1", "192.168.8.2", ..., "192.168.8.254"]
// Also handles single IPs (without CIDR notation)
func ExpandSubnet(subnet string) ([]string, error) {
// Check if it's a CIDR notation
ip, ipnet, err := net.ParseCIDR(subnet)
if err != nil {
// Not a CIDR, might be single IP
if net.ParseIP(subnet) != nil {
return []string{subnet}, nil
}
return nil, fmt.Errorf("invalid IP or CIDR: %s", subnet)
}
// Special case: /32 (single host) - just return the IP
ones, _ := ipnet.Mask.Size()
if ones == 32 {
return []string{ip.String()}, nil
}
var ips []string
// Iterate through all IPs in the subnet
for ip := ip.Mask(ipnet.Mask); ipnet.Contains(ip); incIP(ip) {
// Skip network address (first IP)
if ip.Equal(ipnet.IP) {
continue
}
// Skip broadcast address (last IP)
if isLastIP(ip, ipnet) {
continue
}
ips = append(ips, ip.String())
}
return ips, nil
}
// incIP increments an IP address
func incIP(ip net.IP) {
for j := len(ip) - 1; j >= 0; j-- {
ip[j]++
if ip[j] > 0 {
break
}
}
}
// isLastIP checks if an IP is the last IP in a subnet (broadcast address)
func isLastIP(ip net.IP, ipnet *net.IPNet) bool {
lastIP := make(net.IP, len(ip))
for i := range ip {
lastIP[i] = ip[i] | ^ipnet.Mask[i]
}
return ip.Equal(lastIP)
}

View File

@@ -89,11 +89,11 @@ func (g *ConfigGenerator) RestartService() error {
// ServiceStatus represents the status of the dnsmasq service
type ServiceStatus struct {
Status string `json:"status"`
PID int `json:"pid"`
ConfigFile string `json:"config_file"`
InstancesConfigured int `json:"instances_configured"`
LastRestart time.Time `json:"last_restart"`
Status string `json:"status"`
PID int `json:"pid"`
ConfigFile string `json:"config_file"`
InstancesConfigured int `json:"instances_configured"`
LastRestart time.Time `json:"last_restart"`
}
// GetStatus checks the status of the dnsmasq service

View File

@@ -9,6 +9,7 @@ import (
"github.com/wild-cloud/wild-central/daemon/internal/context"
"github.com/wild-cloud/wild-central/daemon/internal/secrets"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// Manager handles instance lifecycle operations
@@ -38,18 +39,21 @@ type Instance struct {
}
// GetInstancePath returns the path to an instance directory
// Deprecated: Use tools.GetInstancePath instead
func (m *Manager) GetInstancePath(name string) string {
return filepath.Join(m.dataDir, "instances", name)
return tools.GetInstancePath(m.dataDir, name)
}
// GetInstanceConfigPath returns the path to an instance's config file
// Deprecated: Use tools.GetInstanceConfigPath instead
func (m *Manager) GetInstanceConfigPath(name string) string {
return filepath.Join(m.GetInstancePath(name), "config.yaml")
return tools.GetInstanceConfigPath(m.dataDir, name)
}
// GetInstanceSecretsPath returns the path to an instance's secrets file
// Deprecated: Use tools.GetInstanceSecretsPath instead
func (m *Manager) GetInstanceSecretsPath(name string) string {
return filepath.Join(m.GetInstancePath(name), "secrets.yaml")
return tools.GetInstanceSecretsPath(m.dataDir, name)
}
// InstanceExists checks if an instance exists
@@ -71,7 +75,7 @@ func (m *Manager) CreateInstance(name string) error {
}
// Acquire lock for instance creation
lockPath := filepath.Join(m.dataDir, "instances", ".lock")
lockPath := tools.GetInstancesLockPath(m.dataDir)
return storage.WithLock(lockPath, func() error {
// Create instance directory
if err := storage.EnsureDir(instancePath, 0755); err != nil {
@@ -123,7 +127,7 @@ func (m *Manager) DeleteInstance(name string) error {
}
// Acquire lock for instance deletion
lockPath := filepath.Join(m.dataDir, "instances", ".lock")
lockPath := tools.GetInstancesLockPath(m.dataDir)
return storage.WithLock(lockPath, func() error {
// Remove instance directory
if err := os.RemoveAll(instancePath); err != nil {
@@ -136,7 +140,7 @@ func (m *Manager) DeleteInstance(name string) error {
// ListInstances returns a list of all instance names
func (m *Manager) ListInstances() ([]string, error) {
instancesDir := filepath.Join(m.dataDir, "instances")
instancesDir := tools.GetInstancesPath(m.dataDir)
// Ensure instances directory exists
if !storage.FileExists(instancesDir) {

View File

@@ -20,11 +20,22 @@ type Manager struct {
}
// NewManager creates a new node manager
func NewManager(dataDir string) *Manager {
func NewManager(dataDir string, instanceName string) *Manager {
var talosctl *tools.Talosctl
// If instanceName is provided, use instance-specific talosconfig
// Otherwise, create basic talosctl (will use --insecure mode)
if instanceName != "" {
talosconfigPath := tools.GetTalosconfigPath(dataDir, instanceName)
talosctl = tools.NewTalosconfigWithConfig(talosconfigPath)
} else {
talosctl = tools.NewTalosctl()
}
return &Manager{
dataDir: dataDir,
configMgr: config.NewManager(),
talosctl: tools.NewTalosctl(),
talosctl: talosctl,
}
}
@@ -59,7 +70,7 @@ type ApplyOptions struct {
// GetInstancePath returns the path to an instance's nodes directory
func (m *Manager) GetInstancePath(instanceName string) string {
return filepath.Join(m.dataDir, "instances", instanceName)
return tools.GetInstancePath(m.dataDir, instanceName)
}
// List returns all nodes for an instance
@@ -254,12 +265,14 @@ func (m *Manager) Delete(instanceName, nodeIdentifier string) error {
configPath := filepath.Join(instancePath, "config.yaml")
// Delete node from config.yaml
// Path: cluster.nodes.active.{hostname}
nodePath := fmt.Sprintf("cluster.nodes.active.%s", node.Hostname)
// Path: .cluster.nodes.active["hostname"]
// Use bracket notation to safely handle hostnames with special characters
nodePath := fmt.Sprintf(".cluster.nodes.active[\"%s\"]", node.Hostname)
yq := tools.NewYQ()
// Use yq to delete the node
_, err = yq.Exec("eval", "-i", fmt.Sprintf("del(%s)", nodePath), configPath)
delExpr := fmt.Sprintf("del(%s)", nodePath)
_, err = yq.Exec("eval", "-i", delExpr, configPath)
if err != nil {
return fmt.Errorf("failed to delete node: %w", err)
}
@@ -268,10 +281,20 @@ func (m *Manager) Delete(instanceName, nodeIdentifier string) error {
}
// DetectHardware queries node hardware information via talosctl
// Automatically detects maintenance mode by trying insecure first, then secure
func (m *Manager) DetectHardware(nodeIP string) (*HardwareInfo, error) {
// Query node with insecure flag (maintenance mode)
insecure := true
// Try insecure first (maintenance mode)
hwInfo, err := m.detectHardwareWithMode(nodeIP, true)
if err == nil {
return hwInfo, nil
}
// Fall back to secure (configured node)
return m.detectHardwareWithMode(nodeIP, false)
}
// detectHardwareWithMode queries node hardware with specified connection mode
func (m *Manager) detectHardwareWithMode(nodeIP string, insecure bool) (*HardwareInfo, error) {
// Try to get default interface (with default route)
iface, err := m.talosctl.GetDefaultInterface(nodeIP, insecure)
if err != nil {
@@ -299,10 +322,11 @@ func (m *Manager) DetectHardware(nodeIP string) (*HardwareInfo, error) {
Interface: iface,
Disks: disks,
SelectedDisk: selectedDisk,
MaintenanceMode: true,
MaintenanceMode: insecure, // If we used insecure, it's in maintenance mode
}, nil
}
// Apply generates configuration and applies it to node
// This follows the wild-node-apply flow:
// 1. Auto-fetch templates if missing
@@ -380,9 +404,9 @@ func (m *Manager) Apply(instanceName, nodeIdentifier string, opts ApplyOptions)
// Determine which IP to use and whether node is in maintenance mode
//
// Three scenarios:
// 1. Production node (currentIP empty/same, maintenance=false): use targetIP, no --insecure
// 1. Production node (already applied, maintenance=false): use targetIP, no --insecure
// 2. IP changing (currentIP != targetIP): use currentIP, --insecure (always maintenance)
// 3. Maintenance at target (maintenance=true, no IP change): use targetIP, --insecure
// 3. Fresh/maintenance node (never applied OR maintenance=true): use targetIP, --insecure
var deployIP string
var maintenanceMode bool
@@ -390,12 +414,13 @@ func (m *Manager) Apply(instanceName, nodeIdentifier string, opts ApplyOptions)
// Scenario 2: IP is changing - node is at currentIP, moving to targetIP
deployIP = node.CurrentIP
maintenanceMode = true
} else if node.Maintenance {
// Scenario 3: Explicit maintenance mode, no IP change
} else if node.Maintenance || !node.Applied {
// Scenario 3: Explicit maintenance mode OR never been applied (fresh node)
// Fresh nodes need --insecure because they have self-signed certificates
deployIP = node.TargetIP
maintenanceMode = true
} else {
// Scenario 1: Production node at target IP
// Scenario 1: Production node at target IP (already applied, not in maintenance)
deployIP = node.TargetIP
maintenanceMode = false
}
@@ -535,16 +560,6 @@ func (m *Manager) extractEmbeddedTemplates(destDir string) error {
return nil
}
// copyFile copies a file from src to dst
func (m *Manager) copyFile(src, dst string) error {
data, err := os.ReadFile(src)
if err != nil {
return err
}
return os.WriteFile(dst, data, 0644)
}
// updateNodeStatus updates node status flags in config.yaml
func (m *Manager) updateNodeStatus(instanceName string, node *Node) error {
instancePath := m.GetInstancePath(instanceName)

View File

@@ -8,8 +8,12 @@ import (
"time"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// Bootstrap step constants
const totalBootstrapSteps = 7
// Manager handles async operation tracking
type Manager struct {
dataDir string
@@ -22,23 +26,38 @@ func NewManager(dataDir string) *Manager {
}
}
// BootstrapProgress tracks detailed bootstrap progress
type BootstrapProgress struct {
CurrentStep int `json:"current_step"` // 0-6
StepName string `json:"step_name"`
Attempt int `json:"attempt"`
MaxAttempts int `json:"max_attempts"`
StepDescription string `json:"step_description"`
}
// OperationDetails contains operation-specific details
type OperationDetails struct {
BootstrapProgress *BootstrapProgress `json:"bootstrap,omitempty"`
}
// Operation represents a long-running operation
type Operation struct {
ID string `json:"id"`
Type string `json:"type"` // discover, setup, download, bootstrap
Target string `json:"target"`
Instance string `json:"instance"`
Status string `json:"status"` // pending, running, completed, failed, cancelled
Message string `json:"message,omitempty"`
Progress int `json:"progress"` // 0-100
LogFile string `json:"logFile,omitempty"` // Path to output log file
StartedAt time.Time `json:"started_at"`
EndedAt time.Time `json:"ended_at,omitempty"`
ID string `json:"id"`
Type string `json:"type"` // discover, setup, download, bootstrap
Target string `json:"target"`
Instance string `json:"instance"`
Status string `json:"status"` // pending, running, completed, failed, cancelled
Message string `json:"message,omitempty"`
Progress int `json:"progress"` // 0-100
Details *OperationDetails `json:"details,omitempty"` // Operation-specific details
LogFile string `json:"logFile,omitempty"` // Path to output log file
StartedAt time.Time `json:"started_at"`
EndedAt time.Time `json:"ended_at,omitempty"`
}
// GetOperationsDir returns the operations directory for an instance
func (m *Manager) GetOperationsDir(instanceName string) string {
return filepath.Join(m.dataDir, "instances", instanceName, "operations")
return tools.GetInstanceOperationsPath(m.dataDir, instanceName)
}
// generateID generates a unique operation ID
@@ -78,19 +97,6 @@ func (m *Manager) Start(instanceName, opType, target string) (string, error) {
return opID, nil
}
// Get returns operation status
func (m *Manager) Get(opID string) (*Operation, error) {
// Operation ID contains instance name, but we need to find it
// For now, we'll scan all instances (not ideal but simple)
// Better approach: encode instance in operation ID or maintain index
// Simplified: assume operation ID format is op_{type}_{target}_{timestamp}
// We need to know which instance to look in
// For now, return error if we can't find it
// This needs improvement in actual implementation
return nil, fmt.Errorf("operation lookup not implemented - need instance context")
}
// GetByInstance returns an operation for a specific instance
func (m *Manager) GetByInstance(instanceName, opID string) (*Operation, error) {
@@ -230,13 +236,38 @@ func (m *Manager) Cleanup(instanceName string, olderThan time.Duration) error {
for _, op := range ops {
if (op.Status == "completed" || op.Status == "failed" || op.Status == "cancelled") &&
!op.EndedAt.IsZero() && op.EndedAt.Before(cutoff) {
m.Delete(instanceName, op.ID)
_ = m.Delete(instanceName, op.ID)
}
}
return nil
}
// UpdateBootstrapProgress updates bootstrap-specific progress details
func (m *Manager) UpdateBootstrapProgress(instanceName, opID string, step int, stepName string, attempt, maxAttempts int, stepDescription string) error {
op, err := m.GetByInstance(instanceName, opID)
if err != nil {
return err
}
if op.Details == nil {
op.Details = &OperationDetails{}
}
op.Details.BootstrapProgress = &BootstrapProgress{
CurrentStep: step,
StepName: stepName,
Attempt: attempt,
MaxAttempts: maxAttempts,
StepDescription: stepDescription,
}
op.Progress = (step * 100) / (totalBootstrapSteps - 1)
op.Message = fmt.Sprintf("Step %d/%d: %s (attempt %d/%d)", step+1, totalBootstrapSteps, stepName, attempt, maxAttempts)
return m.writeOperation(op)
}
// writeOperation writes operation to disk
func (m *Manager) writeOperation(op *Operation) error {
opsDir := m.GetOperationsDir(op.Instance)

View File

@@ -9,6 +9,7 @@ import (
"path/filepath"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// Manager handles PXE boot asset management
@@ -35,7 +36,7 @@ type Asset struct {
// GetPXEDir returns the PXE directory for an instance
func (m *Manager) GetPXEDir(instanceName string) string {
return filepath.Join(m.dataDir, "instances", instanceName, "pxe")
return tools.GetInstancePXEPath(m.dataDir, instanceName)
}
// ListAssets returns available PXE assets for an instance

142
internal/services/config.go Normal file
View File

@@ -0,0 +1,142 @@
package services
import (
"fmt"
"os"
"strings"
"gopkg.in/yaml.v3"
"github.com/wild-cloud/wild-central/daemon/internal/contracts"
"github.com/wild-cloud/wild-central/daemon/internal/operations"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// UpdateConfig updates service configuration and optionally redeploys
func (m *Manager) UpdateConfig(instanceName, serviceName string, update contracts.ServiceConfigUpdate, broadcaster *operations.Broadcaster) (*contracts.ServiceConfigResponse, error) {
// 1. Validate service exists
manifest, err := m.GetManifest(serviceName)
if err != nil {
return nil, fmt.Errorf("service not found: %w", err)
}
namespace := manifest.Namespace
if deployment, ok := serviceDeployments[serviceName]; ok {
namespace = deployment.namespace
}
// 2. Load instance config
configPath := tools.GetInstanceConfigPath(m.dataDir, instanceName)
if !storage.FileExists(configPath) {
return nil, fmt.Errorf("config file not found for instance %s", instanceName)
}
configData, err := os.ReadFile(configPath)
if err != nil {
return nil, fmt.Errorf("failed to read config: %w", err)
}
var config map[string]interface{}
if err := yaml.Unmarshal(configData, &config); err != nil {
return nil, fmt.Errorf("failed to parse config: %w", err)
}
// 3. Validate config keys against service manifest
validPaths := make(map[string]bool)
for _, path := range manifest.ConfigReferences {
validPaths[path] = true
}
for _, cfg := range manifest.ServiceConfig {
validPaths[cfg.Path] = true
}
for key := range update.Config {
if !validPaths[key] {
return nil, fmt.Errorf("invalid config key '%s' for service %s", key, serviceName)
}
}
// 4. Update config values
for key, value := range update.Config {
if err := setNestedValue(config, key, value); err != nil {
return nil, fmt.Errorf("failed to set config key '%s': %w", key, err)
}
}
// 5. Write updated config
updatedData, err := yaml.Marshal(config)
if err != nil {
return nil, fmt.Errorf("failed to marshal config: %w", err)
}
if err := os.WriteFile(configPath, updatedData, 0644); err != nil {
return nil, fmt.Errorf("failed to write config: %w", err)
}
// 6. Redeploy if requested
redeployed := false
if update.Redeploy {
// Fetch fresh templates if requested
if update.Fetch {
if err := m.Fetch(instanceName, serviceName); err != nil {
return nil, fmt.Errorf("failed to fetch templates: %w", err)
}
}
// Recompile templates
if err := m.Compile(instanceName, serviceName); err != nil {
return nil, fmt.Errorf("failed to recompile templates: %w", err)
}
// Redeploy service
if err := m.Deploy(instanceName, serviceName, "", broadcaster); err != nil {
return nil, fmt.Errorf("failed to redeploy service: %w", err)
}
redeployed = true
}
// 7. Build response
message := "Service configuration updated successfully"
if redeployed {
message = "Service configuration updated and redeployed successfully"
}
return &contracts.ServiceConfigResponse{
Service: serviceName,
Namespace: namespace,
Config: update.Config,
Redeployed: redeployed,
Message: message,
}, nil
}
// setNestedValue sets a value in a nested map using dot notation
func setNestedValue(data map[string]interface{}, path string, value interface{}) error {
keys := strings.Split(path, ".")
current := data
for i, key := range keys {
if i == len(keys)-1 {
// Last key - set the value
current[key] = value
return nil
}
// Navigate to the next level
if next, ok := current[key].(map[string]interface{}); ok {
current = next
} else if current[key] == nil {
// Create intermediate map if it doesn't exist
next := make(map[string]interface{})
current[key] = next
current = next
} else {
return fmt.Errorf("path '%s' conflicts with existing non-map value at '%s'", path, key)
}
}
return nil
}

287
internal/services/logs.go Normal file
View File

@@ -0,0 +1,287 @@
package services
import (
"bufio"
"encoding/json"
"fmt"
"io"
"time"
"github.com/wild-cloud/wild-central/daemon/internal/contracts"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// GetLogs retrieves buffered logs from a service
func (m *Manager) GetLogs(instanceName, serviceName string, opts contracts.ServiceLogsRequest) (*contracts.ServiceLogsResponse, error) {
// 1. Get service namespace
manifest, err := m.GetManifest(serviceName)
if err != nil {
return nil, fmt.Errorf("service not found: %w", err)
}
namespace := manifest.Namespace
if deployment, ok := serviceDeployments[serviceName]; ok {
namespace = deployment.namespace
}
// 2. Get kubeconfig path
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
if !storage.FileExists(kubeconfigPath) {
return nil, fmt.Errorf("kubeconfig not found - cluster may not be bootstrapped")
}
kubectl := tools.NewKubectl(kubeconfigPath)
// 3. Get pod name (use first pod if no specific container specified)
podName := ""
if opts.Container == "" {
// Get first pod in namespace
podName, err = kubectl.GetFirstPodName(namespace)
if err != nil {
// Check if it's because there are no pods
pods, _ := kubectl.GetPods(namespace, false)
if len(pods) == 0 {
// Return empty logs response instead of error when no pods exist
return &contracts.ServiceLogsResponse{
Lines: []string{"No pods found for service. The service may not be deployed yet."},
}, nil
}
return nil, fmt.Errorf("failed to find pod: %w", err)
}
// If no container specified, get first container
containers, err := kubectl.GetPodContainers(namespace, podName)
if err != nil {
return nil, fmt.Errorf("failed to get pod containers: %w", err)
}
if len(containers) > 0 {
opts.Container = containers[0]
}
} else {
// Find pod with specified container
pods, err := kubectl.GetPods(namespace, false)
if err != nil {
return nil, fmt.Errorf("failed to list pods: %w", err)
}
if len(pods) > 0 {
podName = pods[0].Name
} else {
return nil, fmt.Errorf("no pods found in namespace %s", namespace)
}
}
// 4. Set default tail if not specified
if opts.Tail == 0 {
opts.Tail = 100
}
// Enforce maximum tail
if opts.Tail > 5000 {
opts.Tail = 5000
}
// 5. Get logs
logOpts := tools.LogOptions{
Container: opts.Container,
Tail: opts.Tail,
Previous: opts.Previous,
Since: opts.Since,
SinceSeconds: 0,
}
logEntries, err := kubectl.GetLogs(namespace, podName, logOpts)
if err != nil {
return nil, fmt.Errorf("failed to get logs: %w", err)
}
// 6. Convert structured logs to string lines
lines := make([]string, 0, len(logEntries))
for _, entry := range logEntries {
lines = append(lines, entry.Message)
}
truncated := false
if len(lines) > opts.Tail {
lines = lines[len(lines)-opts.Tail:]
truncated = true
}
return &contracts.ServiceLogsResponse{
Service: serviceName,
Namespace: namespace,
Container: opts.Container,
Lines: lines,
Truncated: truncated,
Timestamp: time.Now(),
}, nil
}
// StreamLogs streams logs from a service using SSE
func (m *Manager) StreamLogs(instanceName, serviceName string, opts contracts.ServiceLogsRequest, writer io.Writer) error {
// 1. Get service namespace
manifest, err := m.GetManifest(serviceName)
if err != nil {
return fmt.Errorf("service not found: %w", err)
}
namespace := manifest.Namespace
if deployment, ok := serviceDeployments[serviceName]; ok {
namespace = deployment.namespace
}
// 2. Get kubeconfig path
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
if !storage.FileExists(kubeconfigPath) {
return fmt.Errorf("kubeconfig not found - cluster may not be bootstrapped")
}
kubectl := tools.NewKubectl(kubeconfigPath)
// 3. Get pod name
podName := ""
if opts.Container == "" {
podName, err = kubectl.GetFirstPodName(namespace)
if err != nil {
// Check if it's because there are no pods
pods, _ := kubectl.GetPods(namespace, false)
if len(pods) == 0 {
// Send a message event indicating no pods
fmt.Fprintf(writer, "data: No pods found for service. The service may not be deployed yet.\n\n")
return nil
}
return fmt.Errorf("failed to find pod: %w", err)
}
// Get first container
containers, err := kubectl.GetPodContainers(namespace, podName)
if err != nil {
return fmt.Errorf("failed to get pod containers: %w", err)
}
if len(containers) > 0 {
opts.Container = containers[0]
}
} else {
pods, err := kubectl.GetPods(namespace, false)
if err != nil {
return fmt.Errorf("failed to list pods: %w", err)
}
if len(pods) > 0 {
podName = pods[0].Name
} else {
return fmt.Errorf("no pods found in namespace %s", namespace)
}
}
// 4. Set default tail for streaming
if opts.Tail == 0 {
opts.Tail = 50
}
// 5. Stream logs
logOpts := tools.LogOptions{
Container: opts.Container,
Tail: opts.Tail,
Since: opts.Since,
}
cmd, err := kubectl.StreamLogs(namespace, podName, logOpts)
if err != nil {
return fmt.Errorf("failed to start log stream: %w", err)
}
// Get stdout pipe
stdout, err := cmd.StdoutPipe()
if err != nil {
return fmt.Errorf("failed to get stdout pipe: %w", err)
}
stderr, err := cmd.StderrPipe()
if err != nil {
return fmt.Errorf("failed to get stderr pipe: %w", err)
}
// Start command
if err := cmd.Start(); err != nil {
return fmt.Errorf("failed to start kubectl logs: %w", err)
}
// Stream logs line by line as SSE events
scanner := bufio.NewScanner(stdout)
errScanner := bufio.NewScanner(stderr)
// Channel to signal completion
done := make(chan error, 1)
// Read stderr in background
go func() {
for errScanner.Scan() {
event := contracts.ServiceLogsSSEEvent{
Type: "error",
Error: errScanner.Text(),
Container: opts.Container,
Timestamp: time.Now(),
}
_ = writeSSEEvent(writer, event)
}
}()
// Read stdout
go func() {
for scanner.Scan() {
event := contracts.ServiceLogsSSEEvent{
Type: "log",
Line: scanner.Text(),
Container: opts.Container,
Timestamp: time.Now(),
}
if err := writeSSEEvent(writer, event); err != nil {
done <- err
return
}
}
if err := scanner.Err(); err != nil {
done <- err
return
}
done <- nil
}()
// Wait for completion or error
err = <-done
_ = cmd.Process.Kill()
// Send end event
endEvent := contracts.ServiceLogsSSEEvent{
Type: "end",
Timestamp: time.Now(),
}
_ = writeSSEEvent(writer, endEvent)
return err
}
// writeSSEEvent writes an SSE event to the writer
func writeSSEEvent(w io.Writer, event contracts.ServiceLogsSSEEvent) error {
// Marshal the event to JSON safely
jsonData, err := json.Marshal(event)
if err != nil {
return fmt.Errorf("failed to marshal SSE event: %w", err)
}
// Write SSE format: "data: <json>\n\n"
data := fmt.Sprintf("data: %s\n\n", jsonData)
_, err = w.Write([]byte(data))
if err != nil {
return err
}
// Flush if writer supports it
if flusher, ok := w.(interface{ Flush() }); ok {
flusher.Flush()
}
return nil
}

View File

@@ -37,10 +37,25 @@ func NewManager(dataDir string) *Manager {
manifest, err := setup.GetManifest(serviceName)
if err == nil {
// Convert setup.ServiceManifest to services.ServiceManifest
// Convert setup.ConfigDefinition map to services.ConfigDefinition map
serviceConfig := make(map[string]ConfigDefinition)
for key, cfg := range manifest.ServiceConfig {
serviceConfig[key] = ConfigDefinition{
Path: cfg.Path,
Prompt: cfg.Prompt,
Default: cfg.Default,
Type: cfg.Type,
}
}
manifests[serviceName] = &ServiceManifest{
Name: manifest.Name,
Description: manifest.Description,
Category: manifest.Category,
Name: manifest.Name,
Description: manifest.Description,
Namespace: manifest.Namespace,
Category: manifest.Category,
Dependencies: manifest.Dependencies,
ConfigReferences: manifest.ConfigReferences,
ServiceConfig: serviceConfig,
}
}
}
@@ -60,6 +75,7 @@ type Service struct {
Version string `json:"version"`
Namespace string `json:"namespace"`
Dependencies []string `json:"dependencies,omitempty"`
HasConfig bool `json:"hasConfig"` // Whether service has configurable fields
}
// Base services in Wild Cloud (kept for reference/validation)
@@ -103,7 +119,8 @@ func (m *Manager) checkServiceStatus(instanceName, serviceName string) string {
// Special case: NFS doesn't have a deployment, check for StorageClass instead
if serviceName == "nfs" {
cmd := exec.Command("kubectl", "--kubeconfig", kubeconfigPath, "get", "storageclass", "nfs", "-o", "name")
cmd := exec.Command("kubectl", "get", "storageclass", "nfs", "-o", "name")
tools.WithKubeconfig(cmd, kubeconfigPath)
if err := cmd.Run(); err == nil {
return "deployed"
}
@@ -147,12 +164,14 @@ func (m *Manager) List(instanceName string) ([]Service, error) {
// Get service info from manifest if available
var namespace, description, version string
var dependencies []string
var hasConfig bool
if manifest, ok := m.manifests[name]; ok {
namespace = manifest.Namespace
description = manifest.Description
version = manifest.Category // Using category as version for now
dependencies = manifest.Dependencies
hasConfig = len(manifest.ServiceConfig) > 0
} else {
// Fall back to hardcoded map
namespace = name + "-system" // default
@@ -168,6 +187,7 @@ func (m *Manager) List(instanceName string) ([]Service, error) {
Description: description,
Version: version,
Dependencies: dependencies,
HasConfig: hasConfig,
}
services = append(services, service)
@@ -245,7 +265,7 @@ func (m *Manager) Delete(instanceName, serviceName string) error {
}
// Get manifests file from embedded setup or instance directory
instanceServiceDir := filepath.Join(m.dataDir, "instances", instanceName, "setup", "cluster-services", serviceName)
instanceServiceDir := filepath.Join(tools.GetInstancePath(m.dataDir, instanceName), "setup", "cluster-services", serviceName)
manifestsFile := filepath.Join(instanceServiceDir, "manifests.yaml")
if !storage.FileExists(manifestsFile) {
@@ -313,7 +333,7 @@ func (m *Manager) Fetch(instanceName, serviceName string) error {
}
// 2. Create instance service directory
instanceDir := filepath.Join(m.dataDir, "instances", instanceName,
instanceDir := filepath.Join(tools.GetInstancePath(m.dataDir, instanceName),
"setup", "cluster-services", serviceName)
if err := os.MkdirAll(instanceDir, 0755); err != nil {
return fmt.Errorf("failed to create service directory: %w", err)
@@ -327,7 +347,7 @@ func (m *Manager) Fetch(instanceName, serviceName string) error {
// Extract README.md if it exists
if readmeData, err := setup.GetServiceFile(serviceName, "README.md"); err == nil {
os.WriteFile(filepath.Join(instanceDir, "README.md"), readmeData, 0644)
_ = os.WriteFile(filepath.Join(instanceDir, "README.md"), readmeData, 0644)
}
// Extract install.sh if it exists
@@ -340,7 +360,7 @@ func (m *Manager) Fetch(instanceName, serviceName string) error {
// Extract wild-manifest.yaml
if manifestData, err := setup.GetServiceFile(serviceName, "wild-manifest.yaml"); err == nil {
os.WriteFile(filepath.Join(instanceDir, "wild-manifest.yaml"), manifestData, 0644)
_ = os.WriteFile(filepath.Join(instanceDir, "wild-manifest.yaml"), manifestData, 0644)
}
// Extract kustomize.template directory
@@ -357,7 +377,7 @@ func (m *Manager) Fetch(instanceName, serviceName string) error {
// serviceFilesExist checks if service files exist in the instance
func (m *Manager) serviceFilesExist(instanceName, serviceName string) bool {
serviceDir := filepath.Join(m.dataDir, "instances", instanceName,
serviceDir := filepath.Join(tools.GetInstancePath(m.dataDir, instanceName),
"setup", "cluster-services", serviceName)
installSh := filepath.Join(serviceDir, "install.sh")
return fileExists(installSh)
@@ -375,52 +395,6 @@ func dirExists(path string) bool {
return err == nil && info.IsDir()
}
func copyFile(src, dst string) error {
input, err := os.ReadFile(src)
if err != nil {
return err
}
return os.WriteFile(dst, input, 0644)
}
func copyFileIfExists(src, dst string) error {
if !fileExists(src) {
return nil
}
return copyFile(src, dst)
}
func copyDir(src, dst string) error {
// Create destination directory
if err := os.MkdirAll(dst, 0755); err != nil {
return err
}
// Read source directory
entries, err := os.ReadDir(src)
if err != nil {
return err
}
// Copy each entry
for _, entry := range entries {
srcPath := filepath.Join(src, entry.Name())
dstPath := filepath.Join(dst, entry.Name())
if entry.IsDir() {
if err := copyDir(srcPath, dstPath); err != nil {
return err
}
} else {
if err := copyFile(srcPath, dstPath); err != nil {
return err
}
}
}
return nil
}
// extractFS extracts files from an fs.FS to a destination directory
func extractFS(fsys fs.FS, dst string) error {
return fs.WalkDir(fsys, ".", func(path string, d fs.DirEntry, err error) error {
@@ -449,7 +423,7 @@ func extractFS(fsys fs.FS, dst string) error {
// Compile processes gomplate templates into final Kubernetes manifests
func (m *Manager) Compile(instanceName, serviceName string) error {
instanceDir := filepath.Join(m.dataDir, "instances", instanceName)
instanceDir := tools.GetInstancePath(m.dataDir, instanceName)
serviceDir := filepath.Join(instanceDir, "setup", "cluster-services", serviceName)
templateDir := filepath.Join(serviceDir, "kustomize.template")
outputDir := filepath.Join(serviceDir, "kustomize")
@@ -527,7 +501,7 @@ func (m *Manager) Compile(instanceName, serviceName string) error {
func (m *Manager) Deploy(instanceName, serviceName, opID string, broadcaster *operations.Broadcaster) error {
fmt.Printf("[DEBUG] Deploy() called for service=%s instance=%s opID=%s\n", serviceName, instanceName, opID)
instanceDir := filepath.Join(m.dataDir, "instances", instanceName)
instanceDir := tools.GetInstancePath(m.dataDir, instanceName)
serviceDir := filepath.Join(instanceDir, "setup", "cluster-services", serviceName)
installScript := filepath.Join(serviceDir, "install.sh")
@@ -623,8 +597,7 @@ func (m *Manager) validateConfig(instanceName, serviceName string) error {
}
// Load instance config
instanceDir := filepath.Join(m.dataDir, "instances", instanceName)
configFile := filepath.Join(instanceDir, "config.yaml")
configFile := tools.GetInstanceConfigPath(m.dataDir, instanceName)
configData, err := os.ReadFile(configFile)
if err != nil {

146
internal/services/status.go Normal file
View File

@@ -0,0 +1,146 @@
package services
import (
"fmt"
"os"
"time"
"gopkg.in/yaml.v3"
"github.com/wild-cloud/wild-central/daemon/internal/contracts"
"github.com/wild-cloud/wild-central/daemon/internal/storage"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// GetDetailedStatus returns comprehensive service status including pods and health
func (m *Manager) GetDetailedStatus(instanceName, serviceName string) (*contracts.DetailedServiceStatus, error) {
// 1. Get service manifest and namespace
manifest, err := m.GetManifest(serviceName)
if err != nil {
return nil, fmt.Errorf("service not found: %w", err)
}
namespace := manifest.Namespace
deploymentName := manifest.GetDeploymentName()
// Check hardcoded map for correct deployment name
if deployment, ok := serviceDeployments[serviceName]; ok {
namespace = deployment.namespace
deploymentName = deployment.deploymentName
}
// 2. Get kubeconfig path
kubeconfigPath := tools.GetKubeconfigPath(m.dataDir, instanceName)
if !storage.FileExists(kubeconfigPath) {
return &contracts.DetailedServiceStatus{
Name: serviceName,
Namespace: namespace,
DeploymentStatus: "NotFound",
Replicas: contracts.ReplicaStatus{},
Pods: []contracts.PodStatus{},
LastUpdated: time.Now(),
}, nil
}
kubectl := tools.NewKubectl(kubeconfigPath)
// 3. Get deployment information
deploymentInfo, err := kubectl.GetDeployment(deploymentName, namespace)
deploymentStatus := "NotFound"
replicas := contracts.ReplicaStatus{}
if err == nil {
replicas = contracts.ReplicaStatus{
Desired: deploymentInfo.Desired,
Current: deploymentInfo.Current,
Ready: deploymentInfo.Ready,
Available: deploymentInfo.Available,
}
// Determine deployment status
if deploymentInfo.Ready == deploymentInfo.Desired && deploymentInfo.Desired > 0 {
deploymentStatus = "Ready"
} else if deploymentInfo.Ready < deploymentInfo.Desired {
if deploymentInfo.Current > deploymentInfo.Desired {
deploymentStatus = "Progressing"
} else {
deploymentStatus = "Degraded"
}
} else if deploymentInfo.Desired == 0 {
deploymentStatus = "Scaled to Zero"
}
}
// 4. Get pod information
podInfos, err := kubectl.GetPods(namespace, false)
pods := make([]contracts.PodStatus, 0, len(podInfos))
if err == nil {
for _, podInfo := range podInfos {
pods = append(pods, contracts.PodStatus{
Name: podInfo.Name,
Status: podInfo.Status,
Ready: podInfo.Ready,
Restarts: podInfo.Restarts,
Age: podInfo.Age,
Node: podInfo.Node,
IP: podInfo.IP,
})
}
}
// 5. Load current config values
configPath := tools.GetInstanceConfigPath(m.dataDir, instanceName)
configValues := make(map[string]interface{})
if storage.FileExists(configPath) {
configData, err := os.ReadFile(configPath)
if err == nil {
var instanceConfig map[string]interface{}
if err := yaml.Unmarshal(configData, &instanceConfig); err == nil {
// Extract values for all config paths
for _, path := range manifest.ConfigReferences {
if value := getNestedValue(instanceConfig, path); value != nil {
configValues[path] = value
}
}
for _, cfg := range manifest.ServiceConfig {
if value := getNestedValue(instanceConfig, cfg.Path); value != nil {
configValues[cfg.Path] = value
}
}
}
}
}
// 6. Convert ServiceConfig to contracts.ConfigDefinition
contractsServiceConfig := make(map[string]contracts.ConfigDefinition)
for key, cfg := range manifest.ServiceConfig {
contractsServiceConfig[key] = contracts.ConfigDefinition{
Path: cfg.Path,
Prompt: cfg.Prompt,
Default: cfg.Default,
Type: cfg.Type,
}
}
// 7. Build detailed status response
status := &contracts.DetailedServiceStatus{
Name: serviceName,
Namespace: namespace,
DeploymentStatus: deploymentStatus,
Replicas: replicas,
Pods: pods,
Config: configValues,
Manifest: &contracts.ServiceManifest{
Name: manifest.Name,
Description: manifest.Description,
Namespace: manifest.Namespace,
ConfigReferences: manifest.ConfigReferences,
ServiceConfig: contractsServiceConfig,
},
LastUpdated: time.Now(),
}
return status, nil
}

View File

@@ -19,11 +19,22 @@ var clusterServices = setupFS
// ServiceManifest represents the wild-manifest.yaml structure
type ServiceManifest struct {
Name string `yaml:"name"`
Description string `yaml:"description"`
Version string `yaml:"version"`
Category string `yaml:"category"`
// Add other fields as needed from wild-manifest.yaml
Name string `yaml:"name"`
Description string `yaml:"description"`
Version string `yaml:"version"`
Category string `yaml:"category"`
Namespace string `yaml:"namespace"`
Dependencies []string `yaml:"dependencies,omitempty"`
ConfigReferences []string `yaml:"configReferences,omitempty"`
ServiceConfig map[string]ConfigDefinition `yaml:"serviceConfig,omitempty"`
}
// ConfigDefinition defines config that should be prompted during service setup
type ConfigDefinition struct {
Path string `yaml:"path"`
Prompt string `yaml:"prompt"`
Default string `yaml:"default"`
Type string `yaml:"type,omitempty"`
}
// ListServices returns all available cluster services

View File

@@ -96,7 +96,7 @@ func WithLock(lockPath string, fn func() error) error {
if err != nil {
return err
}
defer lock.Release()
defer func() { _ = lock.Release() }()
return fn()
}

View File

@@ -35,3 +35,53 @@ func GetTalosconfigPath(dataDir, instanceName string) string {
func GetKubeconfigPath(dataDir, instanceName string) string {
return filepath.Join(dataDir, "instances", instanceName, "kubeconfig")
}
// GetInstancePath returns the path to an instance directory
func GetInstancePath(dataDir, instanceName string) string {
return filepath.Join(dataDir, "instances", instanceName)
}
// GetInstanceConfigPath returns the path to an instance's config file
func GetInstanceConfigPath(dataDir, instanceName string) string {
return filepath.Join(dataDir, "instances", instanceName, "config.yaml")
}
// GetInstanceSecretsPath returns the path to an instance's secrets file
func GetInstanceSecretsPath(dataDir, instanceName string) string {
return filepath.Join(dataDir, "instances", instanceName, "secrets.yaml")
}
// GetInstanceTalosPath returns the path to an instance's talos directory
func GetInstanceTalosPath(dataDir, instanceName string) string {
return filepath.Join(dataDir, "instances", instanceName, "talos")
}
// GetInstancePXEPath returns the path to an instance's PXE directory
func GetInstancePXEPath(dataDir, instanceName string) string {
return filepath.Join(dataDir, "instances", instanceName, "pxe")
}
// GetInstanceOperationsPath returns the path to an instance's operations directory
func GetInstanceOperationsPath(dataDir, instanceName string) string {
return filepath.Join(dataDir, "instances", instanceName, "operations")
}
// GetInstanceBackupsPath returns the path to an instance's backups directory
func GetInstanceBackupsPath(dataDir, instanceName string) string {
return filepath.Join(dataDir, "instances", instanceName, "backups")
}
// GetInstanceDiscoveryPath returns the path to an instance's discovery directory
func GetInstanceDiscoveryPath(dataDir, instanceName string) string {
return filepath.Join(dataDir, "instances", instanceName, "discovery")
}
// GetInstancesPath returns the path to the instances directory
func GetInstancesPath(dataDir string) string {
return filepath.Join(dataDir, "instances")
}
// GetInstancesLockPath returns the path to the instances directory lock file
func GetInstancesLockPath(dataDir string) string {
return filepath.Join(dataDir, "instances", ".lock")
}

View File

@@ -1,10 +1,16 @@
package tools
import (
"encoding/json"
"fmt"
"os/exec"
"sort"
"strconv"
"strings"
"time"
)
// Kubectl provides a thin wrapper around the kubectl command-line tool
// Kubectl provides a comprehensive wrapper around the kubectl command-line tool
type Kubectl struct {
kubeconfigPath string
}
@@ -16,6 +22,115 @@ func NewKubectl(kubeconfigPath string) *Kubectl {
}
}
// Pod Information Structures
// PodInfo represents pod information from kubectl
type PodInfo struct {
Name string `json:"name"`
Status string `json:"status"`
Ready string `json:"ready"`
Restarts int `json:"restarts"`
Age string `json:"age"`
Node string `json:"node,omitempty"`
IP string `json:"ip,omitempty"`
Containers []ContainerInfo `json:"containers,omitempty"`
Conditions []PodCondition `json:"conditions,omitempty"`
}
// ContainerInfo represents detailed container information
type ContainerInfo struct {
Name string `json:"name"`
Image string `json:"image"`
Ready bool `json:"ready"`
RestartCount int `json:"restartCount"`
State ContainerState `json:"state"`
}
// ContainerState represents the state of a container
type ContainerState struct {
Status string `json:"status"`
Reason string `json:"reason,omitempty"`
Message string `json:"message,omitempty"`
Since time.Time `json:"since,omitempty"`
}
// PodCondition represents a pod condition
type PodCondition struct {
Type string `json:"type"`
Status string `json:"status"`
Reason string `json:"reason,omitempty"`
Message string `json:"message,omitempty"`
Since time.Time `json:"since,omitempty"`
}
// Deployment Information Structures
// DeploymentInfo represents deployment information
type DeploymentInfo struct {
Desired int32 `json:"desired"`
Current int32 `json:"current"`
Ready int32 `json:"ready"`
Available int32 `json:"available"`
}
// ReplicaInfo represents aggregated replica information
type ReplicaInfo struct {
Desired int `json:"desired"`
Current int `json:"current"`
Ready int `json:"ready"`
Available int `json:"available"`
}
// Resource Information Structures
// ResourceMetric represents resource usage for a specific resource type
type ResourceMetric struct {
Used string `json:"used"`
Requested string `json:"requested"`
Limit string `json:"limit"`
Percentage float64 `json:"percentage"`
}
// ResourceUsage represents aggregated resource usage
type ResourceUsage struct {
CPU *ResourceMetric `json:"cpu,omitempty"`
Memory *ResourceMetric `json:"memory,omitempty"`
Storage *ResourceMetric `json:"storage,omitempty"`
}
// Event Information Structures
// KubernetesEvent represents a Kubernetes event
type KubernetesEvent struct {
Type string `json:"type"`
Reason string `json:"reason"`
Message string `json:"message"`
Count int `json:"count"`
FirstSeen time.Time `json:"firstSeen"`
LastSeen time.Time `json:"lastSeen"`
Object string `json:"object"`
}
// Logging Structures
// LogOptions configures log retrieval
type LogOptions struct {
Container string
Tail int
Previous bool
Since string
SinceSeconds int
}
// LogEntry represents a structured log entry
type LogEntry struct {
Timestamp time.Time `json:"timestamp"`
Message string `json:"message"`
Pod string `json:"pod"`
}
// Pod Operations
// DeploymentExists checks if a deployment exists in the specified namespace
func (k *Kubectl) DeploymentExists(name, namespace string) bool {
args := []string{
@@ -31,3 +146,594 @@ func (k *Kubectl) DeploymentExists(name, namespace string) bool {
err := cmd.Run()
return err == nil
}
// GetPods retrieves pod information for a namespace
// If detailed is true, includes containers and conditions
func (k *Kubectl) GetPods(namespace string, detailed bool) ([]PodInfo, error) {
args := []string{
"get", "pods",
"-n", namespace,
"-o", "json",
}
if k.kubeconfigPath != "" {
args = append([]string{"--kubeconfig", k.kubeconfigPath}, args...)
}
cmd := exec.Command("kubectl", args...)
output, err := cmd.Output()
if err != nil {
return nil, fmt.Errorf("failed to get pods: %w", err)
}
var podList struct {
Items []struct {
Metadata struct {
Name string `json:"name"`
CreationTimestamp time.Time `json:"creationTimestamp"`
} `json:"metadata"`
Spec struct {
NodeName string `json:"nodeName"`
Containers []struct {
Name string `json:"name"`
Image string `json:"image"`
} `json:"containers"`
} `json:"spec"`
Status struct {
Phase string `json:"phase"`
PodIP string `json:"podIP"`
Conditions []struct {
Type string `json:"type"`
Status string `json:"status"`
LastTransitionTime time.Time `json:"lastTransitionTime"`
Reason string `json:"reason"`
Message string `json:"message"`
} `json:"conditions"`
ContainerStatuses []struct {
Name string `json:"name"`
Image string `json:"image"`
Ready bool `json:"ready"`
RestartCount int `json:"restartCount"`
State struct {
Running *struct{ StartedAt time.Time } `json:"running,omitempty"`
Waiting *struct{ Reason, Message string } `json:"waiting,omitempty"`
Terminated *struct {
Reason string
Message string
FinishedAt time.Time
} `json:"terminated,omitempty"`
} `json:"state"`
} `json:"containerStatuses"`
} `json:"status"`
} `json:"items"`
}
if err := json.Unmarshal(output, &podList); err != nil {
return nil, fmt.Errorf("failed to parse pod list: %w", err)
}
pods := make([]PodInfo, 0, len(podList.Items))
for _, pod := range podList.Items {
// Calculate ready containers
readyCount := 0
totalCount := len(pod.Status.ContainerStatuses)
totalRestarts := 0
for _, cs := range pod.Status.ContainerStatuses {
if cs.Ready {
readyCount++
}
totalRestarts += cs.RestartCount
}
// Ensure status is never empty
status := pod.Status.Phase
if status == "" {
status = "Unknown"
}
podInfo := PodInfo{
Name: pod.Metadata.Name,
Status: status,
Ready: fmt.Sprintf("%d/%d", readyCount, totalCount),
Restarts: totalRestarts,
Age: formatAge(time.Since(pod.Metadata.CreationTimestamp)),
Node: pod.Spec.NodeName,
IP: pod.Status.PodIP,
}
// Include detailed information if requested
if detailed {
// Add container details
containers := make([]ContainerInfo, 0, len(pod.Status.ContainerStatuses))
for _, cs := range pod.Status.ContainerStatuses {
containerState := ContainerState{Status: "unknown"}
if cs.State.Running != nil {
containerState.Status = "running"
containerState.Since = cs.State.Running.StartedAt
} else if cs.State.Waiting != nil {
containerState.Status = "waiting"
containerState.Reason = cs.State.Waiting.Reason
containerState.Message = cs.State.Waiting.Message
} else if cs.State.Terminated != nil {
containerState.Status = "terminated"
containerState.Reason = cs.State.Terminated.Reason
containerState.Message = cs.State.Terminated.Message
containerState.Since = cs.State.Terminated.FinishedAt
}
containers = append(containers, ContainerInfo{
Name: cs.Name,
Image: cs.Image,
Ready: cs.Ready,
RestartCount: cs.RestartCount,
State: containerState,
})
}
podInfo.Containers = containers
// Add condition details
conditions := make([]PodCondition, 0, len(pod.Status.Conditions))
for _, cond := range pod.Status.Conditions {
conditions = append(conditions, PodCondition{
Type: cond.Type,
Status: cond.Status,
Reason: cond.Reason,
Message: cond.Message,
Since: cond.LastTransitionTime,
})
}
podInfo.Conditions = conditions
}
pods = append(pods, podInfo)
}
return pods, nil
}
// GetFirstPodName returns the name of the first pod in a namespace
func (k *Kubectl) GetFirstPodName(namespace string) (string, error) {
pods, err := k.GetPods(namespace, false)
if err != nil {
return "", err
}
if len(pods) == 0 {
return "", fmt.Errorf("no pods found in namespace %s", namespace)
}
return pods[0].Name, nil
}
// GetPodContainers returns container names for a pod
func (k *Kubectl) GetPodContainers(namespace, podName string) ([]string, error) {
args := []string{
"get", "pod", podName,
"-n", namespace,
"-o", "jsonpath={.spec.containers[*].name}",
}
if k.kubeconfigPath != "" {
args = append([]string{"--kubeconfig", k.kubeconfigPath}, args...)
}
cmd := exec.Command("kubectl", args...)
output, err := cmd.Output()
if err != nil {
return nil, fmt.Errorf("failed to get pod containers: %w", err)
}
containerNames := strings.Fields(string(output))
return containerNames, nil
}
// Deployment Operations
// GetDeployment retrieves deployment information
func (k *Kubectl) GetDeployment(name, namespace string) (*DeploymentInfo, error) {
args := []string{
"get", "deployment", name,
"-n", namespace,
"-o", "json",
}
if k.kubeconfigPath != "" {
args = append([]string{"--kubeconfig", k.kubeconfigPath}, args...)
}
cmd := exec.Command("kubectl", args...)
output, err := cmd.Output()
if err != nil {
return nil, fmt.Errorf("failed to get deployment: %w", err)
}
var deployment struct {
Status struct {
Replicas int32 `json:"replicas"`
UpdatedReplicas int32 `json:"updatedReplicas"`
ReadyReplicas int32 `json:"readyReplicas"`
AvailableReplicas int32 `json:"availableReplicas"`
} `json:"status"`
Spec struct {
Replicas int32 `json:"replicas"`
} `json:"spec"`
}
if err := json.Unmarshal(output, &deployment); err != nil {
return nil, fmt.Errorf("failed to parse deployment: %w", err)
}
return &DeploymentInfo{
Desired: deployment.Spec.Replicas,
Current: deployment.Status.Replicas,
Ready: deployment.Status.ReadyReplicas,
Available: deployment.Status.AvailableReplicas,
}, nil
}
// GetReplicas retrieves aggregated replica information for a namespace
func (k *Kubectl) GetReplicas(namespace string) (*ReplicaInfo, error) {
info := &ReplicaInfo{}
// Get deployments
deployCmd := exec.Command("kubectl", "get", "deployments", "-n", namespace, "-o", "json")
WithKubeconfig(deployCmd, k.kubeconfigPath)
deployOutput, err := deployCmd.Output()
if err == nil {
var deployList struct {
Items []struct {
Spec struct {
Replicas int `json:"replicas"`
} `json:"spec"`
Status struct {
Replicas int `json:"replicas"`
ReadyReplicas int `json:"readyReplicas"`
AvailableReplicas int `json:"availableReplicas"`
} `json:"status"`
} `json:"items"`
}
if json.Unmarshal(deployOutput, &deployList) == nil {
for _, deploy := range deployList.Items {
info.Desired += deploy.Spec.Replicas
info.Current += deploy.Status.Replicas
info.Ready += deploy.Status.ReadyReplicas
info.Available += deploy.Status.AvailableReplicas
}
}
}
// Get statefulsets
stsCmd := exec.Command("kubectl", "get", "statefulsets", "-n", namespace, "-o", "json")
WithKubeconfig(stsCmd, k.kubeconfigPath)
stsOutput, err := stsCmd.Output()
if err == nil {
var stsList struct {
Items []struct {
Spec struct {
Replicas int `json:"replicas"`
} `json:"spec"`
Status struct {
Replicas int `json:"replicas"`
ReadyReplicas int `json:"readyReplicas"`
} `json:"status"`
} `json:"items"`
}
if json.Unmarshal(stsOutput, &stsList) == nil {
for _, sts := range stsList.Items {
info.Desired += sts.Spec.Replicas
info.Current += sts.Status.Replicas
info.Ready += sts.Status.ReadyReplicas
// StatefulSets don't have availableReplicas, use ready as proxy
info.Available += sts.Status.ReadyReplicas
}
}
}
return info, nil
}
// Resource Monitoring
// GetResources retrieves aggregated resource usage for a namespace
func (k *Kubectl) GetResources(namespace string) (*ResourceUsage, error) {
cmd := exec.Command("kubectl", "get", "pods", "-n", namespace, "-o", "json")
WithKubeconfig(cmd, k.kubeconfigPath)
output, err := cmd.Output()
if err != nil {
return nil, fmt.Errorf("failed to get pods: %w", err)
}
var podList struct {
Items []struct {
Spec struct {
Containers []struct {
Resources struct {
Requests map[string]string `json:"requests,omitempty"`
Limits map[string]string `json:"limits,omitempty"`
} `json:"resources"`
} `json:"containers"`
} `json:"spec"`
} `json:"items"`
}
if err := json.Unmarshal(output, &podList); err != nil {
return nil, fmt.Errorf("failed to parse pod list: %w", err)
}
// Aggregate resources
cpuRequests := int64(0)
cpuLimits := int64(0)
memRequests := int64(0)
memLimits := int64(0)
for _, pod := range podList.Items {
for _, container := range pod.Spec.Containers {
if req, ok := container.Resources.Requests["cpu"]; ok {
cpuRequests += parseResourceQuantity(req)
}
if lim, ok := container.Resources.Limits["cpu"]; ok {
cpuLimits += parseResourceQuantity(lim)
}
if req, ok := container.Resources.Requests["memory"]; ok {
memRequests += parseResourceQuantity(req)
}
if lim, ok := container.Resources.Limits["memory"]; ok {
memLimits += parseResourceQuantity(lim)
}
}
}
// Build resource usage with metrics
usage := &ResourceUsage{}
// CPU metrics (if any resources defined)
if cpuRequests > 0 || cpuLimits > 0 {
cpuUsed := cpuRequests // Approximate "used" as requests for now
cpuPercentage := 0.0
if cpuLimits > 0 {
cpuPercentage = float64(cpuUsed) / float64(cpuLimits) * 100
}
usage.CPU = &ResourceMetric{
Used: formatCPU(cpuUsed),
Requested: formatCPU(cpuRequests),
Limit: formatCPU(cpuLimits),
Percentage: cpuPercentage,
}
}
// Memory metrics (if any resources defined)
if memRequests > 0 || memLimits > 0 {
memUsed := memRequests // Approximate "used" as requests for now
memPercentage := 0.0
if memLimits > 0 {
memPercentage = float64(memUsed) / float64(memLimits) * 100
}
usage.Memory = &ResourceMetric{
Used: formatMemory(memUsed),
Requested: formatMemory(memRequests),
Limit: formatMemory(memLimits),
Percentage: memPercentage,
}
}
return usage, nil
}
// GetRecentEvents retrieves recent events for a namespace
func (k *Kubectl) GetRecentEvents(namespace string, limit int) ([]KubernetesEvent, error) {
cmd := exec.Command("kubectl", "get", "events", "-n", namespace,
"--sort-by=.lastTimestamp", "-o", "json")
WithKubeconfig(cmd, k.kubeconfigPath)
output, err := cmd.Output()
if err != nil {
return nil, fmt.Errorf("failed to get events: %w", err)
}
var eventList struct {
Items []struct {
Type string `json:"type"`
Reason string `json:"reason"`
Message string `json:"message"`
Count int `json:"count"`
FirstTimestamp time.Time `json:"firstTimestamp"`
LastTimestamp time.Time `json:"lastTimestamp"`
InvolvedObject struct {
Kind string `json:"kind"`
Name string `json:"name"`
} `json:"involvedObject"`
} `json:"items"`
}
if err := json.Unmarshal(output, &eventList); err != nil {
return nil, fmt.Errorf("failed to parse events: %w", err)
}
// Sort by last timestamp (most recent first)
sort.Slice(eventList.Items, func(i, j int) bool {
return eventList.Items[i].LastTimestamp.After(eventList.Items[j].LastTimestamp)
})
// Limit results
if limit > 0 && len(eventList.Items) > limit {
eventList.Items = eventList.Items[:limit]
}
events := make([]KubernetesEvent, 0, len(eventList.Items))
for _, event := range eventList.Items {
events = append(events, KubernetesEvent{
Type: event.Type,
Reason: event.Reason,
Message: event.Message,
Count: event.Count,
FirstSeen: event.FirstTimestamp,
LastSeen: event.LastTimestamp,
Object: fmt.Sprintf("%s/%s", event.InvolvedObject.Kind, event.InvolvedObject.Name),
})
}
return events, nil
}
// Logging Operations
// GetLogs retrieves logs from a pod
func (k *Kubectl) GetLogs(namespace, podName string, opts LogOptions) ([]LogEntry, error) {
args := []string{"logs", podName, "-n", namespace}
if opts.Container != "" {
args = append(args, "-c", opts.Container)
}
if opts.Tail > 0 {
args = append(args, "--tail", strconv.Itoa(opts.Tail))
}
if opts.SinceSeconds > 0 {
args = append(args, "--since", fmt.Sprintf("%ds", opts.SinceSeconds))
} else if opts.Since != "" {
args = append(args, "--since", opts.Since)
}
if opts.Previous {
args = append(args, "--previous")
}
if k.kubeconfigPath != "" {
args = append([]string{"--kubeconfig", k.kubeconfigPath}, args...)
}
cmd := exec.Command("kubectl", args...)
output, err := cmd.Output()
if err != nil {
return nil, fmt.Errorf("failed to get logs: %w", err)
}
lines := strings.Split(string(output), "\n")
entries := make([]LogEntry, 0, len(lines))
for _, line := range lines {
if line == "" {
continue
}
entries = append(entries, LogEntry{
Timestamp: time.Now(), // Best effort - kubectl doesn't provide structured timestamps
Message: line,
Pod: podName,
})
}
return entries, nil
}
// StreamLogs streams logs from a pod
func (k *Kubectl) StreamLogs(namespace, podName string, opts LogOptions) (*exec.Cmd, error) {
args := []string{
"logs", podName,
"-n", namespace,
"-f", // follow
}
if opts.Container != "" {
args = append(args, "-c", opts.Container)
}
if opts.Tail > 0 {
args = append(args, "--tail", fmt.Sprintf("%d", opts.Tail))
}
if opts.Since != "" {
args = append(args, "--since", opts.Since)
}
if k.kubeconfigPath != "" {
args = append([]string{"--kubeconfig", k.kubeconfigPath}, args...)
}
cmd := exec.Command("kubectl", args...)
return cmd, nil
}
// Helper Functions
// formatAge converts a duration to a human-readable age string
func formatAge(d time.Duration) string {
if d < time.Minute {
return fmt.Sprintf("%ds", int(d.Seconds()))
}
if d < time.Hour {
return fmt.Sprintf("%dm", int(d.Minutes()))
}
if d < 24*time.Hour {
return fmt.Sprintf("%dh", int(d.Hours()))
}
return fmt.Sprintf("%dd", int(d.Hours()/24))
}
// parseResourceQuantity converts kubernetes resource quantities to millicores/bytes
func parseResourceQuantity(quantity string) int64 {
quantity = strings.TrimSpace(quantity)
if quantity == "" {
return 0
}
// Handle CPU (cores)
if strings.HasSuffix(quantity, "m") {
val, _ := strconv.ParseInt(strings.TrimSuffix(quantity, "m"), 10, 64)
return val
}
// Handle memory (bytes)
multipliers := map[string]int64{
"Ki": 1024,
"Mi": 1024 * 1024,
"Gi": 1024 * 1024 * 1024,
"Ti": 1024 * 1024 * 1024 * 1024,
"K": 1000,
"M": 1000 * 1000,
"G": 1000 * 1000 * 1000,
"T": 1000 * 1000 * 1000 * 1000,
}
for suffix, mult := range multipliers {
if strings.HasSuffix(quantity, suffix) {
val, _ := strconv.ParseInt(strings.TrimSuffix(quantity, suffix), 10, 64)
return val * mult
}
}
// Plain number
val, _ := strconv.ParseInt(quantity, 10, 64)
return val
}
// formatCPU formats millicores to human-readable format
func formatCPU(millicores int64) string {
if millicores == 0 {
return "0"
}
if millicores < 1000 {
return fmt.Sprintf("%dm", millicores)
}
return fmt.Sprintf("%.1f", float64(millicores)/1000.0)
}
// formatMemory formats bytes to human-readable format
func formatMemory(bytes int64) string {
if bytes == 0 {
return "0"
}
const unit = 1024
if bytes < unit {
return fmt.Sprintf("%dB", bytes)
}
div, exp := int64(unit), 0
for n := bytes / unit; n >= unit; n /= unit {
div *= unit
exp++
}
units := []string{"Ki", "Mi", "Gi", "Ti"}
return fmt.Sprintf("%.1f%s", float64(bytes)/float64(div), units[exp])
}

View File

@@ -1,10 +1,12 @@
package tools
import (
"context"
"encoding/json"
"fmt"
"os/exec"
"strings"
"time"
)
// Talosctl provides a thin wrapper around the talosctl command-line tool
@@ -92,8 +94,11 @@ func (t *Talosctl) GetDisks(nodeIP string, insecure bool) ([]DiskInfo, error) {
args = append(args, "--insecure")
}
// Build args with talosconfig if available
finalArgs := t.buildArgs(args)
// Use jq to slurp the NDJSON into an array (like v.PoC does with jq -s)
talosCmd := exec.Command("talosctl", args...)
talosCmd := exec.Command("talosctl", finalArgs...)
jqCmd := exec.Command("jq", "-s", ".")
// Pipe talosctl output to jq
@@ -159,10 +164,10 @@ func (t *Talosctl) GetDisks(nodeIP string, insecure bool) ([]DiskInfo, error) {
return disks, nil
}
// GetLinks queries network interfaces from a node
func (t *Talosctl) GetLinks(nodeIP string, insecure bool) ([]map[string]interface{}, error) {
// getResourceJSON executes a talosctl get command and returns parsed JSON array
func (t *Talosctl) getResourceJSON(resourceType, nodeIP string, insecure bool) ([]map[string]interface{}, error) {
args := []string{
"get", "links",
"get", resourceType,
"--nodes", nodeIP,
"-o", "json",
}
@@ -171,8 +176,11 @@ func (t *Talosctl) GetLinks(nodeIP string, insecure bool) ([]map[string]interfac
args = append(args, "--insecure")
}
// Use jq to slurp the NDJSON into an array (like v.PoC does with jq -s)
talosCmd := exec.Command("talosctl", args...)
// Build args with talosconfig if available
finalArgs := t.buildArgs(args)
// Use jq to slurp the NDJSON into an array
talosCmd := exec.Command("talosctl", finalArgs...)
jqCmd := exec.Command("jq", "-s", ".")
// Pipe talosctl output to jq
@@ -184,59 +192,29 @@ func (t *Talosctl) GetLinks(nodeIP string, insecure bool) ([]map[string]interfac
output, err := jqCmd.CombinedOutput()
if err != nil {
return nil, fmt.Errorf("failed to process links JSON: %w\nOutput: %s", err, string(output))
return nil, fmt.Errorf("failed to process %s JSON: %w\nOutput: %s", resourceType, err, string(output))
}
if err := talosCmd.Wait(); err != nil {
return nil, fmt.Errorf("talosctl get links failed: %w", err)
return nil, fmt.Errorf("talosctl get %s failed: %w", resourceType, err)
}
var result []map[string]interface{}
if err := json.Unmarshal(output, &result); err != nil {
return nil, fmt.Errorf("failed to parse links JSON: %w", err)
return nil, fmt.Errorf("failed to parse %s JSON: %w", resourceType, err)
}
return result, nil
}
// GetLinks queries network interfaces from a node
func (t *Talosctl) GetLinks(nodeIP string, insecure bool) ([]map[string]interface{}, error) {
return t.getResourceJSON("links", nodeIP, insecure)
}
// GetRoutes queries routing table from a node
func (t *Talosctl) GetRoutes(nodeIP string, insecure bool) ([]map[string]interface{}, error) {
args := []string{
"get", "routes",
"--nodes", nodeIP,
"-o", "json",
}
if insecure {
args = append(args, "--insecure")
}
// Use jq to slurp the NDJSON into an array (like v.PoC does with jq -s)
talosCmd := exec.Command("talosctl", args...)
jqCmd := exec.Command("jq", "-s", ".")
// Pipe talosctl output to jq
jqCmd.Stdin, _ = talosCmd.StdoutPipe()
if err := talosCmd.Start(); err != nil {
return nil, fmt.Errorf("failed to start talosctl: %w", err)
}
output, err := jqCmd.CombinedOutput()
if err != nil {
return nil, fmt.Errorf("failed to process routes JSON: %w\nOutput: %s", err, string(output))
}
if err := talosCmd.Wait(); err != nil {
return nil, fmt.Errorf("talosctl get routes failed: %w", err)
}
var result []map[string]interface{}
if err := json.Unmarshal(output, &result); err != nil {
return nil, fmt.Errorf("failed to parse routes JSON: %w", err)
}
return result, nil
return t.getResourceJSON("routes", nodeIP, insecure)
}
// GetDefaultInterface finds the interface with the default route
@@ -310,20 +288,45 @@ func (t *Talosctl) GetPhysicalInterface(nodeIP string, insecure bool) (string, e
// GetVersion gets Talos version from a node
func (t *Talosctl) GetVersion(nodeIP string, insecure bool) (string, error) {
args := t.buildArgs([]string{
"version",
"--nodes", nodeIP,
"--short",
})
var args []string
// When using insecure mode (for maintenance mode nodes), don't use talosconfig
// Insecure mode is for unconfigured nodes that don't have authentication set up
if insecure {
args = append(args, "--insecure")
args = []string{
"version",
"--nodes", nodeIP,
"--short",
"--insecure",
}
} else {
// For configured nodes, use talosconfig if available
args = t.buildArgs([]string{
"version",
"--nodes", nodeIP,
"--short",
})
}
cmd := exec.Command("talosctl", args...)
// Use context with timeout to prevent hanging on unreachable nodes
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
cmd := exec.CommandContext(ctx, "talosctl", args...)
output, err := cmd.CombinedOutput()
outputStr := string(output)
// Special case: In maintenance mode, talosctl version returns an error
// "API is not implemented in maintenance mode" but this means the node IS reachable
// and IS in maintenance mode, so we treat this as a success
if err != nil && strings.Contains(outputStr, "API is not implemented in maintenance mode") {
// Extract client version from output as the node version
// Since we can't get server version in maintenance mode
return "maintenance", nil
}
if err != nil {
return "", fmt.Errorf("talosctl version failed: %w\nOutput: %s", err, string(output))
return "", fmt.Errorf("talosctl version failed: %w\nOutput: %s", err, outputStr)
}
// Parse output to extract server version

View File

@@ -7,6 +7,8 @@ import (
"fmt"
"os/exec"
"strings"
"github.com/wild-cloud/wild-central/daemon/internal/tools"
)
// HealthStatus represents cluster health information
@@ -38,7 +40,7 @@ func GetClusterHealth(kubeconfigPath string) (*HealthStatus, error) {
}
// Check MetalLB
if err := checkComponent(kubeconfigPath, "MetalLB", "metallb-system", "app=metallb"); err != nil {
if err := checkComponent(kubeconfigPath, "metallb-system", "app=metallb"); err != nil {
status.Components["metallb"] = "unhealthy"
status.Issues = append(status.Issues, fmt.Sprintf("MetalLB: %v", err))
status.Overall = "degraded"
@@ -47,7 +49,7 @@ func GetClusterHealth(kubeconfigPath string) (*HealthStatus, error) {
}
// Check Traefik
if err := checkComponent(kubeconfigPath, "Traefik", "traefik", "app.kubernetes.io/name=traefik"); err != nil {
if err := checkComponent(kubeconfigPath, "traefik", "app.kubernetes.io/name=traefik"); err != nil {
status.Components["traefik"] = "unhealthy"
status.Issues = append(status.Issues, fmt.Sprintf("Traefik: %v", err))
status.Overall = "degraded"
@@ -56,7 +58,7 @@ func GetClusterHealth(kubeconfigPath string) (*HealthStatus, error) {
}
// Check cert-manager
if err := checkComponent(kubeconfigPath, "cert-manager", "cert-manager", "app.kubernetes.io/instance=cert-manager"); err != nil {
if err := checkComponent(kubeconfigPath, "cert-manager", "app.kubernetes.io/instance=cert-manager"); err != nil {
status.Components["cert-manager"] = "unhealthy"
status.Issues = append(status.Issues, fmt.Sprintf("cert-manager: %v", err))
status.Overall = "degraded"
@@ -65,7 +67,7 @@ func GetClusterHealth(kubeconfigPath string) (*HealthStatus, error) {
}
// Check Longhorn
if err := checkComponent(kubeconfigPath, "Longhorn", "longhorn-system", "app=longhorn-manager"); err != nil {
if err := checkComponent(kubeconfigPath, "longhorn-system", "app=longhorn-manager"); err != nil {
status.Components["longhorn"] = "unhealthy"
status.Issues = append(status.Issues, fmt.Sprintf("Longhorn: %v", err))
status.Overall = "degraded"
@@ -81,13 +83,9 @@ func GetClusterHealth(kubeconfigPath string) (*HealthStatus, error) {
}
// checkComponent checks if a component is running
func checkComponent(kubeconfigPath, name, namespace, selector string) error {
args := []string{"get", "pods", "-n", namespace, "-l", selector, "-o", "json"}
if kubeconfigPath != "" {
args = append([]string{"--kubeconfig", kubeconfigPath}, args...)
}
cmd := exec.Command("kubectl", args...)
func checkComponent(kubeconfigPath, namespace, selector string) error {
cmd := exec.Command("kubectl", "get", "pods", "-n", namespace, "-l", selector, "-o", "json")
tools.WithKubeconfig(cmd, kubeconfigPath)
output, err := cmd.Output()
if err != nil {
return fmt.Errorf("failed to get pods: %w", err)
@@ -127,15 +125,17 @@ func checkComponent(kubeconfigPath, name, namespace, selector string) error {
}
// GetDashboardToken retrieves or creates a Kubernetes dashboard token
func GetDashboardToken() (*DashboardToken, error) {
func GetDashboardToken(kubeconfigPath string) (*DashboardToken, error) {
// Check if service account exists
cmd := exec.Command("kubectl", "get", "serviceaccount", "-n", "kubernetes-dashboard", "dashboard-admin")
tools.WithKubeconfig(cmd, kubeconfigPath)
if err := cmd.Run(); err != nil {
return nil, fmt.Errorf("dashboard-admin service account not found")
}
// Create token
cmd = exec.Command("kubectl", "-n", "kubernetes-dashboard", "create", "token", "dashboard-admin")
tools.WithKubeconfig(cmd, kubeconfigPath)
output, err := cmd.Output()
if err != nil {
return nil, fmt.Errorf("failed to create token: %w", err)
@@ -148,9 +148,10 @@ func GetDashboardToken() (*DashboardToken, error) {
}
// GetDashboardTokenFromSecret retrieves dashboard token from secret (fallback method)
func GetDashboardTokenFromSecret() (*DashboardToken, error) {
func GetDashboardTokenFromSecret(kubeconfigPath string) (*DashboardToken, error) {
cmd := exec.Command("kubectl", "-n", "kubernetes-dashboard", "get", "secret",
"dashboard-admin-token", "-o", "jsonpath={.data.token}")
tools.WithKubeconfig(cmd, kubeconfigPath)
output, err := cmd.Output()
if err != nil {
return nil, fmt.Errorf("failed to get token secret: %w", err)
@@ -167,8 +168,9 @@ func GetDashboardTokenFromSecret() (*DashboardToken, error) {
}
// GetNodeIPs returns IP addresses for all cluster nodes
func GetNodeIPs() ([]*NodeIP, error) {
func GetNodeIPs(kubeconfigPath string) ([]*NodeIP, error) {
cmd := exec.Command("kubectl", "get", "nodes", "-o", "json")
tools.WithKubeconfig(cmd, kubeconfigPath)
output, err := cmd.Output()
if err != nil {
return nil, fmt.Errorf("failed to get nodes: %w", err)
@@ -212,9 +214,10 @@ func GetNodeIPs() ([]*NodeIP, error) {
}
// GetControlPlaneIP returns the IP of the first control plane node
func GetControlPlaneIP() (string, error) {
func GetControlPlaneIP(kubeconfigPath string) (string, error) {
cmd := exec.Command("kubectl", "get", "nodes", "-l", "node-role.kubernetes.io/control-plane",
"-o", "jsonpath={.items[0].status.addresses[?(@.type==\"InternalIP\")].address}")
tools.WithKubeconfig(cmd, kubeconfigPath)
output, err := cmd.Output()
if err != nil {
return "", fmt.Errorf("failed to get control plane IP: %w", err)
@@ -229,9 +232,10 @@ func GetControlPlaneIP() (string, error) {
}
// CopySecretBetweenNamespaces copies a secret from one namespace to another
func CopySecretBetweenNamespaces(secretName, srcNamespace, dstNamespace string) error {
func CopySecretBetweenNamespaces(kubeconfigPath, secretName, srcNamespace, dstNamespace string) error {
// Get secret from source namespace
cmd := exec.Command("kubectl", "get", "secret", "-n", srcNamespace, secretName, "-o", "json")
tools.WithKubeconfig(cmd, kubeconfigPath)
output, err := cmd.Output()
if err != nil {
return fmt.Errorf("failed to get secret from %s: %w", srcNamespace, err)
@@ -259,6 +263,7 @@ func CopySecretBetweenNamespaces(secretName, srcNamespace, dstNamespace string)
// Apply to destination namespace
cmd = exec.Command("kubectl", "apply", "-f", "-")
tools.WithKubeconfig(cmd, kubeconfigPath)
cmd.Stdin = strings.NewReader(string(secretJSON))
if output, err := cmd.CombinedOutput(); err != nil {
return fmt.Errorf("failed to apply secret to %s: %w\nOutput: %s", dstNamespace, err, string(output))
@@ -268,8 +273,9 @@ func CopySecretBetweenNamespaces(secretName, srcNamespace, dstNamespace string)
}
// GetClusterVersion returns the Kubernetes cluster version
func GetClusterVersion() (string, error) {
func GetClusterVersion(kubeconfigPath string) (string, error) {
cmd := exec.Command("kubectl", "version", "-o", "json")
tools.WithKubeconfig(cmd, kubeconfigPath)
output, err := cmd.Output()
if err != nil {
return "", fmt.Errorf("failed to get cluster version: %w", err)

11
main.go
View File

@@ -78,9 +78,9 @@ func main() {
// Configure CORS
// Default to development origins
allowedOrigins := []string{
"http://localhost:5173", // Vite dev server
"http://localhost:5174", // Alternative port
"http://localhost:3000", // Common React dev port
"http://localhost:5173", // Vite dev server
"http://localhost:5174", // Alternative port
"http://localhost:3000", // Common React dev port
"http://127.0.0.1:5173",
"http://127.0.0.1:5174",
"http://127.0.0.1:3000",
@@ -89,10 +89,7 @@ func main() {
// Override with production origins if set
if corsOrigins := os.Getenv("WILD_CORS_ORIGINS"); corsOrigins != "" {
// Split comma-separated origins
allowedOrigins = []string{}
for _, origin := range splitAndTrim(corsOrigins, ",") {
allowedOrigins = append(allowedOrigins, origin)
}
allowedOrigins = splitAndTrim(corsOrigins, ",")
log.Printf("CORS configured for production origins: %v", allowedOrigins)
} else {
log.Printf("CORS configured for development origins")