Commit Graph

87 Commits

Author SHA1 Message Date
Paul Payne
c1ddf46f44 Restore strategies. 2026-05-25 23:09:39 +00:00
Paul Payne
a533082388 Improve Directory pages. 2026-05-25 22:29:47 +00:00
Paul Payne
288b448e48 Remove expensive asset-hashing operation. 2026-05-25 22:01:43 +00:00
Paul Payne
ce5ca426d6 Normalize talos and kubeconfig paths. 2026-05-25 22:01:20 +00:00
Paul Payne
fa59b5d8ad Fix flaky test. 2026-05-25 21:59:24 +00:00
Paul Payne
d38ed94d12 Node addition improvements. Global and instance config merging. Gomplate IPC. 2026-05-25 20:55:07 +00:00
Paul Payne
e2144412ce SSE node discovery. Node reset. Node apply fix. 2026-05-25 18:37:30 +00:00
Paul Payne
e93a14aa92 More informative error logs. 2026-05-25 18:35:05 +00:00
Paul Payne
e82c92b72e Node health monitoring. 2026-05-25 07:35:53 +00:00
Paul Payne
270fbeabef Adds node reboot. 2026-05-25 07:26:29 +00:00
Paul Payne
fdab9484a6 feat: Add cluster config backup and move schedules to per-app backup pages
Cluster config backup archives kubeconfig, talosconfig, config.yaml,
secrets.yaml, and Talos node configs for disaster recovery. Appears as
"Cluster Config" row on the backups page with its own detail page.

Backup schedules are now shown on each app's individual backup page
instead of the main backups overview, with active operations visible
per-app for real-time feedback during backup/restore.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-24 21:54:46 +00:00
Paul Payne
322492a85f fix: Resolve SSE test race condition by making client registration synchronous
RegisterClient was async (channel-based), so Broadcast could be processed
before the client was registered in the map, causing flaky test failures.
Register directly under the mutex instead.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-24 21:54:13 +00:00
Paul Payne
11c875a513 fix: Resolve all golangci-lint errors across API codebase
Handle unchecked errors (errcheck), fix nil-deref false positives (SA5011),
suppress deprecated-but-functional API warnings (SA1019), remove unused code,
and use fmt.Fprintf over WriteString(fmt.Sprintf(...)).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-24 21:52:59 +00:00
Paul Payne
3e9aa153e2 Go format. 2026-05-24 20:54:13 +00:00
Paul Payne
7cad37db07 More logging. 2026-05-24 20:40:02 +00:00
Paul Payne
eff5246144 Add more resiliency to backups and operations. Use Longhorn CRDs instead of a janky tunnel. 2026-05-24 20:35:51 +00:00
Paul Payne
81604879dc slog integration 2026-05-24 20:29:22 +00:00
Paul Payne
44c7cb6f72 Bakup UX. 2026-05-24 20:03:27 +00:00
Paul Payne
7a3ef65683 Refactor upgrade plan computation to support new app.yaml structure
- Updated `checkSourceDrift` to read version from app.yaml and corresponding slot manifest.
- Introduced `computeUpgradePlanFromMeta` to handle upgrade plans using centralized routing rules from app.yaml.
- Enhanced `ComputeUpgradePlan` to fallback to old-style manifest.yaml if app.yaml is not present.
- Added tests for both new and old upgrade plan computation methods, ensuring backward compatibility.
- Improved handling of upgrade paths, including waypoint resolution and circular dependency detection.
2026-05-24 18:30:00 +00:00
Paul Payne
9ac643a50f First version of app upgrade. 2026-05-24 03:59:36 +00:00
Paul Payne
8e55a589fb fix: handle nil manifest in resolveDeploymentResource function 2026-05-23 20:42:23 +00:00
Paul Payne
cd31e6a365 Better app state drift convergence. 2026-05-23 20:05:25 +00:00
Paul Payne
d185f9cf10 feat: Refactor app sidebar and apps component for improved clarity and functionality
- Updated AppSidebar to rename "Available Apps" to "App Directory" for better user understanding.
- Refactored AppsComponent to streamline app data handling and improve loading states.
- Introduced AppDirectoryPage to provide a dedicated view for browsing available apps.
- Added AppInfoPage to display detailed information about individual apps, including README and configuration options.
- Implemented useCatalogReadme hook to fetch README content for apps.
- Enhanced API service to include a method for fetching app README files.
- Improved error handling and loading states across components for better user experience.
2026-05-23 13:24:23 +00:00
Paul Payne
b73b7c25e5 refactor: remove service-related components and hooks
- Deleted ServiceLifecycleBadges, ServiceLogViewer, ServiceLogsDialog, ServiceStatusBadge, and ServiceStatusDialog components.
- Removed useServices and useServiceStatus hooks.
- Cleaned up services API by removing servicesApi and related types.
- Updated index files to reflect the removal of service components and hooks.
2026-05-23 11:21:33 +00:00
Paul Payne
b393c29fc9 feat: enhance database name detection logic and add unit tests for environment variable checks 2026-05-22 23:40:27 +00:00
Paul Payne
548f210849 fix: Derive config-only app status from directory contents, suppress false yellow indicators
Config-only apps (e.g., SMTP) have no kustomization.yaml or install.sh — they only
provide configuration for other apps. Previously they showed a permanent yellow dot
because their status was always "added". Now the API detects config-only apps from
directory contents and the frontend suppresses status indicators for them.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-22 23:36:49 +00:00
Paul Payne
3621c92194 refactor: Remove legacy cloud.smtp config, converge on apps.smtp
SMTP is now managed as an infrastructure app (apps.smtp.*), not as a
cloud-level config (cloud.smtp.*). Remove the SMTP struct from the API
config, the SMTP card from the advanced config page, and update tests
and documentation to reference the new location.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-22 23:26:43 +00:00
Paul Payne
8d8bfed515 feat: optimize cluster state retrieval with batch fetching of active namespaces and ingress URLs 2026-05-21 04:23:50 +00:00
Paul Payne
66389eebf3 feat: add support for operation cancellation and enhance backup scheduling features
- Added 'operation:cancelled' event handling in useGlobalSSE and useOperations hooks.
- Implemented useCancelOperation hook for cancelling operations with immediate UI feedback.
- Enhanced useSchedules hook to utilize SSE for schedule updates and improved schedule management.
- Updated BackupsPage to include schedule management UI and display active operations.
- Refactored operations handling to streamline fetching and filtering of operations.
- Improved backup and recovery plan handling with new health summary and recovery plan tracking.
- Updated API services for schedules and operations to align with new backend endpoints.
2026-05-21 04:20:31 +00:00
Paul Payne
fff321b05c fix: Use camelCase JSON tags for ClusterStatus and NodeStatus API responses
The Go structs used snake_case JSON tags (kubernetes_version, control_plane_nodes, etc.)
but the frontend expected camelCase, with no conversion layer. This broke the Kubernetes
version, node counts, and node status display on the dashboard and cluster pages.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-19 04:40:21 +00:00
Paul Payne
bdbc26d892 WIP: Blue-green backup-restore implementation
Continuation of blue-green backup work. Includes recovery plan
generation, active deployment tracking, and strategy updates for
postgres, mysql, longhorn, and config. Incomplete — branched to
make way for services/apps convergence.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-18 04:25:40 +00:00
Paul Payne
0dabb8b824 feat: add script execution functionality in app manifests and enhance manifest handling 2026-05-18 04:24:18 +00:00
Paul Payne
7ddebaa80c feat: add directory copying functionality during app update process 2026-05-18 03:39:31 +00:00
Paul Payne
18259034ea feat: enhance deployment process with CRD handling and refactor node management components 2026-05-18 03:33:05 +00:00
Paul Payne
48f8809587 feat: add node upgrade and rollback commands to CLI
- Implemented `node upgrade <hostname> <version>` command to upgrade a node's Talos version.
- Implemented `node rollback <hostname>` command to rollback a node to its previous Talos version.
- Added corresponding API calls for node upgrade and rollback in the nodes service.
- Enhanced CLI help documentation for new commands.

feat: introduce Talos utilities in CLI

- Added `talos` command group for Talos Image Factory utilities.
- Implemented `talos versions` command to list available Talos versions.
- Implemented `talos validate <schematic-id> <version>` command to validate schematic compatibility.
- Implemented `talos client` command to show talosctl client information and upgrade functionality.

feat: integrate Talos version selection in UI components

- Created `TalosVersionSelect` component for selecting Talos versions.
- Updated `CentralComponent`, `ClusterSettings`, `NodeUpgradeDialog`, and asset pages to use the new version selection component.
- Added validation feedback for schematic compatibility in relevant components.

fix: update entity tile to support version display

- Modified `EntityTile` component to accept React nodes for descriptions, allowing for version display alongside IP addresses.

chore: refactor API hooks for Talos

- Created new hooks for fetching Talos versions, validating schematics, and managing Talos client information and upgrades.
- Updated existing services and components to utilize the new hooks for improved data management and reactivity.
2026-05-18 02:53:54 +00:00
Paul Payne
da7a165447 Enhance namespace resolution and backup functionality
- Implemented ResolveNamespace method to determine the Kubernetes namespace for an app based on priority: config.yaml, manifest, or appName.
- Updated AppsGetLogs and AppsGetEvents handlers to use the resolved namespace.
- Modified backup process to correctly reference the app's namespace when copying secrets.
- Added unit tests for ResolveNamespace and copyDir functions to ensure correct behavior.
2026-05-17 23:24:20 +00:00
Paul Payne
2b8ef8e2b6 Add manifests endpoint and integrate into app detail panel 2026-05-17 22:32:55 +00:00
Paul Payne
b1f0ba07d8 Refactor application and cluster services management
- Removed the ClusterServicesComponent and integrated its functionality into the AppsComponent.
- Updated AppSidebar to reflect the removal of cluster services navigation.
- Adjusted AppsComponent to handle infrastructure services, including fetching, compiling, and deploying.
- Enhanced AppDetailPanel to support infrastructure-specific actions and lifecycle status display.
- Modified routing to redirect cluster-related paths to the apps section, ensuring proper phase checks.
- Updated phase guard logic to accommodate multiple required phases for app management.
- Cleaned up unused ServiceCard component and related imports.
- Adjusted app status types to include category for better categorization of apps.
2026-05-17 22:26:09 +00:00
Paul Payne
f37cb458f8 Add infrastructure setup. 2026-05-17 19:22:56 +00:00
Paul Payne
2dd661118c Consolidate service/app. 2026-05-17 19:11:10 +00:00
Paul Payne
7e017eaec7 Removes unnecessary crowdsec cluster role. 2026-05-16 22:45:53 +00:00
Paul Payne
f1a6a70bf8 Update crowdsec and change deployment to use /var/log on affinity traefik node instead of API (performance improvement and traffic-burst DOS prevention). 2026-05-16 22:45:01 +00:00
Paul Payne
5804c5fdd0 k8s dashboard replaced by headlamp 2026-05-16 22:21:53 +00:00
Paul Payne
42a58d383f Blue-green backup-restore implementation (incomplete). 2026-03-04 17:32:23 +00:00
Paul Payne
773d2e88c8 fix(postgres): Separate drop and create database commands in Restore function 2026-03-02 05:19:37 +00:00
Paul Payne
86cf4443d6 feat(postgres): Enhance database name retrieval from config.yaml for PostgreSQL strategy 2026-03-01 17:56:11 +00:00
Paul Payne
5f305672ac feat(kubectl): Add GetPodsByDeployment and GetPodsByLabel methods for improved pod retrieval
refactor(logs): Enhance pod retrieval logic in GetLogs and StreamLogs methods
refactor(backup): Update JSON unmarshalling to marshal response data before parsing
2026-02-28 21:31:58 +00:00
Paul Payne
26c88decd8 New backup system. 2026-02-28 20:51:46 +00:00
Paul Payne
aa528a2b01 feat(config): Implement config value extraction and tracking for service compilation 2026-02-28 07:58:44 +00:00
Paul Payne
395b740d78 refactor(deployment): Update deployment state terminology from 'needs_redeploy' to 'out_of_sync' 2026-02-28 07:21:43 +00:00