Compare commits

..

7 Commits

Author SHA1 Message Date
Paul Payne
12e87635c6 docs: Update ADDING-APPS.md to remove cloud.smtp references
SMTP config is now at apps.smtp.* via the SMTP infrastructure app,
not cloud.smtp.*. Remove the old variable listing and update the
configuration flow documentation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-22 23:31:14 +00:00
Paul Payne
351dff14d4 feat: add BackupTarget configuration and update kustomization to include it 2026-05-21 04:22:13 +00:00
Paul Payne
0645624ded feat: update Immich version and image tags to 1.135.3 in manifest.yaml 2026-05-21 04:21:40 +00:00
Paul Payne
afa21ef650 feat: add initial Kubernetes manifests for e2e-test-app including deployment, service, PVC, and database initialization job 2026-05-21 04:20:56 +00:00
Paul Payne
5733c20098 feat: add repair-certificates script for managing stuck certificates and ACME orders 2026-05-18 04:24:21 +00:00
Paul Payne
54abfdd469 Add kustomization.yaml for cert-manager with custom DNS settings
- Introduced a new kustomization.yaml file for cert-manager.
- Configured a patch to modify the cert-manager Deployment to use a custom DNS policy and settings.
- Set dnsPolicy to None and specified custom nameservers and search options.
2026-05-18 03:39:21 +00:00
Paul Payne
e4c24d4a8c feat: update CrowdSec and Traefik manifests; remove installation scripts and add secret management 2026-05-18 03:33:37 +00:00
24 changed files with 13679 additions and 430 deletions

View File

@@ -121,15 +121,6 @@ Here's a comprehensive rundown of all config variables that get set during clust
- cloud.dockerRegistryHost - Docker registry hostname (e.g., "registry.internal.cloud2.payne.io") - cloud.dockerRegistryHost - Docker registry hostname (e.g., "registry.internal.cloud2.payne.io")
##### SMTP Configuration (SMTP Service):
- cloud.smtp.host - SMTP server hostname
- cloud.smtp.port - SMTP port (typically "465" or "587")
- cloud.smtp.user - SMTP username
- cloud.smtp.from - Default 'from' email address
- cloud.smtp.tls - Enable TLS (true/false)
- cloud.smtp.startTls - Enable STARTTLS (true/false)
###### Backup Configuration: ###### Backup Configuration:
- cloud.backup.root - Root path for backups - cloud.backup.root - Root path for backups
@@ -214,8 +205,7 @@ Configuration Flow
- ExternalDNS → cluster.externalDns.ownerId - ExternalDNS → cluster.externalDns.ownerId
- NFS → cloud.nfs.* - NFS → cloud.nfs.*
- Docker Registry → cloud.dockerRegistryHost, cluster.dockerRegistry.storage - Docker Registry → cloud.dockerRegistryHost, cluster.dockerRegistry.storage
- SMTP → cloud.smtp.* 4. Apps: Each app adds its configuration under apps.<name>.* based on its manifest (including SMTP as an infrastructure app at apps.smtp.*)
4. Apps: Each app adds its configuration under apps.<name>.* based on its manifest
#### Manifest App Reference Resolution: #### Manifest App Reference Resolution:

View File

@@ -1 +1,20 @@
# cert-manager
X.509 certificate management for Kubernetes using Let's Encrypt.
## Upstream
The `upstream/cert-manager.yaml` file is downloaded from the official cert-manager release:
- Source: https://github.com/cert-manager/cert-manager/releases/download/v1.17.2/cert-manager.yaml
- Version: v1.17.2
To update, download the new version and replace the file.
## DNS Configuration
The upstream cert-manager deployment is patched via kustomize overlay (`upstream/kustomization.yaml`) to use external DNS resolvers (1.1.1.1, 8.8.8.8) instead of cluster DNS. This is required for ACME DNS-01 challenge verification.
## Maintenance
The `scripts/repair-certificates.sh` script can fix stuck certificates, orphaned ACME orders, and Cloudflare DNS cleanup errors. Run it manually when certificate issuance has issues.

View File

@@ -1,233 +0,0 @@
#!/bin/bash
set -e
set -o pipefail
if [ -z "${WILD_INSTANCE}" ]; then
echo "ERROR: WILD_INSTANCE is not set"
exit 1
fi
if [ -z "${WILD_API_DATA_DIR}" ]; then
echo "ERROR: WILD_API_DATA_DIR is not set"
exit 1
fi
if [ -z "${KUBECONFIG}" ]; then
echo "ERROR: KUBECONFIG is not set"
exit 1
fi
INSTANCE_DIR="${WILD_API_DATA_DIR}/instances/${WILD_INSTANCE}"
CERT_MANAGER_DIR="${INSTANCE_DIR}/apps/cert-manager"
echo "=== Setting up cert-manager ==="
echo ""
#######################
# Dependencies
#######################
echo "Verifying Traefik is ready (required for cert-manager)..."
kubectl wait --for=condition=Available deployment/traefik -n traefik --timeout=60s 2>/dev/null || {
echo "WARNING: Traefik not ready, but continuing with cert-manager installation"
echo "Note: cert-manager may not work properly without Traefik"
}
if [ ! -f "${CERT_MANAGER_DIR}/kustomization.yaml" ]; then
echo "ERROR: Compiled templates not found at ${CERT_MANAGER_DIR}/"
echo "Templates should be compiled before deployment."
exit 1
fi
########################
# Kubernetes components
########################
echo "Installing cert-manager components..."
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.17.2/cert-manager.yaml || \
kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.17.2/cert-manager.yaml
echo "Waiting for cert-manager to be ready..."
kubectl wait --for=condition=Available deployment/cert-manager -n cert-manager --timeout=120s
kubectl wait --for=condition=Available deployment/cert-manager-cainjector -n cert-manager --timeout=120s
kubectl wait --for=condition=Available deployment/cert-manager-webhook -n cert-manager --timeout=120s
echo "Creating Cloudflare API token secret..."
SECRETS_FILE="${WILD_API_DATA_DIR}/instances/${WILD_INSTANCE}/secrets.yaml"
CLOUDFLARE_API_TOKEN=$(yq '.apps.cert-manager.cloudflareToken' "$SECRETS_FILE" 2>/dev/null)
CLOUDFLARE_API_TOKEN=$(echo "$CLOUDFLARE_API_TOKEN")
if [ -z "$CLOUDFLARE_API_TOKEN" ] || [ "$CLOUDFLARE_API_TOKEN" = "null" ]; then
echo "ERROR: Cloudflare API token not found"
echo "Please set: apps.cert-manager.cloudflareToken in secrets.yaml"
exit 1
fi
kubectl create secret generic cloudflare-api-token \
--namespace cert-manager \
--from-literal=api-token="${CLOUDFLARE_API_TOKEN}" \
--dry-run=client -o yaml | kubectl apply -f -
echo "Verifying cert-manager webhook is fully operational..."
until kubectl get validatingwebhookconfigurations cert-manager-webhook &>/dev/null; do
echo "Waiting for cert-manager webhook to register..."
sleep 5
done
echo "Configuring cert-manager to use external DNS servers..."
kubectl patch deployment cert-manager -n cert-manager --patch '
spec:
template:
spec:
dnsPolicy: None
dnsConfig:
nameservers:
- "1.1.1.1"
- "8.8.8.8"
searches:
- cert-manager.svc.cluster.local
- svc.cluster.local
- cluster.local
options:
- name: ndots
value: "5"'
echo "Waiting for cert-manager to restart with new DNS configuration..."
kubectl rollout status deployment/cert-manager -n cert-manager --timeout=120s
########################
# Create issuers and certificates
########################
echo "Creating Let's Encrypt issuers and certificates..."
kubectl apply -k ${CERT_MANAGER_DIR}/
echo "Waiting for Let's Encrypt issuers to be ready..."
kubectl wait --for=condition=Ready clusterissuer/letsencrypt-prod --timeout=60s || echo "WARNING: Production issuer not ready, proceeding anyway..."
kubectl wait --for=condition=Ready clusterissuer/letsencrypt-staging --timeout=60s || echo "WARNING: Staging issuer not ready, proceeding anyway..."
sleep 5
######################################
# Fix stuck certificates and cleanup
######################################
needs_restart=false
echo "Checking for certificates with failed issuance attempts..."
stuck_certs=$(kubectl get certificates --all-namespaces -o json 2>/dev/null | \
jq -r '.items[] | select(.status.conditions[]? | select(.type=="Issuing" and .status=="False" and (.message | contains("404")))) | "\(.metadata.namespace) \(.metadata.name)"')
if [ -n "$stuck_certs" ]; then
echo "WARNING: Found certificates stuck with non-existent orders, recreating them..."
echo "$stuck_certs" | while read ns name; do
echo "Recreating certificate $ns/$name..."
cert_spec=$(kubectl get certificate "$name" -n "$ns" -o json | jq '.spec')
kubectl delete certificate "$name" -n "$ns"
echo "{\"apiVersion\":\"cert-manager.io/v1\",\"kind\":\"Certificate\",\"metadata\":{\"name\":\"$name\",\"namespace\":\"$ns\"},\"spec\":$cert_spec}" | kubectl apply -f -
done
needs_restart=true
sleep 5
else
echo "No certificates stuck with failed orders"
fi
echo "Checking for orphaned ACME orders..."
orphaned_orders=$(kubectl logs -n cert-manager deployment/cert-manager --tail=200 2>/dev/null | \
grep -E "failed to retrieve the ACME order.*404" 2>/dev/null | \
sed -n 's/.*resource_name="\([^"]*\)".*/\1/p' | \
sort -u || true)
if [ -n "$orphaned_orders" ]; then
echo "WARNING: Found orphaned ACME orders from logs"
for order in $orphaned_orders; do
echo "Deleting orphaned order: $order"
orders_found=$(kubectl get orders --all-namespaces 2>/dev/null | grep "$order" 2>/dev/null || true)
if [ -n "$orders_found" ]; then
echo "$orders_found" | while read ns name rest; do
kubectl delete order "$name" -n "$ns" 2>/dev/null || true
done
fi
done
needs_restart=true
else
echo "No orphaned orders found in logs"
fi
echo "Checking for Cloudflare DNS cleanup errors..."
cloudflare_errors=$(kubectl logs -n cert-manager deployment/cert-manager --tail=200 2>/dev/null | \
grep -c "Error: 7003.*Could not route" 2>/dev/null || echo "0")
if [ "$cloudflare_errors" -gt "0" ]; then
echo "WARNING: Found $cloudflare_errors Cloudflare DNS cleanup errors (stale DNS record references)"
echo "Deleting stuck challenges and orders to allow fresh start"
kubectl delete challenges --all -n cert-manager 2>/dev/null || true
kubectl delete orders --all -n cert-manager 2>/dev/null || true
needs_restart=true
else
echo "No Cloudflare DNS cleanup errors"
fi
if [ "$needs_restart" = true ]; then
echo "Restarting cert-manager to clear internal state..."
kubectl rollout restart deployment cert-manager -n cert-manager
kubectl rollout status deployment/cert-manager -n cert-manager --timeout=120s
echo "Waiting for cert-manager to recreate fresh challenges..."
sleep 15
else
echo "No restart needed - cert-manager state is clean"
fi
#########################
# Final checks
#########################
echo "Waiting for wildcard certificates to be ready (this may take several minutes)..."
wait_for_cert() {
local cert_name="$1"
local timeout=300
local elapsed=0
echo " Checking $cert_name..."
while [ $elapsed -lt $timeout ]; do
if kubectl get certificate "$cert_name" -n cert-manager -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}' 2>/dev/null | grep -q "True"; then
echo " $cert_name is ready"
return 0
fi
if [ $((elapsed % 30)) -eq 0 ] && [ $elapsed -gt 0 ]; then
local status=$(kubectl get certificate "$cert_name" -n cert-manager -o jsonpath='{.status.conditions[?(@.type=="Ready")].message}' 2>/dev/null || echo "Waiting...")
echo " Still waiting for $cert_name... ($elapsed/${timeout}s) - $status"
fi
sleep 5
elapsed=$((elapsed + 5))
done
echo " WARNING: Timeout waiting for $cert_name (will continue anyway)"
return 1
}
wait_for_cert "wildcard-internal-wild-cloud"
wait_for_cert "wildcard-wild-cloud"
echo "Performing final cert-manager health check..."
failed_certs=$(kubectl get certificates --all-namespaces -o json 2>/dev/null | jq -r '.items[] | select(.status.conditions[]? | select(.type=="Ready" and .status!="True")) | "\(.metadata.namespace)/\(.metadata.name)"' | wc -l)
if [ "$failed_certs" -gt 0 ]; then
echo "WARNING: Found $failed_certs certificates not in Ready state"
echo "Check certificate status with: kubectl get certificates --all-namespaces"
echo "Check cert-manager logs with: kubectl logs -n cert-manager deployment/cert-manager"
else
echo "All certificates are in Ready state"
fi
echo ""
echo "cert-manager setup complete!"
echo ""
echo "To verify the installation:"
echo " kubectl get certificates --all-namespaces"
echo " kubectl get clusterissuers"

View File

@@ -11,5 +11,20 @@ defaultConfig:
internalDomain: "{{ .cloud.internalDomain }}" internalDomain: "{{ .cloud.internalDomain }}"
email: "{{ .operator.email }}" email: "{{ .operator.email }}"
cloudflareDomain: "{{ .cloud.baseDomain }}" cloudflareDomain: "{{ .cloud.baseDomain }}"
scripts:
- name: repair-certificates
path: scripts/repair-certificates.sh
description: Fix stuck certificates, orphaned ACME orders, and Cloudflare DNS cleanup errors
defaultSecrets: defaultSecrets:
- key: cloudflareToken - key: cloudflareToken
deploy:
phases:
- path: upstream
waitFor:
name: cert-manager-webhook
timeout: "120s"
- path: .
createSecrets:
- name: cloudflare-api-token
entries:
api-token: cloudflareToken

View File

@@ -0,0 +1,89 @@
#!/bin/bash
# Repair stuck certificates, orphaned ACME orders, and Cloudflare DNS errors.
# This is an operational maintenance script, not part of deployment.
# Run manually when cert-manager has issues with certificate issuance.
#
# Usage: KUBECONFIG=/path/to/kubeconfig ./repair-certificates.sh
set -e
set -o pipefail
if [ -z "${KUBECONFIG}" ]; then
echo "ERROR: KUBECONFIG is not set"
exit 1
fi
needs_restart=false
echo "=== cert-manager Certificate Repair ==="
echo ""
echo "Checking for certificates with failed issuance attempts..."
stuck_certs=$(kubectl get certificates --all-namespaces -o json 2>/dev/null | \
jq -r '.items[] | select(.status.conditions[]? | select(.type=="Issuing" and .status=="False" and (.message | contains("404")))) | "\(.metadata.namespace) \(.metadata.name)"')
if [ -n "$stuck_certs" ]; then
echo "WARNING: Found certificates stuck with non-existent orders, recreating them..."
echo "$stuck_certs" | while read ns name; do
echo "Recreating certificate $ns/$name..."
cert_spec=$(kubectl get certificate "$name" -n "$ns" -o json | jq '.spec')
kubectl delete certificate "$name" -n "$ns"
echo "{\"apiVersion\":\"cert-manager.io/v1\",\"kind\":\"Certificate\",\"metadata\":{\"name\":\"$name\",\"namespace\":\"$ns\"},\"spec\":$cert_spec}" | kubectl apply -f -
done
needs_restart=true
sleep 5
else
echo "No certificates stuck with failed orders"
fi
echo "Checking for orphaned ACME orders..."
orphaned_orders=$(kubectl logs -n cert-manager deployment/cert-manager --tail=200 2>/dev/null | \
grep -E "failed to retrieve the ACME order.*404" 2>/dev/null | \
sed -n 's/.*resource_name="\([^"]*\)".*/\1/p' | \
sort -u || true)
if [ -n "$orphaned_orders" ]; then
echo "WARNING: Found orphaned ACME orders from logs"
for order in $orphaned_orders; do
echo "Deleting orphaned order: $order"
orders_found=$(kubectl get orders --all-namespaces 2>/dev/null | grep "$order" 2>/dev/null || true)
if [ -n "$orders_found" ]; then
echo "$orders_found" | while read ns name rest; do
kubectl delete order "$name" -n "$ns" 2>/dev/null || true
done
fi
done
needs_restart=true
else
echo "No orphaned orders found in logs"
fi
echo "Checking for Cloudflare DNS cleanup errors..."
cloudflare_errors=$(kubectl logs -n cert-manager deployment/cert-manager --tail=200 2>/dev/null | \
grep -c "Error: 7003.*Could not route" 2>/dev/null || echo "0")
if [ "$cloudflare_errors" -gt "0" ]; then
echo "WARNING: Found $cloudflare_errors Cloudflare DNS cleanup errors (stale DNS record references)"
echo "Deleting stuck challenges and orders to allow fresh start"
kubectl delete challenges --all -n cert-manager 2>/dev/null || true
kubectl delete orders --all -n cert-manager 2>/dev/null || true
needs_restart=true
else
echo "No Cloudflare DNS cleanup errors"
fi
if [ "$needs_restart" = true ]; then
echo "Restarting cert-manager to clear internal state..."
kubectl rollout restart deployment cert-manager -n cert-manager
kubectl rollout status deployment/cert-manager -n cert-manager --timeout=120s
echo "Waiting for cert-manager to recreate fresh challenges..."
sleep 15
else
echo "No restart needed - cert-manager state is clean"
fi
echo ""
echo "Repair complete. Check certificate status with:"
echo " kubectl get certificates --all-namespaces"
echo " kubectl get clusterissuers"

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,30 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- cert-manager.yaml
patches:
- target:
kind: Deployment
name: cert-manager
namespace: cert-manager
patch: |-
apiVersion: apps/v1
kind: Deployment
metadata:
name: cert-manager
namespace: cert-manager
spec:
template:
spec:
dnsPolicy: None
dnsConfig:
nameservers:
- "1.1.1.1"
- "8.8.8.8"
searches:
- cert-manager.svc.cluster.local
- svc.cluster.local
- cluster.local
options:
- name: ndots
value: "5"

View File

@@ -66,6 +66,12 @@ spec:
secretKeyRef: secretKeyRef:
name: crowdsec-agent-secret name: crowdsec-agent-secret
key: password key: password
- name: BOUNCER_KEY_traefik
valueFrom:
secretKeyRef:
name: crowdsec-secrets
key: bouncerApiKey
optional: true
ports: ports:
- name: lapi - name: lapi
containerPort: 8080 containerPort: 8080

View File

@@ -1,118 +0,0 @@
#!/bin/bash
set -e
set -o pipefail
if [ -z "${WILD_INSTANCE}" ]; then
echo "ERROR: WILD_INSTANCE is not set"
exit 1
fi
if [ -z "${WILD_API_DATA_DIR}" ]; then
echo "ERROR: WILD_API_DATA_DIR is not set"
exit 1
fi
if [ -z "${KUBECONFIG}" ]; then
echo "ERROR: KUBECONFIG is not set"
exit 1
fi
INSTANCE_DIR="${WILD_API_DATA_DIR}/instances/${WILD_INSTANCE}"
CROWDSEC_DIR="${INSTANCE_DIR}/apps/crowdsec"
SECRETS_FILE="${INSTANCE_DIR}/secrets.yaml"
echo "=== Setting up CrowdSec Security Engine ==="
echo ""
echo "Verifying Traefik is ready (required for CrowdSec bouncer)..."
kubectl wait --for=condition=Available deployment/traefik -n traefik --timeout=60s 2>/dev/null || {
echo "WARNING: Traefik not ready, but continuing with CrowdSec installation"
echo "Note: CrowdSec bouncer will not work until Traefik is available"
}
echo "Using pre-compiled CrowdSec templates..."
if [ ! -f "${CROWDSEC_DIR}/kustomization.yaml" ]; then
echo "ERROR: Compiled templates not found at ${CROWDSEC_DIR}"
echo "Templates should be compiled before deployment."
exit 1
fi
echo "Deploying CrowdSec..."
kubectl apply -k ${CROWDSEC_DIR}/
echo "Creating CrowdSec agent secret..."
AGENT_PASSWORD=$(yq '.apps.crowdsec.agentPassword' "$SECRETS_FILE" 2>/dev/null | tr -d '"')
if [ -z "$AGENT_PASSWORD" ] || [ "$AGENT_PASSWORD" = "null" ]; then
echo "Generating new agent password..."
AGENT_PASSWORD=$(openssl rand -base64 32)
echo "WARNING: Agent password not found in secrets.yaml"
echo "Using generated password - you may want to persist this"
fi
kubectl create secret generic crowdsec-agent-secret \
--namespace crowdsec \
--from-literal=password="${AGENT_PASSWORD}" \
--dry-run=client -o yaml | kubectl apply -f -
echo "Waiting for CrowdSec agent to be ready..."
kubectl rollout status deployment/crowdsec -n crowdsec --timeout=120s
echo "Registering bouncer with CrowdSec agent..."
BOUNCER_API_KEY=$(yq '.apps.crowdsec.bouncerApiKey' "$SECRETS_FILE" 2>/dev/null | tr -d '"')
if [ -z "$BOUNCER_API_KEY" ] || [ "$BOUNCER_API_KEY" = "null" ]; then
echo "Generating new bouncer API key from CrowdSec agent..."
kubectl exec -n crowdsec deploy/crowdsec -- cscli bouncers delete traefik-bouncer 2>/dev/null || true
BOUNCER_API_KEY=$(kubectl exec -n crowdsec deploy/crowdsec -- cscli bouncers add traefik-bouncer -o raw)
echo "Generated bouncer API key - you may want to persist this in secrets.yaml"
fi
kubectl create secret generic crowdsec-bouncer-secret \
--namespace crowdsec \
--from-literal=api-key="${BOUNCER_API_KEY}" \
--dry-run=client -o yaml | kubectl apply -f -
echo "Copying bouncer secret to traefik namespace..."
kubectl create secret generic crowdsec-bouncer-secret \
--namespace traefik \
--from-literal=api-key="${BOUNCER_API_KEY}" \
--dry-run=client -o yaml | kubectl apply -f -
echo "Cleaning up old bouncer deployment..."
kubectl delete deployment traefik-crowdsec-bouncer -n crowdsec --ignore-not-found
kubectl delete service traefik-crowdsec-bouncer -n crowdsec --ignore-not-found
echo "Restarting Traefik to load CrowdSec plugin..."
kubectl rollout restart deployment/traefik -n traefik
kubectl rollout status deployment/traefik -n traefik --timeout=120s
echo "Configuring Traefik to use CrowdSec security chain by default..."
kubectl patch deployment traefik -n traefik --type='json' -p='[
{
"op": "add",
"path": "/spec/template/spec/containers/0/args/-",
"value": "--entryPoints.websecure.http.middlewares=crowdsec-security-chain@kubernetescrd"
}
]' 2>/dev/null || {
echo "Note: Traefik may already have middleware configured or patch failed"
echo "You can manually configure default middleware if needed"
}
echo ""
echo "CrowdSec installed successfully (using Traefik plugin)"
echo ""
echo "All ingresses are now protected by default with:"
echo " - Threat detection (CrowdSec Traefik plugin, stream mode)"
echo " - Rate limiting (100 req/min)"
echo " - Security headers (HSTS, XSS protection, etc.)"
echo ""
echo "To verify the installation:"
echo " kubectl get pods -n crowdsec"
echo " kubectl get pods -n traefik"
echo " kubectl exec -n crowdsec deploy/crowdsec -- cscli bouncers list"
echo " kubectl exec -n crowdsec deploy/crowdsec -- cscli decisions list"
echo ""
echo "To opt-out a specific ingress from CrowdSec protection:"
echo " Add annotation: traefik.ingress.kubernetes.io/router.middlewares: \"\""
echo ""

View File

@@ -13,3 +13,18 @@ defaultConfig:
defaultSecrets: defaultSecrets:
- key: agentPassword - key: agentPassword
- key: bouncerApiKey - key: bouncerApiKey
deploy:
createSecrets:
- name: crowdsec-agent-secret
entries:
password: agentPassword
- name: crowdsec-bouncer-secret
entries:
api-key: bouncerApiKey
- name: crowdsec-bouncer-secret
namespace: traefik
entries:
api-key: bouncerApiKey
waitForRollout:
name: crowdsec
timeout: "120s"

View File

@@ -0,0 +1,72 @@
apiVersion: batch/v1
kind: Job
metadata:
name: e2e-test-app-db-init
labels:
component: db-init
spec:
template:
metadata:
labels:
component: db-init
spec:
restartPolicy: OnFailure
securityContext:
runAsNonRoot: true
runAsUser: 999
runAsGroup: 999
seccompProfile:
type: RuntimeDefault
containers:
- name: postgres-init
image: postgres:15
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: false
env:
- name: PGHOST
value: {{ .dbHost }}
- name: PGUSER
value: postgres
- name: PGPASSWORD
valueFrom:
secretKeyRef:
name: e2e-test-app-secrets
key: postgres.password
- name: DB_NAME
value: {{ .dbName }}
- name: DB_USER
value: {{ .dbUser }}
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: e2e-test-app-secrets
key: dbPassword
command:
- /bin/bash
- -c
- |
set -e
echo "Waiting for PostgreSQL to be ready..."
until pg_isready; do
echo "PostgreSQL is not ready - sleeping"
sleep 2
done
echo "PostgreSQL is ready"
echo "Creating database and user..."
psql -c "CREATE DATABASE ${DB_NAME};" || echo "Database ${DB_NAME} already exists"
psql -c "CREATE USER ${DB_USER} WITH PASSWORD '${DB_PASSWORD}';" || echo "User ${DB_USER} already exists"
psql -c "ALTER USER ${DB_USER} WITH PASSWORD '${DB_PASSWORD}';"
psql -c "GRANT ALL PRIVILEGES ON DATABASE ${DB_NAME} TO ${DB_USER};"
psql -d ${DB_NAME} -c "GRANT ALL ON SCHEMA public TO ${DB_USER};"
echo "Creating test data table..."
psql -d ${DB_NAME} -c "CREATE TABLE IF NOT EXISTS e2e_test_data (id SERIAL PRIMARY KEY, key TEXT UNIQUE NOT NULL, value TEXT NOT NULL, created_at TIMESTAMP DEFAULT NOW());"
psql -d ${DB_NAME} -c "GRANT ALL ON TABLE e2e_test_data TO ${DB_USER};"
psql -d ${DB_NAME} -c "GRANT USAGE, SELECT ON SEQUENCE e2e_test_data_id_seq TO ${DB_USER};"
echo "Database initialization complete"

View File

@@ -0,0 +1,55 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: e2e-test-app
spec:
replicas: 1
strategy:
type: Recreate
selector:
matchLabels:
component: web
template:
metadata:
labels:
component: web
spec:
securityContext:
runAsNonRoot: true
runAsUser: 101
runAsGroup: 101
fsGroup: 101
seccompProfile:
type: RuntimeDefault
containers:
- name: nginx
image: nginxinc/nginx-unprivileged:alpine
ports:
- containerPort: 8080
name: http
volumeMounts:
- name: app-data
mountPath: /data
resources:
limits:
cpu: 100m
memory: 64Mi
requests:
cpu: 50m
memory: 32Mi
readinessProbe:
httpGet:
path: /
port: 8080
initialDelaySeconds: 3
periodSeconds: 5
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: false
volumes:
- name: app-data
persistentVolumeClaim:
claimName: e2e-test-app-data

View File

@@ -0,0 +1,15 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: e2e-test-app
labels:
- includeSelectors: true
pairs:
app: e2e-test-app
managedBy: kustomize
partOf: wild-cloud
resources:
- namespace.yaml
- deployment.yaml
- service.yaml
- pvc.yaml
- db-init-job.yaml

View File

@@ -0,0 +1,23 @@
name: e2e-test-app
is: e2e-test-app
description: End-to-end test application for automated integration testing. Includes PVC and PostgreSQL dependency to exercise all backup strategies.
version: 1.0.0
requires:
- name: postgres
defaultConfig:
namespace: e2e-test-app
domain: e2e-test-app.{{ .cloud.domain }}
externalDnsDomain: "{{ .cloud.domain }}"
tlsSecretName: wildcard-wild-cloud-tls
storage: 1Gi
dbHost: "{{ .apps.postgres.host }}"
dbPort: "{{ .apps.postgres.port }}"
dbName: e2e_test_app
dbUser: e2e_test_app
timezone: UTC
defaultSecrets:
- key: dbPassword
- key: dbUrl
default: "postgres://{{ .app.dbUser }}:{{ .secrets.dbPassword }}@{{ .app.dbHost }}:{{ .app.dbPort }}/{{ .app.dbName }}?sslmode=disable"
requiredSecrets:
- postgres.password

View File

@@ -0,0 +1,4 @@
apiVersion: v1
kind: Namespace
metadata:
name: {{ .namespace }}

11
e2e-test-app/pvc.yaml Normal file
View File

@@ -0,0 +1,11 @@
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: e2e-test-app-data
spec:
accessModes:
- ReadWriteOnce
storageClassName: longhorn
resources:
requests:
storage: {{ .storage }}

11
e2e-test-app/service.yaml Normal file
View File

@@ -0,0 +1,11 @@
apiVersion: v1
kind: Service
metadata:
name: e2e-test-app
spec:
selector:
component: web
ports:
- port: 80
targetPort: 8080
name: http

View File

@@ -2,7 +2,7 @@ name: immich
is: immich is: immich
description: Immich is a self-hosted photo and video backup solution that allows you description: Immich is a self-hosted photo and video backup solution that allows you
to store, manage, and share your media files securely. to store, manage, and share your media files securely.
version: release version: 1.135.3
icon: https://cdn.jsdelivr.net/gh/homarr-labs/dashboard-icons/svg/immich.svg icon: https://cdn.jsdelivr.net/gh/homarr-labs/dashboard-icons/svg/immich.svg
requires: requires:
- name: redis - name: redis
@@ -10,8 +10,8 @@ requires:
defaultConfig: defaultConfig:
namespace: immich namespace: immich
externalDnsDomain: '{{ .cloud.domain }}' externalDnsDomain: '{{ .cloud.domain }}'
serverImage: ghcr.io/immich-app/immich-server:release serverImage: ghcr.io/immich-app/immich-server:v1.135.3
mlImage: ghcr.io/immich-app/immich-machine-learning:release mlImage: ghcr.io/immich-app/immich-machine-learning:v1.135.3
timezone: UTC timezone: UTC
serverPort: 2283 serverPort: 2283
mlPort: 3003 mlPort: 3003

View File

@@ -0,0 +1,9 @@
apiVersion: longhorn.io/v1beta2
kind: BackupTarget
metadata:
name: default
namespace: longhorn-system
spec:
backupTargetURL: "{{ .backupTarget }}"
credentialSecret: ""
pollInterval: 5m0s

View File

@@ -3,5 +3,6 @@ kind: Kustomization
resources: resources:
- longhorn.yaml - longhorn.yaml
- backup-target.yaml
- ingress.yaml - ingress.yaml
- volumesnapshotclass-longhorn.yaml - volumesnapshotclass-longhorn.yaml

View File

@@ -83,8 +83,6 @@ data:
default-setting.yaml: |- default-setting.yaml: |-
priority-class: longhorn-critical priority-class: longhorn-critical
disable-revision-counter: true disable-revision-counter: true
backup-target: {{ .backupTarget }}
backup-target-credential-secret: ""
--- ---
# Source: longhorn/templates/storageclass.yaml # Source: longhorn/templates/storageclass.yaml
apiVersion: v1 apiVersion: v1

View File

@@ -118,6 +118,7 @@ spec:
- "--accesslog=true" - "--accesslog=true"
- "--accesslog.format=json" - "--accesslog.format=json"
- "--log.level=INFO" - "--log.level=INFO"
- "--entryPoints.websecure.http.middlewares=crowdsec-security-chain@kubernetescrd"
env: env:
- name: POD_NAME - name: POD_NAME

View File

@@ -1,63 +0,0 @@
#!/bin/bash
set -e
set -o pipefail
if [ -z "${WILD_INSTANCE}" ]; then
echo "ERROR: WILD_INSTANCE is not set"
exit 1
fi
if [ -z "${WILD_API_DATA_DIR}" ]; then
echo "ERROR: WILD_API_DATA_DIR is not set"
exit 1
fi
if [ -z "${KUBECONFIG}" ]; then
echo "ERROR: KUBECONFIG is not set"
exit 1
fi
INSTANCE_DIR="${WILD_API_DATA_DIR}/instances/${WILD_INSTANCE}"
TRAEFIK_DIR="${INSTANCE_DIR}/apps/traefik"
echo "=== Setting up Traefik Ingress Controller ==="
echo ""
echo "Verifying MetalLB is ready (required for Traefik LoadBalancer service)..."
kubectl wait --for=condition=Ready pod -l component=controller -n metallb-system --timeout=60s 2>/dev/null || {
echo "MetalLB controller not ready, but continuing with Traefik installation"
echo "Note: Traefik LoadBalancer service may not get external IP without MetalLB"
}
echo "Installing Gateway API CRDs..."
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.0.0/standard-install.yaml
echo "Installing Traefik CRDs..."
kubectl apply -f https://raw.githubusercontent.com/traefik/traefik/v3.4/docs/content/reference/dynamic-configuration/kubernetes-crd-definition-v1.yml
echo "Waiting for CRDs to be established..."
kubectl wait --for condition=established crd/gateways.gateway.networking.k8s.io --timeout=60s
kubectl wait --for condition=established crd/gatewayclasses.gateway.networking.k8s.io --timeout=60s
kubectl wait --for condition=established crd/ingressroutes.traefik.io --timeout=60s
kubectl wait --for condition=established crd/middlewares.traefik.io --timeout=60s
echo "Using pre-compiled Traefik templates..."
if [ ! -f "${TRAEFIK_DIR}/kustomization.yaml" ]; then
echo "ERROR: Compiled templates not found at ${TRAEFIK_DIR}"
echo "Templates should be compiled before deployment."
exit 1
fi
echo "Deploying Traefik..."
kubectl apply -k ${TRAEFIK_DIR}/
echo "Waiting for Traefik to be ready..."
kubectl wait --for=condition=Available deployment/traefik -n traefik --timeout=120s
echo ""
echo "Traefik installed successfully"
echo ""
echo "To verify the installation:"
echo " kubectl get pods -n traefik"
echo " kubectl get svc -n traefik"
echo ""

View File

@@ -8,3 +8,16 @@ requires:
defaultConfig: defaultConfig:
namespace: traefik namespace: traefik
loadBalancerIp: "{{ .apps.metallb.loadBalancerIp }}" loadBalancerIp: "{{ .apps.metallb.loadBalancerIp }}"
deploy:
crds:
- url: https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.0.0/standard-install.yaml
waitFor:
- gateways.gateway.networking.k8s.io
- gatewayclasses.gateway.networking.k8s.io
- url: https://raw.githubusercontent.com/traefik/traefik/v3.4/docs/content/reference/dynamic-configuration/kubernetes-crd-definition-v1.yml
waitFor:
- ingressroutes.traefik.io
- middlewares.traefik.io
waitForRollout:
name: traefik
timeout: "120s"