New OPS-centric setup. Integrated with wild-init and wild-setup.

This commit is contained in:
2025-06-21 14:22:22 -07:00
parent e55b9b2b8c
commit f90baac653
70 changed files with 128 additions and 197 deletions

101
setup/cluster/README.md Normal file
View File

@@ -0,0 +1,101 @@
# Infrastructure setup scripts
Creates a fully functional personal cloud infrastructure on a bare metal Kubernetes (k3s) cluster that provides:
1. **External access** to services via configured domain names (using ${DOMAIN})
2. **Internal-only access** to admin interfaces (via internal.${DOMAIN} subdomains)
3. **Secure traffic routing** with automatic TLS
4. **Reliable networking** with proper load balancing
## Architecture
```
Internet → External DNS → MetalLB LoadBalancer → Traefik → Kubernetes Services
Internal DNS
Internal Network
```
## Key Components
- **MetalLB** - Provides load balancing for bare metal clusters
- **Traefik** - Handles ingress traffic, TLS termination, and routing
- **cert-manager** - Manages TLS certificates
- **CoreDNS** - Provides DNS resolution for services
- **Longhorn** - Distributed storage system for persistent volumes
- **NFS** - Network file system for shared media storage (optional)
- **Kubernetes Dashboard** - Web UI for cluster management (accessible via https://dashboard.internal.${DOMAIN})
- **Docker Registry** - Private container registry for custom images
## Configuration Approach
All infrastructure components use a consistent configuration approach:
1. **Environment Variables** - All configuration settings are managed using environment variables loaded by running `source load-env.sh`
2. **Template Files** - Configuration files use templates with `${VARIABLE}` syntax
3. **Setup Scripts** - Each component has a dedicated script in `infrastructure_setup/` for installation and configuration
## Idempotent Design
All setup scripts are designed to be idempotent:
- Scripts can be run multiple times without causing harm
- Each script checks for existing resources before creating new ones
- Configuration updates are applied cleanly without duplication
- Failed or interrupted setups can be safely retried
- Changes to configuration will be properly applied on subsequent runs
This idempotent approach ensures consistent, reliable infrastructure setup and allows for incremental changes without requiring a complete teardown and rebuild.
## NFS Setup (Optional)
The infrastructure supports optional NFS (Network File System) for shared media storage across the cluster:
### Host Setup
First, set up the NFS server on your chosen host:
```bash
# Set required environment variables
export NFS_HOST=box-01 # Hostname or IP of NFS server
export NFS_MEDIA_PATH=/data/media # Path to media directory
export NFS_STORAGE_CAPACITY=1Ti # Optional: PV size (default: 250Gi)
# Run host setup script on the NFS server
./infrastructure_setup/setup-nfs-host.sh
```
### Cluster Integration
Then integrate NFS with your Kubernetes cluster:
```bash
# Run cluster setup (part of setup-all.sh or standalone)
./infrastructure_setup/setup-nfs.sh
```
### Features
- **Automatic IP detection** - Uses network IP even when hostname resolves to localhost
- **Cluster-wide access** - Any pod can mount the NFS share regardless of node placement
- **Configurable capacity** - Set PersistentVolume size via `NFS_STORAGE_CAPACITY`
- **ReadWriteMany** - Multiple pods can simultaneously access the same storage
### Usage
Applications can use NFS storage by setting `storageClassName: nfs` in their PVCs:
```yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: media-pvc
spec:
accessModes:
- ReadWriteMany
storageClassName: nfs
resources:
requests:
storage: 100Gi
```

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,19 @@
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: wildcard-internal-wild-cloud
namespace: cert-manager
spec:
secretName: wildcard-internal-wild-cloud-tls
dnsNames:
- "*.internal.${DOMAIN}"
- "internal.${DOMAIN}"
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
duration: 2160h # 90 days
renewBefore: 360h # 15 days
privateKey:
algorithm: RSA
size: 2048

View File

@@ -0,0 +1,26 @@
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
email: ${EMAIL}
privateKeySecretRef:
name: letsencrypt-prod
server: https://acme-v02.api.letsencrypt.org/directory
solvers:
# DNS-01 solver for wildcard certificates
- dns01:
cloudflare:
email: ${EMAIL}
apiTokenSecretRef:
name: cloudflare-api-token
key: api-token
selector:
dnsZones:
- "${CLOUDFLARE_DOMAIN}"
# Keep the HTTP-01 solver for non-wildcard certificates
- http01:
ingress:
class: traefik

View File

@@ -0,0 +1,26 @@
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-staging
spec:
acme:
email: ${EMAIL}
privateKeySecretRef:
name: letsencrypt-staging
server: https://acme-staging-v02.api.letsencrypt.org/directory
solvers:
# DNS-01 solver for wildcard certificates
- dns01:
cloudflare:
email: ${EMAIL}
apiTokenSecretRef:
name: cloudflare-api-token
key: api-token
selector:
dnsZones:
- "${CLOUDFLARE_DOMAIN}"
# Keep the HTTP-01 solver for non-wildcard certificates
- http01:
ingress:
class: traefik

View File

@@ -0,0 +1,19 @@
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: wildcard-wild-cloud
namespace: cert-manager
spec:
secretName: wildcard-wild-cloud-tls
dnsNames:
- "*.${DOMAIN}"
- "${DOMAIN}"
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
duration: 2160h # 90 days
renewBefore: 360h # 15 days
privateKey:
algorithm: RSA
size: 2048

View File

@@ -0,0 +1,49 @@
# CoreDNS
- https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/
- https://github.com/kubernetes/dns/blob/master/docs/specification.md
- https://coredns.io/
CoreDNS has the `kubernetes` plugin, so it returns all k8s service endpoints in well-known format.
All services and pods are registered in CoreDNS.
- <service-name>.<namespace>.svc.cluster.local
- <service-name>.<namespace>
- <service-name> (if in the same namespace)
- <pod-ipv4-address>.<namespace>.pod.cluster.local
- <pod-ipv4-address>.<service-name>.<namespace>.svc.cluster.local
Any query for a resource in the `internal.$DOMAIN` domain will be given the IP of the Traefik proxy. We expose the CoreDNS server in the LAN via MetalLB just for this capability.
## Default CoreDNS Configuration
Found at: https://github.com/k3s-io/k3s/blob/master/manifests/coredns.yaml
This is k3s default CoreDNS configuration, for reference:
```txt
.:53 {
errors
health
ready
kubernetes %{CLUSTER_DOMAIN}% in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
hosts /etc/coredns/NodeHosts {
ttl 60
reload 15s
fallthrough
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
import /etc/coredns/custom/*.override
}
import /etc/coredns/custom/*.server
```

View File

@@ -0,0 +1,28 @@
---
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns-custom
namespace: kube-system
data:
# Custom server block for internal domains. All internal domains should
# resolve to the cluster proxy.
internal.server: |
internal.cloud.payne.io {
errors
cache 30
reload
template IN A {
match (.*)\.internal\.cloud\.payne\.io\.
answer "{{ .Name }} 60 IN A 192.168.8.240"
}
template IN AAAA {
match (.*)\.internal\.cloud\.payne\.io\.
rcode NXDOMAIN
}
}
# Custom override to set external resolvers.
external.override: |
forward . 1.1.1.1 8.8.8.8 {
max_concurrent 1000
}

View File

@@ -0,0 +1,25 @@
---
apiVersion: v1
kind: Service
metadata:
name: coredns-lb
namespace: kube-system
annotations:
metallb.universe.tf/loadBalancerIPs: "192.168.8.241"
spec:
type: LoadBalancer
ports:
- name: dns
port: 53
protocol: UDP
targetPort: 53
- name: dns-tcp
port: 53
protocol: TCP
targetPort: 53
- name: metrics
port: 9153
protocol: TCP
targetPort: 9153
selector:
k8s-app: kube-dns

View File

@@ -0,0 +1,2 @@
DOCKER_REGISTRY_STORAGE=10Gi
DOCKER_REGISTRY_HOST=docker-registry.$INTERNAL_DOMAIN

View File

@@ -0,0 +1,36 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: docker-registry
labels:
app: docker-registry
spec:
replicas: 1
selector:
matchLabels:
app: docker-registry
strategy:
rollingUpdate:
maxSurge: 0
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
labels:
app: docker-registry
spec:
containers:
- image: registry:3.0.0
name: docker-registry
ports:
- containerPort: 5000
protocol: TCP
volumeMounts:
- mountPath: /var/lib/registry
name: docker-registry-storage
readOnly: false
volumes:
- name: docker-registry-storage
persistentVolumeClaim:
claimName: docker-registry-pvc

View File

@@ -0,0 +1,20 @@
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: docker-registry
spec:
rules:
- host: docker-registry.internal.${DOMAIN}
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: docker-registry
port:
number: 5000
tls:
- hosts:
- docker-registry.internal.${DOMAIN}
secretName: wildcard-internal-wild-cloud-tls

View File

@@ -0,0 +1,40 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: docker-registry
labels:
- includeSelectors: true
pairs:
app: docker-registry
managedBy: wild-cloud
resources:
- deployment.yaml
- ingress.yaml
- service.yaml
- namespace.yaml
- pvc.yaml
configMapGenerator:
- name: docker-registry-config
envs:
- config/config.env
replacements:
- source:
kind: ConfigMap
name: docker-registry-config
fieldPath: data.DOCKER_REGISTRY_STORAGE
targets:
- select:
kind: PersistentVolumeClaim
name: docker-registry-pvc
fieldPaths:
- spec.resources.requests.storage
- source:
kind: ConfigMap
name: docker-registry-config
fieldPath: data.DOCKER_REGISTRY_HOST
targets:
- select:
kind: Ingress
name: docker-registry
fieldPaths:
- spec.rules.0.host
- spec.tls.0.hosts.0

View File

@@ -0,0 +1,4 @@
apiVersion: v1
kind: Namespace
metadata:
name: docker-registry

View File

@@ -0,0 +1,12 @@
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: docker-registry-pvc
spec:
storageClassName: longhorn
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 10Gi

View File

@@ -0,0 +1,13 @@
---
apiVersion: v1
kind: Service
metadata:
name: docker-registry
labels:
app: docker-registry
spec:
ports:
- port: 5000
targetPort: 5000
selector:
app: docker-registry

View File

@@ -0,0 +1,14 @@
# External DNS
See: https://github.com/kubernetes-sigs/external-dns
ExternalDNS allows you to keep selected zones (via --domain-filter) synchronized with Ingresses and Services of type=LoadBalancer and nodes in various DNS providers.
Currently, we are only configured to use CloudFlare.
Docs: https://github.com/kubernetes-sigs/external-dns/blob/master/docs/tutorials/cloudflare.md
Any Ingress that has metatdata.annotions with
external-dns.alpha.kubernetes.io/hostname: `<something>.${DOMAIN}`
will have Cloudflare records created by External DNS.

View File

@@ -0,0 +1,39 @@
---
# CloudFlare provider for ExternalDNS
apiVersion: apps/v1
kind: Deployment
metadata:
name: external-dns
namespace: externaldns
spec:
selector:
matchLabels:
app: external-dns
strategy:
type: Recreate
template:
metadata:
labels:
app: external-dns
spec:
serviceAccountName: external-dns
containers:
- name: external-dns
image: registry.k8s.io/external-dns/external-dns:v0.13.4
args:
- --source=service
- --source=ingress
- --txt-owner-id=${OWNER_ID}
- --provider=cloudflare
- --domain-filter=payne.io
#- --exclude-domains=internal.${DOMAIN}
- --cloudflare-dns-records-per-page=5000
- --publish-internal-services
- --no-cloudflare-proxied
- --log-level=debug
env:
- name: CF_API_TOKEN
valueFrom:
secretKeyRef:
name: cloudflare-api-token
key: api-token

View File

@@ -0,0 +1,35 @@
---
# Common RBAC resources for all ExternalDNS deployments
apiVersion: v1
kind: ServiceAccount
metadata:
name: external-dns
namespace: externaldns
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: external-dns
rules:
- apiGroups: [""]
resources: ["services", "endpoints", "pods"]
verbs: ["get", "watch", "list"]
- apiGroups: ["extensions", "networking.k8s.io"]
resources: ["ingresses"]
verbs: ["get", "watch", "list"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: external-dns-viewer
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: external-dns
subjects:
- kind: ServiceAccount
name: external-dns
namespace: externaldns

347
setup/cluster/get_helm.sh Executable file
View File

@@ -0,0 +1,347 @@
#!/usr/bin/env bash
# Copyright The Helm Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# The install script is based off of the MIT-licensed script from glide,
# the package manager for Go: https://github.com/Masterminds/glide.sh/blob/master/get
: ${BINARY_NAME:="helm"}
: ${USE_SUDO:="true"}
: ${DEBUG:="false"}
: ${VERIFY_CHECKSUM:="true"}
: ${VERIFY_SIGNATURES:="false"}
: ${HELM_INSTALL_DIR:="/usr/local/bin"}
: ${GPG_PUBRING:="pubring.kbx"}
HAS_CURL="$(type "curl" &> /dev/null && echo true || echo false)"
HAS_WGET="$(type "wget" &> /dev/null && echo true || echo false)"
HAS_OPENSSL="$(type "openssl" &> /dev/null && echo true || echo false)"
HAS_GPG="$(type "gpg" &> /dev/null && echo true || echo false)"
HAS_GIT="$(type "git" &> /dev/null && echo true || echo false)"
HAS_TAR="$(type "tar" &> /dev/null && echo true || echo false)"
# initArch discovers the architecture for this system.
initArch() {
ARCH=$(uname -m)
case $ARCH in
armv5*) ARCH="armv5";;
armv6*) ARCH="armv6";;
armv7*) ARCH="arm";;
aarch64) ARCH="arm64";;
x86) ARCH="386";;
x86_64) ARCH="amd64";;
i686) ARCH="386";;
i386) ARCH="386";;
esac
}
# initOS discovers the operating system for this system.
initOS() {
OS=$(echo `uname`|tr '[:upper:]' '[:lower:]')
case "$OS" in
# Minimalist GNU for Windows
mingw*|cygwin*) OS='windows';;
esac
}
# runs the given command as root (detects if we are root already)
runAsRoot() {
if [ $EUID -ne 0 -a "$USE_SUDO" = "true" ]; then
sudo "${@}"
else
"${@}"
fi
}
# verifySupported checks that the os/arch combination is supported for
# binary builds, as well whether or not necessary tools are present.
verifySupported() {
local supported="darwin-amd64\ndarwin-arm64\nlinux-386\nlinux-amd64\nlinux-arm\nlinux-arm64\nlinux-ppc64le\nlinux-s390x\nlinux-riscv64\nwindows-amd64\nwindows-arm64"
if ! echo "${supported}" | grep -q "${OS}-${ARCH}"; then
echo "No prebuilt binary for ${OS}-${ARCH}."
echo "To build from source, go to https://github.com/helm/helm"
exit 1
fi
if [ "${HAS_CURL}" != "true" ] && [ "${HAS_WGET}" != "true" ]; then
echo "Either curl or wget is required"
exit 1
fi
if [ "${VERIFY_CHECKSUM}" == "true" ] && [ "${HAS_OPENSSL}" != "true" ]; then
echo "In order to verify checksum, openssl must first be installed."
echo "Please install openssl or set VERIFY_CHECKSUM=false in your environment."
exit 1
fi
if [ "${VERIFY_SIGNATURES}" == "true" ]; then
if [ "${HAS_GPG}" != "true" ]; then
echo "In order to verify signatures, gpg must first be installed."
echo "Please install gpg or set VERIFY_SIGNATURES=false in your environment."
exit 1
fi
if [ "${OS}" != "linux" ]; then
echo "Signature verification is currently only supported on Linux."
echo "Please set VERIFY_SIGNATURES=false or verify the signatures manually."
exit 1
fi
fi
if [ "${HAS_GIT}" != "true" ]; then
echo "[WARNING] Could not find git. It is required for plugin installation."
fi
if [ "${HAS_TAR}" != "true" ]; then
echo "[ERROR] Could not find tar. It is required to extract the helm binary archive."
exit 1
fi
}
# checkDesiredVersion checks if the desired version is available.
checkDesiredVersion() {
if [ "x$DESIRED_VERSION" == "x" ]; then
# Get tag from release URL
local latest_release_url="https://get.helm.sh/helm-latest-version"
local latest_release_response=""
if [ "${HAS_CURL}" == "true" ]; then
latest_release_response=$( curl -L --silent --show-error --fail "$latest_release_url" 2>&1 || true )
elif [ "${HAS_WGET}" == "true" ]; then
latest_release_response=$( wget "$latest_release_url" -q -O - 2>&1 || true )
fi
TAG=$( echo "$latest_release_response" | grep '^v[0-9]' )
if [ "x$TAG" == "x" ]; then
printf "Could not retrieve the latest release tag information from %s: %s\n" "${latest_release_url}" "${latest_release_response}"
exit 1
fi
else
TAG=$DESIRED_VERSION
fi
}
# checkHelmInstalledVersion checks which version of helm is installed and
# if it needs to be changed.
checkHelmInstalledVersion() {
if [[ -f "${HELM_INSTALL_DIR}/${BINARY_NAME}" ]]; then
local version=$("${HELM_INSTALL_DIR}/${BINARY_NAME}" version --template="{{ .Version }}")
if [[ "$version" == "$TAG" ]]; then
echo "Helm ${version} is already ${DESIRED_VERSION:-latest}"
return 0
else
echo "Helm ${TAG} is available. Changing from version ${version}."
return 1
fi
else
return 1
fi
}
# downloadFile downloads the latest binary package and also the checksum
# for that binary.
downloadFile() {
HELM_DIST="helm-$TAG-$OS-$ARCH.tar.gz"
DOWNLOAD_URL="https://get.helm.sh/$HELM_DIST"
CHECKSUM_URL="$DOWNLOAD_URL.sha256"
HELM_TMP_ROOT="$(mktemp -dt helm-installer-XXXXXX)"
HELM_TMP_FILE="$HELM_TMP_ROOT/$HELM_DIST"
HELM_SUM_FILE="$HELM_TMP_ROOT/$HELM_DIST.sha256"
echo "Downloading $DOWNLOAD_URL"
if [ "${HAS_CURL}" == "true" ]; then
curl -SsL "$CHECKSUM_URL" -o "$HELM_SUM_FILE"
curl -SsL "$DOWNLOAD_URL" -o "$HELM_TMP_FILE"
elif [ "${HAS_WGET}" == "true" ]; then
wget -q -O "$HELM_SUM_FILE" "$CHECKSUM_URL"
wget -q -O "$HELM_TMP_FILE" "$DOWNLOAD_URL"
fi
}
# verifyFile verifies the SHA256 checksum of the binary package
# and the GPG signatures for both the package and checksum file
# (depending on settings in environment).
verifyFile() {
if [ "${VERIFY_CHECKSUM}" == "true" ]; then
verifyChecksum
fi
if [ "${VERIFY_SIGNATURES}" == "true" ]; then
verifySignatures
fi
}
# installFile installs the Helm binary.
installFile() {
HELM_TMP="$HELM_TMP_ROOT/$BINARY_NAME"
mkdir -p "$HELM_TMP"
tar xf "$HELM_TMP_FILE" -C "$HELM_TMP"
HELM_TMP_BIN="$HELM_TMP/$OS-$ARCH/helm"
echo "Preparing to install $BINARY_NAME into ${HELM_INSTALL_DIR}"
runAsRoot cp "$HELM_TMP_BIN" "$HELM_INSTALL_DIR/$BINARY_NAME"
echo "$BINARY_NAME installed into $HELM_INSTALL_DIR/$BINARY_NAME"
}
# verifyChecksum verifies the SHA256 checksum of the binary package.
verifyChecksum() {
printf "Verifying checksum... "
local sum=$(openssl sha1 -sha256 ${HELM_TMP_FILE} | awk '{print $2}')
local expected_sum=$(cat ${HELM_SUM_FILE})
if [ "$sum" != "$expected_sum" ]; then
echo "SHA sum of ${HELM_TMP_FILE} does not match. Aborting."
exit 1
fi
echo "Done."
}
# verifySignatures obtains the latest KEYS file from GitHub main branch
# as well as the signature .asc files from the specific GitHub release,
# then verifies that the release artifacts were signed by a maintainer's key.
verifySignatures() {
printf "Verifying signatures... "
local keys_filename="KEYS"
local github_keys_url="https://raw.githubusercontent.com/helm/helm/main/${keys_filename}"
if [ "${HAS_CURL}" == "true" ]; then
curl -SsL "${github_keys_url}" -o "${HELM_TMP_ROOT}/${keys_filename}"
elif [ "${HAS_WGET}" == "true" ]; then
wget -q -O "${HELM_TMP_ROOT}/${keys_filename}" "${github_keys_url}"
fi
local gpg_keyring="${HELM_TMP_ROOT}/keyring.gpg"
local gpg_homedir="${HELM_TMP_ROOT}/gnupg"
mkdir -p -m 0700 "${gpg_homedir}"
local gpg_stderr_device="/dev/null"
if [ "${DEBUG}" == "true" ]; then
gpg_stderr_device="/dev/stderr"
fi
gpg --batch --quiet --homedir="${gpg_homedir}" --import "${HELM_TMP_ROOT}/${keys_filename}" 2> "${gpg_stderr_device}"
gpg --batch --no-default-keyring --keyring "${gpg_homedir}/${GPG_PUBRING}" --export > "${gpg_keyring}"
local github_release_url="https://github.com/helm/helm/releases/download/${TAG}"
if [ "${HAS_CURL}" == "true" ]; then
curl -SsL "${github_release_url}/helm-${TAG}-${OS}-${ARCH}.tar.gz.sha256.asc" -o "${HELM_TMP_ROOT}/helm-${TAG}-${OS}-${ARCH}.tar.gz.sha256.asc"
curl -SsL "${github_release_url}/helm-${TAG}-${OS}-${ARCH}.tar.gz.asc" -o "${HELM_TMP_ROOT}/helm-${TAG}-${OS}-${ARCH}.tar.gz.asc"
elif [ "${HAS_WGET}" == "true" ]; then
wget -q -O "${HELM_TMP_ROOT}/helm-${TAG}-${OS}-${ARCH}.tar.gz.sha256.asc" "${github_release_url}/helm-${TAG}-${OS}-${ARCH}.tar.gz.sha256.asc"
wget -q -O "${HELM_TMP_ROOT}/helm-${TAG}-${OS}-${ARCH}.tar.gz.asc" "${github_release_url}/helm-${TAG}-${OS}-${ARCH}.tar.gz.asc"
fi
local error_text="If you think this might be a potential security issue,"
error_text="${error_text}\nplease see here: https://github.com/helm/community/blob/master/SECURITY.md"
local num_goodlines_sha=$(gpg --verify --keyring="${gpg_keyring}" --status-fd=1 "${HELM_TMP_ROOT}/helm-${TAG}-${OS}-${ARCH}.tar.gz.sha256.asc" 2> "${gpg_stderr_device}" | grep -c -E '^\[GNUPG:\] (GOODSIG|VALIDSIG)')
if [[ ${num_goodlines_sha} -lt 2 ]]; then
echo "Unable to verify the signature of helm-${TAG}-${OS}-${ARCH}.tar.gz.sha256!"
echo -e "${error_text}"
exit 1
fi
local num_goodlines_tar=$(gpg --verify --keyring="${gpg_keyring}" --status-fd=1 "${HELM_TMP_ROOT}/helm-${TAG}-${OS}-${ARCH}.tar.gz.asc" 2> "${gpg_stderr_device}" | grep -c -E '^\[GNUPG:\] (GOODSIG|VALIDSIG)')
if [[ ${num_goodlines_tar} -lt 2 ]]; then
echo "Unable to verify the signature of helm-${TAG}-${OS}-${ARCH}.tar.gz!"
echo -e "${error_text}"
exit 1
fi
echo "Done."
}
# fail_trap is executed if an error occurs.
fail_trap() {
result=$?
if [ "$result" != "0" ]; then
if [[ -n "$INPUT_ARGUMENTS" ]]; then
echo "Failed to install $BINARY_NAME with the arguments provided: $INPUT_ARGUMENTS"
help
else
echo "Failed to install $BINARY_NAME"
fi
echo -e "\tFor support, go to https://github.com/helm/helm."
fi
cleanup
exit $result
}
# testVersion tests the installed client to make sure it is working.
testVersion() {
set +e
HELM="$(command -v $BINARY_NAME)"
if [ "$?" = "1" ]; then
echo "$BINARY_NAME not found. Is $HELM_INSTALL_DIR on your "'$PATH?'
exit 1
fi
set -e
}
# help provides possible cli installation arguments
help () {
echo "Accepted cli arguments are:"
echo -e "\t[--help|-h ] ->> prints this help"
echo -e "\t[--version|-v <desired_version>] . When not defined it fetches the latest release tag from the Helm CDN"
echo -e "\te.g. --version v3.0.0 or -v canary"
echo -e "\t[--no-sudo] ->> install without sudo"
}
# cleanup temporary files to avoid https://github.com/helm/helm/issues/2977
cleanup() {
if [[ -d "${HELM_TMP_ROOT:-}" ]]; then
rm -rf "$HELM_TMP_ROOT"
fi
}
# Execution
#Stop execution on any error
trap "fail_trap" EXIT
set -e
# Set debug if desired
if [ "${DEBUG}" == "true" ]; then
set -x
fi
# Parsing input arguments (if any)
export INPUT_ARGUMENTS="${@}"
set -u
while [[ $# -gt 0 ]]; do
case $1 in
'--version'|-v)
shift
if [[ $# -ne 0 ]]; then
export DESIRED_VERSION="${1}"
if [[ "$1" != "v"* ]]; then
echo "Expected version arg ('${DESIRED_VERSION}') to begin with 'v', fixing..."
export DESIRED_VERSION="v${1}"
fi
else
echo -e "Please provide the desired version. e.g. --version v3.0.0 or -v canary"
exit 0
fi
;;
'--no-sudo')
USE_SUDO="false"
;;
'--help'|-h)
help
exit 0
;;
*) exit 1
;;
esac
shift
done
set +u
initArch
initOS
verifySupported
checkDesiredVersion
if ! checkHelmInstalledVersion; then
downloadFile
verifyFile
installFile
fi
testVersion
cleanup

View File

@@ -0,0 +1,32 @@
---
# Service Account and RBAC for Dashboard admin access
apiVersion: v1
kind: ServiceAccount
metadata:
name: dashboard-admin
namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: dashboard-admin
subjects:
- kind: ServiceAccount
name: dashboard-admin
namespace: kubernetes-dashboard
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
---
# Token for dashboard-admin
apiVersion: v1
kind: Secret
metadata:
name: dashboard-admin-token
namespace: kubernetes-dashboard
annotations:
kubernetes.io/service-account.name: dashboard-admin
type: kubernetes.io/service-account-token

View File

@@ -0,0 +1,84 @@
---
# Internal-only middleware
apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
name: internal-only
namespace: kubernetes-dashboard
spec:
ipWhiteList:
# Restrict to local private network ranges
sourceRange:
- 127.0.0.1/32 # localhost
- 10.0.0.0/8 # Private network
- 172.16.0.0/12 # Private network
- 192.168.0.0/16 # Private network
---
# HTTPS redirect middleware
apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
name: dashboard-redirect-scheme
namespace: kubernetes-dashboard
spec:
redirectScheme:
scheme: https
permanent: true
---
# IngressRoute for Dashboard
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
name: kubernetes-dashboard-https
namespace: kubernetes-dashboard
spec:
entryPoints:
- websecure
routes:
- match: Host(`dashboard.internal.${DOMAIN}`)
kind: Rule
middlewares:
- name: internal-only
namespace: kubernetes-dashboard
services:
- name: kubernetes-dashboard
port: 443
serversTransport: dashboard-transport
tls:
secretName: wildcard-internal-wild-cloud-tls
---
# HTTP to HTTPS redirect.
# FIXME: Is this needed?
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
name: kubernetes-dashboard-http
namespace: kubernetes-dashboard
spec:
entryPoints:
- web
routes:
- match: Host(`dashboard.internal.${DOMAIN}`)
kind: Rule
middlewares:
- name: dashboard-redirect-scheme
namespace: kubernetes-dashboard
services:
- name: kubernetes-dashboard
port: 443
serversTransport: dashboard-transport
---
# ServersTransport for HTTPS backend with skip verify.
# FIXME: Is this needed?
apiVersion: traefik.containo.us/v1alpha1
kind: ServersTransport
metadata:
name: dashboard-transport
namespace: kubernetes-dashboard
spec:
insecureSkipVerify: true
serverName: dashboard.internal.${DOMAIN}

View File

@@ -0,0 +1,20 @@
# Longhorn Storage
See: [Longhorn Docs v 1.8.1](https://longhorn.io/docs/1.8.1/deploy/install/install-with-kubectl/)
## Installation Notes
- Manifest copied from https://raw.githubusercontent.com/longhorn/longhorn/v1.8.1/deploy/longhorn.yaml
- Using kustomize to apply custom configuration (see `kustomization.yaml`)
## Important Settings
- **Number of Replicas**: Set to 1 (default is 3) to accommodate smaller clusters
- This avoids "degraded" volumes when fewer than 3 nodes are available
- For production with 3+ nodes, consider changing back to 3 for better availability
## Common Operations
- View volumes: `kubectl get volumes.longhorn.io -n longhorn-system`
- Check volume status: `kubectl describe volumes.longhorn.io <volume-name> -n longhorn-system`
- Access Longhorn UI: Set up port-forwarding with `kubectl -n longhorn-system port-forward service/longhorn-frontend 8080:80`

View File

@@ -0,0 +1,5 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- longhorn.yaml

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,18 @@
namespace: metallb-system
resources:
- pool.yaml
configMapGenerator:
- name: metallb-config
envs:
- config/config.env
replacements:
- source:
kind: ConfigMap
name: metallb-config
fieldPath: data.CLUSTER_LOAD_BALANCER_RANGE
targets:
- select:
kind: IPAddressPool
name: first-pool
fieldPaths:
- spec.addresses.0

View File

@@ -0,0 +1,19 @@
---
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: first-pool
namespace: metallb-system
spec:
addresses:
- PLACEHOLDER_CLUSTER_LOAD_BALANCER_RANGE
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: l2-advertisement
namespace: metallb-system
spec:
ipAddressPools:
- first-pool

View File

@@ -0,0 +1,3 @@
namespace: metallb-system
resources:
- github.com/metallb/metallb/config/native?ref=v0.15.0

View File

@@ -0,0 +1,53 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- persistent-volume.yaml
- storage-class.yaml
replacements:
- source:
kind: ConfigMap
name: nfs-config
fieldPath: data.NFS_HOST_IP
targets:
- select:
kind: PersistentVolume
name: nfs-media-pv
fieldPaths:
- spec.nfs.server
- select:
kind: StorageClass
name: nfs
fieldPaths:
- parameters.server
- source:
kind: ConfigMap
name: nfs-config
fieldPath: data.NFS_MEDIA_PATH
targets:
- select:
kind: PersistentVolume
name: nfs-media-pv
fieldPaths:
- spec.nfs.path
- select:
kind: StorageClass
name: nfs
fieldPaths:
- parameters.path
- source:
kind: ConfigMap
name: nfs-config
fieldPath: data.NFS_STORAGE_CAPACITY
targets:
- select:
kind: PersistentVolume
name: nfs-media-pv
fieldPaths:
- spec.capacity.storage
configMapGenerator:
- name: nfs-config
envs:
- config/config.env

View File

@@ -0,0 +1,23 @@
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-media-pv
labels:
storage: nfs-media
spec:
capacity:
storage: REPLACE_ME
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: nfs
nfs:
server: REPLACE_ME
path: REPLACE_ME
mountOptions:
- nfsvers=4.1
- rsize=1048576
- wsize=1048576
- hard
- intr
- timeo=600

View File

@@ -0,0 +1,10 @@
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nfs
provisioner: nfs
parameters:
server: REPLACE_ME
path: REPLACE_ME
reclaimPolicy: Retain
allowVolumeExpansion: true

55
setup/cluster/setup-all.sh Executable file
View File

@@ -0,0 +1,55 @@
#!/bin/bash
set -e
# Navigate to script directory
SCRIPT_PATH="$(realpath "${BASH_SOURCE[0]}")"
SCRIPT_DIR="$(dirname "$SCRIPT_PATH")"
cd "$SCRIPT_DIR"
echo "Setting up infrastructure components for k3s..."
# Make all script files executable
chmod +x *.sh
# Utils
./setup-utils.sh
# Setup MetalLB (must be first for IP allocation)
./setup-metallb.sh
# Setup Longhorn
./setup-longhorn.sh
# Setup Traefik
./setup-traefik.sh
# Setup CoreDNS
./setup-coredns.sh
# Setup cert-manager
./setup-cert-manager.sh
# Setup ExternalDNS
./setup-externaldns.sh
# Setup Kubernetes Dashboard
./setup-dashboard.sh
# Setup NFS Kubernetes integration (optional)
./setup-nfs.sh
# Setup Docker Registry
./setup-registry.sh
echo "Infrastructure setup complete!"
echo
echo "Next steps:"
echo "1. Install Helm charts for non-infrastructure components"
echo "2. Access the dashboard at: https://dashboard.internal.${DOMAIN}"
echo "3. Get the dashboard token with: ./bin/dashboard-token"
echo
echo "To verify components, run:"
echo "- kubectl get pods -n cert-manager"
echo "- kubectl get pods -n externaldns"
echo "- kubectl get pods -n kubernetes-dashboard"
echo "- kubectl get clusterissuers"

View File

@@ -0,0 +1,93 @@
#!/bin/bash
set -e
# Navigate to script directory
SCRIPT_PATH="$(realpath "${BASH_SOURCE[0]}")"
SCRIPT_DIR="$(dirname "$SCRIPT_PATH")"
cd "$SCRIPT_DIR"
# Source environment variables
if [[ -f "../load-env.sh" ]]; then
source ../load-env.sh
fi
echo "Setting up cert-manager..."
# Create cert-manager namespace
kubectl create namespace cert-manager --dry-run=client -o yaml | kubectl apply -f -
# Install cert-manager using the official installation method
# This installs CRDs, controllers, and webhook components
echo "Installing cert-manager components..."
# Using stable URL for cert-manager installation
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.1/cert-manager.yaml || \
kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.13.1/cert-manager.yaml
# Wait for cert-manager to be ready
echo "Waiting for cert-manager to be ready..."
kubectl wait --for=condition=Available deployment/cert-manager -n cert-manager --timeout=120s
kubectl wait --for=condition=Available deployment/cert-manager-cainjector -n cert-manager --timeout=120s
kubectl wait --for=condition=Available deployment/cert-manager-webhook -n cert-manager --timeout=120s
# Add delay to allow webhook to be fully ready
echo "Waiting additional time for cert-manager webhook to be fully operational..."
sleep 30
# Setup Cloudflare API token for DNS01 challenges
if [[ -n "${CLOUDFLARE_API_TOKEN}" ]]; then
echo "Creating Cloudflare API token secret in cert-manager namespace..."
kubectl create secret generic cloudflare-api-token \
--namespace cert-manager \
--from-literal=api-token="${CLOUDFLARE_API_TOKEN}" \
--dry-run=client -o yaml | kubectl apply -f -
else
echo "Warning: CLOUDFLARE_API_TOKEN not set. DNS01 challenges will not work."
fi
echo "Creating Let's Encrypt issuers..."
cat ${SCRIPT_DIR}/cert-manager/letsencrypt-staging-dns01.yaml | envsubst | kubectl apply -f -
cat ${SCRIPT_DIR}/cert-manager/letsencrypt-prod-dns01.yaml | envsubst | kubectl apply -f -
# Wait for issuers to be ready
echo "Waiting for Let's Encrypt issuers to be ready..."
sleep 10
# Configure cert-manager to use external DNS for challenge verification
echo "Configuring cert-manager to use external DNS servers..."
kubectl patch deployment cert-manager -n cert-manager --patch '
spec:
template:
spec:
dnsPolicy: None
dnsConfig:
nameservers:
- "1.1.1.1"
- "8.8.8.8"
searches:
- cert-manager.svc.cluster.local
- svc.cluster.local
- cluster.local
options:
- name: ndots
value: "5"'
# Wait for cert-manager to restart with new DNS config
echo "Waiting for cert-manager to restart with new DNS configuration..."
kubectl rollout status deployment/cert-manager -n cert-manager --timeout=120s
# Apply wildcard certificates
echo "Creating wildcard certificates..."
cat ${SCRIPT_DIR}/cert-manager/internal-wildcard-certificate.yaml | envsubst | kubectl apply -f -
cat ${SCRIPT_DIR}/cert-manager/wildcard-certificate.yaml | envsubst | kubectl apply -f -
echo "Wildcard certificate creation initiated. This may take some time to complete depending on DNS propagation."
# Wait for the certificates to be issued (with a timeout)
echo "Waiting for wildcard certificates to be ready (this may take several minutes)..."
kubectl wait --for=condition=Ready certificate wildcard-internal-wild-cloud -n cert-manager --timeout=300s || true
kubectl wait --for=condition=Ready certificate wildcard-wild-cloud -n cert-manager --timeout=300s || true
echo "cert-manager setup complete!"
echo ""
echo "To verify the installation:"
echo " kubectl get pods -n cert-manager"
echo " kubectl get clusterissuers"

30
setup/cluster/setup-coredns.sh Executable file
View File

@@ -0,0 +1,30 @@
#!/bin/bash
set -e
SCRIPT_PATH="$(realpath "${BASH_SOURCE[0]}")"
SCRIPT_DIR="$(dirname "$SCRIPT_PATH")"
cd "$SCRIPT_DIR"
# Source environment variables
if [[ -f "../load-env.sh" ]]; then
source ../load-env.sh
fi
echo "Setting up CoreDNS for k3s..."
echo "Script directory: ${SCRIPT_DIR}"
echo "Current directory: $(pwd)"
# Apply the k3s-compatible custom DNS override (k3s will preserve this)
echo "Applying CoreDNS custom override configuration..."
cat "${SCRIPT_DIR}/coredns/coredns-custom-config.yaml" | envsubst | kubectl apply -f -
# Apply the LoadBalancer service for external access to CoreDNS
echo "Applying CoreDNS service configuration..."
cat "${SCRIPT_DIR}/coredns/coredns-lb-service.yaml" | envsubst | kubectl apply -f -
# Restart CoreDNS pods to apply the changes
echo "Restarting CoreDNS pods to apply changes..."
kubectl rollout restart deployment/coredns -n kube-system
kubectl rollout status deployment/coredns -n kube-system
echo "CoreDNS setup complete!"

View File

@@ -0,0 +1,46 @@
#!/bin/bash
set -e
# Store the script directory path for later use
SCRIPT_PATH="$(realpath "${BASH_SOURCE[0]}")"
SCRIPT_DIR="$(dirname "$SCRIPT_PATH")"
cd "$SCRIPT_DIR"
# Source environment variables
if [[ -f "../load-env.sh" ]]; then
source ../load-env.sh
fi
echo "Setting up Kubernetes Dashboard..."
NAMESPACE="kubernetes-dashboard"
# Apply the official dashboard installation
echo "Installing Kubernetes Dashboard core components..."
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml
# Copying cert-manager secrets to the dashboard namespace
copy-secret cert-manager:wildcard-internal-wild-cloud-tls $NAMESPACE
copy-secret cert-manager:wildcard-wild-cloud-tls $NAMESPACE
# Create admin service account and token
echo "Creating dashboard admin service account and token..."
cat "${SCRIPT_DIR}/kubernetes-dashboard/dashboard-admin-rbac.yaml" | kubectl apply -f -
# Apply the dashboard configuration
echo "Applying dashboard configuration..."
cat "${SCRIPT_DIR}/kubernetes-dashboard/dashboard-kube-system.yaml" | envsubst | kubectl apply -f -
# Restart CoreDNS to pick up the changes
kubectl delete pods -n kube-system -l k8s-app=kube-dns
echo "Restarted CoreDNS to pick up DNS changes"
# Wait for dashboard to be ready
echo "Waiting for Kubernetes Dashboard to be ready..."
kubectl rollout status deployment/kubernetes-dashboard -n $NAMESPACE --timeout=60s
echo "Kubernetes Dashboard setup complete!"
echo "Access the dashboard at: https://dashboard.internal.${DOMAIN}"
echo ""
echo "To get the authentication token, run:"
echo "./bin/dashboard-token"

View File

@@ -0,0 +1,51 @@
#!/bin/bash
set -e
# Navigate to script directory
SCRIPT_PATH="$(realpath "${BASH_SOURCE[0]}")"
SCRIPT_DIR="$(dirname "$SCRIPT_PATH")"
cd "$SCRIPT_DIR"
# Source environment variables
if [[ -f "../load-env.sh" ]]; then
source ../load-env.sh
fi
echo "Setting up ExternalDNS..."
# Create externaldns namespace
kubectl create namespace externaldns --dry-run=client -o yaml | kubectl apply -f -
# Setup Cloudflare API token secret for ExternalDNS
if [[ -n "${CLOUDFLARE_API_TOKEN}" ]]; then
echo "Creating Cloudflare API token secret..."
kubectl create secret generic cloudflare-api-token \
--namespace externaldns \
--from-literal=api-token="${CLOUDFLARE_API_TOKEN}" \
--dry-run=client -o yaml | kubectl apply -f -
else
echo "Error: CLOUDFLARE_API_TOKEN not set. ExternalDNS will not work correctly."
exit 1
fi
# Apply common RBAC resources
echo "Deploying ExternalDNS RBAC resources..."
cat ${SCRIPT_DIR}/externaldns/externaldns-rbac.yaml | envsubst | kubectl apply -f -
# Apply ExternalDNS manifests with environment variables
echo "Deploying ExternalDNS for external DNS (Cloudflare)..."
cat ${SCRIPT_DIR}/externaldns/externaldns-cloudflare.yaml | envsubst | kubectl apply -f -
# Wait for ExternalDNS to be ready
echo "Waiting for Cloudflare ExternalDNS to be ready..."
kubectl rollout status deployment/external-dns -n externaldns --timeout=60s
# echo "Waiting for CoreDNS ExternalDNS to be ready..."
# kubectl rollout status deployment/external-dns-coredns -n externaldns --timeout=60s
echo "ExternalDNS setup complete!"
echo ""
echo "To verify the installation:"
echo " kubectl get pods -n externaldns"
echo " kubectl logs -n externaldns -l app=external-dns -f"
echo " kubectl logs -n externaldns -l app=external-dns-coredns -f"

16
setup/cluster/setup-longhorn.sh Executable file
View File

@@ -0,0 +1,16 @@
#!/bin/bash
set -e
SCRIPT_PATH="$(realpath "${BASH_SOURCE[0]}")"
SCRIPT_DIR="$(dirname "$SCRIPT_PATH")"
cd "$SCRIPT_DIR"
if [[ -f "../load-env.sh" ]]; then
source ../load-env.sh
fi
echo "Setting up Longhorn..."
# Apply Longhorn with kustomize to apply our customizations
kubectl apply -k ${SCRIPT_DIR}/longhorn/
echo "Longhorn setup complete!"

30
setup/cluster/setup-metallb.sh Executable file
View File

@@ -0,0 +1,30 @@
#!/bin/bash
set -e
SCRIPT_PATH="$(realpath "${BASH_SOURCE[0]}")"
SCRIPT_DIR="$(dirname "$SCRIPT_PATH")"
cd "$SCRIPT_DIR"
# Source environment variables
if [[ -f "../load-env.sh" ]]; then
source ../load-env.sh
fi
echo "Setting up MetalLB..."
echo "Deploying MetalLB..."
# cat ${SCRIPT_DIR}/metallb/metallb-helm-config.yaml | envsubst | kubectl apply -f -
kubectl apply -k metallb/installation
echo "Waiting for MetalLB to be deployed..."
kubectl wait --for=condition=Available deployment/controller -n metallb-system --timeout=60s
sleep 10 # Extra buffer for webhook initialization
echo "Customizing MetalLB..."
kubectl apply -k metallb/configuration
echo "✅ MetalLB installed and configured"
echo ""
echo "To verify the installation:"
echo " kubectl get pods -n metallb-system"
echo " kubectl get ipaddresspools.metallb.io -n metallb-system"

257
setup/cluster/setup-nfs-host.sh Executable file
View File

@@ -0,0 +1,257 @@
#!/bin/bash
set -e
set -o pipefail
# Navigate to script directory
SCRIPT_PATH="$(realpath "${BASH_SOURCE[0]}")"
SCRIPT_DIR="$(dirname "$SCRIPT_PATH")"
PROJECT_DIR="$(dirname "$SCRIPT_DIR")"
# Source environment variables
source "${PROJECT_DIR}/load-env.sh"
echo "Setting up NFS server on this host..."
# Check if required NFS variables are configured
if [[ -z "${NFS_HOST}" ]]; then
echo "NFS_HOST not set. Please set NFS_HOST=<hostname> in your environment"
echo "Example: export NFS_HOST=box-01"
exit 1
fi
# Ensure NFS_MEDIA_PATH is explicitly set
if [[ -z "${NFS_MEDIA_PATH}" ]]; then
echo "Error: NFS_MEDIA_PATH not set. Please set it in your environment"
echo "Example: export NFS_MEDIA_PATH=/data/media"
exit 1
fi
# Set default for NFS_EXPORT_OPTIONS if not already set
if [[ -z "${NFS_EXPORT_OPTIONS}" ]]; then
export NFS_EXPORT_OPTIONS="*(rw,sync,no_subtree_check,no_root_squash)"
echo "Using default NFS_EXPORT_OPTIONS: ${NFS_EXPORT_OPTIONS}"
fi
echo "Target NFS host: ${NFS_HOST}"
echo "Media path: ${NFS_MEDIA_PATH}"
echo "Export options: ${NFS_EXPORT_OPTIONS}"
# Function to check if we're running on the correct host
check_host() {
local current_hostname=$(hostname)
if [[ "${current_hostname}" != "${NFS_HOST}" ]]; then
echo "Warning: Current host (${current_hostname}) differs from NFS_HOST (${NFS_HOST})"
echo "This script should be run on ${NFS_HOST}"
read -p "Continue anyway? (y/N): " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
exit 1
fi
fi
}
# Function to install NFS server and SMB/CIFS
install_nfs_server() {
echo "Installing NFS server and SMB/CIFS packages..."
# Detect package manager and install NFS server + Samba
if command -v apt-get >/dev/null 2>&1; then
# Debian/Ubuntu
sudo apt-get update
sudo apt-get install -y nfs-kernel-server nfs-common samba samba-common-bin
elif command -v yum >/dev/null 2>&1; then
# RHEL/CentOS
sudo yum install -y nfs-utils samba samba-client
elif command -v dnf >/dev/null 2>&1; then
# Fedora
sudo dnf install -y nfs-utils samba samba-client
else
echo "Error: Unable to detect package manager. Please install NFS server and Samba manually."
exit 1
fi
}
# Function to create media directory
create_media_directory() {
echo "Creating media directory: ${NFS_MEDIA_PATH}"
# Create directory if it doesn't exist
sudo mkdir -p "${NFS_MEDIA_PATH}"
# Set appropriate permissions
# Using 755 for directory, allowing read/execute for all, write for owner
sudo chmod 755 "${NFS_MEDIA_PATH}"
echo "Media directory created with appropriate permissions"
echo "Directory info:"
ls -la "${NFS_MEDIA_PATH}/"
}
# Function to configure NFS exports
configure_nfs_exports() {
echo "Configuring NFS exports..."
local export_line="${NFS_MEDIA_PATH} ${NFS_EXPORT_OPTIONS}"
local exports_file="/etc/exports"
# Backup existing exports file
sudo cp "${exports_file}" "${exports_file}.backup.$(date +%Y%m%d-%H%M%S)" 2>/dev/null || true
# Check if export already exists
if sudo grep -q "^${NFS_MEDIA_PATH}" "${exports_file}" 2>/dev/null; then
echo "Export for ${NFS_MEDIA_PATH} already exists, updating..."
sudo sed -i "s|^${NFS_MEDIA_PATH}.*|${export_line}|" "${exports_file}"
else
echo "Adding new export for ${NFS_MEDIA_PATH}..."
echo "${export_line}" | sudo tee -a "${exports_file}"
fi
# Export the filesystems
sudo exportfs -rav
echo "NFS exports configured:"
sudo exportfs -v
}
# Function to start and enable NFS services
start_nfs_services() {
echo "Starting NFS services..."
# Start and enable NFS server
sudo systemctl enable nfs-server
sudo systemctl start nfs-server
# Also enable related services
sudo systemctl enable rpcbind
sudo systemctl start rpcbind
echo "NFS services started and enabled"
# Show service status
sudo systemctl status nfs-server --no-pager --lines=5
}
# Function to configure SMB/CIFS sharing
configure_smb_sharing() {
echo "Configuring SMB/CIFS sharing..."
local smb_config="/etc/samba/smb.conf"
local share_name="media"
# Backup existing config
sudo cp "${smb_config}" "${smb_config}.backup.$(date +%Y%m%d-%H%M%S)" 2>/dev/null || true
# Check if share already exists
if sudo grep -q "^\[${share_name}\]" "${smb_config}" 2>/dev/null; then
echo "SMB share '${share_name}' already exists, updating..."
# Remove existing share section
sudo sed -i "/^\[${share_name}\]/,/^\[/{ /^\[${share_name}\]/d; /^\[/!d; }" "${smb_config}"
fi
# Add media share configuration
cat << EOF | sudo tee -a "${smb_config}"
[${share_name}]
comment = Media files for Jellyfin
path = ${NFS_MEDIA_PATH}
browseable = yes
read only = no
guest ok = yes
create mask = 0664
directory mask = 0775
force user = $(whoami)
force group = $(whoami)
EOF
echo "SMB share configuration added"
# Test configuration
if sudo testparm -s >/dev/null 2>&1; then
echo "✓ SMB configuration is valid"
else
echo "✗ SMB configuration has errors"
sudo testparm
exit 1
fi
}
# Function to start SMB services
start_smb_services() {
echo "Starting SMB services..."
# Enable and start Samba services
sudo systemctl enable smbd
sudo systemctl start smbd
sudo systemctl enable nmbd
sudo systemctl start nmbd
echo "SMB services started and enabled"
# Show service status
sudo systemctl status smbd --no-pager --lines=3
}
# Function to test NFS setup
test_nfs_setup() {
echo "Testing NFS setup..."
# Test if NFS is responding
if command -v showmount >/dev/null 2>&1; then
echo "Available NFS exports:"
showmount -e localhost || echo "Warning: showmount failed, but NFS may still be working"
fi
# Check if the export directory is accessible
if [[ -d "${NFS_MEDIA_PATH}" ]]; then
echo "✓ Media directory exists and is accessible"
else
echo "✗ Media directory not accessible"
exit 1
fi
}
# Function to show usage instructions
show_usage_instructions() {
echo
echo "=== NFS/SMB Host Setup Complete ==="
echo
echo "NFS and SMB servers are now running on this host with media directory: ${NFS_MEDIA_PATH}"
echo
echo "Access methods:"
echo "1. NFS (for Kubernetes): Use setup-nfs-k8s.sh to register with cluster"
echo "2. SMB/CIFS (for Windows): \\\\${NFS_HOST}\\media"
echo
echo "To add media files:"
echo "- Copy directly to: ${NFS_MEDIA_PATH}"
echo "- Or mount SMB share from Windows and copy there"
echo
echo "Windows SMB mount:"
echo "- Open File Explorer"
echo "- Map network drive to: \\\\${NFS_HOST}\\media"
echo "- Or use: \\\\$(hostname -I | awk '{print $1}')\\media"
echo
echo "To verify services:"
echo "- NFS: showmount -e ${NFS_HOST}"
echo "- SMB: smbclient -L ${NFS_HOST} -N"
echo "- Status: systemctl status nfs-server smbd"
echo
echo "Current NFS exports:"
sudo exportfs -v
echo
}
# Main execution
main() {
check_host
install_nfs_server
create_media_directory
configure_nfs_exports
start_nfs_services
configure_smb_sharing
start_smb_services
test_nfs_setup
show_usage_instructions
}
# Run main function
main "$@"

230
setup/cluster/setup-nfs.sh Executable file
View File

@@ -0,0 +1,230 @@
#!/bin/bash
set -e
set -o pipefail
# Navigate to script directory
SCRIPT_PATH="$(realpath "${BASH_SOURCE[0]}")"
SCRIPT_DIR="$(dirname "$SCRIPT_PATH")"
PROJECT_DIR="$(dirname "$SCRIPT_DIR")"
# Source environment variables
source "${PROJECT_DIR}/load-env.sh"
echo "Registering NFS server with Kubernetes cluster..."
# Check if NFS_HOST is configured
if [[ -z "${NFS_HOST}" ]]; then
echo "NFS_HOST not set. Skipping NFS Kubernetes setup."
echo "To enable NFS media sharing:"
echo "1. Set NFS_HOST=<hostname> in your environment"
echo "2. Run setup-nfs-host.sh on the NFS host"
echo "3. Re-run this script"
exit 0
fi
# Set default for NFS_STORAGE_CAPACITY if not already set
if [[ -z "${NFS_STORAGE_CAPACITY}" ]]; then
export NFS_STORAGE_CAPACITY="250Gi"
echo "Using default NFS_STORAGE_CAPACITY: ${NFS_STORAGE_CAPACITY}"
fi
echo "NFS host: ${NFS_HOST}"
echo "Media path: ${NFS_MEDIA_PATH}"
echo "Storage capacity: ${NFS_STORAGE_CAPACITY}"
# Function to resolve NFS host to IP
resolve_nfs_host() {
if [[ "${NFS_HOST}" =~ ^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
# NFS_HOST is already an IP address
NFS_HOST_IP="${NFS_HOST}"
else
# Resolve hostname to IP
NFS_HOST_IP=$(getent hosts "${NFS_HOST}" | awk '{print $1}' | head -n1)
if [[ -z "${NFS_HOST_IP}" ]]; then
echo "Error: Unable to resolve hostname ${NFS_HOST} to IP address"
echo "Make sure ${NFS_HOST} is resolvable from this cluster"
exit 1
fi
# Check if resolved IP is localhost - auto-detect network IP instead
if [[ "${NFS_HOST_IP}" =~ ^127\. ]]; then
echo "Warning: ${NFS_HOST} resolves to localhost (${NFS_HOST_IP})"
echo "Auto-detecting network IP for cluster access..."
# Try to find the primary network interface IP (exclude docker/k8s networks)
local network_ip=$(ip route get 8.8.8.8 | grep -oP 'src \K\S+' 2>/dev/null)
if [[ -n "${network_ip}" && ! "${network_ip}" =~ ^127\. ]]; then
echo "Using detected network IP: ${network_ip}"
NFS_HOST_IP="${network_ip}"
else
echo "Could not auto-detect network IP. Available IPs:"
ip addr show | grep "inet " | grep -v "127.0.0.1" | grep -v "10.42" | grep -v "172." | awk '{print " " $2}' | cut -d/ -f1
echo "Please set NFS_HOST to the correct IP address manually."
exit 1
fi
fi
fi
echo "NFS server IP: ${NFS_HOST_IP}"
export NFS_HOST_IP
}
# Function to test NFS accessibility
test_nfs_accessibility() {
echo "Testing NFS accessibility from cluster..."
# Check if showmount is available
if ! command -v showmount >/dev/null 2>&1; then
echo "Installing NFS client tools..."
if command -v apt-get >/dev/null 2>&1; then
sudo apt-get update && sudo apt-get install -y nfs-common
elif command -v yum >/dev/null 2>&1; then
sudo yum install -y nfs-utils
elif command -v dnf >/dev/null 2>&1; then
sudo dnf install -y nfs-utils
else
echo "Warning: Unable to install NFS client tools. Skipping accessibility test."
return 0
fi
fi
# Test if we can reach the NFS server
echo "Testing connection to NFS server..."
if timeout 10 showmount -e "${NFS_HOST_IP}" >/dev/null 2>&1; then
echo "✓ NFS server is accessible"
echo "Available exports:"
showmount -e "${NFS_HOST_IP}"
else
echo "✗ Cannot connect to NFS server at ${NFS_HOST_IP}"
echo "Make sure:"
echo "1. NFS server is running on ${NFS_HOST}"
echo "2. Network connectivity exists between cluster and NFS host"
echo "3. Firewall allows NFS traffic (port 2049)"
exit 1
fi
# Test specific export
if showmount -e "${NFS_HOST_IP}" | grep -q "${NFS_MEDIA_PATH}"; then
echo "✓ Media path ${NFS_MEDIA_PATH} is exported"
else
echo "✗ Media path ${NFS_MEDIA_PATH} is not found in exports"
echo "Available exports:"
showmount -e "${NFS_HOST_IP}"
echo
echo "Run setup-nfs-host.sh on ${NFS_HOST} to configure the export"
exit 1
fi
}
# Function to create test mount
test_nfs_mount() {
echo "Testing NFS mount functionality..."
local test_mount="/tmp/nfs-test-$$"
mkdir -p "${test_mount}"
# Try to mount the NFS export
if timeout 30 sudo mount -t nfs4 "${NFS_HOST_IP}:${NFS_MEDIA_PATH}" "${test_mount}"; then
echo "✓ NFS mount successful"
# Test read access
if ls "${test_mount}" >/dev/null 2>&1; then
echo "✓ NFS read access working"
else
echo "✗ NFS read access failed"
fi
# Unmount
sudo umount "${test_mount}" || echo "Warning: Failed to unmount test directory"
else
echo "✗ NFS mount failed"
echo "Check NFS server configuration and network connectivity"
exit 1
fi
# Clean up
rmdir "${test_mount}" 2>/dev/null || true
}
# Function to create Kubernetes resources
create_k8s_resources() {
echo "Creating Kubernetes NFS resources..."
# Generate config file with resolved variables
local nfs_dir="${SCRIPT_DIR}/nfs"
local env_file="${nfs_dir}/config/.env"
local config_file="${nfs_dir}/config/config.env"
echo "Generating NFS configuration..."
export NFS_HOST_IP
export NFS_MEDIA_PATH
export NFS_STORAGE_CAPACITY
envsubst < "${env_file}" > "${config_file}"
# Apply the NFS Kubernetes manifests using kustomize
echo "Applying NFS manifests from: ${nfs_dir}"
kubectl apply -k "${nfs_dir}"
echo "✓ NFS PersistentVolume and StorageClass created"
# Verify resources were created
echo "Verifying Kubernetes resources..."
if kubectl get storageclass nfs >/dev/null 2>&1; then
echo "✓ StorageClass 'nfs' created"
else
echo "✗ StorageClass 'nfs' not found"
exit 1
fi
if kubectl get pv nfs-media-pv >/dev/null 2>&1; then
echo "✓ PersistentVolume 'nfs-media-pv' created"
kubectl get pv nfs-media-pv
else
echo "✗ PersistentVolume 'nfs-media-pv' not found"
exit 1
fi
}
# Function to show usage instructions
show_usage_instructions() {
echo
echo "=== NFS Kubernetes Setup Complete ==="
echo
echo "NFS server ${NFS_HOST} (${NFS_HOST_IP}) has been registered with the cluster"
echo
echo "Kubernetes resources created:"
echo "- StorageClass: nfs"
echo "- PersistentVolume: nfs-media-pv (${NFS_STORAGE_CAPACITY}, ReadWriteMany)"
echo
echo "To use NFS storage in your applications:"
echo "1. Set storageClassName: nfs in your PVC"
echo "2. Use accessMode: ReadWriteMany for shared access"
echo
echo "Example PVC:"
echo "---"
echo "apiVersion: v1"
echo "kind: PersistentVolumeClaim"
echo "metadata:"
echo " name: my-nfs-pvc"
echo "spec:"
echo " accessModes:"
echo " - ReadWriteMany"
echo " storageClassName: nfs"
echo " resources:"
echo " requests:"
echo " storage: 10Gi"
echo
}
# Main execution
main() {
resolve_nfs_host
test_nfs_accessibility
test_nfs_mount
create_k8s_resources
show_usage_instructions
}
# Run main function
main "$@"

20
setup/cluster/setup-registry.sh Executable file
View File

@@ -0,0 +1,20 @@
#!/bin/bash
set -e
# Navigate to script directory
SCRIPT_PATH="$(realpath "${BASH_SOURCE[0]}")"
SCRIPT_DIR="$(dirname "$SCRIPT_PATH")"
echo "Setting up Docker Registry..."
# Apply the docker registry manifests using kustomize
kubectl apply -k "${SCRIPT_DIR}/docker-registry"
echo "Waiting for Docker Registry to be ready..."
kubectl wait --for=condition=available --timeout=300s deployment/docker-registry -n docker-registry
echo "Docker Registry setup complete!"
# Show deployment status
kubectl get pods -n docker-registry
kubectl get services -n docker-registry

18
setup/cluster/setup-traefik.sh Executable file
View File

@@ -0,0 +1,18 @@
#!/bin/bash
set -e
SCRIPT_PATH="$(realpath "${BASH_SOURCE[0]}")"
SCRIPT_DIR="$(dirname "$SCRIPT_PATH")"
cd "$SCRIPT_DIR"
# Source environment variables
if [[ -f "../load-env.sh" ]]; then
source ../load-env.sh
fi
echo "Setting up Traefik service and middleware for k3s..."
cat ${SCRIPT_DIR}/traefik/traefik-service.yaml | envsubst | kubectl apply -f -
cat ${SCRIPT_DIR}/traefik/internal-middleware.yaml | envsubst | kubectl apply -f -
echo "Traefik setup complete!"

37
setup/cluster/setup-utils.sh Executable file
View File

@@ -0,0 +1,37 @@
#!/bin/bash
set -e
SCRIPT_PATH="$(realpath "${BASH_SOURCE[0]}")"
SCRIPT_DIR="$(dirname "$SCRIPT_PATH")"
cd "$SCRIPT_DIR"
# Install gomplate
if command -v gomplate &> /dev/null; then
echo "gomplate is already installed."
else
curl -sSL https://github.com/hairyhenderson/gomplate/releases/latest/download/gomplate_linux-amd64 -o $HOME/.local/bin/gomplate
chmod +x $HOME/.local/bin/gomplate
echo "gomplate installed successfully."
fi
# Install kustomize
if command -v kustomize &> /dev/null; then
echo "kustomize is already installed."
else
curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" | bash
mv kustomize $HOME/.local/bin/
echo "kustomize installed successfully."
fi
## Install yq
if command -v yq &> /dev/null; then
echo "yq is already installed."
else
VERSION=v4.45.4
BINARY=yq_linux_amd64
wget https://github.com/mikefarah/yq/releases/download/${VERSION}/${BINARY}.tar.gz -O - | tar xz
mv ${BINARY} $HOME/.local/bin/yq
chmod +x $HOME/.local/bin/yq
rm yq.1
echo "yq installed successfully."
fi

View File

@@ -0,0 +1,7 @@
# Traefik
- https://doc.traefik.io/traefik/providers/kubernetes-ingress/
Ingress RDs can be create for any service. The routes specificed in the Ingress are added automatically to the Traefik proxy.
Traefik serves all incoming network traffic on ports 80 and 443 to their appropriate services based on the route.

View File

@@ -0,0 +1,13 @@
apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
name: internal-only
namespace: kube-system
spec:
ipWhiteList:
# Restrict to local private network ranges - adjust these to match your network
sourceRange:
- 127.0.0.1/32 # localhost
- 10.0.0.0/8 # Private network
- 172.16.0.0/12 # Private network
- 192.168.0.0/16 # Private network

View File

@@ -0,0 +1,29 @@
---
# Traefik service configuration with static LoadBalancer IP
apiVersion: v1
kind: Service
metadata:
name: traefik
namespace: kube-system
annotations:
# Get a stable IP from MetalLB
metallb.universe.tf/address-pool: production
metallb.universe.tf/allow-shared-ip: traefik-lb
labels:
app.kubernetes.io/instance: traefik-kube-system
app.kubernetes.io/name: traefik
spec:
type: LoadBalancer
loadBalancerIP: 192.168.8.240
selector:
app.kubernetes.io/instance: traefik-kube-system
app.kubernetes.io/name: traefik
ports:
- name: web
port: 80
targetPort: web
- name: websecure
port: 443
targetPort: websecure
externalTrafficPolicy: Local

View File

@@ -0,0 +1,71 @@
---
apiVersion: v1
kind: Namespace
metadata:
name: debug
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: netdebug
namespace: debug
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: netdebug
subjects:
- kind: ServiceAccount
name: netdebug
namespace: debug
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: netdebug
namespace: debug
labels:
app: netdebug
spec:
replicas: 1
selector:
matchLabels:
app: netdebug
template:
metadata:
labels:
app: netdebug
spec:
serviceAccountName: netdebug
containers:
- name: netdebug
image: nicolaka/netshoot:latest
command: ["/bin/bash"]
args: ["-c", "while true; do sleep 3600; done"]
resources:
limits:
cpu: 200m
memory: 256Mi
requests:
cpu: 100m
memory: 128Mi
securityContext:
privileged: true
---
apiVersion: v1
kind: Service
metadata:
name: netdebug
namespace: debug
spec:
selector:
app: netdebug
ports:
- port: 22
targetPort: 22
name: ssh
type: ClusterIP

1072
setup/cluster/validate-setup.sh Executable file

File diff suppressed because it is too large Load Diff