- Jinja 40.7%
- HTML 32.7%
- HCL 12.3%
- CSS 11.3%
- JavaScript 3%
- CLAUDE.md: minor refresh of provisioned-state notes - iac/ansible/ansible.cfg: fact caching tweaks for the fleet - TODO-postdeploy.md: post-cutover checklist items not yet captured in per-role READMEs - TODO-roadmap.md: forward-looking ideas + the not-yet-planned bits - raw_crackle_website_coming_soon/: static "coming soon" content served by the rawcrackle_site nginx LXC (default content source per the role's `rawcrackle_site_source_dir` default) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| _archive/mailcow-hacks | ||
| iac | ||
| iac_archived | ||
| raw_crackle_website | ||
| raw_crackle_website_coming_soon | ||
| .gitignore | ||
| CLAUDE.md | ||
| edited-dns.txt | ||
| rawcrackle.ro.txt | ||
| README.md | ||
| TODO-authelia.md | ||
| TODO-passbolt-cutover.md | ||
| TODO-postdeploy.md | ||
| TODO-roadmap.md | ||
Raw Crackle Lab
Homelab + public-facing infrastructure for Raw Crackle, a hip-hop recording studio in Bucharest (recording, mixing, mastering, rehearsals).
Hosts the public website at rawcrackle.ro plus the internal stack the studio runs on: directory, SSO, password vault, mail server, file sync, reverse proxy, DNS, observability, and a self-developed WireGuard overlay (WireWarp) that tunnels public ingress through a rented VPS so the residential ISP doesn't pollute deliverability or expose the studio's IP.
Stack at a glance
internet
│
▼
┌───────────────────────┐
│ VPS (IPAX, AT) │ public IP 37.252.189.57
│ 37.252.189.57 │ PTR → mail.rawcrackle.ro
│ WireWarp tunnel-svr │ CrowdSec + iptables-bouncer
└───────────┬───────────┘
│ WireGuard tunnel
│ (real source IPs preserved end-to-end)
▼
┌───────────────────────┐
│ WireWarp gateway VM │ .40.10
│ (DNAT + NAT + FW) │
└───────────┬───────────┘
│
┌────────────┼────────────────────────────┐
▼ ▼ ▼
Traefik mailcow VM other services
.40.111 .40.201 (LXCs on .40.0/24)
(80/443) (25/465/587/993/4190)
IP allocation (192.168.40.0/24)
| Tier | IP range | Examples |
|---|---|---|
| DNS | .5 |
PiHole |
| Edge / hypervisor | .10–.19 |
WireWarp gateway (.10), px40 (.11) |
| Infra LXCs (depended on by others) | .100–.119 |
Traefik (.111), Authelia (.112), lldap (.113), WireWarp control (.114), Psono (.115) |
| App LXCs / VMs | .120–.199 |
(reserved — peer workloads) |
| Public-facing / observability | .200–.229 |
monitoring (.200), mailcow (.201), site (.202), nextcloud (.203), umami (.204), n8n (.205), homepage (.206) |
| Reserved / router | .230–.254 |
.254 studio gateway |
Tier rule of thumb: anything in .100–.119 going down takes half the lab
with it; .120–.229 workloads churn freely without touching infra. VMID
follows the last octet (.40.111 → CT 111; .40.5 → CT 105 with the
+100 alias for IPs < 100).
Services
| Service | URL | Auth | Surface |
|---|---|---|---|
| Public website | https://rawcrackle.ro |
none | public |
| Webmail | https://mail.rawcrackle.ro |
mailcow login → SOGo via SSO | public |
| Authelia portal | https://auth.rawcrackle.ro |
self | public |
| Autodiscover / Autoconfig | https://{autodiscover,autoconfig}.rawcrackle.ro |
mailcow | public (mail clients) |
| Nextcloud | https://cloud.rawcrackle.ro |
lldap direct | public |
| Collabora Office (WOPI) | https://office.rawcrackle.ro |
Nextcloud session | public (editor JS) |
| Umami tracker endpoint | https://track.rawcrackle.ro |
none | public (website JS only) |
| Homepage (app launcher) | https://dash.int.rawcrackle.ro |
none — LAN gate | LAN-only |
| Mailcow admin | https://mailcow.infra.rawcrackle.ro/admin |
mailcow | LAN-only |
| PiHole admin | https://pihole.infra.rawcrackle.ro/admin |
Authelia (sso_pihole_admins) |
LAN-only |
| Traefik dashboard | https://traefik.infra.rawcrackle.ro |
Authelia (sso_traefik_admins) |
LAN-only |
| lldap | https://lldap.infra.rawcrackle.ro |
lldap | LAN-only |
| Proxmox | https://px.infra.rawcrackle.ro |
PVE PAM | LAN-only |
| WireWarp | https://wirewarp.infra.rawcrackle.ro |
WireWarp | LAN-only |
| Psono | https://stash.int.rawcrackle.ro |
LDAP-direct | LAN-only |
| Grafana | https://grafana.infra.rawcrackle.ro |
Grafana local | LAN-only |
| Prometheus | https://prometheus.infra.rawcrackle.ro |
none — LAN gate | LAN-only |
| Loki | https://loki.infra.rawcrackle.ro |
none — LAN gate | LAN-only |
| Uptime-Kuma | https://uptime.infra.rawcrackle.ro |
Uptime-Kuma local | LAN-only |
| Umami admin | https://analytics.infra.rawcrackle.ro |
Umami local | LAN-only |
| n8n | https://n8n.infra.rawcrackle.ro |
n8n owner account | LAN-only |
LAN-only routes return HTTP 403 from public via Traefik's internal-only
middleware (RFC1918 + WG-range allowlist). Access from outside the studio LAN
goes through a WireWarp client config. The defense-in-depth pattern: even if
public DNS were ever pointed at a private hostname, Traefik 403s before any
backend sees the request.
Self-hosted on mailcow. Mail leaves the studio via the WireWarp tunnel, so
outbound is from the clean Austrian VPS IP rather than the studio's residential
line. 10/10 on mail-tester.com on the first
send — full DKIM + SPF + DMARC + PTR + FCrDNS chain. lldap drives mailbox
lifecycle via mailcow's built-in LDAP IdP (members of mailcow_account get a
mailbox auto-provisioned on first sync). Daily timer pulls the LE wildcard
from Traefik's acme.json over SSH and reloads postfix/dovecot/nginx only
on cert change.
Upgrades are operator-driven, not via Ansible:
ssh root@192.168.40.201
cd /opt/mailcow-dockerized && ./update.sh
WireWarp
Tunnel orchestration is split across three pieces:
- Control LXC (
.40.114) — FastAPI + Postgres dashboard + WS hub, pinned to a specific upstream SHA inroles/wirewarp_control/defaults/main.yml. Bump the ref and re-run--tags wirewarp-controlto take an upstream update; the role discards local edits and force-rebuilds the api image. - Gateway VM (
.40.10) — tunnel client to the VPS, plus DNAT/SNAT for the LAN. Runs the WireWarp agent (wirewarp-agent, systemd-managed) and a per-attachment routing healer. - Tunnel server — the rented VPS at
37.252.189.57. Same agent, modeserver. CrowdSec + iptables-firewall-bouncer optional (and one-click installable from the WireWarp dashboard — the agent does the apt install- cscli registration + auto-applies a whitelist covering every known IP and subnet in the environment).
The agent self-heals routing state every 60s: ip rule fwmark / per-table
routes / mangle CONNMARK rules / MSS clamp / MASQUERADE all get verified
and re-installed on drift. A wirewarp-routing.service systemd unit is
installed on first attach so iptables state survives reboots even when the
agent is down.
Observability
Single-node monitoring stack on .40.200:
- Prometheus — 90d retention, scrapes node_exporter + cAdvisor + Alloy + pihole-exporter + speedtest-exporter + restic-exporter + pve-exporter
- Loki — 30d retention, fed by Alloy from every host's journald + Docker
- Grafana — bundles dashboards for Traefik, Loki stack, host drilldown, fleet overview, PiHole, Restic, Nextcloud, Authelia
- Uptime-Kuma — HTTP + push monitors (restic backups send heartbeats)
Every managed host runs the observability trio (node_exporter, cadvisor,
alloy_agent) as sidecars — added by the Observability sidecars play that
runs after common and before the rest of the stack.
Backups
restic to Backblaze B2, one timer per stateful host (see [backup_hosts]
in the inventory). Each unit writes a heartbeat URL on success that
Uptime-Kuma watches; a missed beat surfaces as a Uptime-Kuma incident.
Repo layout
raw_crackle_lab/
├── CLAUDE.md # canonical playbook for AI assistants
├── README.md # you are here
├── TODO-postdeploy.md # hands-on items left after the IaC rollout
├── TODO-roadmap.md # phased "finish the lab" plan
├── _archive/ # retired patches kept for context
│ └── mailcow-hacks/ # direct-SOGo-login override (retired 2026-05)
├── raw_crackle_website/ # vanilla static site (HTML/CSS/JS, no build)
└── iac/
├── ansible/
│ ├── ansible.cfg, requirements.yml
│ ├── inventory.ini
│ ├── group_vars/all/{vars.yml, vault.yml{.example,}}
│ ├── host_vars/ # per-host overrides (firewall_internal_ports etc.)
│ ├── site.yml # full deploy, dependency-ordered
│ └── roles/
│ ├── common # apt/docker/hostname/timezone baseline
│ ├── node_exporter ┐
│ ├── cadvisor ├─ observability sidecars (every host)
│ ├── alloy_agent ┘
│ ├── pihole # LAN DNS + split-horizon
│ ├── lldap # directory (replaces FreeIPA for ≤10 users)
│ ├── authelia # OIDC IdP + ForwardAuth
│ ├── wirewarp_control # tunnel control plane
│ ├── wirewarp_client # gateway agent
│ ├── traefik # reverse proxy + ACME (Cloudflare DNS-01)
│ ├── mailcow # full mail stack on a VM
│ ├── nextcloud # file sync + Collabora Office
│ ├── psono # password vault (LDAP-direct)
│ ├── rawcrackle_site # public site (static nginx)
│ ├── monitoring # Prometheus + Grafana + Loki + Uptime-Kuma
│ ├── restic # B2 backups + heartbeats
│ ├── umami # privacy-friendly analytics
│ ├── n8n # workflow automation
│ ├── homepage # central app launcher
│ └── firewall_internal_port # DOCKER-USER LAN-bypass lockdown
└── terraform/
├── containers.tf, vms.tf # bpg/proxmox containers + VMs
├── locals.tf, variables.tf
├── files/ # cloud-init templates
├── templates.tf, providers.tf
└── terraform.tfvars{.example,} # gitignored real values + schema
Quick start (ops)
# First time on a new workstation
cd iac/ansible
ansible-galaxy collection install -r requirements.yml
# Terraform (Proxmox LXCs/VMs + Cloudflare DNS)
cd ../terraform
terraform init
terraform plan
terraform apply -target=proxmox_virtual_environment_container.<name> # narrow first
terraform apply
# Ansible deploy — one service, narrow first
cd ../ansible
ansible-playbook site.yml --tags mailcow --limit mailcow_hosts
# Full deploy (dependency-ordered: common → observability sidecars → DNS →
# lldap → WireWarp → Traefik → Authelia → mail/cloud/etc. → monitoring →
# restic → firewall lockdown)
ansible-playbook site.yml
The firewall_internal_port role intentionally runs last so every other
service has populated its DOCKER-USER chain by then. It locks down any LAN
host that has firewall_internal_ports defined in its host_vars, ensuring a
LAN-only service can't be reached by bypassing Traefik (e.g. curl http://192.168.40.114:8200 is dropped at the host firewall).
Secrets
Both iac/ansible/group_vars/all/vault.yml and
iac/terraform/terraform.tfvars are plaintext + gitignored. Copy the
.example siblings, fill in real values, never commit the result. Tradeoff
acknowledged in CLAUDE.md: anyone with checkout access has the keys; the
mitigation is that the repo only lives on a trusted workstation. Add a new
secret in both the live file and the .example so the schema stays in
sync.
Do not echo plaintext from vault.yml into shell commands — Ansible
substitutes {{ vault_<name> }} at template-render time, and operator-side
secret access should go via the user's password manager (Psono), not the
checkout.
Conventions
Carried over from the user's other homelab repo:
- Unprivileged LXCs by default. Privileged only with documented justification.
- Docker data under
/opt/appdata/<service>, compose at/opt/<service>/. - Image tags pinned to
Major.Minor— never:latest. Exceptions per service (Psono EE compound triple, Nextcloud AIO datestamp imaginary tag) documented in role defaults. - All deploys idempotent: re-run =
0 changed. Verified on every role. - PVE LXC
.confedits uselineinfilewrite-then-cp(never direct writes through pmxcfs FUSE — silently truncates). - OpenWrt/GL.iNet routers: network changes via LuCI only, never UCI scripts.
See CLAUDE.md for the full list, including session-learned
gotchas (Cloudflare CNAME FQDN bug, Pi-hole v6 cache flush, browser DoH
override, mailcow SOGo redirect loop, WireWarp MTU blackhole, …).
Reference
- WireWarp source: https://github.com/stepunu/wirewarp
- Mailcow docs: https://docs.mailcow.email
- Authelia docs: https://www.authelia.com
- lldap docs: https://github.com/lldap/lldap
- Psono docs: https://doc.psono.com
- Grafana / Loki / Prometheus: https://grafana.com/docs