# Troubleshooting Guide Common issues and solutions for this NixOS configuration. ## Build Failures ### `nixos-rebuild switch` fails 1. **Syntax error** — the error message includes the file and line number. Common causes: missing `;`, unmatched `{`, wrong type passed to an option. 2. **Evaluation error** — read the full error trace. Often caused by a module option receiving the wrong type, or a missing `cfg.enable` guard. 3. **Fetch failure** — a flake input or package source can't be downloaded. Check network connectivity, or try: ```bash nix flake update --update-input ``` 4. **Disk space** — build sandbox fills up. Free space: ```bash sudo nix-collect-garbage -d df -h /nix ``` ### Assertion failures If you see `assertion failed`, read the `message` field. For example: ``` error: assertion failed at …/nebula/sops.nix mjallen.services.nebula.secretsPrefix must be set ``` Set the required option in the system configuration. ## Boot Issues ### System won't boot after a config change 1. At the boot menu, select a previous generation. 2. Once booted, revert the change: ```bash cd /etc/nixos git revert HEAD sudo nixos-rebuild switch --flake .#$(hostname) ``` ### Booting from installation media to recover ```bash # Mount the system (adjust device paths as needed) sudo mount /dev/disk/by-label/nixos /mnt sudo mount /dev/disk/by-label/boot /mnt/boot # Chroot in sudo nixos-enter --root /mnt cd /etc/nixos # Revert and rebuild git revert HEAD nixos-rebuild switch --flake .#hostname --install-bootloader ``` ### Lanzaboote / Secure Boot issues If Secure Boot enrolment fails or the system won't verify: ```bash # Check enrolled keys sbctl status # Re-enrol if needed (run as root) sbctl enrol-keys --microsoft # Sign bootloader files manually sbctl sign -s /boot/EFI/systemd/systemd-bootx64.efi ``` ## SOPS / Secrets Issues ### `secret not found` or permission denied at boot 1. Verify the secret key path matches what's declared in the module's `sops.nix`. 2. Check the secret exists in the SOPS file: ```bash sops --decrypt secrets/nas-secrets.yaml | grep "the-key" ``` 3. Check the `owner`/`group` set on the secret matches the service user. ### Can't decrypt — wrong age key The machine's age key is derived from `/etc/ssh/ssh_host_ed25519_key`. If the host key was regenerated, the age key changed and existing secrets can no longer be decrypted. To fix: re-encrypt the secrets file with the new public key: ```bash # Get the new public key nix-shell -p ssh-to-age --run 'ssh-to-age < /etc/ssh/ssh_host_ed25519_key.pub' # Update .sops.yaml with the new key, then: sops updatekeys secrets/nas-secrets.yaml ``` ### Adding a new secret to an existing file ```bash sops secrets/nas-secrets.yaml # Editor opens with decrypted YAML — add your key, save, sops re-encrypts ``` ## Nebula VPN Issues ### Peers can't connect 1. Verify the lighthouse is reachable on its public address: ```bash nc -zvu mjallen.dev 4242 ``` 2. Check the nebula service on both hosts: ```bash systemctl status nebula@jallen-nebula journalctl -u nebula@jallen-nebula -n 50 ``` 3. Confirm the CA cert, host cert, and host key are all present and owned by the `nebula-jallen-nebula` user: ```bash ls -la /run/secrets/pi5/nebula/ ``` 4. Verify the host cert was signed by the same CA as the other nodes: ```bash nebula-cert verify -ca ca.crt -crt host.crt ``` ### Certificate expired Re-sign the host certificate: ```bash nebula-cert sign -name "hostname" -ip "10.1.1.x/24" \ -ca-crt ca.crt -ca-key ca.key \ -out-crt host.crt -out-key host.key # Update SOPS, rebuild ``` ## Impermanence Issues ### Service fails because its data directory is missing after reboot If a service stores state in a path that isn't in the persistence list, it will be wiped on reboot. Add it to `impermanence.extraDirectories`: ```nix mjallen.impermanence.extraDirectories = [ { directory = "/var/lib/my-service"; user = "my-service"; group = "my-service"; mode = "0750"; } ]; ``` Then move the existing data if needed: ```bash cp -a /var/lib/my-service /persist/var/lib/my-service ``` ## Flake Input Issues ### Input update breaks a build Roll back the specific input: ```bash git checkout HEAD^ -- flake.lock ``` Or pin the input to a specific revision in `flake.nix`: ```nix nixpkgs-unstable.url = "github:NixOS/nixpkgs/abc123def"; ``` ## Service Issues ### Service won't start ```bash systemctl status journalctl -u -n 100 --no-pager ``` ### Caddy reverse proxy not routing 1. Check that `reverseProxy.enable = true` is set on the service. 2. Verify the subdomain matches: `reverseProxy.subdomain = "myapp"` → `myapp.mjallen.dev`. 3. Check Caddy logs: ```bash journalctl -u caddy -n 50 ``` ### PostgreSQL database missing for a service If `configureDb = true` is set, the database is created automatically. If it's missing: ```bash sudo -u postgres createdb my-service sudo -u postgres psql -c "GRANT ALL ON DATABASE my-service TO my-service;" ``` ## Network Issues ### Firewall blocking a service Check which ports are open: ```bash sudo nft list ruleset | grep accept ``` Add ports in the system config: ```nix mjallen.network.firewall.allowedTCPPorts = [ 8080 ]; ``` Or if using `mkModule`, set `openFirewall = true` (it's the default). ## Getting Help - NixOS manual: `nixos-help` or https://nixos.org/manual/nixos/stable/ - NixOS Wiki: https://nixos.wiki/ - NixOS Discourse: https://discourse.nixos.org/ - Nix package search: https://search.nixos.org/packages