218 lines
5.5 KiB
Markdown
Executable File
218 lines
5.5 KiB
Markdown
Executable File
# Troubleshooting Guide
|
|
|
|
Common issues and solutions for this NixOS configuration.
|
|
|
|
## Build Failures
|
|
|
|
### `nixos-rebuild switch` fails
|
|
|
|
1. **Syntax error** — the error message includes the file and line number. Common causes: missing `;`, unmatched `{`, wrong type passed to an option.
|
|
|
|
2. **Evaluation error** — read the full error trace. Often caused by a module option receiving the wrong type, or a missing `cfg.enable` guard.
|
|
|
|
3. **Fetch failure** — a flake input or package source can't be downloaded. Check network connectivity, or try:
|
|
```bash
|
|
nix flake update --update-input <input-name>
|
|
```
|
|
|
|
4. **Disk space** — build sandbox fills up. Free space:
|
|
```bash
|
|
sudo nix-collect-garbage -d
|
|
df -h /nix
|
|
```
|
|
|
|
### Assertion failures
|
|
|
|
If you see `assertion failed`, read the `message` field. For example:
|
|
```
|
|
error: assertion failed at …/nebula/sops.nix
|
|
mjallen.services.nebula.secretsPrefix must be set
|
|
```
|
|
Set the required option in the system configuration.
|
|
|
|
## Boot Issues
|
|
|
|
### System won't boot after a config change
|
|
|
|
1. At the boot menu, select a previous generation.
|
|
2. Once booted, revert the change:
|
|
```bash
|
|
cd /etc/nixos
|
|
git revert HEAD
|
|
sudo nixos-rebuild switch --flake .#$(hostname)
|
|
```
|
|
|
|
### Booting from installation media to recover
|
|
|
|
```bash
|
|
# Mount the system (adjust device paths as needed)
|
|
sudo mount /dev/disk/by-label/nixos /mnt
|
|
sudo mount /dev/disk/by-label/boot /mnt/boot
|
|
|
|
# Chroot in
|
|
sudo nixos-enter --root /mnt
|
|
cd /etc/nixos
|
|
|
|
# Revert and rebuild
|
|
git revert HEAD
|
|
nixos-rebuild switch --flake .#hostname --install-bootloader
|
|
```
|
|
|
|
### Lanzaboote / Secure Boot issues
|
|
|
|
If Secure Boot enrolment fails or the system won't verify:
|
|
|
|
```bash
|
|
# Check enrolled keys
|
|
sbctl status
|
|
|
|
# Re-enrol if needed (run as root)
|
|
sbctl enrol-keys --microsoft
|
|
|
|
# Sign bootloader files manually
|
|
sbctl sign -s /boot/EFI/systemd/systemd-bootx64.efi
|
|
```
|
|
|
|
## SOPS / Secrets Issues
|
|
|
|
### `secret not found` or permission denied at boot
|
|
|
|
1. Verify the secret key path matches what's declared in the module's `sops.nix`.
|
|
2. Check the secret exists in the SOPS file:
|
|
```bash
|
|
sops --decrypt secrets/nas-secrets.yaml | grep "the-key"
|
|
```
|
|
3. Check the `owner`/`group` set on the secret matches the service user.
|
|
|
|
### Can't decrypt — wrong age key
|
|
|
|
The machine's age key is derived from `/etc/ssh/ssh_host_ed25519_key`. If the host key was regenerated, the age key changed and existing secrets can no longer be decrypted.
|
|
|
|
To fix: re-encrypt the secrets file with the new public key:
|
|
```bash
|
|
# Get the new public key
|
|
nix-shell -p ssh-to-age --run 'ssh-to-age < /etc/ssh/ssh_host_ed25519_key.pub'
|
|
|
|
# Update .sops.yaml with the new key, then:
|
|
sops updatekeys secrets/nas-secrets.yaml
|
|
```
|
|
|
|
### Adding a new secret to an existing file
|
|
|
|
```bash
|
|
sops secrets/nas-secrets.yaml
|
|
# Editor opens with decrypted YAML — add your key, save, sops re-encrypts
|
|
```
|
|
|
|
## Nebula VPN Issues
|
|
|
|
### Peers can't connect
|
|
|
|
1. Verify the lighthouse is reachable on its public address:
|
|
```bash
|
|
nc -zvu mjallen.dev 4242
|
|
```
|
|
2. Check the nebula service on both hosts:
|
|
```bash
|
|
systemctl status nebula@jallen-nebula
|
|
journalctl -u nebula@jallen-nebula -n 50
|
|
```
|
|
3. Confirm the CA cert, host cert, and host key are all present and owned by the `nebula-jallen-nebula` user:
|
|
```bash
|
|
ls -la /run/secrets/pi5/nebula/
|
|
```
|
|
4. Verify the host cert was signed by the same CA as the other nodes:
|
|
```bash
|
|
nebula-cert verify -ca ca.crt -crt host.crt
|
|
```
|
|
|
|
### Certificate expired
|
|
|
|
Re-sign the host certificate:
|
|
```bash
|
|
nebula-cert sign -name "hostname" -ip "10.1.1.x/24" \
|
|
-ca-crt ca.crt -ca-key ca.key \
|
|
-out-crt host.crt -out-key host.key
|
|
# Update SOPS, rebuild
|
|
```
|
|
|
|
## Impermanence Issues
|
|
|
|
### Service fails because its data directory is missing after reboot
|
|
|
|
If a service stores state in a path that isn't in the persistence list, it will be wiped on reboot. Add it to `impermanence.extraDirectories`:
|
|
|
|
```nix
|
|
mjallen.impermanence.extraDirectories = [
|
|
{ directory = "/var/lib/my-service"; user = "my-service"; group = "my-service"; mode = "0750"; }
|
|
];
|
|
```
|
|
|
|
Then move the existing data if needed:
|
|
```bash
|
|
cp -a /var/lib/my-service /persist/var/lib/my-service
|
|
```
|
|
|
|
## Flake Input Issues
|
|
|
|
### Input update breaks a build
|
|
|
|
Roll back the specific input:
|
|
```bash
|
|
git checkout HEAD^ -- flake.lock
|
|
```
|
|
|
|
Or pin the input to a specific revision in `flake.nix`:
|
|
```nix
|
|
nixpkgs-unstable.url = "github:NixOS/nixpkgs/abc123def";
|
|
```
|
|
|
|
## Service Issues
|
|
|
|
### Service won't start
|
|
|
|
```bash
|
|
systemctl status <service>
|
|
journalctl -u <service> -n 100 --no-pager
|
|
```
|
|
|
|
### Caddy reverse proxy not routing
|
|
|
|
1. Check that `reverseProxy.enable = true` is set on the service.
|
|
2. Verify the subdomain matches: `reverseProxy.subdomain = "myapp"` → `myapp.mjallen.dev`.
|
|
3. Check Caddy logs:
|
|
```bash
|
|
journalctl -u caddy -n 50
|
|
```
|
|
|
|
### PostgreSQL database missing for a service
|
|
|
|
If `configureDb = true` is set, the database is created automatically. If it's missing:
|
|
```bash
|
|
sudo -u postgres createdb my-service
|
|
sudo -u postgres psql -c "GRANT ALL ON DATABASE my-service TO my-service;"
|
|
```
|
|
|
|
## Network Issues
|
|
|
|
### Firewall blocking a service
|
|
|
|
Check which ports are open:
|
|
```bash
|
|
sudo nft list ruleset | grep accept
|
|
```
|
|
|
|
Add ports in the system config:
|
|
```nix
|
|
mjallen.network.firewall.allowedTCPPorts = [ 8080 ];
|
|
```
|
|
|
|
Or if using `mkModule`, set `openFirewall = true` (it's the default).
|
|
|
|
## Getting Help
|
|
|
|
- NixOS manual: `nixos-help` or https://nixos.org/manual/nixos/stable/
|
|
- NixOS Wiki: https://nixos.wiki/
|
|
- NixOS Discourse: https://discourse.nixos.org/
|
|
- Nix package search: https://search.nixos.org/packages
|