Proxmox High Availability

To experiment with Proxmox High Availability (HA), I tried it out using nested hosts. To validate that nested virtualization is supported, the following command should return a ‘Y’.

cat /sys/module/kvm_intel/paramaters/nested

I created guest VMs pve1, pve2 and pve3 on my main Proxmox server. Note that when creating the VMs, the CPU type needs to be ‘host’ in order to enable virtualization on the nested servers. I cloned the first VM twice changing the host names and the machine ids. My main clusters run on 4th generation Intel systems that are inexpensive and very low power. For the first time, creation of these guest VMs pushed the limits of my system and took a long time to complete.

On the nested hosts, enter the following commands. Note that pve1 summary will show “Guest Agent not running” in the IPs section. After these commands, this section wil show the IP address.

apt install qemu-guest-agent
systemctl start qemu-guest-agent
systemctl enable qemu-guest-agent

Before enabling HA, you must enable shared storage using Ceph, the name of which is derived from the marine animals cephalopods. Ceph uses storage from each of the nodes, three in this case, such that you can remove any one of the nodes and the remaining two still have a complete set of the data. The VM hard drive must be moved to the Ceph storage pool. Brandon Lee has an excellent video for enabling Ceph.

Cloning Proxmox VMs

I’ve implemented a three-node Proxmox cluster. It’s making it very easy to deploy backup servers and copies of servers for testing. Servers can easily be migrated from one node to another. It doesn’t take long to right click a node and clone it. It’s much faster to convert a node to a template then create a new node from the template.

After making a clone the server has the same hostname (/etc/hostname) and hosts files (/etc/hosts) of the original. These two files must be edited to reflect a new host name. Another setting that must be changed is the machine id (/etc/machine-id). I use the Pi-hole DHCP server and it uses the machine id to assign IP addresses. If this is not changed, the original server and cloned server will get the same IP address.

These four steps will make the machine-id file empty; remove the symbolic link to the machine-id file, give the file a random machine id and restore the symbolic link to the machine-id file.

sudo truncate -s 0 /etc/machine-id
sudo rm -f /var/lib/dbus/machine-id
sudo systemd-machine-id-setup
sudo ln -s /etc/machine-id /var/lib/dbus/machine

sudo sed -i s/us.archive/old-releases/ /etc/apt/sources.list

These commands are executed in ~/bin/clone.


Back up Pi-hole on a Schedule

To back up Pi-hole on a schedule, you can use its built-in configuration export tools combined with a cron job. This allows you to save the configuration and restore it if needed. Here’s how to set it up:


1. Pi-hole Configuration Backup

Pi-hole provides a pihole -a command to export and import settings.

Backup Command:

To back up Pi-hole settings, run:

pihole -a -t

This creates a tarball of Pi-hole settings in the current directory, containing:

  • DNS configuration
  • Whitelist/Blacklist
  • Static DHCP leases
  • Query logs (optional)

Restore Command:

To restore from the backup:

pihole -a -r


2. Automate Backup with a Cron Job

Step 1: Create a Backup Directory

Choose where to store the backup files (e.g., /home/pi/pihole-backups):

mkdir -p /home/pi/pihole-backups

Step 2: Write a Backup Script

Create a script to handle the backup process.

Example script: /home/pi/backup_pihole.sh

#!/bin/bash

# Define backup directory
BACKUP_DIR="/home/pi/pihole-backups"

# Create a timestamp
TIMESTAMP=$(date +"%Y-%m-%d_%H-%M-%S")

# Backup file path
BACKUP_FILE="$BACKUP_DIR/pihole-backup-$TIMESTAMP.tar.gz"

# Run Pi-hole backup
pihole -a -t > /dev/null 2>&1

# Move the tarball to the backup directory with a timestamp
mv /etc/pihole/*.tar.gz "$BACKUP_FILE"

# Delete old backups (optional, e.g., keep 7 days)
find "$BACKUP_DIR" -type f -mtime +7 -exec rm {} \;

Make the script executable:

chmod +x /home/pi/backup_pihole.sh

Step 3: Add a Cron Job

Edit the crontab to schedule the script:

crontab -e

Add a line to schedule the script (e.g., daily at 2:00 AM):

0 2 * * * /home/pi/backup_pihole.sh


3. (Optional) Sync Backups to Another Location

For additional safety, copy backups to an external location (e.g., NAS, cloud storage, or another server).

Example: Use rsync to Copy Backups

Add the following line to the script:

rsync -av --delete /home/pi/pihole-backups/ user@remote-server:/backup-location/


4. Verify Backup and Restore

  1. Run the script manually to test: /home/pi/backup_pihole.sh
  2. Confirm the backup file exists in /home/pi/pihole-backups.
  3. Test restoring using: pihole -a -r

With this setup, Pi-hole backups will occur automatically, and you’ll have a reliable way to restore your configuration when needed. Let me know if you’d like help customizing the process!

Proxmox High Availablity Setup (ChatGPT)

I think there’s a pretty good video here to set up high availability (HA) on my Proxmox cluster.

This is my query to ChatGPT:

How to set up ha on my three-node proxmox cluster (pve1, pve2, and pve3) for automatic failover of my pihole dhcp server on pve1.

ChatGPT response:

Setting up high availability (HA) for your Pi-hole DHCP server on a three-node Proxmox cluster (pve1, pve2, pve3) ensures automatic failover if one node goes down. Here’s how you can achieve this:

1. Prerequisites

  1. Shared Storage:
    • Ensure all Proxmox nodes have access to shared storage (e.g., NFS, Ceph, or iSCSI) for the Pi-hole VM or container.
    • Alternatively, use Proxmox replication for local storage to synchronize disks between nodes.
  2. Cluster and Quorum:
    • Your Proxmox cluster (pve1, pve2, pve3) must be fully configured with quorum enabled to manage HA resources. pvecm status
  3. Pi-hole Setup:
    • Install Pi-hole as a VM or LXC container on pve1.
    • Configure Pi-hole’s DHCP and DNS settings.
  4. HA Configuration Package:
    • Ensure HA management tools are installed on Proxmox: apt install proxmox-ha-manager

2. Configure Shared Storage

If you don’t have shared storage:

  1. Set Up NFS Storage:
    • Install NFS on a NAS or a separate server.
    • Add the shared storage in Proxmox:
      • Go to Datacenter > Storage > Add > NFS.
      • Provide the NFS server details and share path.
  2. Use Local Storage Replication:
    • Enable storage replication between nodes for local storage (not real-time but effective for HA).

3. Move Pi-hole VM to Shared Storage

  1. Migrate the Pi-hole VM to shared storage: qm move_disk <vmid> <storage_name> Replace <vmid> with your Pi-hole VM ID and <storage_name> with the shared storage ID.
  2. Confirm the disk location: qm config <vmid>

4. Enable HA for Pi-hole VM

  1. Add the VM to HA Resources:
    • In the Proxmox web interface:
      • Go to Datacenter > HA > Add > Virtual Machine.
      • Select the Pi-hole VM and configure Failover priority.
    • Or use the CLI: ha-manager add vm:<vmid>
  2. Check HA Status: ha-manager status
  3. Set VM Start Priority:
    • Go to VM > Options and set the start/shutdown order.

5. Test HA Setup

  1. Shut down pve1: shutdown -h now
  2. Verify that the Pi-hole VM automatically starts on another node (e.g., pve2 or pve3).

6. Additional Considerations

DNS Redundancy:

  • Add a secondary Pi-hole instance for DNS failover.
    • Configure it on another node (e.g., pve2).
    • Sync configurations between the two using tools like rsync or Gravity Sync.

Monitor DHCP Leases:

  • Ensure DHCP leases persist after failover.
    • Store the dnsmasq lease file on shared storage or sync it between nodes using a script.

Backup Configuration:

  • Regularly back up the Pi-hole configuration and VM:
    • Use Proxmox backup jobs to save VM snapshots.

Summary of Key Commands

  • Add HA Resource: ha-manager add vm:<vmid>
  • Check HA Status: ha-manager status
  • Migrate Disk to Shared Storage: qm move_disk <vmid> <storage_name>

Let me know if you’d like assistance with any specific part of this process!

Samba Setup (ChatGPT)

This is a copy and paste from ChatGPT (https://chatgpt.com/share/677bd14a-ef84-8002-93fc-b3456d8a9df8). The formatting and loss of copy buttons is a problem.

Setting up a file server on Ubuntu using an external USB drive is a straightforward process. Here’s a step-by-step guide:


1. Prepare Your USB Drive

  1. Plug in the USB Drive: Connect the external USB drive to your Ubuntu machine.
  2. Identify the Drive: Open a terminal and run:bashCopy codelsblk Find the USB drive in the list. It will likely be named something like /dev/sdX (e.g., /dev/sdb1).
  3. Format the Drive (if needed): If the USB drive isn’t already formatted, format it with a suitable file system (e.g., ext4 for Linux systems):bashCopy codesudo mkfs.ext4 /dev/sdX1 Replace /dev/sdX1 with your USB drive’s identifier.
  4. Create a Mount Point:bashCopy codesudo mkdir /mnt/usb
  5. Mount the Drive:bashCopy codesudo mount /dev/sdX1 /mnt/usb Add the drive to /etc/fstab for automatic mounting:bashCopy codeecho "/dev/sdX1 /mnt/usb ext4 defaults 0 0" | sudo tee -a /etc/fstab

2. Install and Configure Samba (File Sharing for Windows/Linux)

  1. Install Samba:bashCopy codesudo apt update sudo apt install samba -y
  2. Configure Samba: Open the Samba configuration file:bashCopy codesudo nano /etc/samba/smb.conf Add a shared folder configuration at the end:iniCopy code[USB_Share] path = /mnt/usb read only = no browsable = yes writable = yes guest ok = no valid users = your_username
  3. Set Samba User: Add your user to Samba:bashCopy codesudo smbpasswd -a your_username
  4. Restart Samba Service:bashCopy codesudo systemctl restart smbd

3. Configure Permissions

Ensure your user has ownership of the USB mount point:

bashCopy codesudo chown -R your_username:your_username /mnt/usb

4. Access the File Server

  • From Linux: Use the smb:// protocol in your file manager.
  • From Windows:
    • Open the File Explorer.
    • Type \\<Ubuntu_IP_Address>\USB_Share in the address bar.
    • Enter your Samba username and password.

Optional: Install and Configure NFS (for Linux Sharing)

If you prefer NFS for Linux clients:

  1. Install the NFS server:bashCopy codesudo apt install nfs-kernel-server -y
  2. Edit the export file:bashCopy codesudo nano /etc/exports Add:plaintextCopy code/mnt/usb *(rw,sync,no_subtree_check)
  3. Restart the NFS service:bashCopy codesudo systemctl restart nfs-kernel-server

Pi-hole Installation

The official Pi-hole website has a great tutorial. It suggests a single line install:

curl -sSL https://install.pi-hole.net | bash

As the DHCP server, Pi-hole needs a static IP address. The network definition moved again and can be edited with this command:

sudo nmtui

Add a secondary DNS server by creating /etc/dnsmasq.d/99-second-DNS.conf and adding this line:

dhcp-option=option:dns-server,192.168.2.2,192.168.2.1

Proxmox Recovery and Clustering

The day after I moved my DHCP server to Pi-hole on my Proxmox server, my Proxmox startup failed to boot with a grub error. I’ve done regular backups of all but my recent VMs so I should be able to execute my restore plan by installing a new instance of Proxmox and restoring the VMs.

My server is a Lenovo M73 ThinkCentre tiny computer. There’s no room for a second drive for a mirror, so I live dangerously. The M73 has an Intel Core-i5 cpu which is 10 generations back (2013) but it really gets the job done for me. I eventually aquired three spares. I’m more into getting the job done efficiently at a low cost than having expensive failsafe systems. I have the opportunity to cluster two or three instances of Proxmox so I’m off on another learning experience.

After updating my main server, it failed to boot with a grub error. I was relieved to find a Proxmox wiki that described the recovery. There’s a known Debian LVM bug that caused the grub “disk not found” failure. The temporary fix is to trigger an LVM metadata update by creating a small logical volume. The only permanent fix for PVE 7.x is to upgrade to PVE8. The permanent fix for PVE 8 is to install the correct metapackage for EUFI.

The Proxmox upgrade is a simple “apt update && apt upgrade”. However the enterprise respository is enabled by default and unless you are an enterprise, you should use the “no subsription” repositories. To disable the enterprise repository, see this wiki or better yet, this one.

Replace the Proxmox Drive

I’ve backed up the nodes on my main Proxmox server to an external hard drive. I pulled the SSD drive and replaced it with a new SSD. I installed the latest release of Proxmox. Now it’s time to restore the VMs. From the Proxmox shell, connect the backup drive from the previous Proxmox and mount it. From Datacenter, add the mount point directory to Datacenter storage. Add the mount to fstab for persistance. ChatGPT helps with the details. I restored my VMs from the backup drive successfully.

Proxmox Cluster

I now have three Proxmox nodes that are all up to date with the same version. I ran backups on each node. I’m confident that I can recover from a hard system failure using the external backup drive on my first node. Proxmox clusters make decisions based on majority vote so it’s important to have an odd number of nodes. Two nodes can work with the addition of a qdevice as the third vote giving a quorum. A raspberry pi is sufficient for a qdevice. However, my hardware is cheap and low power. Three nodes should make for a full function cluster. I installed npt on each node for time synchronization which is required

I used a video by Tech Tutorials – David McKone to set up the cluster. The VMs on the first node will remain but I had to remove all the VMs on the other nodes.

I followed the video very carefully up to the point where I clicked “join”. My second node stopped responding because of an IP conflict. My first Proxmox node had Pi-hole with my DHCP server. It was down for 24 hours while I replaced the hard disk. I started the DHCP server on my router, knowing the potential for conflicts. The join stalled and nothing I found on the internet helped. The partial join caused the quorum to increase to two nodes but only one node worked. No quorum – nothing worked. Then I identified the problem to ChatGPT and got all the commands I needed to repair the damage. ChatGPT even included “copy” buttons for the commands of which the name of my corrupted server was included! I added the third node simply by hitting the “join” button. Now I can loose a node and still have a quorum.

My third node has problems I can’t fix. I change the quorum