Subversion Repositories havirt

Rev

Rev 26 | Blame | Compare with Previous | Last modification | View Log | Download | RSS feed

# havirt Installation Guide

## Overview

This guide covers the complete installation and initial setup of havirt for a KVM/libvirt cluster with shared storage. The process takes approximately 30-60 minutes for a typical 3-4 node cluster.

## Prerequisites

### Hardware Requirements

- **Multiple hypervisor nodes** (minimum 2, tested with up to 4)
- **Shared block storage** (iSCSI, Fiber Channel, or similar)
- **NFS server** for shared configuration and state files
- **Network connectivity** between all nodes

### Software Requirements

- **Operating System**: Debian-based Linux (tested on Devuan Chimaera, Debian, Ubuntu)
- **Virtualization**: KVM/QEMU with libvirt installed
- **Perl**: Version 5.x or later
- **Perl Modules**:
  - `Data::Dumper` (usually included with Perl)
  - `YAML::Tiny`
  - `FindBin` (usually included with Perl)
- **Utilities**: `ssh`, `svn` (for installation)

### Network Requirements

- All nodes must be able to SSH to each other (including themselves)
- DNS or `/etc/hosts` entries for all node names
- Consistent hostname resolution across all nodes

### Storage Requirements

- **NFS share** mounted at the **same path** on all nodes
- **File locking must be enabled** on the NFS server
- Recommended: Dedicated NFS share for havirt (minimum 100MB)

## Tested Environment

Havirt has been tested and deployed on:

- **Nodes**: 4 x Devuan Linux (Chimaera) hypervisors
- **Virtualization**: libvirt/KVM/QEMU
- **Storage**: NAS providing iSCSI targets
- **NFS**: Mounted at `/media/shared` on all nodes
- **Lock requirement**: NFS file locking enabled

## Installation Steps

### Step 1: Prepare NFS Share

On your NFS server, create and export a share:

```bash
# On NFS server
mkdir -p /exports/havirt
chown root:root /exports/havirt
chmod 755 /exports/havirt
```

Add to `/etc/exports`:
```
/exports/havirt  node1(rw,sync,no_subtree_check) node2(rw,sync,no_subtree_check) node3(rw,sync,no_subtree_check)
```

**Critical**: Ensure file locking is enabled (default in most NFS configurations).

Reload exports:
```bash
exportfs -ra
```

### Step 2: Mount NFS on All Nodes

On each hypervisor node:

```bash
# Create mount point
mkdir -p /media/shared

# Add to /etc/fstab for persistent mounting
echo "nfs-server:/exports/havirt  /media/shared  nfs  defaults  0  0" >> /etc/fstab

# Mount the share
mount /media/shared

# Verify
df -h | grep shared
touch /media/shared/test && rm /media/shared/test  # Test write access
```

### Step 3: Install Dependencies

On each node, install required packages:

**Debian/Ubuntu/Devuan:**
```bash
apt update
apt install -y libvirt-daemon-system qemu-kvm libdata-dump-perl libyaml-tiny-perl subversion
```

**RHEL/CentOS/Rocky:**
```bash
yum install -y libvirt qemu-kvm perl-Data-Dumper perl-YAML-Tiny subversion
```

Verify Perl modules:
```bash
perl -MData::Dumper -e 'print "Data::Dumper OK\n"'
perl -MYAML::Tiny -e 'print "YAML::Tiny OK\n"'
```

### Step 4: Install havirt

Install havirt to the NFS share (run on any node):

```bash
# Check out stable version
svn co http://svn.dailydata.net/svn/havirt/stable /media/shared/havirt

# Verify installation
ls -la /media/shared/havirt/
```

You should see:
```
havirt              # Main executable
*.pm                # Module files
config.sample.yaml  # Sample configuration
conf/               # Directory for VM configs
var/                # Directory for state files
```

### Step 5: Create Symlinks on All Nodes

On each node:

```bash
ln -s /media/shared/havirt/havirt /usr/local/bin/havirt

# Verify
which havirt
havirt --version
```

### Step 6: Generate Initial Configuration

On any node:

```bash
cd /media/shared/havirt
havirt
```

This creates `config.yaml` with safe defaults. By default, havirt runs in **dry-run mode** (commands are displayed but not executed).

Review and customize:
```bash
vi config.yaml
```

Key settings to review:
```yaml
flags:
  dryrun: 1                    # Keep at 1 until setup complete
  verbose: 1                   # Show detailed output
  debug: 0                     # Set to 1-3 for troubleshooting
  
node reserved memory: 8388608  # 8GB reserved per node (adjust as needed)
min scan time: 300             # Minimum 5 minutes between scans
```

### Step 7: Configure SSH Keys

All nodes must authenticate to each other without passwords.

**On each node, as root:**

```bash
# Generate SSH key (if not already present)
if [ ! -f /root/.ssh/id_rsa ]; then
    ssh-keygen -t rsa -b 4096 -N "" -f /root/.ssh/id_rsa
fi

# Append public key to shared collection
cat /root/.ssh/id_rsa.pub >> /media/shared/havirt/sshkeys
```

**After all nodes have added their keys:**

```bash
# On each node, deploy the collected keys
cat /media/shared/havirt/sshkeys >> /root/.ssh/authorized_keys
chown root:root /root/.ssh/authorized_keys
chmod 600 /root/.ssh/authorized_keys
```

**Test SSH connectivity from each node:**

```bash
# On node1, test connectivity to all nodes
for node in node1 node2 node3 node4; do
    echo "Testing $node..."
    ssh -o StrictHostKeyChecking=accept-new $node hostname
done
```

**Optional**: Create SSH config for convenience:

```bash
cat > /root/.ssh/config << 'EOF'
Host node1
    HostName node1.example.com
    User root

Host node2
    HostName node2.example.com
    User root
    
# Add entries for all nodes
EOF

chmod 600 /root/.ssh/config
```

### Step 8: Register Nodes

Register each hypervisor node in the cluster database.

**On any node:**

```bash
# Add each node
havirt node add node1
havirt node add node2
havirt node add node3
havirt node add node4

# Verify all nodes registered
havirt node list
```

Expected output:
```
     name     memory  cpu_count maintenance
    node1  67108864         16           0
    node2  67108864         16           0
    node3  67108864         16           0
```

For TSV output (useful for scripts):
```bash
havirt --format tsv node list
```

### Step 9: Configure iSCSI (Optional)

If using iSCSI shared storage, configure targets:

```bash
# Add iSCSI target to configuration
havirt cluster iscsi add 192.168.1.10

# Verify target added
havirt cluster iscsi

# Test on one node first
havirt cluster iscsi update node1

# If successful, update all nodes
havirt cluster iscsi update

# Verify iSCSI sessions
iscsiadm -m session
```

### Step 10: Import Existing VMs

Discover and register VMs already running on the cluster.

```bash
# Scan all nodes for running VMs
havirt node scan --force

# Import VM configurations
havirt domain update

# List all discovered VMs
havirt domain list
```

Expected output:
```
         name   memory vcpu  node maintenance
      webvm1  4194304    4 node1           0
       db01   8388608    8 node2           0
   testvm    2097152    2 node3           0
```

### Step 11: Enable Automated Scanning

Set up cron job to keep cluster database current:

```bash
# Copy sample cron file
cp /media/shared/havirt/havirt.sample.cron /etc/cron.d/havirt

# Edit if needed
vi /etc/cron.d/havirt
```

Default cron content:
```bash
# Scan cluster every 5 minutes
*/5 * * * * root /usr/local/bin/havirt node scan --quiet 2>&1 | logger -t havirt
```

Verify cron job:
```bash
systemctl restart cron
tail -f /var/log/syslog | grep havirt
```

### Step 12: Enable Production Mode

Once testing is complete, enable command execution:

```bash
# Edit configuration
vi /media/shared/havirt/config.yaml
```

Change:
```yaml
flags:
  dryrun: 0  # Changed from 1 to 0 if desired
```

**Test with a safe operation:**
```bash
# This should now actually execute
havirt node scan
havirt domain list
```

## Post-Installation

### Verify Installation

Run through this checklist:

```bash
# 1. Check NFS mount on all nodes
for node in node1 node2 node3; do
    ssh $node "df -h | grep shared"
done

# 2. Verify SSH connectivity
for node in node1 node2 node3; do
    ssh $node hostname
done

# 3. Check node registration
havirt node list

# 4. Verify VM discovery
havirt domain list

# 5. Check cluster statistics
havirt cluster stats

# 6. Test dry-run vs real execution
havirt --dryrun 1 domain list  # Shows command
havirt --dryrun 0 domain list  # Executes command
```

### Test Migration

Test VM migration between nodes:

```bash
# List VMs and their locations
havirt domain list

# In dry-run mode, test migration command
havirt --dryrun 1 domain migrate testvmname node2

# If output looks correct, execute
havirt --dryrun 0 domain migrate testvmname node2

# Verify migration
havirt node scan --force
havirt domain list
```

### Configure Logging (Recommended)

Set up dedicated logging:

```bash
# Create log directory
mkdir -p /var/log/havirt

# Add to rsyslog
cat > /etc/rsyslog.d/havirt.conf << 'EOF'
if $programname == 'havirt' then /var/log/havirt/havirt.log
& stop
EOF

# Restart rsyslog
systemctl restart rsyslog

# Add logrotate
cat > /etc/logrotate.d/havirt << 'EOF'
/var/log/havirt/*.log {
    daily
    rotate 30
    compress
    delaycompress
    notifempty
    create 640 root root
}
EOF
```

## Troubleshooting

### NFS Issues

**Problem**: "Permission denied" accessing `/media/shared/havirt`

```bash
# Check NFS mount
mount | grep shared

# Check permissions
ls -la /media/shared

# Test write access
touch /media/shared/test && rm /media/shared/test

# Check NFS server exports
showmount -e nfs-server
```

**Problem**: Lock file issues

```bash
# Remove stale lock
rm -f /media/shared/havirt/var/status.yaml.lock

# Verify NFS locking enabled on server
```

### SSH Connectivity Issues

**Problem**: "Permission denied (publickey)"

```bash
# Verify key in authorized_keys
cat /root/.ssh/authorized_keys

# Check SSH key permissions
chmod 700 /root/.ssh
chmod 600 /root/.ssh/id_rsa
chmod 644 /root/.ssh/id_rsa.pub
chmod 600 /root/.ssh/authorized_keys

# Test with verbose output
ssh -v node2
```

### Module Installation Issues

**Problem**: "Can't locate YAML/Tiny.pm"

```bash
# Debian/Ubuntu
apt install libyaml-tiny-perl

# RHEL/CentOS
yum install perl-YAML-Tiny

# Or use CPAN
cpan YAML::Tiny
```

### VM Not Found

**Problem**: havirt can't find running VM

```bash
# Force rescan
havirt node scan --force

# Check if VM actually running
virsh list --all

# Update VM database
havirt domain update vmname

# Check status database
cat /media/shared/havirt/var/status.yaml | grep vmname
```

### Debug Mode

Enable verbose debugging:

```bash
# Debug level 1: Basic info
havirt --debug 1 node scan

# Debug level 2: Detailed info
havirt --debug 2 domain list

# Debug level 3: Full command trace
havirt --debug 3 cluster stats
```

## Upgrading

To upgrade to a newer version:

```bash
cd /media/shared/havirt

# Backup current version
cp -a /media/shared/havirt /media/shared/havirt.backup

# Update from repository
svn update

# Check version
havirt --version

# Review changes
cat CHANGES.md

# Test with dry-run
havirt --dryrun 1 node scan
```

## Uninstallation

To remove havirt:

```bash
# Remove cron job
rm /etc/cron.d/havirt

# Remove symlink from all nodes
rm /usr/local/bin/havirt

# Remove installation (from NFS)
rm -rf /media/shared/havirt

# Unmount NFS if no longer needed
umount /media/shared
```

## Next Steps

After installation:

1. Review [USAGE.md](USAGE.md) for detailed command reference
2. Review [README.md](README.md) for operational guidelines
3. Set up monitoring/alerting for cluster health
4. Document your VM migration procedures
5. Test maintenance mode workflow
6. Test cluster balancing in dry-run mode

## Getting Help

- Check [USAGE.md](USAGE.md) for command syntax
- Review [README.md](README.md) for common tasks
- Enable `--debug 3` for detailed troubleshooting
- Check `/var/log/syslog` for cron job output
- Review `/media/shared/havirt/var/status.yaml` for cluster state