Subversion Repositories havirt

Rev

Rev 41 | Details | Compare with Previous | Last modification | View Log | RSS feed

Rev Author Line No. Line
47 rodolico 1
# havirt Usage Guide
14 rodolico 2
 
47 rodolico 3
## Command Syntax
14 rodolico 4
 
47 rodolico 5
```bash
6
havirt [flags] [module] [action] [arguments]
7
```
14 rodolico 8
 
47 rodolico 9
- **flags**: Optional parameters (can be placed anywhere in the command)
10
- **module**: One of `domain`, `node`, or `cluster`
11
- **action**: The operation to perform
12
- **arguments**: Action-specific parameters
24 rodolico 13
 
47 rodolico 14
### Global Flags
24 rodolico 15
 
47 rodolico 16
| Flag         | Short | Description                          | Default      |
17
| ------------ | ----- | ------------------------------------ | ------------ |
18
| `--dryrun`   | `-n`  | Show commands without executing      | 1 (enabled)  |
19
| `--nodryrun` |       | Alias for -n 0                       | 0 (disabled) |
20
| `--verbose`  | `-v`  | Show detailed output                 | 1 (enabled)  |
21
| `--quiet`    | `-q`  | Suppress output                      | 0 (disabled) |
22
| `--debug`    | `-d`  | Debug level (0-3)                    | 0            |
23
| `--force`    | `-y`  | Force operation (bypass time checks) | 0            |
24
| `--format`   | `-f`  | Output format: `screen` or `tsv`     | screen       |
25
| `--target`   | `-t`  | Target specific node or domain       | none         |
26
| `--testing`  |       | Use test data (for development)      | 0            |
27
| `--help`     | `-h`  | Show help                            |              |
28
| `--version`  |       | Show version information             |              |
24 rodolico 29
 
47 rodolico 30
### Flag Placement
25 rodolico 31
 
47 rodolico 32
Flags can be placed anywhere in the command:
41 rodolico 33
 
47 rodolico 34
```bash
35
# All equivalent
36
havirt --format tsv domain list
37
havirt domain --format tsv list
38
havirt domain list --format tsv
39
```
41 rodolico 40
 
47 rodolico 41
### Getting Help
41 rodolico 42
 
47 rodolico 43
Help is available at multiple levels:
25 rodolico 44
 
47 rodolico 45
```bash
46
havirt --help              # General help
47
havirt domain              # Domain module help
48
havirt domain help         # Domain module help
49
havirt node                # Node module help
50
havirt cluster             # Cluster module help
51
```
24 rodolico 52
 
47 rodolico 53
## Module Overview
24 rodolico 54
 
47 rodolico 55
Havirt is organized into three main modules:
14 rodolico 56
 
47 rodolico 57
### domain - Virtual Machine Management
14 rodolico 58
 
47 rodolico 59
Manage individual VMs (domains). Operations include starting, stopping, migrating, and listing VMs.
14 rodolico 60
 
47 rodolico 61
```bash
62
havirt domain [action] [arguments]
63
```
14 rodolico 64
 
47 rodolico 65
### node - Hypervisor Management
14 rodolico 66
 
47 rodolico 67
Manage hypervisor nodes. Operations include adding nodes, scanning for VMs, and viewing node resources.
14 rodolico 68
 
47 rodolico 69
```bash
70
havirt node [action] [arguments]
71
```
14 rodolico 72
 
47 rodolico 73
### cluster - Cluster Operations
14 rodolico 74
 
47 rodolico 75
Cluster-wide operations including statistics, load balancing, and iSCSI management.
14 rodolico 76
 
47 rodolico 77
```bash
78
havirt cluster [action] [arguments]
79
```
14 rodolico 80
 
47 rodolico 81
---
14 rodolico 82
 
47 rodolico 83
# Domain Module
14 rodolico 84
 
47 rodolico 85
Manage individual virtual machines across the cluster.
14 rodolico 86
 
47 rodolico 87
## domain list
14 rodolico 88
 
47 rodolico 89
Display all known VMs with their status and resource allocation.
14 rodolico 90
 
47 rodolico 91
**Syntax:**
14 rodolico 92
 
47 rodolico 93
```bash
94
havirt domain list [--format screen|tsv]
95
```
14 rodolico 96
 
47 rodolico 97
**Examples:**
14 rodolico 98
 
47 rodolico 99
```bash
100
# Display VMs in table format
101
havirt domain list
102
 
103
# Output as tab-separated values
104
havirt domain list --format tsv
105
 
106
# Redirect to file for processing
107
havirt domain list -f tsv > vms.txt
108
```
109
 
110
**Output:**
111
 
112
```
113
         name   memory vcpu  node maintenance
114
      webvm1  4194304    4 node1           0
115
       db01   8388608    8 node2           0
116
   testvm    2097152    2 node3           1
117
```
118
 
119
**Fields:**
120
 
121
- **name**: VM name
122
- **memory**: Allocated memory in KB
123
- **vcpu**: Number of virtual CPUs
124
- **node**: Currently running on this node (empty if stopped)
125
- **maintenance**: 1 = in maintenance mode, 0 = normal
126
 
127
## domain start
128
 
129
Start a VM on a specified node (or current node if not specified).
130
 
131
**Syntax:**
132
 
133
```bash
134
havirt domain start <domainname> [nodename]
135
```
136
 
137
**Safety Checks:**
138
 
139
- Verifies VM not running elsewhere in cluster
140
- Validates VM configuration exists in `conf/`
141
- Checks target node has sufficient resources
142
- Rescans all nodes before starting
143
 
144
**Examples:**
145
 
146
```bash
147
# Start VM on specific node
148
havirt domain start webvm1 node2
149
 
150
# Start VM on current node
151
havirt domain start webvm1
152
 
153
# Dry-run to see what would happen
154
havirt --dryrun 1 domain start webvm1 node2
155
```
156
 
157
**Common Errors:**
158
 
159
```
160
# VM already running
161
Error: webvm1 is already running on node1
162
 
163
# Configuration not found
164
Error: Configuration file conf/webvm1.xml not found
165
 
166
# Insufficient resources
167
Error: node2 does not have sufficient memory
168
```
169
 
170
## domain shutdown
171
 
172
Gracefully shutdown a running VM.
173
 
174
**Syntax:**
175
 
176
```bash
177
havirt domain shutdown <domainname>
178
```
179
 
180
**Behavior:**
181
 
182
- Sends ACPI shutdown signal to VM
183
- Waits for VM to shut down gracefully
184
 
185
**Examples:**
186
 
187
```bash
188
# Shutdown VM
189
havirt domain shutdown webvm1
190
 
191
# View shutdown command without executing
192
havirt --dryrun 1 domain shutdown webvm1
193
```
194
 
195
 
196
 
197
# domain migrate
198
 
199
Live migrate a VM from its current node to another node.
200
 
201
**Syntax:**
202
 
203
```bash
204
havirt domain migrate <domainname> [target-node]
205
```
206
 
207
**Behavior:**
208
 
209
- Performs live migration (VM stays running)
210
- If no target specified, selects least-loaded node automatically
211
- Updates cluster database after migration
212
- Validates target has sufficient resources
213
- **Note:** libvirt can fail on domain migration. If this occurs, you will either need to manually migrate or shut down, then start the domain on the target node.
214
 
215
**Examples:**
216
 
217
```bash
218
# Migrate to specific node
219
havirt domain migrate webvm1 node3
220
 
221
# Migrate to automatically selected node
222
havirt domain migrate webvm1
223
 
224
# Check what would happen
225
havirt --dryrun 1 domain migrate webvm1 node3
226
 
227
# Failure reported due to timeout, but domain actually migrated
228
havirt domain migrate webmv1 node2
229
# some error message here
230
havirt node scan --force
231
havirt domain list | grep webvm1
232
# in most cases, if timeout set too low, the final command will show
233
# the domain actually moved
234
```
235
 
236
**Migration Time:**
237
Typically 10-60 seconds depending on:
238
 
239
- VM memory size
240
- Network speed
241
- VM workload (memory changes during migration)
242
 
243
**Safety:**
244
 
245
- Source and target must both access shared storage
246
- Validates resources before starting migration
247
- Will not migrate to nodes in maintenance mode
248
- Will not migrate VMs in maintenance mode
249
 
250
## domain update
251
 
252
Update VM metadata by reading XML configuration files.
253
 
254
**Syntax:**
255
 
256
```bash
257
havirt domain update [domainname]
258
havirt domain update -t domainname  # Target specific domain
259
```
260
 
261
**Behavior:**
262
 
263
- Reads XML files from `conf/` directory
264
- Extracts memory, vcpu, and other settings
265
- Updates `var/status.yaml` database
266
- Without argument: updates ALL domains
267
 
268
**Examples:**
269
 
270
```bash
271
# Update all domains
272
havirt domain update
273
 
274
# Update specific domain
275
havirt domain update webvm1
276
havirt -t webvm1 domain update
277
```
278
 
279
**When to Use:**
280
 
281
- After adding new VMs to the cluster
282
- After modifying VM XML configuration
283
- After restoring from backup
284
- If database seems out of sync
285
 
286
## domain new
287
 
288
Generate a `virt-install` command template for creating a new VM.
289
 
290
**Syntax:**
291
 
292
```bash
293
havirt domain new [domainname]
294
```
295
 
296
**Generated Information:**
297
 
298
- Unused VNC port
299
- Unique UUID
300
- Random MAC address (prefix: 00:16:3e - XEN OUI)
301
- Template from `virt-install.template`
302
 
303
**Examples:**
304
 
305
```bash
306
# Generate template
307
havirt domain new mynewvm
308
 
309
# Copy output and customize
310
havirt domain new mynewvm > create-vm.sh
311
vi create-vm.sh
312
bash create-vm.sh
313
```
314
 
315
**Warning**: MAC addresses are randomly generated and not guaranteed unique. Check for conflicts in large clusters.
316
 
317
## domain maintenance
318
 
319
Set or clear maintenance flag for a VM.
320
 
321
**Syntax:**
322
 
323
```bash
324
havirt domain maintenance <domainname> <on|off>
325
```
326
 
327
**Examples:**
328
 
329
```bash
330
# Enable maintenance mode
331
havirt domain maintenance webvm1 on
332
 
333
# Disable maintenance mode
334
havirt domain maintenance webvm1 off
335
 
336
# Check current status
337
havirt domain list | grep webvm1
338
```
339
 
340
**Effect:**
341
 
342
- Prevents automatic migration during cluster balancing
343
- Prevents automatic restart by keepalive scripts
344
- VM can still be manually started/stopped/migrated
345
 
346
---
347
 
348
# Node Module
349
 
350
Manage hypervisor nodes in the cluster.
351
 
352
## node list
353
 
354
Display all nodes with their resources and status.
355
 
356
**Syntax:**
357
 
358
```bash
359
havirt node list [--format screen|tsv]
360
```
361
 
362
**Examples:**
363
 
364
```bash
365
# Display nodes
366
havirt node list
367
 
368
# TSV format for scripts
369
havirt -f tsv node list
370
```
371
 
372
**Output:**
373
 
374
```
375
     name     memory  cpu_count maintenance
376
    node1  67108864         16           0
377
    node2  67108864         16           0
378
    node3  33554432          8           1
379
```
380
 
381
**Fields:**
382
 
383
- **name**: Node hostname
384
- **memory**: Total memory in KB
385
- **cpu_count**: Number of CPU threads
386
- **maintenance**: 1 = in maintenance, 0 = active
387
 
388
## node scan
389
 
390
Scan nodes to discover running VMs.
391
 
392
**Syntax:**
393
 
394
```bash
395
havirt node scan [nodename]
396
havirt node scan [-t nodename] [--force]
397
```
398
 
399
**Behavior:**
400
 
401
- Executes `virsh list` on each node
402
- Updates cluster database with running VMs
403
- By default: scans all nodes
404
- Respects 5-minute minimum between scans (unless `--force`)
405
 
406
**Examples:**
407
 
408
```bash
409
# Scan all nodes (respects time limit)
410
havirt node scan
411
 
412
# Force immediate scan
413
havirt node scan --force
414
 
415
# Scan specific node
416
havirt node scan node1
417
havirt -t node1 node scan
418
```
419
 
420
**Cron Job:**
421
 
422
```bash
423
# Add to /etc/cron.d/havirt
424
*/5 * * * * root /usr/local/bin/havirt node scan --quiet 2>&1 | logger -t havirt
425
```
426
 
427
**Time Limit:**
428
By default, scans are limited to once per 5 minutes (configurable in `config.yaml` as `min scan time`). Use `--force` to override.
429
 
430
## node add / node update
431
 
432
Add a new node or update existing node information.
433
 
434
**Syntax:**
435
 
436
```bash
437
havirt node add <nodename>
438
havirt node update <nodename>
439
havirt node update              # Update all nodes
440
```
441
 
442
**Behavior:**
443
 
444
- Connects via SSH to node
445
- Executes `virsh nodeinfo`
446
- Extracts CPU and memory information
447
- Adds to or updates cluster database
448
 
449
**Examples:**
450
 
451
```bash
452
# Add new node
453
havirt node add node4
454
 
455
# Update specific node
456
havirt node update node1
457
 
458
# Update all nodes
459
havirt node update
460
```
461
 
462
**Requirements:**
463
 
464
- SSH connectivity must be configured
465
- libvirt must be running on target node
466
- Node must be accessible by hostname/alias
467
 
468
## node maintenance
469
 
470
Place node in or remove from maintenance mode.
471
 
472
**Syntax:**
473
 
474
```bash
475
havirt node maintenance <nodename> <on|off>
476
```
477
 
478
**Examples:**
479
 
480
```bash
481
# Enable maintenance mode
482
havirt node maintenance node2 on
483
 
484
# Run cluster balance to evacuate VMs
485
havirt cluster balance
486
 
487
# Disable maintenance mode when done
488
havirt node maintenance node2 off
489
```
490
 
491
**Effect:**
492
 
493
- Prevents new VMs from starting on node
494
- Prevents node being selected as migration target
495
- During cluster balance, all VMs evacuated from node
496
- Existing VMs continue running until migrated
497
 
498
**Typical Workflow:**
499
 
500
```bash
501
# 1. Enable maintenance
502
havirt node maintenance node2 on
503
 
504
# 2. Evacuate VMs
505
havirt cluster balance
506
 
507
# 3. Verify node empty
508
havirt domain list | grep node2
509
 
510
# 4. Perform maintenance...
511
 
512
# 5. Disable maintenance
513
havirt node maintenance node2 off
514
```
515
 
516
---
517
 
518
# Cluster Module
519
 
520
Cluster-wide operations and statistics.
521
 
522
## cluster stats
523
 
524
Display resource usage statistics for the entire cluster.
525
 
526
**Syntax:**
527
 
528
```bash
529
havirt cluster stats [--format screen|tsv]
530
```
531
 
532
**Examples:**
533
 
534
```bash
535
# Display cluster statistics
536
havirt cluster stats
537
 
538
# TSV format
539
havirt -f tsv cluster stats
540
```
541
 
542
**Output:**
543
 
544
```
545
     node     memory  cpu_count memory_used memory_free cpu_used cpu_free
546
    node1  67108864         16    25165824    41943040        8        8
547
    node2  67108864         16    33554432    33554432       12        4
548
    node3  67108864         16     8388608    58720256        4       12
549
```
550
 
551
**Fields:**
552
 
553
- **memory**: Total memory (KB)
554
- **memory_used**: Allocated to running VMs
555
- **memory_free**: Available for new VMs
556
- **cpu_count**: Total CPU threads
557
- **cpu_used**: Virtual CPUs allocated
558
- **cpu_free**: Available vCPU slots
559
 
560
## cluster balance
561
 
562
Automatically redistribute VMs across nodes for optimal resource utilization.
563
 
564
**Syntax:**
565
 
566
```bash
567
havirt cluster balance [--dryrun 0|1]
568
```
569
 
570
**Behavior:**
571
 
572
1. Evacuates all VMs from nodes in maintenance mode
573
2. Calculates memory imbalance across active nodes
574
3. Selects VMs to migrate from overloaded nodes
575
4. Migrates VMs to underutilized nodes
576
5. Repeats until cluster is balanced
577
 
578
**Examples:**
579
 
580
```bash
581
# See what would be balanced (safe)
582
havirt --dryrun 1 cluster balance
583
 
584
# Actually perform balancing
585
havirt --dryrun 0 cluster balance
586
 
587
# With verbose output
588
havirt --verbose 1 --dryrun 0 cluster balance
589
```
590
 
591
**When to Use:**
592
 
593
- After placing node in maintenance mode
594
- After adding new nodes to cluster
595
- When cluster becomes unbalanced
596
- As part of regular maintenance
597
 
598
**Safety:**
599
 
600
- Respects node maintenance flags
601
- Respects VM maintenance flags
602
- Validates resources before each migration
603
- Will not overload target nodes
604
 
605
## cluster iscsi
606
 
607
Manage iSCSI targets across the cluster.
608
 
609
**Syntax:**
610
 
611
```bash
612
havirt cluster iscsi                    # List targets
613
havirt cluster iscsi add <target-ip>    # Add target
614
havirt cluster iscsi update [node]      # Update sessions
615
```
616
 
617
**Examples:**
618
 
619
```bash
620
# Add iSCSI target
621
havirt cluster iscsi add 192.168.1.10
622
 
623
# List configured targets
624
havirt cluster iscsi
625
 
626
# Update one node
627
havirt cluster iscsi update node1
628
 
629
# Update all nodes
630
havirt cluster iscsi update
631
```
632
 
633
**Behavior:**
634
 
635
- Stores target in cluster configuration
636
- Executes `iscsiadm` commands on nodes
637
- Discovers and logs in to targets
638
- Updates persistent iSCSI configuration
639
 
640
---
641
 
642
# Common Workflows
643
 
644
## Adding a New VM
645
 
646
```bash
647
# 1. Generate virt-install command
648
havirt domain new mynewvm > create-vm.sh
649
 
650
# 2. Customize and run
651
vi create-vm.sh
652
bash create-vm.sh
653
 
654
# 3. Discover new VM
655
havirt node scan --force
656
 
657
# 4. Import configuration
658
havirt domain update mynewvm
659
 
660
# 5. Verify
661
havirt domain list | grep mynewvm
662
```
663
 
664
## Performing Node Maintenance
665
 
666
```bash
667
# 1. Enable maintenance mode
668
havirt node maintenance node2 on
669
 
670
# 2. Evacuate VMs
671
havirt --dryrun 0 cluster balance
672
 
673
# 3. Verify evacuation
674
havirt domain list | grep node2
675
# (should show no VMs)
676
 
677
# 4. Perform maintenance work...
678
# shutdown, update, reboot, etc.
679
 
680
# 5. Bring node back online
681
havirt node maintenance node2 off
682
 
683
# 6. Re-balance cluster
684
havirt --dryrun 0 cluster balance
685
```
686
 
687
## Migrating a VM
688
 
689
```bash
690
# 1. Check current location
691
havirt domain list | grep webvm1
692
 
693
# 2. Verify target has resources
694
havirt cluster stats
695
 
696
# 3. Migrate
697
havirt --dryrun 0 domain migrate webvm1 node3
698
 
699
# 4. Verify migration
700
havirt node scan --force
701
havirt domain list | grep webvm1
702
```
703
 
704
## Emergency VM Shutdown
705
 
706
```bash
707
# 1. Graceful shutdown
708
havirt --dryrun 0 domain shutdown vmname
709
 
710
# If that doesn't work after 60 seconds:
711
# 2. Find node VM is on
712
havirt domain list | grep vmname
713
 
714
# 3. Force destroy directly
715
ssh node1 "virsh destroy vmname"
716
 
717
# 4. Rescan cluster
718
havirt node scan --force
719
```
720
 
721
## Recovering from Split-Brain
722
 
723
If a VM appears running on multiple nodes:
724
 
725
```bash
726
# 1. Identify the problem
727
havirt --debug 2 domain list
728
 
729
# 2. Determine correct node (check VM console/logs)
730
 
731
# 3. Destroy on incorrect node(s)
732
ssh wrong-node "virsh destroy vmname"
733
 
734
# 4. Rescan
735
havirt node scan --force
736
 
737
# 5. Verify
738
havirt domain list | grep vmname
739
```
740
 
741
---
742
 
743
# Output Formats
744
 
745
## Screen Format (Default)
746
 
747
Fixed-width columns suitable for terminal viewing:
748
 
749
```bash
750
havirt domain list
751
```
752
 
753
```
754
         name   memory vcpu  node maintenance
755
      webvm1  4194304    4 node1           0
756
```
757
 
758
## TSV Format
759
 
760
Tab-separated values for scripting:
761
 
762
```bash
763
havirt --format tsv domain list
764
```
765
 
766
```
767
name    memory  vcpu    node    maintenance
768
webvm1  4194304 4       node1   0
769
```
770
 
771
**Use Cases:**
772
 
773
```bash
774
# Import into spreadsheet
775
havirt -f tsv domain list > vms.tsv
776
 
777
# Process with awk
778
havirt -f tsv domain list | awk '$4 == "node1" {print $1}'
779
 
780
# Count VMs per node
781
havirt -f tsv domain list | tail -n +2 | cut -f4 | sort | uniq -c
782
```
783
 
784
---
785
 
786
# Dry-Run Mode
787
 
788
By default, havirt operates in dry-run mode for safety.
789
 
790
## Checking Mode
791
 
792
```bash
793
# Check current setting
794
grep dryrun /media/shared/havirt/config.yaml
795
```
796
 
797
## Temporary Override
798
 
799
```bash
800
# Force dry-run (safe)
801
havirt --dryrun 1 domain start vm1 node1
802
 
803
# Force execution (one command only)
804
havirt --dryrun 0 domain start vm1 node1
805
```
806
 
807
## Permanent Change
808
 
809
```bash
810
# Edit config
811
vi /media/shared/havirt/config.yaml
812
 
813
# Change:
814
flags:
815
  dryrun: 0  # Now executes commands by default
816
```
817
 
818
---
819
 
820
# Debugging
821
 
822
## Debug Levels
823
 
824
```bash
825
# Level 1: Basic operations
826
havirt --debug 1 node scan
827
 
828
# Level 2: Detailed information
829
havirt --debug 2 domain migrate vm1 node2
830
 
831
# Level 3: Full command trace
832
havirt --debug 3 cluster balance
833
```
834
 
835
## Verbose Output
836
 
837
```bash
838
# Enable verbose mode
839
havirt --verbose 1 node scan
840
 
841
# Disable verbose mode
842
havirt --verbose 0 node scan
843
```
844
 
845
## Checking Database
846
 
847
```bash
848
# View entire status database
849
cat /media/shared/havirt/var/status.yaml
850
 
851
# Check specific VM
852
grep -A 5 "vmname" /media/shared/havirt/var/status.yaml
853
 
854
# Check node population
855
grep -A 10 "nodePopulation" /media/shared/havirt/var/status.yaml
856
```
857
 
858
---
859
 
860
# Best Practices
861
 
862
## Regular Operations
863
 
864
1. **Run scans regularly**: Set up cron job for `node scan`
865
2. **Use dry-run first**: Always test with `--dryrun 1` before executing
866
3. **Monitor cluster stats**: Regularly check `cluster stats`
867
4. **Balance periodically**: Run `cluster balance` weekly or after major changes
868
5. **Update after changes**: Run `domain update` after VM config changes
869
 
870
## Safety
871
 
872
1. **Enable maintenance mode**: Before node maintenance, use `node maintenance`
873
2. **Verify evacuations**: Check `domain list` after enabling maintenance mode
874
3. **Test migrations**: Use `--dryrun 1` before migrating critical VMs
875
4. **Keep backups**: Backup `var/status.yaml` before major operations
876
5. **Document changes**: Log all manual migrations and changes
877
 
878
## Performance
879
 
880
1. **Limit scan frequency**: Default 5-minute minimum is reasonable
881
2. **Use TSV for scripts**: Faster parsing than screen format
882
3. **Target specific nodes**: Use `-t node` instead of scanning all
883
4. **Batch operations**: Group multiple updates together
884
 
885
## Troubleshooting
886
 
887
1. **Start with verbose**: Use `--verbose 1` for basic troubleshooting
888
2. **Use debug levels**: Progress from `--debug 1` to `--debug 3`
889
3. **Force rescans**: Use `--force` if data seems stale
890
4. **Check SSH**: Verify `ssh node` works manually
891
5. **Review logs**: Check `/var/log/syslog` for cron job output