Subversion Repositories sysadmin_scripts

Rev

Go to most recent revision | Details | Last modification | View Log | RSS feed

Rev Author Line No. Line
145 rodolico 1
libvirt by Red Hat is a management package that manages virtualizations
2
for several packages under Linux. While very good for single server
3
workstations and servers, we've run into limitations when put in a high
4
availability environment where multiple hypervisors are used and iSCSI
5
servers are implemented to provide the back end block devices.
6
 
7
Since libvirt uses the terms 'node' (for hypervisor) and 'domain' (for
8
virtual), we use that here. We also use the term 'cluster' to mean a
9
group of nodes (hypervisors) responsible for managing multiple domains
10
(virtuals).
11
 
12
Limitations of libvirt include
13
* inability to have a central repository of domain definitions to
14
  provide consistency across multiple nodes. If a domain definition is
15
  modified on one node, it is not synchronized to other nodes in the
16
  cluster.
17
* while the ability to use "foriegn" block device providers is possible,
18
  it is intricate, requiring additional record keeping and commands to
19
  be executed on each node of the cluster to add/remove block devices
20
  from them
21
* No safeguards to keep domains from running on multiple nodes (which
22
  can result in block device corruption)
23
 
24
havirt is a set of scripts to overcome these limitations. The first step
25
is record keeping; knowing which domain is running on which node without
26
having to manually go to each node to record it (which is exactly what
27
havirt does).
28
 
29
Our setup:
30
 
31
* an NFS share is mounted on each node, preferably at a consistent
32
  location. In the following example, it is mounted at /media/shared.
33
  This contains the scripts and files used by havirt. In our case,
34
  havirt is under a separate directory in the NFS share, with other
35
  subdirectoriesused for things like ISO's and images.
36
* nodes in the cluster can make a passwordless ssh connection to any
37
  other node in the cluster using public key authentication. INSECURE
38
  if any node is compromised, all other nodes can be connected through
39
  trivially. Be sure to limit access to all nodes with firewalls and
40
  high end authentication.
41
* each node has a /root/.ssh/config file allowing us to access other
42
  nodes by a short alias.
43
 
44
 
45
Installation is simple assuming you have a shared storage area
46
 
47
svn co http://svn.dailydata.net/svn/sysadmin_scripts/trunk/virtuals /media/shared/virtuals
48
ln -s /media/shared/virtuals/havirt /usr/local/bin/havirt
49
 
50
The directory chosen is self contained; scripts, configuration files
51
and database files are stored in that tree. The file
52
/media/shared/virtuals/havirt.conf can be used to override some of these
53
locations if desired, but the files must be accessible and writable to
54
all nodes in the cluster.
55
 
56
=== Currently (2024-03-17), record keeping is implemented. The following
57
commands currently exist.
58
 
59
havirt node update [node] [node]... # update a given node (or ALL)
60
havirt node list # display tab delimited list of node specs
61
havirt node scan # find domains on all nodes
62
havirt domain update ALL|RUNNING|[domain] [domain]... # update domains
63
 
64
havirt node update
65
Gets resources available on node passed in. Issues command
66
'virsh nodeinfo' on each node, parses the result and populates the
67
definition in var/node.yaml. Adds new entry if one does not exist.
68
 
69
havirt node list
70
Generates a tab delimited output of information about all nodes in
71
cluster.
72
 
73
havirt node scan
74
scans each node in cluster to determine which domains are currently
75
running on it. Stores information in var/node_population.yaml. This
76
should be run regularly to ensure the database is always up to date. We
77
have it set up on a cron job that runs every 5 minutes.
78
 
79
havirt domain update
80
* Parses the config file for the domain (conf/domainname.xml) for some
81
  useful information such as VNC port, number of vcpu's and amount of
82
  memory, updating this value in var/domain.yaml. 
83
* If the config file for a domain  does not exist, gets a copy by
84
  running virsh xmldump on the appropriate node.
85
* if domain is set to ALL, will do this for all domains already in
86
  var/domain.yaml. If domain is set to RUNNING, will scan all nodes for
87
  running domains and act on them.
88
NOTE: this does not refresh the config file. I intend to put a 'force'
89
      flag in later, but for now, you should remove conf/domainname.xml
90
      if you want this file refreshed.
91
 
92
havirt domain list
93
Dumps the definition of one or more domains to STDOUT as a tab delimited
94
list of values.
95
 
96
=== Additional functionality is planned in the near future. NOTE: By
97
default, havirt will simply display a list of commands to be run from
98
the shell, though this can be overriden by a config file change or a
99
command line flag.
100
 
101
havirt node maintenanceon nodename
102
Will flag nodename as having maintenance run on it and remove it from
103
the pool. Will then migrate all domains off of node to other nodes in
104
cluster.
105
 
106
havirt node maintenanceoff nodename
107
Toggles maintenance flag to off for nodename, allowing it to accept
108
migration/running of domains. Generally followed by havirt cluster
109
balance
110
 
111
havirt cluster balance
112
Checks amount of resources used on each node and determine a way to even
113
the resources (memory, vcpu's) out by migrating domains to different
114
nodes.
115
 
116
havirt cluster validate
117
Checks all nodes in cluster to ensure
118
A) the same vnets exist
119
B) the same iscsi targets are mounted
120
C) /root/.ssh/authorized_keys contains all other domains
121
D) /root/.ssh/config are the same
122
 
123
havirt node iscsiscan
124
scans iscsi on domain[s], adding/removing targets. Generally used after
125
changes made to iSCSI target.
126
 
127
havirt domain start domainname [nodename]
128
Will start domain domainname on nodename (or local node) using config
129
file from conf/domainname.xml. Validates the domain is not running on
130
any node before executing a virsh create domainname.
131
 
132
havirt domain stop domainname
133
Locates the node domainname is running on and issues a shutdown command.
134
Upon success, sets domain to 'manual' (to override 'keepalive')
135
 
136
havirt domain migrate domainname nodename
137
migrates domainname to nodename after verifying enough resources exist
138
on nodename.