rgmanager, clvm, gfs2, a SAN and Dell's CMC fencing to support live
migration and automatic failover of Xen/KVM virtual machines.
I ended up trying prototype this using VMWare Fusion on a Mac but was
unable to because there were not any fencing options for the VM
platform at the time.
Today I was glad to set up a three-node RHEL7 HA web cluster
all within my laptop using only four KVM VMs using pacemaker,
corosync, and a VM running an iSCSI server. I used the following
documents for reference.
- How can I set up virtual fencing on my KVM Virtual machine cluster?
- Creating a Red Hat High-Availability Cluster with Pacemaker
- Configuring an Apache Web Server in a Red Hat High Availability Cluster with the pcs Command
- iSCSI server
- iSCSI clients
Below are some of my notes, mainly for my own purposes.
Get the software
subscription-manager register subscription-manager attach --pool=$POOL subscription-manager repos --disable=* subscription-manager repos --enable=rhel-7-server-rpms subscription-manager repos --enable=rhel-ha-for-rhel-7-server-rpms yum -y install pacemaker corosync fence-agents-all fence-agents-virsh fence-virt pcs dlm lvm2-cluster gfs2-utils
Use consistent /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.122.87 pcs-a pcs-a.example.com 192.168.122.26 pcs-b pcs-b.example.com 192.168.122.129 pcs-c pcs-c.example.com 192.168.122.140 iscsi iscsi.example.com 192.168.122.141 web web.example.comNote that I cloned my VM (pcs-a) at this point into pcs-{b,c} at this point.
Start the cluster
Note that this was applied to the other nodes.passwd hacluster pcs cluster auth pcs-a.example.com pcs-b.example.com pcs-c.example.comNotice that I didn't need to set up fencing just to get them talking to eachother.
[root@pcs-a ~]# pcs cluster setup --name rh7nodes pcs-a.example.com pcs-b.example.com pcs-c.example.com Shutting down pacemaker/corosync services... Redirecting to /bin/systemctl stop pacemaker.service Redirecting to /bin/systemctl stop corosync.service Killing any remaining services... Removing all cluster configuration files... pcs-a.example.com: Succeeded pcs-b.example.com: Succeeded pcs-c.example.com: Succeeded [root@pcs-a ~]# pcs cluster start --all
Get Fencing Working
First you need to configure your hypervisor (my laptop) to fence the VMs using fence-virtd as described inHow can I set up virtual fencing on my KVM Virtual machine cluster?
pcs stonith describe fence_xvm pcs stonith create virtfence_xvm fence_xvm pcs stonith update virtfence_xvm pcmk_host_map="pcs-a.example.com:pcs-a;pcs-b.example.com:pcs-b;pcs-c.example.com:pcs-c" key_file=/etc/corosync/fence_xvm.keyAfter following the above I was unable at first to fence a node.
[root@pcs-a ~]# stonith_admin --reboot pcs-c.example.com Command failed: No route to host [root@pcs-a ~]#I found that my host hadn't joined the multicast group which I
addressed by restarting fenced and my VMs.
ss -l | grep 225.0.0.12tcpdump was very helpful in pointing out that I needed to open the
firewall:
tcpdump -i virbr0 -n port 1229 iptables -I INPUT -p udp --dport 1229 -j ACCEPTTo let fence_virtd service get accessed.
# lsof -i UDP:1229 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME fence_vir 11781 root 8u IPv4 506550 0t0 UDP *:zented # ps axu | grep 11781 root 10197 0.0 0.0 112640 960 pts/5 S+ 16:02 0:00 grep --color=auto 11781 root 11781 0.0 0.0 158832 4868 ? Ss 11:23 0:00 /usr/sbin/fence_virtd -w #The other issue I ran into is that you must pass the fence_xvm command
NOT the hostname but the name that virsh uses; i.e. this will not
work:
[root@pcs-a ~]# fence_xvm -H pcs-c.example.com -dddd | tail -5 Issuing TCP challenge Responding to TCP challenge TCP Exchange + Authentication done... Waiting for return value from XVM host Operation failed [root@pcs-a ~]#Because `virsh list | grep pcs-c.example.com` will not return
anything. Instead, the following worked because I had named by VM
rhel7-pcs-c using virsh.
fence_xvm -H rhel7-pcs-c -ddddddFinally, when debugging on the host, stop the fence_virtd daemon via
the service controller and watch the error messages directly:
service fence_virtd stop /usr/sbin/fence_virtd -d999 -F
Configure the iSCSI server
I made a new VM called iscsi.example.com configured it as an iSCSI server by following the RHE7 Storage Admin Guide. I then created a file based block device to serve as a LUN.
[root@iscsi ~]# systemctl enable target ln -s '/usr/lib/systemd/system/target.service' '/etc/systemd/system/multi-user.target.wants/target.service' [root@iscsi ~]# targetcli Warning: Could not load preferences file /root/.targetcli/prefs.bin. targetcli shell version 2.1.fb34 Copyright 2011-2013 by Datera, Inc and others. For help on commands, type 'help'. /> ls o- / ......................................................................................................................... [...] o- backstores .............................................................................................................. [...] | o- block .................................................................................................. [Storage Objects: 0] | o- fileio ................................................................................................. [Storage Objects: 0] | o- pscsi .................................................................................................. [Storage Objects: 0] | o- ramdisk ................................................................................................ [Storage Objects: 0] o- iscsi ............................................................................................................ [Targets: 0] o- loopback ......................................................................................................... [Targets: 0] /> iscsi/ /iscsi> create Created target iqn.2003-01.org.linux-iscsi.iscsi.x8664:sn.9e8fba51d7f4. Created TPG 1. /iscsi> ls o- iscsi .............................................................................................................. [Targets: 1] o- iqn.2003-01.org.linux-iscsi.iscsi.x8664:sn.9e8fba51d7f4 ............................................................. [TPGs: 1] o- tpg1 ................................................................................................. [no-gen-acls, no-auth] o- acls ............................................................................................................ [ACLs: 0] o- luns ............................................................................................................ [LUNs: 0] o- portals ...................................................................................................... [Portals: 0] /iscsi> /backstores> /backstores/fileio create www /var/fileio/www 5G write_back=false Created fileio www with size 5368709120 /backstores> ls o- backstores .................................................................. o- block ..................................................................... o- fileio .................................................................... | o- www ..................................................................... o- pscsi ..................................................................... o- ramdisk ................................................................... /backstores> /backstores> /iscsi /iscsi> ls o- iscsi ....................................................................... o- iqn.2003-01.org.linux-iscsi.iscsi.x8664:sn.9e8fba51d7f4 ................... o- tpg1 .................................................................... o- acls .................................................................. o- luns .................................................................. o- portals ............................................................... /iscsi> /iscsi> iqn.2003-01.org.linux-iscsi.iscsi.x8664:sn.9e8fba51d7f4/tpg1/ /iscsi/iqn.20...ba51d7f4/tpg1> portals/ create 192.168.122.140 Using default IP port 3260 Created network portal 192.168.122.140:3260. /iscsi/iqn.20...ba51d7f4/tpg1> ls o- tpg1 ................................................. [no-gen-acls, no-auth] o- acls ............................................................ [ACLs: 0] o- luns ............................................................ [LUNs: 0] o- portals ...................................................... [Portals: 1] o- 192.168.122.140:3260 ............................................... [OK] /iscsi/iqn.20...ba51d7f4/tpg1> luns/ create /backstores/fileio/www Created LUN 0. /iscsi/iqn.20...ba51d7f4/tpg1> ls o- tpg1 ................................................. [no-gen-acls, no-auth] o- acls ............................................................ [ACLs: 0] o- luns ............................................................ [LUNs: 1] | o- lun0 ..................................... [fileio/www (/var/fileio/www)] o- portals ...................................................... [Portals: 1] o- 192.168.122.140:3260 ............................................... [OK] /iscsi/iqn.20...ba51d7f4/tpg1>
On pcs-{a,b,c} I ran `yum install iscsi-initiator-utils` as per a Red Hat knowledge base article and identified the generated InitiatorName:
[root@pcs-c ~]# cat /etc/iscsi/initiatorname.iscsi InitiatorName=iqn.1994-05.com.redhat:44e8828a7868 [root@pcs-c ~]#From there I assigned the ACLs for my three hosts:
/iscsi/iqn.20...7f4/tpg1/acls> create iqn.1994-05.com.redhat:553420881b94 Created Node ACL for iqn.1994-05.com.redhat:553420881b94 Created mapped LUN 0. /iscsi/iqn.20...7f4/tpg1/acls> create iqn.1994-05.com.redhat:44e8828a7868 Created Node ACL for iqn.1994-05.com.redhat:44e8828a7868 Created mapped LUN 0. /iscsi/iqn.20...7f4/tpg1/acls> create iqn.1994-05.com.redhat:d7163d296480 Created Node ACL for iqn.1994-05.com.redhat:d7163d296480 Created mapped LUN 0. /iscsi/iqn.20...7f4/tpg1/acls> ls o- acls ..................................................................................... [ACLs: 3] o- iqn.1994-05.com.redhat:44e8828a7868 ............................................. [Mapped LUNs: 1] | o- mapped_lun0 ............................................................. [lun0 fileio/www (rw)] o- iqn.1994-05.com.redhat:553420881b94 ............................................. [Mapped LUNs: 1] | o- mapped_lun0 ............................................................. [lun0 fileio/www (rw)] o- iqn.1994-05.com.redhat:d7163d296480 ............................................. [Mapped LUNs: 1] o- mapped_lun0 ............................................................. [lun0 fileio/www (rw)] /iscsi/iqn.20...7f4/tpg1/acls>I then had them all connect to the same block device:
[root@pcs-c ~]# fdisk -l /dev/sda fdisk: cannot open /dev/sda: No such file or directory [root@pcs-c ~]# iscsiadm -m node -T iqn.2003-01.org.linux-iscsi.iscsi.x8664:sn.9e8fba51d7f4 -p iscsi.example.com -l Logging in to [iface: default, target: iqn.2003-01.org.linux-iscsi.iscsi.x8664:sn.9e8fba51d7f4, portal: 192.168.122.140,3260] (multiple) Login to [iface: default, target: iqn.2003-01.org.linux-iscsi.iscsi.x8664:sn.9e8fba51d7f4, portal: 192.168.122.140,3260] successful. [root@pcs-c ~]# fdisk -l /dev/sda Disk /dev/sda: 5368 MB, 5368709120 bytes, 10485760 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 4194304 bytes [root@pcs-c ~]#
Create an Exclusive LV
I then followed the second half of the documentation to create a file system and LV for the web tree.
pvcreate /dev/sda1 vgcreate my_vg /dev/sda1 lvcreate -L1000 -n my_lv my_vg mkfs.ext4 /dev/my_vg/my_lv mount /dev/my_vg/my_lv /var/www/ mkdir /var/www/html mkdir /var/www/cgi-bin mkdir /var/www/error restorecon -R /var/www echo "hello" > /var/www/html/index.html umount /var/www
One thing I like about the example is that I didn't need to use CLVM
since I don't have a true shared file system. I only needed to tell
the cluster that access to a vertain LV was to be managed by it and
for it to be accessed by only one node at a time. One safegard against
the filesystem corruption (in addition to fencing) was to tell LVM to
not configure my iSCSI LV as an auto_activation_volume by editing the
volume_list to include only the local LVs in /etc/lvm/lvm.conf.
[root@pcs-a ~]# lvs LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert my_lv my_vg -wi-a----- 1000.00m root rhel -wi-ao---- 14.81g swap rhel -wi-ao---- 1.70g [root@pcs-a ~]# [root@pcs-b ~]# lvs LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert root rhel -wi-ao---- 14.81g swap rhel -wi-ao---- 1.70g [root@pcs-b ~]# [root@pcs-c ~]# lvs LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert root rhel -wi-ao---- 14.81g swap rhel -wi-ao---- 1.70g [root@pcs-c ~]#
Configure the Cluster Resources
I continued with the the documentation and created by cluster resources starting with the storage resources.[root@pcs-a ~]# pcs resource create my_lvm LVM volgrpname=my_vg exclusive=true --group apachegroup [root@pcs-a ~]# pcs resource show Resource Group: apachegroup my_lvm (ocf::heartbeat:LVM): Started [root@pcs-a ~]# pcs resource create my_fs Filesystem device="/dev/my_vg/my_lv" directory="/var/www" fstype="ext4" --group apachegroup [root@pcs-a ~]#Earlier I had decided that 192.168.122.141 would be a shared IP
address for the host web.example.com. I also configured a new virtual
NIC (which you can do without rebooting a KVM host) for each node and
configured that NIC to not be started on boot, since the cluster will
manage it; e.g. a similar file as below exists on the other nodes but
varies by MAC.
[root@pcs-c ~]# cat /etc/sysconfig/network-scripts/ifcfg-ens8 HWADDR="52:54:00:c8:71:77" TYPE="Ethernet" BOOTPROTO="none" DEFROUTE="yes" NAME="ens8" ONBOOT="no" [root@pcs-c ~]#I then configured this NIC to hold my floating IP address and also
configured the apache resource group.
[root@pcs-a ~]# pcs resource create VirtualIP IPaddr2 ip=192.168.122.141 nic=ens8 cidr_netmask=24 --group apachegroup [root@pcs-a ~]# pcs resource create Website apache configfile="/etc/httpd/conf/httpd.conf" statusurl="http://127.0.0.1/server-status" --group apachegroup [root@pcs-a ~]#
I ran into some issues with getting the VirtualIP and Website
resources to start as per `pcs status` and spent time chasing
after an arp issue using `ip n` on the nodes and `tcpdump -i virbr0`
on the hypervisor before I realized I had assigned an address from a
different vlan by accident. This was a good experience however
because I became acquainted with the following.
[root@pcs-a ~]# pcs resource show Resource Group: apachegroup my_lvm (ocf::heartbeat:LVM): Started pcs-b.example.com my_fs (ocf::heartbeat:Filesystem): Started pcs-b.example.com VirtualIP (ocf::heartbeat:IPaddr2): Stopped Website (ocf::heartbeat:apache): Stopped [root@pcs-a ~]# pcs resource show [root@pcs-a ~]# pcs resource debug-start VirtualIP Operation start for VirtualIP (ocf:heartbeat:IPaddr2) returned 1 > stderr: ERROR: Unable to find nic or netmask. > stderr: ERROR: [findif] failed [root@pcs-a ~]# [root@pcs-a pcsd]# pcs resource show VirtualIP Resource: VirtualIP (class=ocf provider=heartbeat type=IPaddr2) Attributes: ip=198.168.122.141 cidr_netmask=24 Operations: start interval=0s timeout=20s (VirtualIP-start-timeout-20s) stop interval=0s timeout=20s (VirtualIP-stop-timeout-20s) monitor interval=10s timeout=20s (VirtualIP-monitor-interval-10s) [root@pcs-a pcsd]# [root@pcs-a pcsd]# /usr/libexec/heartbeat/send_arp -A -i 200 -r 5 -p /var/run/resource-agents/send_arp-198.168.122.141 ens8 198.168.122.141 auto not_used not_used ARPING 198.168.122.141 from 198.168.122.141 ens8 Sent 5 probes (5 broadcast(s)) Received 0 response(s) [root@pcs-a pcsd]#Can you imagine copying/pasting the above without realizing that I was
not using 192.168.... but 198.168.... ? Often something
simple. Eventually I noticed it while watching the ARPs on my virtual
bridge:
# tcpdump -i virbr0 | grep -i arp 14:04:17.183244 ARP, Request who-has web tell pcs-a, length 28 14:04:17.363589 ARP, Request who-has pcs-b tell pcs-c, length 28 14:04:17.363684 ARP, Reply pcs-b is-at 52:54:00:1c:c4:1c (oui Unknown), length 28 14:04:17.779637 ARP, Request who-has laptop.example.com tell pcs-b, length 28 14:04:17.779658 ARP, Reply laptop.example.com is-at fe:54:00:1c:c4:1c (oui Unknown), length 28 14:04:18.184881 ARP, Request who-has web tell pcs-a, length 28 14:04:19.186844 ARP, Request who-has web tell pcs-a, length 28 14:04:22.611597 ARP, Request who-has pcs-c tell pcs-b, length 28 14:04:22.611680 ARP, Reply pcs-c is-at 52:54:00:b2:79:04 (oui Unknown), length 28 ...
Test the cluster
Once I had updated the IP address,pcs resource update VirtualIP IPaddr2 nic=ens8 ip=192.168.122.141 cidr_netmask=24 --group apachegroupI ran the following while also relading web.example.com in my browser.
watch -n 1 "ip a s ens8; df -h"and did command like the following.
pcs cluster standby pcs-b.example.com pcs cluster unstandby pcs-b.example.comSo I could watch the cluster resources move from hosts that suddently
did not have them:
[root@pcs-b ~]# ip a s ens8; df -h 2: ens8:To hosts that did have them:mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 52:54:00:8f:19:c6 brd ff:ff:ff:ff:ff:ff Filesystem Size Used Avail Use% Mounted on /dev/mapper/rhel-root 15G 1.2G 14G 8% / devtmpfs 489M 0 489M 0% /dev tmpfs 498M 39M 459M 8% /dev/shm tmpfs 498M 6.6M 491M 2% /run tmpfs 498M 0 498M 0% /sys/fs/cgroup /dev/vda1 497M 136M 362M 28% /boot [root@pcs-b ~]#
[root@pcs-a ~]# ip a s ens8; df -h 2: ens8:mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 52:54:00:7d:40:6b brd ff:ff:ff:ff:ff:ff inet 192.168.122.141/24 scope global ens8 valid_lft forever preferred_lft forever Filesystem Size Used Avail Use% Mounted on /dev/mapper/rhel-root 15G 1.2G 14G 8% / devtmpfs 489M 0 489M 0% /dev tmpfs 498M 54M 444M 11% /dev/shm tmpfs 498M 6.6M 491M 2% /run tmpfs 498M 0 498M 0% /sys/fs/cgroup /dev/vda1 497M 136M 362M 28% /boot /dev/mapper/my_vg-my_lv 969M 2.5M 900M 1% /var/www [root@pcs-a ~]#
No comments:
Post a Comment