AN!Wiki :: How To :: 2-Node Red Hat KVM Cluster Tutorial — Troubleshooting
Warning: This document is old, abandoned and very out of date. DON’T USE ANYTHING HERE! Consider it only as historical note taking. |
This is the trouble-shooting section from the 2-Node Red Hat KVM Cluster Tutorial tutorial.
Contents
- 1 Troubleshooting
- 1.1 [vm] error: internal error Attempt to migrate guest to the same host {uuid}
- 1.1.1 Setting host_uuid Didn’t Work, What Now?
- 1.2 [vm] error: Cannot recv data: Host key verification failed.#015: Connection reset by peer
- 1.3 error: unknown OS type hvm
- 1.4 My VM Just Vanished!
- 1.5 Disabling rsyslog Rate Limiting
- 1.6 FATAL: Module drbd not found
- 1.7 Starting Cluster; Mounting configfs… mount: none already mounted or /sys/kernel/config busy
- 1.1 [vm] error: internal error Attempt to migrate guest to the same host {uuid}
Troubleshooting
Here we will cover, in no particular order, some common clustering problems and their fixes.
[vm] error: internal error Attempt to migrate guest to the same host {uuid}
Note: |
This message will appear in the source node’s syslog when trying to migrate a VM. Here is an example set of error messages.
Dec 27 22:00:46 an-node01 rgmanager[2492]: Migrating vm:vm0001-dev to an-node02.alteeve.ca Dec 27 22:00:46 an-node01 rgmanager[22331]: [vm] Migrate vm0001-dev to an-node02.alteeve.ca failed: Dec 27 22:00:46 an-node01 rgmanager[22353]: [vm] error: internal error Attempt to migrate guest to the same host 00020003-0004-0005-0006-000700080009 Dec 27 22:00:46 an-node01 rgmanager[2492]: migrate on vm "vm0001-dev" returned 150 (unspecified) Dec 27 22:00:46 an-node01 rgmanager[2492]: Migration of vm:vm0001-dev to an-node02.alteeve.ca failed; return code 150
For reasons as yet unknown, both nodes have the same UUID. You can verify this by running virsh sysinfo | grep uuid on both nodes.
First node;
virsh sysinfo | grep uuid
<entry name='uuid'>03000200-0400-0500-0006-000700080009</entry>
First node;
virsh sysinfo | grep uuid
<entry name='uuid'>03000200-0400-0500-0006-000700080009</entry>
This UUID comes from the mainboard, and you can confirm this with the following command (note to change the string in grep to a portion of your UUID);
03000200-0400-0500-0006-000700080009
Alternatively;
dmidecode |grep 000700080009 -B 7 -A 4
Handle 0x0001, DMI type 1, 27 bytes System Information Manufacturer: empty Product Name: empty Version: empty Serial Number: empty UUID: 03000200-0400-0500-0006-000700080009 Wake-up Type: Power Switch SKU Number: To be filled by O.E.M. Family: To be filled by O.E.M.
This is the result of a lazy vendor re-using UUIDs across mainboards.
The fix is to specify a unique UUID in /etc/libvirt/libvirtd.conf using its host_uuid variable. We’ll generate new, unique UUIDs for each node using the uuidgen command. Be sure to use two new UUIDs for each node!
On the first node;
cp /etc/libvirt/libvirtd.conf /etc/libvirt/libvirtd.conf.orig uuidgen
31873b9e-1069-42ce-b950-137ae5eaa3d1
Change the UUID;
vim /etc/libvirt/libvirtd.conf
host_uuid = "31873b9e-1069-42ce-b950-137ae5eaa3d1"
Here’s the diff;
diff -u /etc/libvirt/libvirtd.conf.orig /etc/libvirt/libvirtd.conf
--- /etc/libvirt/libvirtd.conf.orig 2011-12-27 22:29:01.243394880 -0500 +++ /etc/libvirt/libvirtd.conf 2011-12-27 22:33:44.309799253 -0500 @@ -365,4 +365,4 @@ # NB This default all-zeros UUID will not work. Replace # it with the output of the 'uuidgen' command and then # uncomment this entry -#host_uuid = "00000000-0000-0000-0000-000000000000" +host_uuid = "31873b9e-1069-42ce-b950-137ae5eaa3d1"
Make the same change, with a new and unique UUID, on the second node.
cp /etc/libvirt/libvirtd.conf /etc/libvirt/libvirtd.conf.orig uuidgen
90b8d280-c9ff-4e0e-867e-6d4f7d915995
Change the UUID;
vim /etc/libvirt/libvirtd.conf
host_uuid = "90b8d280-c9ff-4e0e-867e-6d4f7d915995"
Here’s the diff;
diff -u /etc/libvirt/libvirtd.conf.orig /etc/libvirt/libvirtd.conf
--- /etc/libvirt/libvirtd.conf.orig 2011-12-27 22:35:45.975389858 -0500 +++ /etc/libvirt/libvirtd.conf 2011-12-27 22:36:28.325518880 -0500 @@ -365,4 +365,4 @@ # NB This default all-zeros UUID will not work. Replace # it with the output of the 'uuidgen' command and then # uncomment this entry -#host_uuid = "00000000-0000-0000-0000-000000000000" +host_uuid = "90b8d280-c9ff-4e0e-867e-6d4f7d915995"
Now to reload the configuration, we need to restart libvirtd (a reload is not enough).
Warning: Be sure to stop all VMs on the node before proceeding! |
/etc/init.d/libvirtd restart
Stopping libvirtd daemon: [ OK ] Starting libvirtd daemon: [ OK ]
virsh sysinfo | grep uuid
This should show the new UUID. If it doesn’t though, please apply the work-around below.
Setting host_uuid Didn’t Work, What Now?
Warning: This work-around is not supported in any way supported by Red Hat or any other vendor. This work-around is provided as-is until libvirt is fixed. — Dec. 28, 2011 |
The problem is that libvirt doesn’t use libvirtd.conf‘s host_uuid if it sees the system UUID as being valid (not all 0 or all f).
The work-around is to create a wrapper script for dmidecode that intercepts dmidecode -q -t 0,1,4,17, reads the libvirtd.conf and, if host_uuid is set, substitute UUID returned by dmidecode with the one set by host_uuid.
To apply the work-around;
Check that the current dmidecode returns the bad UUID;
dmidecode -q -t 0,1,4,17 | grep UUID
UUID: 03000200-0400-0500-0006-000700080009
Now we’re going to rename dmidecode as dmidecode.orig, then download the wrapper script.
mv /usr/sbin/dmidecode /usr/sbin/dmidecode.orig wget -c https://alteeve.ca/files/dmidecode -O /usr/sbin/dmidecode
--2011-12-28 13:44:27-- https://alteeve.ca/files/dmidecode Resolving alteeve.ca... 192.139.81.121 Connecting to alteeve.ca|192.139.81.121|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 1159 (1.1K) [text/plain] Saving to: “/usr/sbin/dmidecode” 100%[======================================>] 1,159 --.-K/s in 0s 2011-12-28 13:44:28 (15.3 MB/s) - “/usr/sbin/dmidecode” saved [1159/1159]
chmod 755 /usr/sbin/dmidecode ls -lah /usr/sbin/dmidecode
-rwxr-xr-x 1 root root 1.2K Dec 28 13:26 /usr/sbin/dmidecode
Now re-run the dmidecode call and see that the new UUID is used.
dmidecode -q -t 0,1,4,17 | grep UUID
UUID: 31873b9e-1069-42ce-b950-137ae5eaa3d1
This matches what was set in /etc/libvirt/libvirtd.conf;
grep host_uuid /etc/libvirt/libvirtd.conf
host_uuid = "31873b9e-1069-42ce-b950-137ae5eaa3d1"
Now restart libvirtd and check virsh sysinfo to confirm that libvirtd now returns the proper UUID.
/etc/init.d/libvirtd restart
Stopping libvirtd daemon: [ OK ] Starting libvirtd daemon: [ OK ]
virsh sysinfo | grep uuid
<entry name='uuid'>31873b9e-1069-42ce-b950-137ae5eaa3d1</entry>
Done!
As soon as libvirtd is fixed, this section will be re-written.
[vm] error: Cannot recv data: Host key verification failed.#015: Connection reset by peer
This can show up when you try to live migrate a VM but your /root/.ssh/known_hosts file has not been populated. Effectively, the cluster was prompted to accept the finger-print of the target node, was unable to answer and so then closed the connection.
The syslog entry will look something like this;
Dec 27 21:58:00 an-node02 rgmanager[2439]: Migrating vm:vm0003-db to an-node01.alteeve.ca Dec 27 21:58:01 an-node02 rgmanager[18951]: [vm] Migrate vm0003-db to an-node01.alteeve.ca failed: Dec 27 21:58:01 an-node02 rgmanager[18973]: [vm] error: Cannot recv data: Host key verification failed.#015: Connection reset by peer Dec 27 21:58:01 an-node02 rgmanager[2439]: migrate on vm "vm0003-db" returned 150 (unspecified) Dec 27 21:58:01 an-node02 rgmanager[2439]: Migration of vm:vm0003-db to an-node01.alteeve.ca failed; return code 150
To fix the problem, please return to Populating And Pushing ~/ssh/known_hosts.
error: unknown OS type hvm
This can be caused by hardware virtualization support being disabled in your BIOS.
To check whether you have hardware virtualization support enabled, run;
egrep '(vmx|svm)' --color=always /proc/cpuinfo
On Intel machines, you should see this;
On AMD machines, you should see this;
The above will have the xvm or svm highlighted and the flags line will be quite long. You will also see an entry for every CPU core (or hyperthreaded pseudo-core).
If you don’t see a match to either xvm or svm, please consult your motherboard’s manual for information on enabling hardware virtualization.
My VM Just Vanished!
Warning: If virsh tries to start a virtual machine but a referenced device or media is missing, it will react by completely undefining the virtual machine! |
If you ever suddenly find that a virtual machine has vanished, it is probably because something the VM wanted to use couldn’t be found. This can be as trivial as deleting an ISO that a VM had been defined to mount on boot.
Let’s look at the example where an ISO was deleted, as this is a common issue.
Copy your last backup of the XML definition file for the effected VM and then edit it to remove the <source file=’…’/> lines for the removed media. For example, change:
<disk type='file' device='floppy'> <driver name='qemu' type='raw' cache='none' io='threads'/> <source file='/shared/files/virtio-win-1.1.16.vfd'/> <target dev='fda' bus='fdc'/> <alias name='fdc0-0-0'/> <address type='drive' controller='0' bus='0' unit='0'/> </disk> <disk type='file' device='cdrom'> <driver name='qemu' type='raw' io='threads'/> <source file='/shared/files/Windows_Server_2008_R2_64Bit_SP1.iso'/> <target dev='hdc' bus='ide'/> <readonly/> <alias name='ide0-1-0'/> <address type='drive' controller='0' bus='1' unit='0'/> </disk>
To:
<disk type='file' device='floppy'> <driver name='qemu' type='raw' cache='none' io='threads'/> <target dev='fda' bus='fdc'/> <alias name='fdc0-0-0'/> <address type='drive' controller='0' bus='0' unit='0'/> </disk> <disk type='file' device='cdrom'> <driver name='qemu' type='raw' io='threads'/> <target dev='hdc' bus='ide'/> <readonly/> <alias name='ide0-1-0'/> <address type='drive' controller='0' bus='1' unit='0'/> </disk>
Then redefine the VM and you can safely restart it again.
virsh define /shared/definitions/vm0002-ms.xml
You should be back in business at this point.
Disabling rsyslog Rate Limiting
If you are getting messages like rsyslogd-2177: imuxsock lost 575 messages from pid 29288 due to rate-limiting in EL6.3+, it is because of the tighter message flood restrictions. You can disable these messages by following the steps below.
Make a backup of the original rsyslog.conf, then edit /etc/rsyslog.conf and locate the line with $ModLoad imuxsock. directly below it add the following two entries, one per line; $SystemLogRateLimitInterval 0 and $SystemLogRateLimitBurst 0.
cp /etc/rsyslog.conf /etc/rsyslog.conf.orig vim /etc/rsyslog.conf
$ModLoad imuxsock # provides support for local system logging (e.g. via logger command) $SystemLogRateLimitInterval 0 $SystemLogRateLimitBurst 0
Save it and verify that the changes look sane by comparing against the original file:
diff -u /etc/rsyslog.conf.orig /etc/rsyslog.conf
--- /etc/rsyslog.conf.orig 2012-08-05 18:42:31.016783419 -0400 +++ /etc/rsyslog.conf 2012-08-05 18:42:17.609783118 -0400 @@ -6,6 +6,9 @@ #### MODULES #### $ModLoad imuxsock # provides support for local system logging (e.g. via logger command) +$SystemLogRateLimitInterval 0 +$SystemLogRateLimitBurst 0 + $ModLoad imklog # provides kernel logging support (previously done by rklogd) #$ModLoad immark # provides --MARK-- message capability
Restart the rsyslog daemon to make the changes take effect.
/etc/init.d/rsyslog restart
Shutting down system logger: [ OK ] Starting system logger: [ OK ]
Done!
You should no longer see rate limit messages.
FATAL: Module drbd not found
If you update your operating system’s kernel, but it was added to /boot/grub/grub.conf in the wrong order, DRBD’s kernel module will not load. In this case, you will see an error like:
FATAL: Module drbd not found.
Alternatively, if you are trying to install DRBD from source, you might see an error like this:
make -C drbd drbd_buildtag.c make[1]: Entering directory `/root/drbd-8.3.15/drbd' make[1]: Leaving directory `/root/drbd-8.3.15/drbd' make[1]: Entering directory `/root/drbd-8.3.15/user' flex -s -odrbdadm_scanner.c drbdadm_scanner.fl gcc -g -O2 -Wall -I../drbd -I../drbd/compat -c -o drbdadm_scanner.o drbdadm_scanner.c gcc -g -O2 -Wall -I../drbd -I../drbd/compat -c -o drbdadm_parser.o drbdadm_parser.c gcc -g -O2 -Wall -I../drbd -I../drbd/compat -c -o drbdadm_main.o drbdadm_main.c gcc -g -O2 -Wall -I../drbd -I../drbd/compat -c -o drbdadm_adjust.o drbdadm_adjust.c gcc -g -O2 -Wall -I../drbd -I../drbd/compat -c -o drbdtool_common.o drbdtool_common.c gcc -g -O2 -Wall -I../drbd -I../drbd/compat -c -o drbdadm_usage_cnt.o drbdadm_usage_cnt.c cp ../drbd/drbd_buildtag.c drbd_buildtag.c gcc -g -O2 -Wall -I../drbd -I../drbd/compat -c -o drbd_buildtag.o drbd_buildtag.c gcc -g -O2 -Wall -I../drbd -I../drbd/compat -c -o drbdadm_minor_table.o drbdadm_minor_table.c gcc -o drbdadm drbdadm_scanner.o drbdadm_parser.o drbdadm_main.o drbdadm_adjust.o drbdtool_common.o drbdadm_usage_cnt.o drbd_buildtag.o drbdadm_minor_table.o gcc -g -O2 -Wall -I../drbd -I../drbd/compat -c -o drbdmeta.o drbdmeta.c flex -s -odrbdmeta_scanner.c drbdmeta_scanner.fl gcc -g -O2 -Wall -I../drbd -I../drbd/compat -c -o drbdmeta_scanner.o drbdmeta_scanner.c gcc -o drbdmeta drbdmeta.o drbdmeta_scanner.o drbdtool_common.o drbd_buildtag.o gcc -g -O2 -Wall -I../drbd -I../drbd/compat -c -o drbdsetup.o drbdsetup.c cp ../drbd/drbd_strings.c drbd_strings.c gcc -g -O2 -Wall -I../drbd -I../drbd/compat -c -o drbd_strings.o drbd_strings.c gcc -o drbdsetup drbdsetup.o drbdtool_common.o drbd_buildtag.o drbd_strings.o make[1]: Leaving directory `/root/drbd-8.3.15/user' make[1]: Entering directory `/root/drbd-8.3.15/scripts' make[1]: Nothing to be done for `all'. make[1]: Leaving directory `/root/drbd-8.3.15/scripts' make[1]: Entering directory `/root/drbd-8.3.15/documentation' To (re)make the documentation: make doc make[1]: Leaving directory `/root/drbd-8.3.15/documentation' Userland tools build was successful. SORRY, kernel makefile not found. You need to tell me a correct KDIR, Or install the neccessary kernel source packages. make: *** [check-kdir] Error 1
In either case, check your currently running kernel:
Linux an-node01.alteeve.ca 2.6.32-279.el6.x86_64 #1 SMP Fri Jun 22 12:19:21 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Note that there is no suffix after 2.6.32-279. This is the original kernel, not the updated one. You can confirm the mismatch by checking the version of the
kernel-devel package:
yum list | grep kernel-headers
kernel-headers.x86_64 2.6.32-279.19.1.el6 @updates
Note the version number of the header RPM is 2.6.32-279.19.1. The .19.1 suffix shows that it is a newer kernel than the one that is running. This is our problem.
This generally happens because of a bad order of kernels in /boot/grub/grub.conf. We can check this by looking at the file:
# grub.conf generated by anaconda # # Note that you do not have to rerun grub after making changes to this file # NOTICE: You have a /boot partition. This means that # all kernel and initrd paths are relative to /boot/, eg. # root (hd0,0) # kernel /vmlinuz-version ro root=/dev/sda2 # initrd /initrd-[generic-]version.img #boot=/dev/sda default=0 timeout=5 splashimage=(hd0,0)/grub/splash.xpm.gz hiddenmenu title CentOS (2.6.32-279.el6.x86_64) root (hd0,0) kernel /vmlinuz-2.6.32-279.el6.x86_64 ro root=UUID=861b2d43-6c16-4bfa-ae59-a0a95eebf607 rd_NO_LUKS rd_NO_LVM LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-279.el6.x86_64.img title CentOS (2.6.32-279.19.1.el6.x86_64) root (hd0,0) kernel /vmlinuz-2.6.32-279.19.1.el6.x86_64 ro root=UUID=861b2d43-6c16-4bfa-ae59-a0a95eebf607 rd_NO_LUKS rd_NO_LVM LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-279.19.1.el6.x86_64.img
Note that default=0 which tells us that the default kernel is the first one in the list. If you look at the kernel version described by the first title entry, it is the 2.6.32-279.el6.x86_64 version.
You can fix this by changing the default value to 1, but you will likely miss future kernel updates. It is better instead to put the newer kernel version at the top of the list.
# grub.conf generated by anaconda # # Note that you do not have to rerun grub after making changes to this file # NOTICE: You have a /boot partition. This means that # all kernel and initrd paths are relative to /boot/, eg. # root (hd0,0) # kernel /vmlinuz-version ro root=/dev/sda2 # initrd /initrd-[generic-]version.img #boot=/dev/sda default=0 timeout=5 splashimage=(hd0,0)/grub/splash.xpm.gz hiddenmenu title CentOS (2.6.32-279.19.1.el6.x86_64) root (hd0,0) kernel /vmlinuz-2.6.32-279.19.1.el6.x86_64 ro root=UUID=861b2d43-6c16-4bfa-ae59-a0a95eebf607 rd_NO_LUKS rd_NO_LVM LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-279.19.1.el6.x86_64.img title CentOS (2.6.32-279.el6.x86_64) root (hd0,0) kernel /vmlinuz-2.6.32-279.el6.x86_64 ro root=UUID=861b2d43-6c16-4bfa-ae59-a0a95eebf607 rd_NO_LUKS rd_NO_LVM LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-279.el6.x86_64.img
Note now that the first entry is the 2.6.32-279.19.1.el6.x86_64 version. Once you reboot, that should be the kernel that is loaded.
Once you reboot, you should see that your are now running the latest kernel;
Linux an-node01.alteeve.ca 2.6.32-279.19.1.el6.x86_64 #1 SMP Wed Dec 19 07:05:20 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
You should now be able to load the drbd module. If you can’t, please reinstall DRBD to ensure that it’s kernel module is built against the latest kernel.
Starting Cluster; Mounting configfs… mount: none already mounted or /sys/kernel/config busy
This error occurs when the / (root) file system is full. When there is no space on disk, if causes cman to fail to start with the following error;
Starting cluster: Checking if cluster has been disabled at boot... [ OK ] Checking Network Manager... [ OK ] Global setup... [ OK ] Loading kernel modules... [ OK ] Mounting configfs... mount: none already mounted or /sys/kernel/config busy [FAILED] Stopping cluster: Leaving fence domain... [ OK ] Stopping gfs_controld... [ OK ] Stopping dlm_controld... [ OK ] Stopping fenced... [ OK ] Stopping cman... [ OK ] Unloading kernel modules... [ OK ] Unmounting configfs... [ OK ]
Sure enough, if you run df, you will see no space left.
Filesystem Size Used Avail Use% Mounted on /dev/sda2 40G 40G 0 100% /
Note that in this case, even the temp file systems failed to mount and /boot isn’t visible. In this example, a bad RAM module caused a flood of errors in /var/log/, so deleting those log files freed up space. The system was so messed up that the node had to be fenced to reboot it as the reboot command failed.
Once free space was recovered and the node was rebooted, the cluster was able to start.
Any questions, feedback, advice, complaints or meanderings are welcome. | ||||
Us: Alteeve’s Niche! | Support: Mailing List | IRC: #clusterlabs on Freenode | © Alteeve’s Niche! Inc. 1997-2019 | |
legal stuff: All info is provided «As-Is». Do not use anything here unless you are willing and able to take responsibility for your own actions. |
i’m trying to learn drbd with centoOS 6.3 on virtual box, i have two vm configed, the node 1 is original, the node 2 is cloned from node 1, but i can’t start ‘service drbd start’ there is a error message ‘starting DRBD resources: Can not load the drbd module’, while the node 2 can start the command, here is the config
[root@localhost db]# cat /etc/drbd.conf
# You can find an example in /usr/share/doc/drbd.../drbd.conf.example
#include "drbd.d/global_common.conf";
#include "drbd.d/*.res";
global {
# do not participate in online usage survey
usage-count no;
}
resource data {
# write IO is reported as completed if it has reached both local
# and remote disk
protocol C;
net {
# set up peer authentication
cram-hmac-alg sha1;
shared-secret "s3cr3tp@ss";
# default value 32 - increase as required
max-buffers 512;
# highest number of data blocks between two write barriers
max-epoch-size 512;
# size of the TCP socket send buffer - can tweak or set to 0 to
# allow kernel to autotune
sndbuf-size 0;
}
startup {
# wait for connection timeout - boot process blocked
# until DRBD resources are connected
wfc-timeout 30;
# WFC timeout if peer was outdated
outdated-wfc-timeout 20;
# WFC timeout if this node was in a degraded cluster (i.e. only had one
# node left)
degr-wfc-timeout 30;
}
disk {
# the next two are for safety - detach on I/O error
# and set up fencing - resource-only will attempt to
# reach the other node and fence via the fence-peer
# handler
on-io-error detach;
fencing resource-only;
# no-disk-flushes; # if we had battery-backed RAID
# no-md-flushes; # if we had battery-backed RAID
# ramp up the resync rate
# resync-rate 10M;
}
handlers {
# specify the two fencing handlers
# see: http://www.drbd.org/users-guide-8.4/s-pacemaker-fencing.html
fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
}
# first node
on node1.mycluster.org {
# DRBD device
device /dev/drbd0;
# backing store device
disk /dev/sdb;
# IP address of node, and port to listen on
address 192.168.1.101:7789;
# use internal meta data (don't create a filesystem before
# you create metadata!)
meta-disk internal;
}
# second node
on node2.mycluster.org {
# DRBD debice
device /dev/drbd0;
# backing store device
disk /dev/sdb;
# IP address of node, and port to listen on
address 192.168.1.102:7789;
# use internal meta data (don't create a filesystem before
# you create metadata!)
meta-disk internal;
}
}
any one know what is the problem?
asked Nov 3, 2014 at 2:31
This doesn’t sound like a config problem — rather it sounds like the kernel module for DRBD has not been installed. You will need to install the appropriate version of kmod-drbd. (What happens if you type modprobe drbd ?)
From the command line try doing yum search drbd
Then choose the correct package — probably something like kmod-drbd83
If that doesn’t work, maybe upgrade to a newer version of CentOS and kernel.
answered Nov 3, 2014 at 6:01
davidgodavidgo
6,0563 gold badges21 silver badges39 bronze badges
need to manually run depmod after installation of linux-image-extra-VERSION-virtual package
Bug #890447 reported by
Scott Moser
on 2011-11-14
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu)
|
Confirmed |
Undecided |
Unassigned
|
Bug Description
$ sudo apt-get install linux-image-extra-virtual
…
Setting up linux-image-extra-3.0.0-12-virtual (3.0.0-12.20) …
Setting up linux-image-extra-virtual (3.0.0.12.14) …
$ modprobe drbd
FATAL: Module drbd not found.
$ sudo modprobe drbd
FATAL: Module drbd not found.
$ find /lib/modules/ -name «*drbd*»
/lib/modules/3.0.0-12-virtual/kernel/drivers/block/drbd
/lib/modules/3.0.0-12-virtual/kernel/drivers/block/drbd/drbd.ko
$ sudo depmod -a
$ sudo modprobe drbd
$ lsmod | grep drbd
drbd 273002 0
lru_cache 14896 1 drbd
Basically, installation of linux-image-extra-virtual above did not result in usable modules. I had to run ‘depmod -a’ after that.
The package should do that itself.
ProblemType: Bug
DistroRelease: Ubuntu 11.10
Package: linux-image-extra-3.0.0-12-virtual 3.0.0-12.20
ProcVersionSignature: User Name 3.0.0-12.20-virtual 3.0.4
Uname: Linux 3.0.0-12-virtual x86_64
AlsaDevices:
total 0
crw-rw—- 1 root audio 116, 1 2011-11-14 21:58 seq
crw-rw—- 1 root audio 116, 33 2011-11-14 21:58 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 1.23-0ubuntu3
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
CurrentDmesg:
[ 23.304116] eth0: no IPv6 routers present
[ 173.117715] drbd: initialized. Version: 8.3.11 (api:88/proto:86-96)
[ 173.117719] drbd: srcversion: DA5A13F16DE6553FC7CE9B2
[ 173.117722] drbd: registered as block device major 147
[ 173.117724] drbd: minor_table @ 0xffff88002382b600
Date: Mon Nov 14 22:02:01 2011
Ec2AMI: ami-0dfd3464
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: us-east-1c
Ec2InstanceType: t1.micro
Ec2Kernel: aki-825ea7eb
Ec2Ramdisk: unavailable
Lspci:
Lsusb: Error: command [‘lsusb’] failed with exit code 1: unable to initialize libusb: -99
ProcEnviron:
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcKernelCmdLine: root=LABEL=cloudimg-rootfs ro console=hvc0
ProcModules:
drbd 273002 0 — Live 0x0000000000000000
lru_cache 14896 1 drbd, Live 0x0000000000000000
acpiphp 24080 0 — Live 0x0000000000000000
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)