Red Hat Bugzilla – Bug 476810
Long real server names cause segfault in lvsd
Last modified: 2009-07-22 15:23:57 EDT
Description of problem:
When using long (+/- 28 or more characters) real server names in piranha configuration, and then starting pulse, lvsd will crash with a segfault:
kernel: lvsd: segfault at ffffffffffffffd0 rip 000000314ec785a0 rsp 00007fff63d99558 error 4
Version-Release number of selected component (if applicable):
Program Version: lvs 1.38
A component of: piranha-0.8.4-7
output of uname -a:"
Linux lb01.domainname.local 2.6.18-92.1.18.el5.centos.plus #1 SMP Wed Nov 26 07:28:20 EST 2008 x86_64 x86_64 x86_64 GNU/Linux
Use a long real server name in lvs.cf, e.g. "www.www0001.domainname.local" and then start pulse using "/etc/init.d/pulse start" in an x86_64 environment.
Steps to Reproduce:
1. Create a basic piranha configuration with a real server which has a long name like "www.www0001.domainname.local".
2. Start Pulse using "/etc/init.d/pulse start"
3. Watch the messages log in "/var/log/messages"
Make sure lvsd is being run in daemon mode, as this triggers the syslog function (where the problem probably lies), instead of printing log messages to the display (which works correctly).
Dec 17 10:21:22 lb01 pulse: STARTING PULSE AS MASTER
Dec 17 10:21:40 lb01 pulse: partner dead: activating lvs
Dec 17 10:21:40 lb01 lvs: starting virtual service www.domainname.net: 80
Dec 17 10:21:40 lb01 avahi-daemon: Registering new address record for 10.36.125.202 on eth0.
Dec 17 10:21:40 lb01 avahi-daemon: Withdrawing address record for 10.36.125.202 on eth0.
Dec 17 10:21:40 lb01 kernel: lvsd: segfault at ffffffffffffffd0 rip 000000314ec785a0 rsp 00007fff63d99558 error 4
Dec 17 10:21:40 lb01 nanny: starting LVS client monitor for 10.36.125.202:80
Dec 17 10:21:45 lb01 pulse: gratuitous lvs arps finished
Dec 17 10:22:08 lb01 pulse: Terminating due to signal 15
Dec 17 10:24:36 lb01 pulse: STARTING PULSE AS MASTER
Dec 17 10:24:54 lb01 pulse: partner dead: activating lvs
Dec 17 10:24:54 lb01 lvs: starting virtual service www.domainname.net active: 80
Dec 17 10:24:54 lb01 avahi-daemon: Registering new address record for 10.0.8.1 on eth1.
Dec 17 10:24:54 lb01 avahi-daemon: Withdrawing address record for 10.0.8.1 on eth1.
Dec 17 10:24:54 lb01 lvs: create_monitor for www.domainname.net/www.www0001.domainname.local running as pid 2838
Dec 17 10:24:54 lb01 nanny: starting LVS client monitor for 10.36.125.202:80
Dec 17 10:24:59 lb01 pulse: gratuitous lvs arps finished
The problems seems to be in "piranha-0.8.4/util.c", specifically in the "doSyslog" function. As soon as the log messages are larger than 80 characters a reallocation of memory is being done in this function and that somehow causes a segfault. A quick fix allocating more bytes initially helped to solve the problem for us, but a more structural solution would be of course the reallocation to succeed properly.
Unable to reproduce, can you send me your lvs configuration file?
That's a little hard since the servers we have tested this on are already in production. The most important thing is that the real server name is long, e.g. "www.www0001.domainname.local".
Have you tested this on a 64 bit platform? We know the problem does not occur on 32 bit platforms (in our test setup), so if you have tested this on a 32 bit platform, that is expected.
Please let me know, otherwise we will need to make a new test setup for this.
Test setup will be very welcomed. I tried it on my 64bit machines with 5.3.
It is very likely that this bug is a duplicate of #446802 (segfault if syslog message is longer than 80 characters) which was resolved in 5.3. IMHO that is a reason why I was not able to reproduce it as it was fixed already. Closing as duplicate, if you will have same problems with 5.3 please open new bug.
*** This bug has been marked as a duplicate of bug 446802 ***