Bug 16316
Summary: | Probelms with multiple services with lvs.cf | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat High Availability Server | Reporter: | Need Real Name <ciofina> | ||||||
Component: | piranha | Assignee: | Phil Copeland <copeland> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Phil Copeland <copeland> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 1.0 | CC: | ciofina, keith.moore | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | i386 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2000-08-18 03:35:01 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Red Hat Bugzilla
2000-08-16 07:49:20 UTC
It is actually a requirement for piranha LVS that each virtual service have a unique virtual address and virtual device designation. This should be documented (let me know if it's not). Part of the reason for this is that services are monitored, removed, and added individually. Additionl note: Virtual IP addresses are not the same as the real IP address of the computer -- they are independent. Multiple IP services CAN run on the same computer, but each must have a unique virtual IP address. It was not clear from your posting whether this was understood. Problem reopened -- it was closed accidentally Problem being investigated in another entry. *** This bug has been marked as a duplicate of 16399 *** OK, so it's not a duplicate. As you can see below, there are no problems with having multiple services with the same IP but different ports (This has to be allowed, since it's very common to have both port 80 and 443 on the same site). I have been using this type of configuration since RedHat 6.0. The only difference in the code I am running and the latest code is a bugfix for lvs.c regarding persistence has been applied: IP Virtual Server version 0.9.14 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 198.245.191.240:https wlc persistent 20 -> renpwwwbkprd5.renp-dmz.com:https Masq 1 0 0 TCP 198.245.191.240:www wlc persistent 20 -> renpwwwbkprd5.renp-dmz.com:www Masq 1 0 0 -- Keith Moore Are you using Apache on the real servers? Some servers do not respond to standard GET requests on port 443 (IIS for one). If this is the case, the SSL server will come never be brought up, since it's not getting a proper response. Try blanking out the SEND/EXPECT. Also, make sure you are using the latest software, 0.4.16-3, prior versions had problems with 443. What is the output of ipvsadm -ln This will show if the Virtual server is being enabled, but none of the real servers. -- Keith Moore In addition to the comment made by Keith Moore, also install this patch he created and see if it has any impact on your problem (you never mention if you are using persistence or not): --- lvs.c.old Wed Aug 16 19:05:14 2000 +++ lvs.c Wed Aug 16 19:08:48 2000 @@ -465,6 +465,7 @@ int * numClientsPtr) { int i; char * argv[40]; + char wrkBuf[10]; char ** arg = argv; char virtAddress[50]; int oldNumClients; @@ -521,7 +522,8 @@ if (vserver->persistent > 0 ) { *arg++ = (char *) "-p"; - (void) sprintf(*arg++, "%d", vserver->persistent); + (void) sprintf(wrkBuf, "%d", vserver->persistent); + *arg++ = wrkBuf; if (vserver->pmask.s_addr) { pmask = inet_ntoa(vserver->pmask); I cut and pasted your config into my test machine, it works fine, however I had to remove the send/expect for port 443, since I'm using NT/IIS which doesn't respond to GET requests on port 443 (This is probably the case with Netscape also). With the send/expect it would not bring up the https real-servers. -- Keith Moore 1. ipvsadm -l doesn't return the 443 service; 2. I tried also without persistence, but the problem persist 3. I'm using apache as www and stronghold (apache based) for https: I tried also without the persistence and without the send/expect string (although the secure server doesn't expect some password) and the problem persist. 4. I didn't try to apply the patch for persistence, but I believe it doesn't affect my case; 5. I didn't try to change the IP address's (I will try this today). But, in any case, I tried with other services (like ftp) with a different IP and all goes well. Follow you will find a piece of my log file when pulse start the service on the primary server. As you can see lvs try to start the service strong on port 443, but after this nothing else about the second service (the ip addresses are real!). Thanks again:) Aug 17 07:37:58 haa pulse[2227]: STARTING PULSE AS MASTER Aug 17 07:37:58 haa pulse: pulse startup succeeded Aug 17 07:38:16 haa pulse[2227]: partner dead: activating lvs Aug 17 07:38:16 haa lvs: running command "/usr/sbin/ipvsadm" "-C" Aug 17 07:38:16 haa lvs[2233]: starting virtual service www active: 80 Aug 17 07:38:16 haa lvs[2233]: running command "/usr/sbin/ipvsadm" "-A" "-t" "193.70.29.2:80" "-s" "wlc" Aug 17 07:38:16 haa lvs[2233]: running command "/usr/sbin/nanny" "-c" "-h" "193.70.29.18" "-p" "80" "-s" "GET / HTTP/1.0\r\n\r\n" "-x" "HTTP" "-a" "15" "-I" "/usr/sbin/ipvsadm" "-t" "6" "-w" "1" "-V" "193.70.29.2" "-M" "g" "-U" "rup" Aug 17 07:38:16 haa lvs[2233]: create_monitor for www/hidx.teta.it running as pid 2240 Aug 17 07:38:16 haa lvs[2233]: running command "/usr/sbin/nanny" "-c" "-h" "193.70.29.20" "-p" "80" "-s" "GET / HTTP/1.0\r\n\r\n" "-x" "HTTP" "-a" "15" "-I" "/usr/sbin/ipvsadm" "-t" "6" "-w" "1" "-V" "193.70.29.2" "-M" "g" "-U" "rup" Aug 17 07:38:16 haa pulse[2242]: running command "/sbin/ifconfig" "eth0:1" "193.70.29.2" "up" Aug 17 07:38:16 haa pulse[2239]: running command "/usr/sbin/send_arp" "-i" "eth 0" "193.70.29.2" "0804000000EB" "193.70.29.63" "ffffffffffff" Aug 17 07:38:16 haa lvs[2233]: create_monitor for www/hidb.teta.it running as pi d 2241 Aug 17 07:38:16 haa lvs[2233]: running command "/usr/sbin/nanny" "-c" "-h" "193.70.29.19" "-p" "80" "-s" "GET / HTTP/1.0\r\n\r\n" "-x" "HTTP" "-a" "15" "-I" "/usr/sbin/ipvsadm" "-t" "6" "-w" "1" "-V" "193.70.29.2" "-M" "g" "-U" "rup" Aug 17 07:38:16 haa lvs[2233]: create_monitor for www/hida.teta.it running as pid 2244 Aug 17 07:38:16 haa lvs[2233]: starting virtual service strong active: 443 Aug 17 07:38:16 haa nanny[2240]: starting LVS client monitor for 193.70.29.2:80 Aug 17 07:38:16 haa nanny[2240]: making 193.70.29.18:80 available Aug 17 07:38:16 haa nanny[2240]: running command "/usr/sbin/ipvsadm" "-a" "-t" "193.70.29.2:80" "-r" "193.70.29.18" "-g" "-w" "1" Aug 17 07:38:16 haa nanny[2241]: starting LVS client monitor for 193.70.29.2:80 Aug 17 07:38:16 haa nanny[2241]: making 193.70.29.20:80 available Aug 17 07:38:16 haa nanny[2241]: running command "/usr/sbin/ipvsadm" "-a" "-t" "193.70.29.2:80" "-r" "193.70.29.20" "-g" "-w" "1" Aug 17 07:38:16 haa nanny[2244]: starting LVS client monitor for 193.70.29.2:80 Aug 17 07:38:16 haa nanny[2244]: making 193.70.29.19:80 available Aug 17 07:38:16 haa nanny[2244]: running command "/usr/sbin/ipvsadm" "-a" "-t" "193.70.29.2:80" "-r" "193.70.29.19" "-g" "-w" "1" Aug 17 07:38:16 haa nanny[2241]: running command "rup" "193.70.29.20" Aug 17 07:38:16 haa nanny[2240]: running command "rup" "193.70.29.18" Aug 17 07:38:16 haa nanny[2244]: running command "rup" "193.70.29.19" Aug 17 07:38:21 haa pulse[2235]: gratuitous lvs arps finished This is the same log I got when I was having the persistence problem. Is lvs Defunct after startup? -- Keith Moore If lvs is defunct, there will be a core file dumped by lvs. Please attach the core file to this bug report (Make sure you mark it as binary). In case you don't know the easiest way to find the proper core file: find / -name core -exec file {} \; Look for the one created by lvs. You may need to kill the nanny's to allow lvs to finish dumping it's core. I've tried several things to duplicate your problem, without success, so I need to rely on your system to get the information. -- Keith Moore Yes! lvs (with the lvs.cf contains two services) is defunct (In fact I must kill the nanny's only when I start pulse with the lvs.cf mentioned). I cannot find any core file in the system also after killing the nanny's processes. Do you know why lvs doesn't produce the core file? Do you know a way to verify this? Is it possible I have a broken lvs executables (probably this is a stupid question)? To prevent Murphy:) here you find the sum result of lvs: 15938 31 Let me know how produce the core files and thank you very much for your cooperation. Ok, not getting a core file is moderatly annoying, I've never had that problem. If lvs is going defunct there is a bug. (The patch above fixes one of them). It's probably not a corrupt binary, that would act differently. Try running the following (As root): /usr/sbin/lvs --nodeamon --nofork -c /etc/lvs.cf The normal log will come to the screen, and it should dump a core file in your current directory if it dies. Once again, you may have to kill lvs and the nanny processes. The defunct is because the parent died, but there are still active children, so the kernel keeps the PID existing, but... defunct. If I can get a core, I can probably find the problem in about 2 minutes. -- Keith Moore Probably the problem is elsewhere:(:(:( Now (if I start from the console lvs like you described to me) all goes well and also the https service start!!! On the other hand if I start the services by pulse (in this case the IP address goes up) I give the same bad results. ????? Ok, I know what kind of bug that sounds like, I'm looking through that section of the code now. -- Keith Please attach your lvs binary. (Insure to mark it as binary). -- Keith Moore Created attachment 2597 [details]
The lvs executables you requested:)
Ah, I just realized that the config you posted and your real one aren't the same (IPs changed to protect the innocent, I'm sure) With the posted one, and your executable, everything still works fine for me. Please attach (not cut-and-paste) your config. Piranha is a bit touchy about the config, and any minor difference could totally break my test. This is a configuration related issue, probably having to do with using a specific set of options. I'm not sure how much testing goes into the direct mode of LVS. -- Keith Moore Created attachment 2598 [details]
my production lvs.cf
That did it. It IS the same bug, in your production file you have persistent = 900 for the https service, and this triggers the persistent bug. I was able to duplicate it immediately. Apply the above patch and it solves the problem. If you don't have the setup to apply and rebuild I'm sure we can get you an updated RPM. -- Keith Moore I beleive I tried also to avoid the persistence istruction in some of the many tries I have done. In any case if you can send me an updated RPM I will try. Thank you very much:):):):) Can you tell us if it is now solved? Can we close this bugzilla entry? Last known status: Problem was reproduced by Keith Moore, and corrected with the persistence patch. Keith sent ciofina an updated RPM. Also; Red Hat posted a new source RPM containing the latest patches. We are waiting to hear if this problem has been solved and can be closed The problem was solved. Thank you very much:):):) Sorry for the delay but when you send the new bin here was night:) (I am in Europe). |