501786 – bitlbee segv's when trying to connect to jabber server

Bug 501786 - bitlbee segv's when trying to connect to jabber server

Summary: bitlbee segv's when trying to connect to jabber server

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Fedora EPEL
Classification:	Fedora
Component:	bitlbee
Sub Component:
Version:	el5
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Robert Scheck
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:	ActualBug
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2009-05-20 17:35 UTC by Ben Woodard
Modified:	2018-04-11 13:08 UTC (History)
CC List:	5 users (show)
Fixed In Version:	1.2.3-4.fc10
Clone Of:
Environment:
Last Closed:	2009-08-16 22:50:48 UTC
Type:	---
Embargoed:

Attachments	(Terms of Use)
Fix null pointer dereference. (789 bytes, patch) 2009-07-15 02:48 UTC, Ricky Zhou	no flags	Details \| Diff
View All

Description Ben Woodard 2009-05-20 17:35:24 UTC

Description of problem:
Was trying to connect use bitlbee to connect to our jabber server but it segv's every time I try to enable the account.

Version-Release number of selected component (if applicable):
bitlbee-1.2.3-1.el5

How reproducible:
* neb sets mode +s neb
* Now talking on &bitlbee
* localhost.localdomain sets mode +t &bitlbee
* Topic for &bitlbee is: Welcome to the control channel. Type help for help information.
<root> Welcome to the BitlBee gateway!
<root> 
<root> If you've never used BitlBee before, please do read the help information using the help command. Lots of FAQs are answered there.
<root> If you already have an account on this server, just use the identify command to identify yourself.
<neb> account add jabber woodard9.gov <password>
<root> Account successfully added
<neb> account on
* Disconnected (Remote host closed socket).

I doubt that there is anything fancy about our jabber server.

Steps to Reproduce:
1. account add jabber woodard9.gov <password>
2. account on

  
Actual results:
bitlbee segv's

Expected results:
It works.

Additional info:
[ben@apbptr xinetd.d]$ sudo strace -fp 4990
Password: 
Process 4990 attached - interrupt to quit
gettimeofday({1242840364, 74067}, NULL) = 0
gettimeofday({1242840364, 74301}, NULL) = 0
poll([{fd=0, events=POLLIN}], 1, 115614) = 1 ([{fd=0, revents=POLLIN}])
gettimeofday({1242840374, 313708}, NULL) = 0
read(0, "WHO &bitlbee\r\n", 512)        = 14
fstat(0, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
fcntl(0, F_GETFL)                       = 0x802 (flags O_RDWR|O_NONBLOCK)
gettimeofday({1242840374, 314748}, NULL) = 0
poll([{fd=0, events=POLLOUT}, {fd=0, events=POLLIN}], 2, 105374) = 1 ([{fd=0, revents=POLLOUT}])
gettimeofday({1242840374, 315180}, NULL) = 0
write(0, ":localhost.localdomain 352 neb &"..., 282) = 282
gettimeofday({1242840374, 315685}, NULL) = 0
poll([{fd=0, events=POLLIN}], 1, 105373) = 1 ([{fd=0, revents=POLLIN}])
gettimeofday({1242840384, 389573}, NULL) = 0
read(0, "PING LAG1242790384389083\r\n", 512) = 26
fstat(0, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
fcntl(0, F_GETFL)                       = 0x802 (flags O_RDWR|O_NONBLOCK)
gettimeofday({1242840384, 390270}, NULL) = 0
poll([{fd=0, events=POLLOUT}, {fd=0, events=POLLIN}], 2, 95298) = 1 ([{fd=0, revents=POLLOUT}])
gettimeofday({1242840384, 390575}, NULL) = 0
write(0, ":localhost.localdomain PONG loca"..., 72) = 72
gettimeofday({1242840384, 391017}, NULL) = 0
poll([{fd=0, events=POLLIN}], 1, 95297) = 1 ([{fd=0, revents=POLLIN}])
gettimeofday({1242840386, 605207}, NULL) = 0
read(0, "PRIVMSG &bitlbee :account add ja"..., 512) = 80
fstat(0, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
fcntl(0, F_GETFL)                       = 0x802 (flags O_RDWR|O_NONBLOCK)
gettimeofday({1242840386, 605940}, NULL) = 0
poll([{fd=0, events=POLLOUT}, {fd=0, events=POLLIN}], 2, 93082) = 1 ([{fd=0, revents=POLLOUT}])
gettimeofday({1242840386, 606245}, NULL) = 0
write(0, ":root!root"..., 79) = 79
gettimeofday({1242840386, 606769}, NULL) = 0
poll([{fd=0, events=POLLIN}], 1, 93081) = 1 ([{fd=0, revents=POLLIN}])
gettimeofday({1242840391, 132935}, NULL) = 0
read(0, "PRIVMSG &bitlbee :account on\r\n", 512) = 30
fstat(0, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
fcntl(0, F_GETFL)                       = 0x802 (flags O_RDWR|O_NONBLOCK)
open("/etc/resolv.conf", O_RDONLY)      = 4
fstat(4, {st_mode=S_IFREG|0664, st_size=226, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b122ef5f000
read(4, "#@VPNC_GENERATED@ -- this file i"..., 4096) = 226
read(4, "", 4096)                       = 0
close(4)                                = 0
munmap(0x2b122ef5f000, 4096)            = 0
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("172.16.52.28")}, 28) = 0
fcntl(4, F_GETFL)                       = 0x2 (flags O_RDWR)
fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
gettimeofday({1242840391, 137481}, NULL) = 0
poll([{fd=4, events=POLLOUT}], 1, 0)    = 1 ([{fd=4, revents=POLLOUT}])
sendto(4, "\252x\1\0\0\1\0\0\0\0\0\0\f_xmpp-client\4_tcp\nc"..., 55, MSG_NOSIGNAL, NULL, 0) = 55
poll([{fd=4, events=POLLIN}], 1, 5000)  = 1 ([{fd=4, revents=POLLIN}])
recvfrom(4, "\252x\201\203\0\1\0\0\0\1\0\0\f_xmpp-client\4_tcp\nc"..., 512, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("172.16.52.28")}, [16]) = 106
close(4)                                = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
write(0, "\r\nERROR :Error: Fatal signal rec"..., 67) = 67
tgkill(4990, 4990, SIGSEGV)             = 0
rt_sigreturn(0x137e)                    = 383834016
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
Process 4990 detached

So it seems to be right after it gets the packet from the jabber server back.

Comment 1 Robert Scheck 2009-05-24 14:55:52 UTC

Looks like this is an issue with the SRV thing :-(

[16:12:11] < wilmer> rsc: That's on SRV lookups I think?
[16:12:32] < wilmer> Someone reported a problem like that here a few days ago at least.
[16:12:59] < rsc> wilmer: is it on SRV lookups?
[16:13:24] < wilmer> Yeah.
[16:13:40] < wilmer> It's fixed by manually setting the server variable
[16:13:45] < wilmer> Which suppresses SRV lookups.
[16:13:55] < wilmer> Also note that the crash happens after receiving response from the DNS server.
[16:15:25] < rsc> wilmer: okay, that means for me now what?
[16:35:59] < wilmer> rsc: Didn't you modify that code?
[16:37:29] < rsc> wilmer: I added a contributed patch, yes. But are you sure, the SRV records are causing this?
[16:40:08] < wilmer> rsc: Quite sure; the crash doesn't happen if the user sets the server variable manually.
[16:40:39] < rsc> wilmer: details? I don't get your point.
[16:42:49] < wilmer> rsc: What kind of details are you looking for?
[16:43:05] < rsc> wilmer: What do you mean with setting server variable manually?
[16:43:36] < wilmer> Oh, that; specifying the name of the server to connect to. account set x/server chat-green.llnl.gov   in this case.
[16:43:58] < wilmer> If that var is set, BitlBee won't try to guess/look up the servername and just use the variable instead.
[16:44:49] < wilmer> rsc: Someone else reported a problem very similar to this (also on Fedora, possibly a new release?) and setting the servar var solved the problem.
[16:48:29] < rsc> wilmer: mmh. Ideas how to fix that?
[16:52:33] < wilmer> rsc: We should probably first get something better than a strace dump, like a full stacktrace. :-)

Ben, can you test, whether setting the server variable manually solves your 
problem? And can you create a core dump with full backtrace/stacktrace?

Matěj, any ideas? The patch (I still like it) was initially contributed by you.

Comment 2 Matěj Cepl 2009-05-25 15:49:41 UTC

I think setting system to generate coredumps (e.g., it is switched off in Fedora/RHEL) and then analyze coredump with gdb postmortem is probably the easiest thing to do. And yes, that would be one more reason why this patch should be accepted upstream, so that it doesn't get out of sync with the rest of the code (which is the only reason which comes to my mind why this happens, because it is actually quite simple patch otherwise).

Comment 3 Matěj Cepl 2009-05-25 22:48:31 UTC

Oh sorry, I missed that reporter of the bug isn't Robert. OK, then in order to switch on generating of coredumps, do the following:

a) install bitlbee-debuginfo from http://download.fedora.redhat.com/pub/epel/5Server/i386/debug/bitlbee-debuginfo-1.2.3-1.el5.i386.rpm (or whatever is the appropriate for your system)
b) edit /etc/security/limits.conf and add there two lines

*               soft    core            0
*               hard    core            0

Also, you may need to comment out ulimit -c 0 in /etc/profile or somewhere else (grep -rs /etc is your friend).
c) quit your IRC client and restart xinetd (or inetd) daemon

When you try the connection and bitlbee crashes, you should get file name core.<number> either in / or in /usr/sbin/ (or wherever is a binary of bitlbee located).

Run over that core file this

gdb -nx --batch -ex 'backtrace' /usr/sbin/bitlbee core.<number> \
    >/tmp/bitlbee-log.txt 2>/dev/null

Please attach /tmp/bitlbee-log.txt to this bug report as an separate uncompressed attachment.

We will review this issue again once you've had a chance to attach this information.

Thanks in advance.

Comment 4 Ben Woodard 2009-05-27 19:43:05 UTC

BTW it is vastly easier to just have gdb attach to the running bitlbee process after xinetd runs it.



Program received signal SIGSEGV, Segmentation fault.
0x0000003ed247b59b in memcpy () from /lib64/libc.so.6
(gdb) bt
#0  0x0000003ed247b59b in memcpy () from /lib64/libc.so.6
#1  0x000000000041e0a8 in srv_lookup ()
#2  0x000000000042a118 in ?? ()
#3  0x0000000000415cb6 in ?? ()
#4  0x0000000000416268 in root_command_string ()
#5  0x00000000004101f9 in irc_send ()
#6  0x00000000004113a4 in irc_process ()
#7  0x000000000040d82d in bitlbee_io_current_client_read ()
#8  0x0000003eae62cdb4 in g_main_context_dispatch ()
   from /lib64/libglib-2.0.so.0
#9  0x0000003eae62fc0d in ?? () from /lib64/libglib-2.0.so.0
#10 0x0000003eae62ff1a in g_main_loop_run () from /lib64/libglib-2.0.so.0
#11 0x0000000000418af9 in main ()

Comment 5 Ben Woodard 2009-05-27 20:00:29 UTC

With debuginfo:

(gdb) bt
#0  0x0000003ed247b59b in memcpy () from /lib64/libc.so.6
#1  0x000000000041e0a8 in srv_lookup (service=<value optimized out>, protocol=<value optimized out>, 
    domain=<value optimized out>) at srv.c:207
#2  0x000000000042a118 in jabber_login (acc=0x6d62090) at jabber.c:190
#3  0x0000000000415cb6 in cmd_account (irc=0x6d4d2d0, cmd=<value optimized out>) at root_commands.c:377
#4  0x0000000000416268 in root_command_string (irc=0x410, u=<value optimized out>, 
    command=0x414 <Address 0x414 out of bounds>, flags=<value optimized out>) at root_commands.c:77
#5  0x00000000004101f9 in irc_send (irc=0x6d4d2d0, nick=0x6d4d280 "root", s=0x6d4d280 "root", flags=0) at irc.c:1108
#6  0x00000000004113a4 in irc_process (irc=0x6d4d2d0) at irc.c:413
#7  0x000000000040d82d in bitlbee_io_current_client_read (data=0x6d4d2d0, fd=0, cond=<value optimized out>)
    at bitlbee.c:184
#8  0x0000003eae62cdb4 in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
#9  0x0000003eae62fc0d in ?? () from /lib64/libglib-2.0.so.0
#10 0x0000003eae62ff1a in g_main_loop_run () from /lib64/libglib-2.0.so.0
#11 0x0000000000418af9 in main (argc=<value optimized out>, argv=0x7fff290209e8) at unix.c:135

Comment 6 Ricky Zhou 2009-07-15 02:48:40 UTC

Created attachment 351711 [details]
Fix null pointer dereference.

Hi, here's a patch to srv.c which solves the issue for me.  The issue is triggered for me when the conditions in 

    if ((((HEADER *)answer)->rcode)==NOERROR && (count=ntohs(((HEADER *)answer)->ancount))) {

are false, which causes:

    *reply = *list;

with list = NULL.

This happens when no SRV records are returned.

Comment 7 Robert Scheck 2009-07-15 06:53:09 UTC

Wow, thank you. I'll fire a new build in the next few days together with
another nearly solved bitlbee packaging issue (bitlbee-devel subpackage).

Comment 8 Fedora Update System 2009-08-16 22:52:51 UTC

bitlbee-1.2.3-4.el4 has been submitted as an update for Fedora EPEL 4.
http://admin.fedoraproject.org/updates/bitlbee-1.2.3-4.el4

Comment 9 Fedora Update System 2009-08-16 22:52:51 UTC

bitlbee-1.2.3-4.el5 has been submitted as an update for Fedora EPEL 5.
http://admin.fedoraproject.org/updates/bitlbee-1.2.3-4.el5

Comment 10 Fedora Update System 2009-08-16 22:53:02 UTC

bitlbee-1.2.3-4.fc11 has been submitted as an update for Fedora 11.
http://admin.fedoraproject.org/updates/bitlbee-1.2.3-4.fc11

Comment 11 Fedora Update System 2009-08-17 20:44:01 UTC

bitlbee-1.2.3-4.fc10 has been submitted as an update for Fedora 10.
http://admin.fedoraproject.org/updates/bitlbee-1.2.3-4.fc10

Comment 12 Fedora Update System 2009-09-02 20:55:07 UTC

bitlbee-1.2.3-4.el5 has been pushed to the Fedora EPEL 5 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 13 Fedora Update System 2009-09-02 20:56:21 UTC

bitlbee-1.2.3-4.el4 has been pushed to the Fedora EPEL 4 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 14 Fedora Update System 2009-09-03 00:31:19 UTC

bitlbee-1.2.3-4.fc11 has been pushed to the Fedora 11 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 15 Fedora Update System 2009-09-03 00:34:04 UTC

bitlbee-1.2.3-4.fc10 has been pushed to the Fedora 10 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.