524503 – xterm sigsegv in DamagedCells()

Bug 524503 - xterm sigsegv in DamagedCells()

Summary: xterm sigsegv in DamagedCells()

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	xterm
Sub Component:
Version:	rawhide
Hardware:	x86_64
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	---
Assignee:	Miroslav Lichvar
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2009-09-20 21:32 UTC by Kalev Lember
Modified:	2009-09-30 12:14 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2009-09-29 09:43:06 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Kalev Lember 2009-09-20 21:32:36 UTC

In my F-12 Rawhide installation xterm sometimes crashes without me doing anything special. I haven't found a reliable way to trigger the crash, but it tends to happen in every few hours.

I recompiled xterm with -O0 -g -ggdb switches to get a more useful backtrace, but besides recompiling, it is xterm-248-1.fc12.x86_64 without any other changes.

Program received signal SIGSEGV, Segmentation fault.
0x0000000000440eb8 in DamagedCells (screen=0x6ad858, n=186, klp=0x7fffffffdc08, krp=0x7fffffffdc04, row=108, col=0) at ./util.c:110
110         if (ld->charData[kl] == HIDDEN_CHAR) {
(gdb) bt
#0  0x0000000000440eb8 in DamagedCells (screen=0x6ad858, n=186, klp=0x7fffffffdc08, krp=0x7fffffffdc04, row=108, col=0) at ./util.c:110
#1  0x000000000043b9a4 in ScrnClearCells (xw=0x6ad6e0, row=53, col=0, len=186) at ./screen.c:704
#2  0x0000000000443fb6 in ClearRight (xw=0x6ad6e0, n=186) at ./util.c:1273
#3  0x00000000004442df in do_erase_line (xw=0x6ad6e0, param=-1, mode=0) at ./util.c:1362
#4  0x0000000000411c5e in doparsing (xw=0x6ad6e0, c=75, sp=0x6773a0) at ./charproc.c:1871
#5  0x0000000000414128 in VTparse (xw=0x6ad6e0) at ./charproc.c:3024
#6  0x0000000000417ff4 in VTRun (xw=0x6ad6e0) at ./charproc.c:4963
#7  0x000000000042c4e3 in main (argc=0, argv=0x7fffffffe0b0) at ./main.c:2414
(gdb) print ld
$1 = (LineData *) 0x7700c0
(gdb) print ld->charData
$5 = (CharData *) 0x1f4500000000
(gdb) print kl
$3 = 0
(gdb) print ld->charData[0]
Cannot access memory at address 0x1f4500000000

Comment 1 Thomas E. Dickey 2009-09-21 08:10:35 UTC

I don't see an obvious cause in the walkback, since
DamagedCells is in a context where it "should" work.
However, if there's some coding error from a different
point, then ld could be null.  In that case, adding
a null-pointer check just before line 110, and returning
false would help.

A "different point", for example, would be in the resizing
code, which is where some of the recent bug-fixes were.

Comment 2 Kalev Lember 2009-09-21 09:28:39 UTC

(In reply to comment #1)
> However, if there's some coding error from a different
> point, then ld could be null.  In that case, adding
> a null-pointer check just before line 110, and returning
> false would help.

I posted some more gdb output right after the backtrace where I examined a few variables. ld doesn't appear to be null, but ld->charData points to garbage.

Comment 3 Thomas E. Dickey 2009-09-21 09:37:15 UTC

I see (didn't notice).  Doing a "print *ld" would
show the other struct members, which I assume are
all incorrect.

Also, going up the stack to screen.c line 1274, there
should be a comparable ld value.

Does this crash in the same place, each time?

Comment 4 Kalev Lember 2009-09-21 15:25:22 UTC

Note that I'm examining another trace this time, so some values might be a bit different than in the first backtrace posted. It crashed in the exact same place though.


> I see (didn't notice).  Doing a "print *ld" would
> show the other struct members, which I assume are
> all incorrect.

(gdb) frame
#0  0x0000000000440eb8 in DamagedCells (screen=0x8af858, n=141, klp=0x7fffffffdc08, krp=0x7fffffffdc04, row=66, col=0) at ./util.c:110
110         if (ld->charData[kl] == HIDDEN_CHAR) {
(gdb) info local
ld = 0x9a63f0
kl = 0
kr = 141
(gdb) print *ld
$3 = {lineSize = 141, bufHead = 0 '\000', combSize = 2 '\002', attribs = 0x2000000020 <Address 0x2000000020 out of bounds>,
  color = 0x2000000020, charData = 0x2000000020, combData = {0x2000000020}}


> Also, going up the stack to screen.c line 1274, there
> should be a comparable ld value.

I can't see any frame at screen.c:1274 in this backtrace, but I'll show you *ld from util.c:1273 instead.

(gdb) up
#1  0x000000000043b9a4 in ScrnClearCells (xw=0x8af6e0, row=51, col=0, len=141) at ./screen.c:704
704         if_OPT_WIDE_CHARS(screen, {
(gdb) info local
kl = 9107168
kr = 51
screen = 0x8af858
flags = 0
(gdb) up
#2  0x0000000000443fb6 in ClearRight (xw=0x8af6e0, n=141) at ./util.c:1273
1273            ScrnClearCells(xw, screen->cur_row, screen->cur_col, len);
(gdb) info local
screen = 0x8af858
ld = 0x9a6120
len = 141
(gdb) print *ld
$6 = {lineSize = 141, bufHead = 0 '\000', combSize = 2 '\002',
  attribs = 0xb2d8f8 "\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200", color = 0xb2d988,
  charData = 0xb2daa4, combData = {0xb2dcd8}}


> Does this crash in the same place, each time?  

Yes, it always crashes in the same place.

Comment 5 Thomas E. Dickey 2009-09-21 20:53:08 UTC

thanks - I'm assuming that one of the limit checks for
index/row translation is hitting a boundary.  I'll pick
through this and see if I can spot the problem.

Comment 6 Thomas E. Dickey 2009-09-22 09:38:41 UTC

I think the problem is an error in the way I mapped the
row indices (mostly changes from 2007), which could break
if you happen to have xterm scrolled back, e.g., for a
selection, while an application is updating the screen.

For instance, the INX2ROW() on line 1223 of screen.c looks
like the immediate problem - I'll investigate and see.

Comment 7 Thomas E. Dickey 2009-09-22 22:24:25 UTC

...I meant INX2ROW on line 707 actually ;-)

Comment 8 Thomas E. Dickey 2009-09-22 22:40:23 UTC

But I am able to reproduce the problem (that INX2ROW is incorrect).
I'll review more from that slice and fix what I can find.

thanks

Comment 9 Kalev Lember 2009-09-23 06:16:29 UTC

Thanks for the very fast investigation, Thomas. Let me know if you need me to test anything else.

Comment 10 Thomas E. Dickey 2009-09-23 08:54:40 UTC

no - I have enough data (thanks).  I'm about 2/3 through reviewing
the places where I used that macro.

Comment 11 Miroslav Lichvar 2009-09-29 09:43:06 UTC

We have beta freeze today, so I've included upstream patch 248c + one hunk from 248e.

Should be fixed in xterm-248-2.fc12. Please reopen if it doesn't fix the issue. Thanks.

Comment 12 Thomas E. Dickey 2009-09-29 21:09:55 UTC

sounds good...

Comment 13 Kalev Lember 2009-09-30 12:14:06 UTC

I can confirm that the issue I was seeing is now fixed. Thanks!

Note You need to log in before you can comment on or make changes to this bug.