Hide Forgot
In my F-12 Rawhide installation xterm sometimes crashes without me doing anything special. I haven't found a reliable way to trigger the crash, but it tends to happen in every few hours. I recompiled xterm with -O0 -g -ggdb switches to get a more useful backtrace, but besides recompiling, it is xterm-248-1.fc12.x86_64 without any other changes. Program received signal SIGSEGV, Segmentation fault. 0x0000000000440eb8 in DamagedCells (screen=0x6ad858, n=186, klp=0x7fffffffdc08, krp=0x7fffffffdc04, row=108, col=0) at ./util.c:110 110 if (ld->charData[kl] == HIDDEN_CHAR) { (gdb) bt #0 0x0000000000440eb8 in DamagedCells (screen=0x6ad858, n=186, klp=0x7fffffffdc08, krp=0x7fffffffdc04, row=108, col=0) at ./util.c:110 #1 0x000000000043b9a4 in ScrnClearCells (xw=0x6ad6e0, row=53, col=0, len=186) at ./screen.c:704 #2 0x0000000000443fb6 in ClearRight (xw=0x6ad6e0, n=186) at ./util.c:1273 #3 0x00000000004442df in do_erase_line (xw=0x6ad6e0, param=-1, mode=0) at ./util.c:1362 #4 0x0000000000411c5e in doparsing (xw=0x6ad6e0, c=75, sp=0x6773a0) at ./charproc.c:1871 #5 0x0000000000414128 in VTparse (xw=0x6ad6e0) at ./charproc.c:3024 #6 0x0000000000417ff4 in VTRun (xw=0x6ad6e0) at ./charproc.c:4963 #7 0x000000000042c4e3 in main (argc=0, argv=0x7fffffffe0b0) at ./main.c:2414 (gdb) print ld $1 = (LineData *) 0x7700c0 (gdb) print ld->charData $5 = (CharData *) 0x1f4500000000 (gdb) print kl $3 = 0 (gdb) print ld->charData[0] Cannot access memory at address 0x1f4500000000
I don't see an obvious cause in the walkback, since DamagedCells is in a context where it "should" work. However, if there's some coding error from a different point, then ld could be null. In that case, adding a null-pointer check just before line 110, and returning false would help. A "different point", for example, would be in the resizing code, which is where some of the recent bug-fixes were.
(In reply to comment #1) > However, if there's some coding error from a different > point, then ld could be null. In that case, adding > a null-pointer check just before line 110, and returning > false would help. I posted some more gdb output right after the backtrace where I examined a few variables. ld doesn't appear to be null, but ld->charData points to garbage.
I see (didn't notice). Doing a "print *ld" would show the other struct members, which I assume are all incorrect. Also, going up the stack to screen.c line 1274, there should be a comparable ld value. Does this crash in the same place, each time?
Note that I'm examining another trace this time, so some values might be a bit different than in the first backtrace posted. It crashed in the exact same place though. > I see (didn't notice). Doing a "print *ld" would > show the other struct members, which I assume are > all incorrect. (gdb) frame #0 0x0000000000440eb8 in DamagedCells (screen=0x8af858, n=141, klp=0x7fffffffdc08, krp=0x7fffffffdc04, row=66, col=0) at ./util.c:110 110 if (ld->charData[kl] == HIDDEN_CHAR) { (gdb) info local ld = 0x9a63f0 kl = 0 kr = 141 (gdb) print *ld $3 = {lineSize = 141, bufHead = 0 '\000', combSize = 2 '\002', attribs = 0x2000000020 <Address 0x2000000020 out of bounds>, color = 0x2000000020, charData = 0x2000000020, combData = {0x2000000020}} > Also, going up the stack to screen.c line 1274, there > should be a comparable ld value. I can't see any frame at screen.c:1274 in this backtrace, but I'll show you *ld from util.c:1273 instead. (gdb) up #1 0x000000000043b9a4 in ScrnClearCells (xw=0x8af6e0, row=51, col=0, len=141) at ./screen.c:704 704 if_OPT_WIDE_CHARS(screen, { (gdb) info local kl = 9107168 kr = 51 screen = 0x8af858 flags = 0 (gdb) up #2 0x0000000000443fb6 in ClearRight (xw=0x8af6e0, n=141) at ./util.c:1273 1273 ScrnClearCells(xw, screen->cur_row, screen->cur_col, len); (gdb) info local screen = 0x8af858 ld = 0x9a6120 len = 141 (gdb) print *ld $6 = {lineSize = 141, bufHead = 0 '\000', combSize = 2 '\002', attribs = 0xb2d8f8 "\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200", color = 0xb2d988, charData = 0xb2daa4, combData = {0xb2dcd8}} > Does this crash in the same place, each time? Yes, it always crashes in the same place.
thanks - I'm assuming that one of the limit checks for index/row translation is hitting a boundary. I'll pick through this and see if I can spot the problem.
I think the problem is an error in the way I mapped the row indices (mostly changes from 2007), which could break if you happen to have xterm scrolled back, e.g., for a selection, while an application is updating the screen. For instance, the INX2ROW() on line 1223 of screen.c looks like the immediate problem - I'll investigate and see.
...I meant INX2ROW on line 707 actually ;-)
But I am able to reproduce the problem (that INX2ROW is incorrect). I'll review more from that slice and fix what I can find. thanks
Thanks for the very fast investigation, Thomas. Let me know if you need me to test anything else.
no - I have enough data (thanks). I'm about 2/3 through reviewing the places where I used that macro.
We have beta freeze today, so I've included upstream patch 248c + one hunk from 248e. Should be fixed in xterm-248-2.fc12. Please reopen if it doesn't fix the issue. Thanks.
sounds good...
I can confirm that the issue I was seeing is now fixed. Thanks!