Bug 185741

Summary: prboom immediately segfaults when attempting to play freedoom on FC4
Product: [Fedora] Fedora Reporter: Kevin Kofler <kevin>
Component: prboomAssignee: Wart <wart>
Status: CLOSED NEXTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: medium    
Version: 4CC: extras-qa
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-03-19 07:00:22 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Fix segfault on i386
none
Patch dropping p(v)snprintf and using the system versions
none
Patch fixing this for both i386 and x86_64 none

Description Kevin Kofler 2006-03-17 12:59:20 UTC
Description of problem:  
Prboom doesn't seem to work at all: when attempting to play freedoom on FC4 on 
a Pentium III with a Radeon 7000 graphics card, I get this: 
... 
P_Init: Init Playloop state. 
I_Init: Setting up machine state. 
I_InitSound:  configured audio device with 512 samples/slice 
I_InitSound: sound module ready 
S_Init: Setting up sound. 
HU_Init: Setting up heads up display. 
I_InitGraphics: 320x320 
I_SignalHandler: Exiting on signal: signal 11 
I_ShutdownMusic: removing /tmp/prboom-music-ud83vJ 
I_ShutdownSound: 
 
Version-Release number of selected component (if applicable): 
prboom-2.3.1-4.fc4 
 
How reproducible: Always 
 
Steps to Reproduce:  
1. prboom -iwad /usr/share/doom/freedoom.wad 
2. Watch the output on the console. 
 
Actual results: 
Prboom exits due to signal 11 (SIGSEGV). 
 
Expected results: 
The game starts up normally.

Comment 1 Kevin Kofler 2006-03-17 13:01:52 UTC
I forgot to mention that this is with the X.Org driver (with DRI enabled), not 
with fglrx. (I never installed or used fglrx or any other proprietary graphics 
driver.) 

Comment 2 Hans de Goede 2006-03-17 14:30:45 UTC
I saw this too yesterday when trying Alien TC on my i386-FC5 machine and I
thought this was an Alien TC problem. But I can reproduce it on x86_64 too by
installing the i386 version, investigating.


Comment 3 Hans de Goede 2006-03-17 14:42:17 UTC
Hmm,

el weirdo bug should be easy to fix if only I knew why it happens. Anyway I've
gotta leave my keybaord now. I'll probably take another look tomorrow feel free
to take a look untill then.


Comment 4 Kevin Kofler 2006-03-17 14:53:03 UTC
So the x86_64 version works? Then maybe the 64-bit patch is breaking 32-bit? 

Comment 5 Wart 2006-03-17 15:21:10 UTC
I can duplicate this on my FC5-i386 box as well, whereas FC-4 x86_64 has been
running great for the past week with no problems.  kevin might be right, this
might be something introduced by the 64-bit patch.  I'm setting up a build
environment on i386 now to test.

Comment 6 Wart 2006-03-17 15:45:29 UTC
I thought I recognized that pvsnprintf nonsense...

I'm still a little groggy this morning, so I don't have a full explanation, but
it seems that there is an issue with the way varargs are handled on i386 vs.
FC-4.  I've attached an updated -gcc4 patch that fixes the problem on i386 (but
probably won't compile on x86_64, or FC-4, or some other combination that I
don't recall).

Comment 7 Wart 2006-03-17 15:46:38 UTC
Created attachment 126282 [details]
Fix segfault on i386

Comment 8 Hans de Goede 2006-03-17 16:14:37 UTC
Created attachment 126283 [details]
Patch dropping p(v)snprintf and using the system versions

Well since I had todo some stuff away from my keyboard I had some time to think
about this, the result the attached patch which gets rid of p(v)snprintf and
just uses the system versions. I can check if your stripped gcc4 patch works on
FC5 -i386 and x86_64 if you want, but I believe your stripped gcc4 patch
together with this patch is both the easiest and the best solution.

Comment 9 Wart 2006-03-17 16:44:23 UTC
Thanks for the patch!

It looks like there's a conflict between the two patches.  I'll resolve the
conflict and update CVS accordingly.  Can you test on FC-5 once the patches are
in CVS?

Comment 10 Wart 2006-03-17 17:25:51 UTC
The psnprintf patch fixes the segfault, but prboom no longer responds to menu
selections.  I tested this on FC-4 x86_64 and FC-5 i386.  I tried it with the
full -gcc4 patch and the attached stripped version, but got the same results.

1) Start prboom
2) Move the cursor keys to select "Quit" from the main menu
3) Hit <enter> to quit the game

Expected result:  The game quits.
Observed result:  The menu selection sound effect is played, but the game
doesn't quit.  In fact, none of the main menu options can be selected.

I wonder if there's some subtle difference between psnprintf() and the system
snprintf()?

Comment 11 Kevin Kofler 2006-03-17 19:07:37 UTC
I looked at the -gcc4 patch and indeed the psnprintf patches don't seem correct  
to me (nor does just using the system snprintf). It looks to me like this 
function really does want a pointer to a va_list and not a plain va_list, which 
is why there is that "p" in "psnprintf". I think the idea is that psnprintf 
removes the arguments it consumed from the va_list, which is why it needs a 
pointer to the va_list (it's an in/out parameter). With the -gcc4 patch, the 
va_list * which is being passed to psnprintf is implicitly converted to a 
va_list (there may well be no warning about that due to internals of GCC's 
va_list implementation) and this is what causes the segfault. 

Comment 12 Kevin Kofler 2006-03-17 19:26:14 UTC
Why are the psnprintf changes in the -gcc4 patch needed at all? The unmodified 
psnprintf.c (at least the version at 
https://www.crowproductions.de/repos/prboom/tags/prboom_2_3_1/prboom2/src/ ) 
compiles without errors for me with FC4's GCC 4.0.2. (I haven't checked that 
the ASM output is valid though.) Is this a GCC 4.1 issue? 

Comment 13 Wart 2006-03-17 19:32:15 UTC
(In reply to comment #12)
> Why are the psnprintf changes in the -gcc4 patch needed at all? The unmodified 
> psnprintf.c (at least the version at 
> https://www.crowproductions.de/repos/prboom/tags/prboom_2_3_1/prboom2/src/ ) 
> compiles without errors for me with FC4's GCC 4.0.2. (I haven't checked that 
> the ASM output is valid though.) Is this a GCC 4.1 issue? 

It's a x86_64 issue.  It will compile on x86_64 with many warnings, but the
application will segfault immediately.  The naming of the patch is a little
misleading...

Comment 14 Kevin Kofler 2006-03-17 20:20:08 UTC
The va_* stuff is supposed to be a GCC builtin, its behavior shouldn't depend 
on the target. I suppose the x86_64 glibc is doing something weird to its 
va_lists. Maybe this could help: 
#undef va_list 
#undef va_start 
#undef va_arg 
#undef va_end 
#define va_list __builtin_va_list 
#define va_start __builtin_va_start 
#define va_arg __builtin_va_arg 
#define va_end __builtin_va_end 

Comment 15 Hans de Goede 2006-03-17 23:05:57 UTC
Created attachment 126291 [details]
Patch fixing this for both i386 and x86_64

Hmm,

According to the stdarg docs va_list can be almost anything and the functions
to manipulate it are not functions but macros and thus they could once again do
anything. There even is a special va_copy macro because even such a simple
thing as the the "=" operator is not guaranteed to work. Thus the current
psnprint code which passed va_list by reference is a bad idea. As it looks
right now the old code will work on i386 but not on x86_64 and on x86_64
va_list appearantly always gets passed by reference, and doing this double
leads to crashes.

So an option would be to conditonally apply the relevant part of the gcc4 part
on x86_64 only (ugly as hell).

Instead however I've opted to modify the code so that the va_list doesn't need
to get passed around anymore at all, see the above patch. Sorry this patch is
against the non stripped gcc4 patch it would have been better todo it against
the stripped version. This has been tested on FC-5 both i386 and x86_64 .

I also have found out why the menu doesn't work. The prboom code expects in
many places that it is able to strcat a formatted string to a buf by using:
sprintf(buf "%s%more%stuff", buf, .....

Which is not guaranteed to work at all and has actually been deliberatly broken
in glibc for a while to force people to not use this. We could fix this
assumption in prboom and then drop psnprintf completly.

The p BTW is not for pointer but for portable, the prototypes of pvsnprintf and
vsnprintf are 100% identical, both expect a va_list passed by value.

Besides the strcat problem, psnprintf also deviates in that it takes normal int
values on the stack for "%hd" or "%hx" etc, instead of shirt ints. I believe
this is a bug, but other code may depend on this erm feature.

Comment 16 Kevin Kofler 2006-03-17 23:15:57 UTC
> Besides the strcat problem, psnprintf also deviates in that it takes normal  
int  
> values on the stack for "%hd" or "%hx" etc, instead of shirt ints. I believe  
> this is a bug, but other code may depend on this erm feature.  
  
No, this is just the regular C default promotion, the standard snprintf does  
that too. If you pass a short to a vararg function, it will always be passed as  
an int. (Actually, the C standard says this happens for non-vararg functions  
too, but for these the as-if rule allows implementations to pass unpromoted  
shorts instead. But for vararg functions, the difference matters.) 
 
va_arg(short) is invalid (well, the compiler is supposed to compile it, but  
behavior at execution time is completely undefined), so if you do that with  
GCC's __builtin_va_arg, you'll get a mandatory warning and the code will be  
compiled into an abort.  

Comment 17 Wart 2006-03-18 05:02:11 UTC
I haven't taken a close look at the patch yet, but I was able to verify that it
works on FC-4 x86_64 and FC-5 i386, so I'm confident that it fixes the problem.
 I'll adjust the patch to work against the stripped -gcc patch and update CVS
tomorrow.

The only issue I saw on FC-5 i386 was a hypersensitive mouse, but this is also a
problem when running quake2.  I'm highly suspicious that it's some issue with my
virtual machine and not the application.  If someone could verify that the
package works before I request a new build I would appreciate it.

Comment 18 Hans de Goede 2006-03-18 08:45:09 UTC
Same problem (erratic mouse, FC5-i386) over here, clearing .prboom/prboom.config
fixed it, I guess the snprintf bugs also fscked up the config-file. (I got a
hunch when I noticed my default difficulty level for starting a new game had
changed).

With the config file removed it works 100% (for a short test run)


Comment 19 Hans de Goede 2006-03-18 08:47:43 UTC
Forget about the config file comment, the default mouse sensity is just a bit
high for my liking, lowering it works fine.



Comment 20 Wart 2006-03-18 17:54:29 UTC
(In reply to comment #18)
> Same problem (erratic mouse, FC5-i386) over here, clearing .prboom/prboom.config
> fixed it, I guess the snprintf bugs also fscked up the config-file. (I got a
> hunch when I noticed my default difficulty level for starting a new game had
> changed).

My keybindings got deleted from the existing prboom.config for some reason. 
Fortunately, the save games were unaffected.

-devel has been updated with the modified patches.  I tested it on FC-4 x86_64
and FC-5 i386.  If you could test again with the CVS sources, and don't run into
any problems, then I'll request a build and mark this bug as fixed.

Comment 21 Kevin Kofler 2006-03-19 05:04:02 UTC
Why wait for test results at all? You can hardly regress beyond a segfault 
immediately on startup which is what happens with the packages currently in 
Extras. 

Comment 22 Wart 2006-03-19 07:00:22 UTC
(In reply to comment #21)
> Why wait for test results at all? You can hardly regress beyond a segfault 
> immediately on startup which is what happens with the packages currently in 
> Extras. 

Point taken.

New builds pushed.

Comment 23 Kevin Kofler 2006-03-19 16:18:34 UTC
The build is now up in the repository and works fine.