Bug 197440 - ppp occasionally dies
ppp occasionally dies
Status: CLOSED WONTFIX
Product: Fedora
Classification: Fedora
Component: ppp (Show other bugs)
5
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Thomas Woerner
: Reopened
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-07-01 11:20 EDT by Bruno Wolff III
Modified: 2007-11-30 17:11 EST (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-08-30 04:43:51 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
strace of pppd (65.03 KB, text/plain)
2006-07-19 06:25 EDT, Bruno Wolff III
no flags Details

  None (edit)
Description Bruno Wolff III 2006-07-01 11:20:50 EDT
Description of problem:
I use ppp in dial on demand mode and it occassionally locks up and then dies.
Typically I will see the modem still indicating outbound packets, but no inbound
packets and then a little while later the ppp interface will disappear. I can
use ifdown and ifup to get things working again. This happens sporadically.
Sometimes I will see if happen a couple of times in an hour and other times I
can be connected for over 8 hours without having the problem.

I first noticed this behavior after upgrading to the 2.6.16 kernel, however some
other things changed around that time as well. My dialup provider changed who
they buy dialup service from. So it might have been a preexisting bug that
wasn't triggered by the previous service provider. I also started running the
zaptel kernel module (for asterisk) around that time and there could potentially
be a problem with it.

I probably should have tried going back to a 2.6.15 kernel for a while, but
these seem to have been removed from updates now so I can't try this easily.
(I think the last 2.6.15 kernel update for FC5 might have some security issues,
but they would probably be low risk for me and I could try this if it would help
and I can get a copy.)

I don't know if this is related or not, but I also have been getting the
following log message, though I don't know that it correlates with when I have
the described problem:
PPP: VJ uncompressed error

Version-Release number of selected component (if applicable):
ppp-2.4.3-6.2.1
2.6.16 and 2.6.17 smp kernels


How reproducible:
Sporadic

Steps to Reproduce:
1. Make a dial up connection using ppp
2. wait
3.
  
Actual results:
ppp occasionally will lock up and then usally die shortly there after.


Expected results:
No problems.


Additional info:
Comment 1 Thomas Woerner 2006-07-05 13:01:47 EDT
Here are some questions:

Is pppd dying at this moment?
Is it reproducable without that additional kernel module?

Please use a debugger or strace on pppd if it is happening again.
Comment 2 Bruno Wolff III 2006-07-05 13:31:15 EDT
I saw at least one instance over last weekend.
I am still just playing with Asterisk and can uninstall the zaptel modules for a
while to see if that stops the crashes.
The next time I see a hang I will see if I can attach a debugger before the ppp
interface disappears. That will be a learning process me, so I might botch an
attempt or a few, before I get it right.
It isn't clear to me that the hangs are a Fedora problem, but once they occur I
should be able to do a sighup to get ppp to hang up the connection without
losing the ppp interface.
Comment 3 Bruno Wolff III 2006-07-07 12:45:49 EDT
Since it has been a few days I thought I should give an update.
I upgraded to the 2.6.17-2145 kernel and uninstalled the zaptel kernel module.
I haven't seen a recurrence of the problem yet, but only have spent about 4
hours connected using ppp, so it is a bit early to say that zaptel was the problem.
I should rack up some more hours over the weekend. If it doesn't occur for over
something like 16 hours, then I will try putting zaptel back and see if the
failures start up again.
If it does turn out to be related to zaptel, it may not be the kernel module
itself. I believe that the device (TDM400) it runs generates 8000 interrupts per
second and the problem could also be related to that. I'll probably be over my
head trying to debug zaptel if things point that way, but I might ask you guys
for some general advice on how to approach doing that if needed.
Comment 4 Bruno Wolff III 2006-07-08 00:53:40 EDT
It died again. zaptel wasn't installed and I was running the 2.6.17 2145
smp kernel. I wasn't around when it happened, so I didn't get to try to do
anything between the network hang and the interface disappearing. I'll now
try to work on capturing some information about how and/or where it is dying.
Comment 5 Thomas Woerner 2006-07-08 10:31:26 EDT
Please check if pppd is dying or if there is a problem with the ppp kernel driver.

Is pppd still running? Which state has it? Please have a look at the ps output
and use a debugger and/or strace on pppd if it still there.
Comment 6 Bruno Wolff III 2006-07-19 06:25:48 EDT
Created attachment 132671 [details]
strace of pppd

I finally got an strace of pppd. I X'd out a prefix for my dialup password in
one of the writes, but otherwise the file is intact.
Comment 7 Bruno Wolff III 2006-08-12 12:06:25 EDT
I am still seeing the ppp daemon crash. Is there anything else that would be
helpful for isolating what is happening?
P.S. I am not sure I checked the "I am providing the requested information for
this bug." last time, but am now as the status stayed 'NEEDINFO' after I
uploaded the strace data.
Comment 8 Bruno Wolff III 2006-08-12 12:08:45 EDT
Apparebtly I had something else checked or that check box didn't do what I
thought. The ticket should be still open as the problem continues to happen.
Comment 9 Bruno Wolff III 2007-06-22 10:44:43 EDT
So far I haven't be seeing this in Fedora 7. Occasionally the connection hangs,
but the pppd process seems to stay usable if I hang up and reconnect. I haven't
been running F7 very long, so it is possible the problem may just be happening
less often, but based on what i am seeing, I think you can close this when FC5
is EOL'd. (I never ended up testing dial up in FC6, so I don't know if there is
a problem there or not.)
Comment 10 Bruno Wolff III 2007-06-26 11:00:15 EDT
I think I have seen this once now in that I found the pppd process no longer
running. I wasn't actively using the machine when it happened. I don't think
there is any reason it should have stopped running.
Things are still a lot better than with FC5.
Comment 11 Thomas Woerner 2007-08-30 04:43:51 EDT
Please verify this with a newer version of Red Hat Enterprise Linux or
Fedora Core and reopen it against the new version if there is still a 
problem.

Closing as "WONTFIX" for now.
Comment 12 Bruno Wolff III 2007-08-30 11:25:51 EDT
I am pretty sure the problem I was having here is fixed in F7. I have only seen
the one possible reocurrance and I am not even sure about that one. The common 
triggering problem that occurred previously is still occurring. My ISP will stop
sending packets to my modem sometimes. But now when that happens ppp stays up,
whereas before it almost always died. The communication problem is not going to
be isolatable as to whose fault is so there is no point in opening a ticket
versus Fedora for it. (My suspicion is that the connection isn't resynching
sometimes after retrains.)

Note You need to log in before you can comment on or make changes to this bug.