RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1008902 - ptp: phc2sys sys offset suddenly increasing very large
Summary: ptp: phc2sys sys offset suddenly increasing very large
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: linuxptp
Version: 6.5
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Jiri Benc
QA Contact: Dong Zhu
URL:
Whiteboard:
: 1011356 1011363 1011367 1011368 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-09-17 10:12 UTC by Dong Zhu
Modified: 2014-03-17 01:48 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-09-24 09:47:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
ptp4l and phc2sys logs from slave (58.54 KB, application/x-gzip)
2013-09-18 05:17 UTC, Dong Zhu
no flags Details
Comment (113.77 KB, text/plain)
2013-09-17 10:12 UTC, Dong Zhu
no flags Details

Description Dong Zhu 2013-09-17 10:12:03 UTC
Created attachment 915768 [details]
Comment

(This comment was longer than 65,535 characters and has been moved to an attachment by Red Hat Bugzilla).

Comment 1 Jiri Benc 2013-09-17 10:21:40 UTC
Could you attach the full ptp4l log from the slave?

Comment 3 Dong Zhu 2013-09-18 05:17:15 UTC
Created attachment 799074 [details]
ptp4l and phc2sys logs from slave

Comment 4 Miroslav Lichvar 2013-09-18 07:52:53 UTC
In the phc2sys log the ~35 second offset appears around second 1926, but in the ptp4l log there doesn't seem to be anything interesting around that second. This looks like some other process may be setting the system clock.

Any chance there is ntpd running or is ntpdate/hwclock/rdate called periodically?

Comment 6 Miroslav Lichvar 2013-09-18 08:46:46 UTC
Ok, ntpd probably just stepped the clock after its stepout interval (900  seconds).

But there is a strange offset in the slave ptp4l log at 1438. Is the master running ntpd. Is the PHC synchronized by phc2sys from the system clock?

Also, any explanation why the master is dropping out? Was it restarted or blocked by the firewall?

Comment 7 Jiri Benc 2013-09-18 09:11:46 UTC
(In reply to Miroslav Lichvar from comment #6)
> But there is a strange offset in the slave ptp4l log at 1438.

Looking at the three logs, it seems likely that at that time, Sync message was delayed by a switch by 1.7 ms, subsequent Delay_Req and Delay_Resp messages were delayed by unknown time and no further traffic got through until ~21 seconds later.

Are you doing anything with the communication path (switch, network cables, etc.) during the testing?

Comment 9 Jiri Benc 2013-09-18 13:07:56 UTC
Stopping ntpd helped with one of the problems (the one described in comment 4).

The problem described in comment 7 still remains but it's a separate one. Miroslav captured packets on both machines and the captures confirm my theory (see comment 7). This can be caused by hardware at the master not sending the frames, the hardware at the slave not receiving the frames properly, or switch discarding the frames.

Comment 10 Jiri Benc 2013-09-18 13:44:28 UTC
Tried to find out which case (see previous comment) it is but it seems the problem does not reproduce with a ping running in parallel.

There doesn't seem to be anything wrong with ptp4l/phc2sys. If this is a Linux issue, then the only point that could be wrong is the NIC driver. I suspect more a hardware problem, though.

Could you try with a different switch? Or with a master running on a different NIC?

Comment 15 Jiri Benc 2013-09-24 08:20:05 UTC
Thanks for doing the testing. I'm very much inclined to say this was a problem with the switch and its handling of multicast packets. The only thing preventing me from saying for sure this is not a RHEL bug is Jimmy Pan reproducing the problem with igb cards and Cisco Catalyst 3750 switch.

I'll try a few things with a modified linuxptp.

Comment 16 Jiri Benc 2013-09-24 09:02:32 UTC
For the record, cannot reproduce it anymore on the machines that showed the problem originally.

Comment 17 Jiri Benc 2013-09-24 09:38:04 UTC
For the record, Jimmy Pan experienced the problem on the same machines.

Comment 19 Jiri Benc 2013-10-01 09:14:14 UTC
*** Bug 1011356 has been marked as a duplicate of this bug. ***

Comment 20 Jiri Benc 2013-10-01 09:14:17 UTC
*** Bug 1011367 has been marked as a duplicate of this bug. ***

Comment 21 Jiri Benc 2013-10-01 09:14:43 UTC
*** Bug 1011363 has been marked as a duplicate of this bug. ***

Comment 22 Jiri Benc 2013-10-01 09:14:46 UTC
*** Bug 1011368 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.