Bug 433403

Summary: ntp can not set time from gpsd
Product: [Fedora] Fedora Reporter: David <webmaster>
Component: gpsdAssignee: Douglas E. Warner <silfreed>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: low    
Version: 8CC: mlichvar
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-02-28 12:39:36 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
print the ntpd shared memory
none
/usr/sbin/ntpd -n -D 4 &> log
none
New /usr/sbin/ntpd -n -D 4 &> log with new gpsd rpm none

Description David 2008-02-19 02:11:54 UTC
Description of problem:
Seems gpsd is now once again unable to set the time to ntpd.  There are no
messages in /var/log/audit but its no longer working...

I suspect something in either the kernel or ntp has changed to stop shared
memory access.

I have proven gpsd is running on port 2947 and all enquires I get back time,
position and even reports a PPS lock.

Been running this on FC6, F7 and F8 and gpsd is untouched so its the kernel or
ntp that has changed.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Miroslav Lichvar 2008-02-19 12:22:57 UTC
How exactly it's not working, only PPS doesn't work? What does ntpq -pn print?
Any unusual messages in syslog?

Comment 2 David 2008-02-19 21:57:22 UTC
No its completely stopped like I stopped gpsd.  I had pps and everything running
perfectly.  I am using a custom gpsd that uses motorola binary mode and its been
untouched since the start of F7.

[root@primary ~]# ntpq -pn
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 127.127.28.0    .GPS.            0 l    -   16    0    0.000    0.000   0.001
 127.127.28.1    .PPS.            0 l    -   16    0    0.000    0.000   0.001
+60.240.81.28    121.237.90.225   2 u   22  512  377   50.256   26.834  14.359
+202.81.208.37   216.176.180.82   3 u   64  512  377   73.170    6.010   0.718
*203.19.252.1    203.35.83.242    2 u  196  512  377   35.464    5.624   0.317
 192.168.0.255   .BCST.          16 u    -   64    0    0.000    0.000   0.001
 10.255.255.255  .BCST.          16 u    -   64    0    0.000    0.000   0.001
 224.0.1.1       .MCST.          16 u    -   64    0    0.000    0.000   0.001
[root@primary ~]#

Its not GPSD, its running on perfectly:
(Telnet to 2947)
a
GPSD,A=178.300
GPSD
d
GPSD,D=2008-02-20T21:47:47.00Z
GPSD
m
GPSD,M=3
GPSD
o
GPSD,O=GGA 1203544096.000 0.005 -33.741387 151.000045 178.40 ? ? 333.4000 0.000
0.000 ? 30.40 ? 3
GPSD
p
GPSD,P=-33.741387 151.000043
GPSD
q
GPSD,Q=8 1.58 1.90 0.68 0.70 1.73
s
GPSD,S=1

Basically its got a fix, output time, and shows a 8 satellite fix.


I also cant see anything wrong (at all).  gpsd is running and has a fix.  Its
like the shared memory is being blocked, or ntpd just refuses to look at the gps.

Tried rebooting, disabling selinux, I even fired up gpsd in debug mode and the
/var/log/messages show all the data perfectly.

Problem is I dont know exactly when this went south.  It is certainly possible
it was more that 2 kernels ago as F8 had two kernels relatively close together.
 However nothing has been touched at all, apart from package updates.  Seeing I
am using a custom gpsd that can easily be eliminated.

Has anything changed in the kernel to stop shared memory access?  Selinux was
updated to fix this way back (I reported the bug).

Only other thing I have added is mod_sec and rkhunter onto the server, but this
is more for apache and logins, nothing there preventing shared memory...



Comment 3 Miroslav Lichvar 2008-02-20 14:38:17 UTC
Well, it works for me. I've tried kernel-2.6.23.15-137.fc8 and
kernel-2.6.23.8-63.fc8, both works ok.

Can you please try using an older kernel to be sure it's not a kernel problem?

Comment 4 Miroslav Lichvar 2008-02-20 15:01:31 UTC
You can verify that both ntpd and gpsd are using the same shared memory segments by:

ipcs | grep 0x4e54503[0-1]

where the 6th column should be 2.

Comment 5 David 2008-02-20 21:58:18 UTC
I have got kernel -137 and -115 on the server (but I cant be sure it was not
working with both of these it may be the kernel before)..

[root@primary ~]# yum list kernel
Excluding Packages from Atomicorp - 8 - Atomic Secured Linux 2.0
Finished
Installed Packages
kernel.i686                              2.6.23.15-137.fc8      installed
kernel.i686                              2.6.23.14-115.fc8      installed
Available Packages
kernel.i586                              2.6.23.15-137.fc8      updates
[root@primary ~]#


Output does show they are on the same shared memory..

[root@primary ~]# ipcs | grep 0x4e54503[0-1]
0x4e545030 32768      root      700        80         2
0x4e545031 65537      root      700        80         2
[root@primary ~]#

Its getting very strange :(  as its a custom source code gpsd and has not been
touched since the day F8 was installed and I compiled it, its looking very
strange...

Again I confirm it is running (feb 19th was my last reboot trying to fix this)..

[root@primary ~]# ps aux | grep gpsd
nobody    2807  0.1  0.0  13944  1292 ?        S<sl Feb19   2:28
/usr/local/sbin/gpsd -n /dev/ttyS0
root     18562  0.0  0.0   4048   688 pts/2    S+   08:55   0:00 grep gpsd
[root@primary ~]#

Also gpsd does confirm its got a valid fix (the gps itself is physically
working) and all the data is good.

For some reason ntpd just wont look at it...

Comment 6 Miroslav Lichvar 2008-02-21 15:21:40 UTC
Created attachment 295519 [details]
print the ntpd shared memory

Ok, please compile the attached code and start it. If the count doesn't change,
it means gpsd doesn't write to the segment. If valid never changes to 0, ntpd
doesn't read the values.

Comment 7 David 2008-02-22 02:31:56 UTC
Okay I compiled it and ran it, I do see changes from 1 to 0 about every 15 to 18
lines.


[root@primary ~]# ./ntpd_test
count: 460174, valid: 1, ts: 1203647348
count: 460176, valid: 1, ts: 1203647349
count: 460178, valid: 1, ts: 1203647350
count: 460180, valid: 1, ts: 1203647351
count: 460182, valid: 1, ts: 1203647352
count: 460184, valid: 1, ts: 1203647353
count: 460186, valid: 1, ts: 1203647354
count: 460188, valid: 1, ts: 1203647355
count: 460188, valid: 0, ts: 1203647355
count: 460190, valid: 1, ts: 1203647356
count: 460192, valid: 1, ts: 1203647357
count: 460194, valid: 1, ts: 1203647358
count: 460196, valid: 1, ts: 1203647359
count: 460198, valid: 1, ts: 1203647360
count: 460200, valid: 1, ts: 1203647361
count: 460202, valid: 1, ts: 1203647362
count: 460204, valid: 1, ts: 1203647363
count: 460206, valid: 1, ts: 1203647364
count: 460208, valid: 1, ts: 1203647365
count: 460210, valid: 1, ts: 1203647366
count: 460212, valid: 1, ts: 1203647367
count: 460214, valid: 1, ts: 1203647368
count: 460216, valid: 1, ts: 1203647369
count: 460218, valid: 1, ts: 1203647370
count: 460220, valid: 1, ts: 1203647371
count: 460222, valid: 1, ts: 1203647372
count: 460222, valid: 0, ts: 1203647372
count: 460224, valid: 1, ts: 1203647373
count: 460226, valid: 1, ts: 1203647374
count: 460228, valid: 1, ts: 1203647375
count: 460230, valid: 1, ts: 1203647376
count: 460232, valid: 1, ts: 1203647377
count: 460234, valid: 1, ts: 1203647378
count: 460236, valid: 1, ts: 1203647379
count: 460238, valid: 1, ts: 1203647380
count: 460240, valid: 1, ts: 1203647381
count: 460242, valid: 1, ts: 1203647382
count: 460244, valid: 1, ts: 1203647383
count: 460246, valid: 1, ts: 1203647384
count: 460248, valid: 1, ts: 1203647385
count: 460250, valid: 1, ts: 1203647386
count: 460252, valid: 1, ts: 1203647387
count: 460252, valid: 0, ts: 1203647387
count: 460254, valid: 1, ts: 1203647388
count: 460256, valid: 1, ts: 1203647389
count: 460258, valid: 1, ts: 1203647390
count: 460260, valid: 1, ts: 1203647391
count: 460262, valid: 1, ts: 1203647392
count: 460264, valid: 1, ts: 1203647393
count: 460266, valid: 1, ts: 1203647394
count: 460268, valid: 1, ts: 1203647395
count: 460270, valid: 1, ts: 1203647396
count: 460272, valid: 1, ts: 1203647397
count: 460274, valid: 1, ts: 1203647398
count: 460276, valid: 1, ts: 1203647399
count: 460278, valid: 1, ts: 1203647400
count: 460280, valid: 1, ts: 1203647401
count: 460282, valid: 1, ts: 1203647402
count: 460282, valid: 0, ts: 1203647402
count: 460284, valid: 1, ts: 1203647403
count: 460286, valid: 1, ts: 1203647404
count: 460288, valid: 1, ts: 1203647405
count: 460290, valid: 1, ts: 1203647406
count: 460292, valid: 1, ts: 1203647407
count: 460294, valid: 1, ts: 1203647408
count: 460296, valid: 1, ts: 1203647409
count: 460298, valid: 1, ts: 1203647410
count: 460300, valid: 1, ts: 1203647411
count: 460302, valid: 1, ts: 1203647412
count: 460304, valid: 1, ts: 1203647413
count: 460306, valid: 1, ts: 1203647414
count: 460308, valid: 1, ts: 1203647415
count: 460310, valid: 1, ts: 1203647416
count: 460312, valid: 1, ts: 1203647417
count: 460314, valid: 1, ts: 1203647418
count: 460316, valid: 1, ts: 1203647419
count: 460316, valid: 0, ts: 1203647419
count: 460318, valid: 1, ts: 1203647420
count: 460320, valid: 1, ts: 1203647421
count: 460322, valid: 1, ts: 1203647422

[root@primary ~]#


Comment 8 Miroslav Lichvar 2008-02-22 08:38:42 UTC
It doesn't look like a problem with shared memory then. Can you please try the
latest gpsd release?

Comment 9 David 2008-02-23 08:29:22 UTC
I did it and its still the same...

I did not uninstall my compiled copy as it lives in /usr/local/sbin/gpsd

So I yum install gpsd

Its placed into /usr/sbin/gpsd

Now I run it in debug mode, to ttyS0

[root@primary ~]# /usr/sbin/gpsd -D5 /dev/ttyS0 -n
[root@primary ~]# /etc/init.d/ntpd restart
Shutting down ntpd:                                        [  OK  ]
Starting ntpd:                                             [  OK  ]
[root@primary ~]# ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 SHM(0)          .GPS.            0 l    -   16    0    0.000    0.000   0.001
 SHM(1)          .PPS.            0 l    -   16    0    0.000    0.000   0.001
 cachens1.onqnet 204.152.184.72   2 u    2   64    1   32.310  -42.583   0.001
 wireless.org.au 128.250.36.2     2 u    2   64    1   30.827  -17.544   0.001
 60-242-56-98.st 128.250.33.242   2 u    1   64    1   41.339   -8.064   0.001
 192.168.0.255   .BCST.          16 u    -   64    0    0.000    0.000   0.001
 10.255.255.255  .BCST.          16 u    -   64    0    0.000    0.000   0.001
 NTP.MCAST.NET   .MCST.          16 u    -   64    0    0.000    0.000   0.001
[root@primary ~]# ./ntpd_test
count: 76, valid: 1, ts: 1203754848
count: 78, valid: 1, ts: 1203754849
count: 80, valid: 1, ts: 1203754850
count: 82, valid: 1, ts: 1203754851
count: 84, valid: 1, ts: 1203754852
count: 86, valid: 1, ts: 1203754853
count: 88, valid: 1, ts: 1203754854
count: 88, valid: 0, ts: 1203754854
count: 90, valid: 1, ts: 1203754855
count: 92, valid: 1, ts: 1203754856
count: 94, valid: 1, ts: 1203754857
count: 96, valid: 1, ts: 1203754858
count: 98, valid: 1, ts: 1203754859
count: 100, valid: 1, ts: 1203754860

[root@primary ~]#
[root@primary ~]# yum list gpsd
Excluding Packages from Atomicorp - 8 - Atomic Secured Linux 2.0
Finished
Installed Packages
gpsd.i386                                2.34-8.fc8             installed
[root@primary ~]#

And just further to prove it, here is the /var/log/messages data showing it is
running..

$GPGGA,082533.00,3344.4866,S,15100.0004,E,1,08,1.9,178.2,M,,M,,*61#015
Feb 23 19:25:33 primary gpsd[16712]: gpsd: GPGGA sets status 1
Feb 23 19:25:33 primary gpsd[16712]: gpsd: <= GPS:
$GPVTG,317.4,T,,M,0.1,N,0.2,K*62#015
Feb 23 19:25:34 primary gpsd[16712]: gpsd: <= GPS:
$GPRMC,082534.00,A,3344.4865,S,15100.0004,E,0.1,315.0,230208,,*25#015
Feb 23 19:25:34 primary gpsd[16712]: gpsd: <= GPS:
$GPGGA,082534.00,3344.4865,S,15100.0004,E,1,08,1.9,178.4,M,,M,,*63#015
Feb 23 19:25:34 primary gpsd[16712]: gpsd: GPGGA sets status 1
Feb 23 19:25:34 primary gpsd[16712]: gpsd: <= GPS:
$GPGSA,A,3,,,17,09,05,02,04,12,,,,,3.3,1.9,2.7*30#015
Feb 23 19:25:34 primary gpsd[16712]: gpsd: GPGSA sets mode 3
Feb 23 19:25:34 primary gpsd[16712]: gpsd: <= GPS:
$GPVTG,315.0,T,,M,0.1,N,0.2,K*64#015
Feb 23 19:25:34 primary gpsd[16712]: gpsd: <= GPS:
$GPGSV,3,1,10,02,74,022,43,04,57,120,47,05,26,227,44,09,52,285,45*7C#015
Feb 23 19:25:34 primary gpsd[16712]: gpsd: Partial satellite data (1 of 3).
Feb 23 19:25:34 primary gpsd[16712]: gpsd: <= GPS:
$GPGSV,3,2,10,10,04,016,37,12,42,221,45,17,16,117,41,28,03,060,*76#015
Feb 23 19:25:34 primary gpsd[16712]: gpsd: Partial satellite data (2 of 3).
Feb 23 19:25:34 primary gpsd[16712]: gpsd: <= GPS:
$GPGSV,3,3,10,29,01,280,,30,10,230,39*71#015
Feb 23 19:25:34 primary gpsd[16712]: gpsd: Satellite data OK.
Feb 23 19:25:35 primary gpsd[16712]: gpsd: <= GPS:
$GPRMC,082535.00,A,3344.4865,S,15100.0004,E,0.1,306.0,230208,,*26#015
Feb 23 19:25:35 primary gpsd[16712]: gpsd: <= GPS:
$GPGGA,082535.00,3344.4865,S,15100.0004,E,1,08,1.9,178.4,M,,M,,*62#015
Feb 23 19:25:35 primary gpsd[16712]: gpsd: GPGGA sets status 1
Feb 23 19:25:35 primary gpsd[16712]: gpsd: <= GPS:
$GPVTG,306.0,T,,M,0.1,N,0.1,K*65#015
Feb 23 19:25:36 primary gpsd[16712]: gpsd: <= GPS:
$GPRMC,082536.00,A,3344.4864,S,15100.0004,E,0.0,225.0,230208,,*25#015
Feb 23 19:25:36 primary gpsd[16712]: gpsd: <= GPS:
$GPGGA,082536.00,3344.4864,S,15100.0004,E,1,08,1.9,178.5,M,,M,,*61#015
Feb 23 19:25:36 primary gpsd[16712]: gpsd: GPGGA sets status 1
Feb 23 19:25:36 primary gpsd[16712]: gpsd: <= GPS:
$GPVTG,225.0,T,,M,0.0,N,0.0,K*65#015


So why does ntp just ignore the gpsd?  Is there a way of forcing ntp to try to
use it, or get some debug out of it?

Thanks..

Comment 10 Miroslav Lichvar 2008-02-25 09:49:16 UTC
Please try also the latest upstream release (2.37). Debugging for ntpd can be
enabled by adding -D 4 to /etc/sysconfig/ntpd, higher number means more
debugging output.

Comment 11 Miroslav Lichvar 2008-02-25 09:55:57 UTC
Even better way to debug ntpd is to start it directly in terminal as
/usr/sbin/ntpd -n -D 4 &> log.

Comment 12 David 2008-02-25 11:13:03 UTC
I first tried debugging ntpd as gpsd was working unless something has been
changed to stop it.

Should it point to a gpsd issue then I will compile up 2.37.

Attached is the dump log of /usr/sbin/ntpd -n -D 4 &> log

It ran for about 2 minutes.



Comment 13 David 2008-02-25 11:14:26 UTC
Created attachment 295786 [details]
/usr/sbin/ntpd -n -D 4 &> log

/usr/sbin/ntpd -n -D 4 &> log

Comment 14 Miroslav Lichvar 2008-02-25 11:55:03 UTC
From the log it looks like gpsd is not putting valid values in the segment.

I can prepare a patch which would allow us to see what exactly is wrong, but try
the 2.37 release first as the SVN changelog mentions few bugs that could be
related to this.

Comment 15 David 2008-02-25 21:51:45 UTC
I can't seem to build the rpm I am getting an error I can't track down..

[root@primary gpsd-2.37]# rpmbuild -bp gpsd.spec
Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.73842
+ umask 022
+ cd /usr/src/redhat/BUILD
+ LANG=C
+ export LANG
+ unset DISPLAY
+ $'\r'
: command not found842: line 25:
error: Bad exit status from /var/tmp/rpm-tmp.73842 (%prep)


RPM build errors:
    Bad exit status from /var/tmp/rpm-tmp.73842 (%prep)
[root@primary gpsd-2.37]#



Can you possibly try to build the rpm and attach it here and I can rpm on it and
test gpsd 2.37

Thanks again!


Comment 16 David 2008-02-25 21:59:30 UTC
I seem to be getting random errors as the error keeps changing but not the line
number..

Perhaps if you are able to build it, also put it into the fedora-updates for F7
and F8 as it still contains 2.34-8 from years ago....


[root@primary gpsd-2.37]# rpmbuild -ba gpsd.spec
Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.70625
+ umask 022
+ cd /usr/src/redhat/BUILD
+ LANG=C
+ export LANG
+ unset DISPLAY
+ $'\r'
: command not found625: line 25:
error: Bad exit status from /var/tmp/rpm-tmp.70625 (%prep)


RPM build errors:
    Bad exit status from /var/tmp/rpm-tmp.70625 (%prep)
[root@primary gpsd-2.37]# rpmbuild -ba gpsd.spec
Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.67395
+ umask 022
+ cd /usr/src/redhat/BUILD
+ LANG=C
+ export LANG
+ unset DISPLAY
+ $'\r'
: command not found395: line 25:
error: Bad exit status from /var/tmp/rpm-tmp.67395 (%prep)


RPM build errors:
    Bad exit status from /var/tmp/rpm-tmp.67395 (%prep)
[root@primary gpsd-2.37]#


Comment 17 Miroslav Lichvar 2008-02-26 17:54:01 UTC
Ok, a build of 2.37 is here: http://people.redhat.com/mlichvar/tmp/gpsd/


Comment 18 David 2008-02-27 08:36:46 UTC
Created attachment 296034 [details]
New /usr/sbin/ntpd -n -D 4 &> log with new gpsd rpm

Okay success!  This is excellent.  I think I have got some stability issues
with the GPS as far as the NMEA data, but the PPS GPS looks good...

Is there anyway of getting a final build of gpsd 2.37.rpm?

xSHM(0) 	 .GPS.		  0 l	 4   16  377	0.000  -31.842	39.526
*SHM(1) 	 .PPS.		  0 l	16   16  377	0.000	 1.488	 1.622
-ns2.jdcomputers 192.36.143.234   2 u	25   64  377   39.849  209.148	91.430
+60-240-81-28.st 17.248.59.230	  2 u	25   64  377   49.718  214.878	 9.711
+ntp.tourism.wa. 130.95.179.80	  2 u	19   64  377   72.594  212.630	13.485
 192.168.0.255	 .BCST. 	 16 u	 -   64    0	0.000	 0.000	 0.001
 10.255.255.255  .BCST. 	 16 u	 -   64    0	0.000	 0.000	 0.001
 NTP.MCAST.NET	 .MCST. 	 16 u	 -   64    0	0.000	 0.000	 0.001

Comment 19 David 2008-02-27 08:51:23 UTC
Only problem now is the PPS keeps dropping in and out...

==============================================================================
xSHM(0)          .GPS.            0 l   14   16  377    0.000  -36.180  36.970
*SHM(1)          .PPS.            0 l   17   16  377    0.000   -2.334   0.431
-ns2.jdcomputers 192.36.143.234   2 u   34   64  377   37.657  187.019  10.626
+60-240-81-28.st 17.248.59.230    2 u   26   64  377   49.128  192.234   6.081
+ntp.tourism.wa. 130.95.179.80    2 u   25   64  377   72.709  190.056   1.608
 192.168.0.255   .BCST.          16 u    -   64    0    0.000    0.000   0.001
 10.255.255.255  .BCST.          16 u    -   64    0    0.000    0.000   0.001
 NTP.MCAST.NET   .MCST.          16 u    -   64    0    0.000    0.000   0.001
[root@primary ~]# ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
xSHM(0)          .GPS.            0 l    1   16  377    0.000  -101.29  58.642
xSHM(1)          .PPS.            0 l    4   16  377    0.000   -1.649   0.482
+ns2.jdcomputers 192.36.143.234   2 u   38   64  377   37.657  187.019  10.626
+60-240-81-28.st 17.248.59.230    2 u   30   64  377   49.128  192.234   6.081
*ntp.tourism.wa. 130.95.179.80    2 u   29   64  377   72.709  190.056   1.608
 192.168.0.255   .BCST.          16 u    -   64    0    0.000    0.000   0.001
 10.255.255.255  .BCST.          16 u    -   64    0    0.000    0.000   0.001
 NTP.MCAST.NET   .MCST.          16 u    -   64    0    0.000    0.000   0.001
[root@primary ~]#


Comment 20 Miroslav Lichvar 2008-02-27 09:07:17 UTC
Reassigning to gpsd component.

Comment 21 David 2008-02-27 11:36:59 UTC
Once again thanks so much for your fantastic help and assistance! Any idea why I
could not build the rpm?  It would be handy to be able to package them in case I
wanted to tweak with the source files.

Comment 22 Miroslav Lichvar 2008-02-27 11:58:54 UTC
Well, it looks like your spec file is in DOS format, you need to run dos2unix on it.

Comment 23 David 2008-02-28 06:29:15 UTC
Thanks again.  I tar extracted it again and successfully built the package with
rpmbuild -ba !

Hopefully someone will pick this up and we can work on finishing up gpsd for F8.

Meantime FYI, I installed the package and the timing seems to be settled down
with the PPS locked steadily * and the nmea GPS with a + and 3 internet servers,
two with a - and one with a +

My stratum has now gone back to a stratum 1 as seen on my LAN..


Comment 24 Douglas E. Warner 2008-02-28 12:39:36 UTC

*** This bug has been marked as a duplicate of 243026 ***