Description of problem: ptp slaves lose synchronization Version-Release number of selected component (if applicable): Nic information ethtool -i ens7f0 driver: i40e version: 2.8.20-k firmware-version: 7.10 0x800075e6 19.5.12 expansion-rom-version: bus-info: 0000:88:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes sofware version: [root@cnfde4 /]# ptp4l -v 2.0 [root@cnfde4 /]# phc2sys -v 2.0 Kernel verison: uname -a Linux cnfde4.ptp.lab.eng.bos.redhat.com 4.18.0-193.28.1.rt13.77.el8_2.x86_64 #1 SMP PREEMPT RT Fri Oct 16 14:11:07 EDT 2020 x86_64 x86_64 x86_64 GNU/Linux How reproducible: 100% We are using ptp on top of openshift cluster for time synchronization and I am able to see drops in the logs. The topology is 2 nodes one grandmaster and one slave connected via a switch. Logs from Grandmaster - phc2sys[242399.800]: [ens7f0] CLOCK_REALTIME rms 16 max 16 freq -84548 +/- 0 delay 1015 +/- 0 ptp4l[242400.376]: [ens7f0] master offset 28 s2 freq +2196 path delay 1495 phc2sys[242400.800]: [ens7f0] CLOCK_REALTIME rms 6 max 6 freq -84554 +/- 0 delay 1009 +/- 0 ptp4l[242401.376]: [ens7f0] master offset -21 s2 freq +2156 path delay 1496 phc2sys[242401.800]: [ens7f0] CLOCK_REALTIME rms 1 max 1 freq -84557 +/- 0 delay 1017 +/- 0 phc2sys[242402.801]: [ens7f0] CLOCK_REALTIME rms 0 max -0 freq -84557 +/- 0 delay 1016 +/- 0 phc2sys[242403.801]: [ens7f0] CLOCK_REALTIME rms 18 max 18 freq -84575 +/- 0 delay 1009 +/- 0 phc2sys[242404.801]: [ens7f0] CLOCK_REALTIME rms 20 max 20 freq -84583 +/- 0 delay 1013 +/- 0 phc2sys[242405.801]: [ens7f0] CLOCK_REALTIME rms 0 max -0 freq -84569 +/- 0 delay 1017 +/- 0 phc2sys[242406.801]: [ens7f0] CLOCK_REALTIME rms 0 max -0 freq -84569 +/- 0 delay 1012 +/- 0 phc2sys[242407.801]: [ens7f0] CLOCK_REALTIME rms 5 max 5 freq -84574 +/- 0 delay 1014 +/- 0 phc2sys[242408.801]: [ens7f0] CLOCK_REALTIME rms 9 max 9 freq -84579 +/- 0 delay 1019 +/- 0 ptp4l[242408.920]: [ens7f0] port 1: SLAVE to LISTENING on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES ptp4l[242408.920]: [ens7f0] selected local clock 40a6b7.fffe.0e2b30 as best master phc2sys[242409.801]: [ens7f0] port 40a6b7.fffe.0e2b30-1 changed state phc2sys[242409.802]: [ens7f0] reconfiguring after port state change phc2sys[242409.802]: [ens7f0] selecting ens7f0 for synchronization phc2sys[242409.802]: [ens7f0] nothing to synchronize ptp4l[242415.343]: [ens7f0] selected local clock 40a6b7.fffe.0e2b30 as best master I1115 12:03:14.216277 2863254 main.go:98] ticker pull ptp4l[242423.145]: [ens7f0] selected local clock 40a6b7.fffe.0e2b30 as best master ptp4l[242429.167]: [ens7f0] selected local clock 40a6b7.fffe.0e2b30 as best master ptp4l[242432.964]: [ens7f0] selected best master clock 40a6b7.fffe.0e43d0 ptp4l[242432.964]: [ens7f0] port 1: LISTENING to UNCALIBRATED on RS_SLAVE phc2sys[242433.803]: [ens7f0] port 40a6b7.fffe.0e2b30-1 changed state phc2sys[242433.803]: [ens7f0] reconfiguring after port state change phc2sys[242433.803]: [ens7f0] master clock not ready, waiting... ptp4l[242434.963]: [ens7f0] master offset 577 s2 freq +2747 path delay 1492 ptp4l[242434.963]: [ens7f0] port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED phc2sys[242435.804]: [ens7f0] port 40a6b7.fffe.0e2b30-1 changed state phc2sys[242435.804]: [ens7f0] reconfiguring after port state change phc2sys[242435.804]: [ens7f0] selecting CLOCK_REALTIME for synchronization phc2sys[242435.804]: [ens7f0] selecting ens7f0 as the master clock phc2sys[242435.804]: [ens7f0] CLOCK_REALTIME rms 1938345 max 1938345 freq -2022918 +/- 0 delay 1023 +/- 0 ptp4l[242435.963]: [ens7f0] master offset -10 s2 freq +2333 path delay 1492 phc2sys[242436.804]: [ens7f0] CLOCK_REALTIME rms 3472 max 3472 freq -669549 +/- 0 delay 1026 +/- 0 ptp4l[242436.965]: [ens7f0] master offset -167 s2 freq +2173 path delay 1492 Logs from slave: phc2sys[242976.503]: [ens7f0] ens7f0 rms 16 max 16 freq +1215 +/- 0 delay 1349 +/- 0 phc2sys[242977.503]: [ens7f0] ens7f0 rms 17 max 17 freq +1186 +/- 0 delay 1354 +/- 0 phc2sys[242978.503]: [ens7f0] ens7f0 rms 9 max 9 freq +1207 +/- 0 delay 1356 +/- 0 ptp4l[242979.485]: [ens7f0] port 1: MASTER to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED) phc2sys[242979.503]: [ens7f0] port 40a6b7.fffe.0e43d0-1 changed state phc2sys[242979.613]: [ens7f0] reconfiguring after port state change phc2sys[242979.613]: [ens7f0] selecting ens7f0 for synchronization phc2sys[242979.613]: [ens7f0] selecting CLOCK_REALTIME as the master clock phc2sys[242979.613]: [ens7f0] ens7f0 rms 11 max 11 freq +1190 +/- 0 delay 1340 +/- 0 phc2sys[242980.613]: [ens7f0] ens7f0 rms 15 max 15 freq +1213 +/- 0 delay 1362 +/- 0 phc2sys[242982.613]: [ens7f0] ens7f0 rms 1 max 1 freq +1201 +/- 0 delay 1360 +/- 0 phc2sys[242983.614]: [ens7f0] ens7f0 rms 18 max 18 freq +1184 +/- 0 delay 1344 +/- 0 phc2sys[242984.614]: [ens7f0] ens7f0 rms 5 max 5 freq +1202 +/- 0 delay 1357 +/- 0 phc2sys[242985.614]: [ens7f0] ens7f0 rms 14 max 14 freq +1212 +/- 0 delay 1365 +/- 0 phc2sys[242986.614]: [ens7f0] ens7f0 rms 1 max 1 freq +1201 +/- 0 delay 1349 +/- 0 phc2sys[242987.614]: [ens7f0] ens7f0 rms 12 max 12 freq +1214 +/- 0 delay 1448 +/- 0 phc2sys[242988.614]: [ens7f0] ens7f0 rms 7 max 7 freq +1199 +/- 0 delay 1352 +/- 0 phc2sys[242989.615]: [ens7f0] ens7f0 rms 18 max 18 freq +1185 +/- 0 delay 1348 +/- 0 phc2sys[242990.615]: [ens7f0] ens7f0 rms 12 max 12 freq +1186 +/- 0 delay 1354 +/- 0 phc2sys[242991.615]: [ens7f0] ens7f0 rms 2 max 2 freq +1196 +/- 0 delay 1366 +/- 0 phc2sys[242992.615]: [ens7f0] ens7f0 rms 22 max 22 freq +1217 +/- 0 delay 1347 +/- 0 phc2sys[242993.615]: [ens7f0] ens7f0 rms 10 max 10 freq +1192 +/- 0 delay 1354 +/- 0 phc2sys[242994.615]: [ens7f0] ens7f0 rms 14 max 14 freq +1213 +/- 0 delay 1343 +/- 0 phc2sys[242995.616]: [ens7f0] ens7f0 rms 3 max 3 freq +1200 +/- 0 delay 1347 +/- 0 phc2sys[242996.616]: [ens7f0] ens7f0 rms 3 max 3 freq +1205 +/- 0 delay 1342 +/- 0 phc2sys[242997.616]: [ens7f0] ens7f0 rms 17 max 17 freq +1186 +/- 0 delay 1351 +/- 0 phc2sys[242998.616]: [ens7f0] ens7f0 rms 5 max 5 freq +1193 +/- 0 delay 1363 +/- 0 phc2sys[242999.616]: [ens7f0] ens7f0 rms 5 max 5 freq +1201 +/- 0 delay 1344 +/- 0 phc2sys[243000.616]: [ens7f0] ens7f0 rms 1 max 1 freq +1197 +/- 0 delay 1351 +/- 0 ptp4l[243000.635]: [ens7f0] port 1: FAULTY to LISTENING on INIT_COMPLETE phc2sys[243001.617]: [ens7f0] ens7f0 rms 32 max 32 freq +1229 +/- 0 delay 1344 +/- 0 phc2sys[243002.617]: [ens7f0] ens7f0 rms 20 max 20 freq +1187 +/- 0 delay 1348 +/- 0 phc2sys[243003.617]: [ens7f0] ens7f0 rms 2 max 2 freq +1199 +/- 0 delay 1341 +/- 0 phc2sys[243004.617]: [ens7f0] ens7f0 rms 5 max 5 freq +1195 +/- 0 delay 1358 +/- 0 phc2sys[243005.617]: [ens7f0] ens7f0 rms 9 max 9 freq +1190 +/- 0 delay 1375 +/- 0 phc2sys[243006.617]: [ens7f0] ens7f0 rms 33 max 33 freq +1229 +/- 0 delay 1344 +/- 0 phc2sys[243007.618]: [ens7f0] ens7f0 rms 2 max 2 freq +1208 +/- 0 delay 1344 +/- 0 ptp4l[243008.072]: [ens7f0] port 1: LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES ptp4l[243008.073]: [ens7f0] selected local clock 40a6b7.fffe.0e43d0 as best master ptp4l[243008.073]: [ens7f0] assuming the grand master role phc2sys[243008.618]: [ens7f0] port 40a6b7.fffe.0e43d0-1 changed state phc2sys[243008.618]: [ens7f0] reconfiguring after port state change phc2sys[243008.618]: [ens7f0] selecting ens7f0 for synchronization phc2sys[243008.618]: [ens7f0] selecting CLOCK_REALTIME as the master clock phc2sys[243008.618]: [ens7f0] ens7f0 rms 17 max 17 freq +1208 +/- 0 delay 1357 +/- 0 phc2sys[243009.618]: [ens7f0] ens7f0 rms 29 max 29 freq +1195 +/- 0 delay 1362 +/- 0 phc2sys[243010.618]: [ens7f0] ens7f0 rms 17 max 17 freq +1178 +/- 0 delay 1353 +/- 0
I guess the grandmaster and slave logs are swapped? Can you post their configuration and command-line options? The master's fault could be a missing transmit timestamp, but there should be an error message. Try setting logging_level to 7 and if that doesn't give an explanation for the fault, please try it again in strace.
Hi Miroslav, This is the config GrandMaster: /usr/sbin/phc2sys -a -r -r -n 24 -m -u 1 -z /var/run/ptp4l.0.socket -t [ens5f0] /usr/sbin/ptp4l -f /var/run/ptp4l.0.config -i ens5f0 -2 --summary_interval 0 -m [global] # # Default Data Set # twoStepFlag 1 slaveOnly 0 priority1 128 priority2 128 domainNumber 24 #utc_offset 37 clockClass 248 clockAccuracy 0xFE offsetScaledLogVariance 0xFFFF free_running 0 freq_est_interval 1 dscp_event 0 dscp_general 0 dataset_comparison ieee1588 G.8275.defaultDS.localPriority 128 # # Port Data Set # logAnnounceInterval -3 logSyncInterval -4 logMinDelayReqInterval -4 logMinPdelayReqInterval -4 announceReceiptTimeout 3 syncReceiptTimeout 0 delayAsymmetry 0 fault_reset_interval 4 neighborPropDelayThresh 20000000 masterOnly 0 G.8275.portDS.localPriority 128 # # Run time options # assume_two_step 0 logging_level 7 path_trace_enabled 0 follow_up_info 0 hybrid_e2e 0 inhibit_multicast_service 0 net_sync_monitor 0 tc_spanning_tree 0 tx_timestamp_timeout 1 unicast_listen 0 unicast_master_table 0 unicast_req_duration 3600 use_syslog 1 verbose 0 summary_interval 0 kernel_leap 1 check_fup_sync 0 # # Servo Options # pi_proportional_const 0.0 pi_integral_const 0.0 pi_proportional_scale 0.0 pi_proportional_exponent -0.3 pi_proportional_norm_max 0.7 pi_integral_scale 0.0 pi_integral_exponent 0.4 pi_integral_norm_max 0.3 step_threshold 0.0 first_step_threshold 0.00002 max_frequency 900000000 clock_servo pi sanity_freq_limit 200000000 ntpshm_segment 0 # # Transport options # transportSpecific 0x0 ptp_dst_mac 01:1B:19:00:00:00 p2p_dst_mac 01:80:C2:00:00:0E udp_ttl 1 udp6_scope 0x0E uds_address /var/run/ptp4l # # Default interface options # clock_type OC network_transport UDPv4 delay_mechanism E2E time_stamping hardware tsproc_mode filter delay_filter moving_median delay_filter_length 10 egressLatency 0 ingressLatency 0 boundary_clock_jbod 0 # # Clock description # productDescription ;; revisionData ;; manufacturerIdentity 00:00:00 userDescription ; timeSource 0xA0 uds_address /var/run/ptp4l.0.socket message_tag [ens5f0] Slave: /usr/sbin/phc2sys -a -r -n 24 -m -u 1 -z /var/run/ptp4l.0.socket -t [ens5f0] /usr/sbin/ptp4l -f /var/run/ptp4l.0.config -i ens5f0 -2 -s --summary_interval -4 -m [global] twoStepFlag 1 slaveOnly 0 priority1 128 priority2 90 domainNumber 24 #utc_offset 37 clockClass 248 clockAccuracy 0xFE offsetScaledLogVariance 0xFFFF free_running 0 freq_est_interval 1 dscp_event 0 dscp_general 0 dataset_comparison ieee1588 G.8275.defaultDS.localPriority 200 # # Port Data Set # logAnnounceInterval -3 logSyncInterval -4 logMinDelayReqInterval -4 logMinPdelayReqInterval -4 announceReceiptTimeout 3 syncReceiptTimeout 0 delayAsymmetry 0 fault_reset_interval 4 neighborPropDelayThresh 20000000 masterOnly 0 G.8275.portDS.localPriority 200 # # Run time options # assume_two_step 0 logging_level 7 path_trace_enabled 0 follow_up_info 0 hybrid_e2e 0 inhibit_multicast_service 0 net_sync_monitor 0 tc_spanning_tree 0 tx_timestamp_timeout 1 unicast_listen 0 unicast_master_table 0 unicast_req_duration 3600 use_syslog 1 verbose 0 summary_interval -4 kernel_leap 1 check_fup_sync 0 # # Servo Options # pi_proportional_const 0.0 pi_integral_const 0.0 pi_proportional_scale 0.0 pi_proportional_exponent -0.3 pi_proportional_norm_max 0.7 pi_integral_scale 0.0 pi_integral_exponent 0.4 pi_integral_norm_max 0.3 step_threshold 0.0 first_step_threshold 0.00002 max_frequency 900000000 clock_servo pi sanity_freq_limit 200000000 ntpshm_segment 0 # # Transport options # transportSpecific 0x0 ptp_dst_mac 01:1B:19:00:00:00 p2p_dst_mac 01:80:C2:00:00:0E udp_ttl 1 udp6_scope 0x0E uds_address /var/run/ptp4l # # Default interface options # clock_type OC #network_transport UDPv4 network_transport L2 delay_mechanism E2E time_stamping hardware tsproc_mode filter delay_filter moving_median delay_filter_length 10 egressLatency 0 ingressLatency 0 boundary_clock_jbod 0 # # Clock description # productDescription ;; revisionData ;; manufacturerIdentity 00:00:00 userDescription ; timeSource 0xA0 uds_address /var/run/ptp4l.0.socket message_tag [ens5f0] I see this error from time to time on the slave ptp4l[54659.120]: [ens5f0] delay filtered 173 raw 173 ptp4l[54659.167]: [ens5f0] master offset -9 s2 freq -6845 path delay 173 phc2sys[54659.185]: [ens5f0] CLOCK_REALTIME rms 51 max 51 freq -86118 +/- 0 delay 1107 +/- 0 ptp4l[54659.230]: [ens5f0] master offset -2 s2 freq -6834 path delay 173 ptp4l[54659.241]: [ens5f0] port 1: delay timeout ptp4l[54659.241]: [ens5f0] delay filtered 173 raw 177 ptp4l[54659.280]: [ens5f0] port 1: delay timeout ptp4l[54659.280]: [ens5f0] delay filtered 174 raw 176 ptp4l[54659.292]: [ens5f0] master offset -6 s2 freq -6841 path delay 174 ptp4l[54659.355]: [ens5f0] master offset -10 s2 freq -6849 path delay 174 ptp4l[54659.370]: [ens5f0] port 1: delay timeout ptp4l[54659.371]: [ens5f0] delay filtered 173 raw 172 ptp4l[54659.404]: [ens5f0] port 1: delay timeout ptp4l[54659.404]: [ens5f0] delay filtered 173 raw 175 ptp4l[54659.417]: [ens5f0] port 1: delay timeout ptp4l[54659.417]: [ens5f0] delay filtered 174 raw 175 ptp4l[54659.417]: [ens5f0] master offset -7 s2 freq -6845 path delay 174 ptp4l[54659.480]: [ens5f0] master offset -15 s2 freq -6859 path delay 174 ptp4l[54659.504]: [ens5f0] port 1: delay timeout ptp4l[54659.504]: [ens5f0] delay filtered 174 raw 165 ptp4l[54659.542]: [ens5f0] master offset -13 s2 freq -6857 path delay 174 ptp4l[54659.605]: [ens5f0] master offset -7 s2 freq -6848 path delay 174 ptp4l[54659.613]: [ens5f0] port 1: delay timeout ptp4l[54659.615]: [ens5f0] port 1: SLAVE to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED) phc2sys[54660.185]: [ens5f0] port 0c42a1.fffe.1df97e-1 changed state ptp4l[54663.637]: [ens5f0] waiting 2^{4} seconds to clear fault on port 1 phc2sys[54664.188]: [ens5f0] reconfiguring after port state change phc2sys[54664.188]: [ens5f0] selecting ens5f0 for synchronization phc2sys[54664.188]: [ens5f0] nothing to synchronize ptp4l[54679.637]: [ens5f0] clearing fault on port 1 ptp4l[54679.637]: [ens5f0] config item ens5f0.logMinDelayReqInterval is -4 ptp4l[54679.638]: [ens5f0] config item ens5f0.logAnnounceInterval is -3 ptp4l[54679.638]: [ens5f0] config item ens5f0.announceReceiptTimeout is 3 ptp4l[54679.638]: [ens5f0] config item ens5f0.syncReceiptTimeout is 0 ptp4l[54679.638]: [ens5f0] config item ens5f0.transportSpecific is 0 ptp4l[54679.638]: [ens5f0] config item ens5f0.ignore_transport_specific is 0 ptp4l[54679.638]: [ens5f0] config item ens5f0.masterOnly is 0 ptp4l[54679.638]: [ens5f0] config item ens5f0.G.8275.portDS.localPriority is 200 ptp4l[54679.638]: [ens5f0] config item ens5f0.logSyncInterval is -4 ptp4l[54679.638]: [ens5f0] config item ens5f0.logMinPdelayReqInterval is -4 ptp4l[54679.638]: [ens5f0] config item ens5f0.neighborPropDelayThresh is 20000000 ptp4l[54679.638]: [ens5f0] config item ens5f0.min_neighbor_prop_delay is -20000000 ptp4l[54679.638]: [ens5f0] config item ens5f0.ptp_dst_mac is '01:1B:19:00:00:00' ptp4l[54679.638]: [ens5f0] config item ens5f0.p2p_dst_mac is '01:80:C2:00:00:0E' ptp4l[54679.652]: [ens5f0] tx_type 1 not 1 ptp4l[54679.652]: [ens5f0] rx_filter 1 not 12 ptp4l[54679.652]: [ens5f0] port 1: FAULTY to LISTENING on INIT_COMPLETE ptp4l[54679.652]: [ens5f0] port 1: ignoring message ptp4l[54679.652]: [ens5f0] port 1: received link status notification ptp4l[54679.652]: [ens5f0] interface index 4 is up ptp4l[54679.669]: [ens5f0] port 1: setting asCapable ptp4l[54679.669]: [ens5f0] port 1: new foreign master 0c42a1.fffe.1df99a-1 ptp4l[54679.919]: [ens5f0] selected best master clock 0c42a1.fffe.1df99a ptp4l[54679.919]: [ens5f0] port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[54679.926]: [ens5f0] master offset 265 s2 freq -6384 path delay 174 ptp4l[54679.926]: [ens5f0] port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED ptp4l[54679.988]: [ens5f0] master offset 219 s2 freq -6437 path delay 174 ptp4l[54680.049]: [ens5f0] port 1: delay timeout ptp4l[54680.049]: [ens5f0] delay filtered 174 raw 174 ptp4l[54680.051]: [ens5f0] master offset 218 s2 freq -6417 path delay 174 ptp4l[54680.080]: [ens5f0] port 1: delay timeout ptp4l[54680.080]: [ens5f0] delay filtered 175 raw 189 ptp4l[54680.113]: [ens5f0] master offset 181 s2 freq -6458 path delay 175 ptp4l[54680.176]: [ens5f0] master offset 147 s2 freq -6498 path delay 175 On the master the "ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES" is only when I start the process for the first time only so I think is fine. here are the logs ptp4l[53298.892]: [ens5f0] config item (null).assume_two_step is 0 ptp4l[53298.892]: [ens5f0] config item (null).check_fup_sync is 0 ptp4l[53298.892]: [ens5f0] config item (null).tx_timestamp_timeout is 1 ptp4l[53298.892]: [ens5f0] config item (null).clock_servo is 0 ptp4l[53298.892]: [ens5f0] config item (null).clock_type is 32768 ptp4l[53298.892]: [ens5f0] config item (null).clock_servo is 0 ptp4l[53298.892]: [ens5f0] config item (null).clockClass is 248 ptp4l[53298.892]: [ens5f0] config item (null).clockAccuracy is 254 ptp4l[53298.892]: [ens5f0] config item (null).offsetScaledLogVariance is 65535 ptp4l[53298.892]: [ens5f0] config item (null).productDescription is ';;' ptp4l[53298.892]: [ens5f0] config item (null).revisionData is ';;' ptp4l[53298.892]: [ens5f0] config item (null).userDescription is ';' ptp4l[53298.892]: [ens5f0] config item (null).manufacturerIdentity is '00:00:00' ptp4l[53298.892]: [ens5f0] config item (null).domainNumber is 24 ptp4l[53298.892]: [ens5f0] config item (null).slaveOnly is 0 ptp4l[53298.892]: [ens5f0] config item (null).gmCapable is 1 ptp4l[53298.892]: [ens5f0] config item (null).gmCapable is 1 ptp4l[53298.892]: [ens5f0] config item (null).G.8275.defaultDS.localPriority is 128 ptp4l[53298.892]: [ens5f0] config item (null).time_stamping is 1 ptp4l[53298.892]: [ens5f0] config item (null).twoStepFlag is 1 ptp4l[53298.892]: [ens5f0] config item (null).twoStepFlag is 1 ptp4l[53298.893]: [ens5f0] config item (null).time_stamping is 1 ptp4l[53298.893]: [ens5f0] config item (null).priority1 is 128 ptp4l[53298.893]: [ens5f0] config item (null).priority2 is 128 phc2sys[53298.893]: [ens5f0] Waiting for ptp4l... ptp4l[53298.893]: [ens5f0] interface index 4 is up ptp4l[53298.893]: [ens5f0] config item (null).free_running is 0 ptp4l[53298.893]: [ens5f0] selected /dev/ptp2 as PTP clock ptp4l[53298.893]: [ens5f0] config item (null).uds_address is '/var/run/ptp4l.0.socket' ptp4l[53298.893]: [ens5f0] section item /var/run/ptp4l.0.socket.announceReceiptTimeout now 0 ptp4l[53298.893]: [ens5f0] section item /var/run/ptp4l.0.socket.delay_mechanism now 0 ptp4l[53298.893]: [ens5f0] section item /var/run/ptp4l.0.socket.network_transport now 0 ptp4l[53298.893]: [ens5f0] section item /var/run/ptp4l.0.socket.delay_filter_length now 1 ptp4l[53298.893]: [ens5f0] config item (null).free_running is 0 ptp4l[53298.893]: [ens5f0] config item (null).freq_est_interval is 1 ptp4l[53298.893]: [ens5f0] config item (null).gmCapable is 1 ptp4l[53298.893]: [ens5f0] config item (null).kernel_leap is 1 ptp4l[53298.893]: [ens5f0] config item (null).utc_offset is 37 ptp4l[53298.893]: [ens5f0] config item (null).timeSource is 160 ptp4l[53298.893]: [ens5f0] config item (null).pi_proportional_const is 0.000000 ptp4l[53298.893]: [ens5f0] config item (null).pi_integral_const is 0.000000 ptp4l[53298.893]: [ens5f0] config item (null).pi_proportional_scale is 0.000000 ptp4l[53298.893]: [ens5f0] config item (null).pi_proportional_exponent is -0.300000 ptp4l[53298.893]: [ens5f0] config item (null).pi_proportional_norm_max is 0.700000 ptp4l[53298.893]: [ens5f0] config item (null).pi_integral_scale is 0.000000 ptp4l[53298.893]: [ens5f0] config item (null).pi_integral_exponent is 0.400000 ptp4l[53298.893]: [ens5f0] config item (null).pi_integral_norm_max is 0.300000 ptp4l[53298.893]: [ens5f0] config item (null).step_threshold is 0.000000 ptp4l[53298.893]: [ens5f0] config item (null).first_step_threshold is 0.000020 ptp4l[53298.893]: [ens5f0] config item (null).max_frequency is 900000000 ptp4l[53298.893]: [ens5f0] config item (null).dataset_comparison is 0 ptp4l[53298.893]: [ens5f0] config item (null).delay_filter_length is 10 ptp4l[53298.893]: [ens5f0] config item (null).delay_filter is 1 ptp4l[53298.893]: [ens5f0] config item (null).tsproc_mode is 0 ptp4l[53298.893]: [ens5f0] config item (null).initial_delay is 0 ptp4l[53298.893]: [ens5f0] config item (null).summary_interval is 0 ptp4l[53298.893]: [ens5f0] config item (null).sanity_freq_limit is 200000000 ptp4l[53298.893]: [ens5f0] PI servo: sync interval 1.000 kp 0.700 ki 0.300000 ptp4l[53298.893]: [ens5f0] config item /var/run/ptp4l.0.socket.boundary_clock_jbod is 0 ptp4l[53298.893]: [ens5f0] config item /var/run/ptp4l.0.socket.network_transport is 0 ptp4l[53298.893]: [ens5f0] config item /var/run/ptp4l.0.socket.delayAsymmetry is 0 ptp4l[53298.893]: [ens5f0] config item /var/run/ptp4l.0.socket.follow_up_info is 0 ptp4l[53298.893]: [ens5f0] config item /var/run/ptp4l.0.socket.freq_est_interval is 1 ptp4l[53298.893]: [ens5f0] config item /var/run/ptp4l.0.socket.net_sync_monitor is 0 ptp4l[53298.893]: [ens5f0] config item /var/run/ptp4l.0.socket.path_trace_enabled is 0 ptp4l[53298.893]: [ens5f0] config item /var/run/ptp4l.0.socket.tc_spanning_tree is 0 ptp4l[53298.893]: [ens5f0] config item /var/run/ptp4l.0.socket.ingressLatency is 0 ptp4l[53298.893]: [ens5f0] config item /var/run/ptp4l.0.socket.egressLatency is 0 ptp4l[53298.893]: [ens5f0] config item /var/run/ptp4l.0.socket.delay_mechanism is 0 ptp4l[53298.893]: [ens5f0] config item /var/run/ptp4l.0.socket.hybrid_e2e is 0 ptp4l[53298.893]: [ens5f0] config item /var/run/ptp4l.0.socket.fault_badpeernet_interval is 16 ptp4l[53298.893]: [ens5f0] config item /var/run/ptp4l.0.socket.fault_reset_interval is 4 ptp4l[53298.893]: [ens5f0] config item /var/run/ptp4l.0.socket.delay_filter_length is 1 ptp4l[53298.893]: [ens5f0] config item /var/run/ptp4l.0.socket.delay_filter is 1 ptp4l[53298.893]: [ens5f0] config item /var/run/ptp4l.0.socket.tsproc_mode is 0 ptp4l[53298.893]: [ens5f0] config item ens5f0.boundary_clock_jbod is 0 ptp4l[53298.893]: [ens5f0] config item ens5f0.network_transport is 3 ptp4l[53298.893]: [ens5f0] config item ens5f0.delayAsymmetry is 0 ptp4l[53298.893]: [ens5f0] config item ens5f0.follow_up_info is 0 ptp4l[53298.893]: [ens5f0] config item ens5f0.freq_est_interval is 1 ptp4l[53298.893]: [ens5f0] config item ens5f0.net_sync_monitor is 0 ptp4l[53298.893]: [ens5f0] config item ens5f0.path_trace_enabled is 0 ptp4l[53298.893]: [ens5f0] config item ens5f0.tc_spanning_tree is 0 ptp4l[53298.893]: [ens5f0] config item ens5f0.ingressLatency is 0 ptp4l[53298.893]: [ens5f0] config item ens5f0.egressLatency is 0 ptp4l[53298.893]: [ens5f0] config item ens5f0.delay_mechanism is 1 ptp4l[53298.893]: [ens5f0] config item ens5f0.unicast_master_table is 0 ptp4l[53298.893]: [ens5f0] config item ens5f0.unicast_listen is 0 ptp4l[53298.894]: [ens5f0] config item ens5f0.hybrid_e2e is 0 ptp4l[53298.894]: [ens5f0] config item ens5f0.fault_badpeernet_interval is 16 ptp4l[53298.894]: [ens5f0] config item ens5f0.fault_reset_interval is 4 ptp4l[53298.894]: [ens5f0] config item ens5f0.delay_filter_length is 10 ptp4l[53298.894]: [ens5f0] config item ens5f0.delay_filter is 1 ptp4l[53298.894]: [ens5f0] config item ens5f0.tsproc_mode is 0 ptp4l[53298.894]: [ens5f0] config item ens5f0.logMinDelayReqInterval is -4 ptp4l[53298.894]: [ens5f0] config item ens5f0.logAnnounceInterval is -3 ptp4l[53298.894]: [ens5f0] config item ens5f0.announceReceiptTimeout is 3 ptp4l[53298.894]: [ens5f0] config item ens5f0.syncReceiptTimeout is 0 ptp4l[53298.894]: [ens5f0] config item ens5f0.transportSpecific is 0 ptp4l[53298.894]: [ens5f0] config item ens5f0.ignore_transport_specific is 0 ptp4l[53298.894]: [ens5f0] config item ens5f0.masterOnly is 0 ptp4l[53298.894]: [ens5f0] config item ens5f0.G.8275.portDS.localPriority is 128 ptp4l[53298.894]: [ens5f0] config item ens5f0.logSyncInterval is -4 ptp4l[53298.894]: [ens5f0] config item ens5f0.logMinPdelayReqInterval is -4 ptp4l[53298.894]: [ens5f0] config item ens5f0.neighborPropDelayThresh is 20000000 ptp4l[53298.894]: [ens5f0] config item ens5f0.min_neighbor_prop_delay is -20000000 ptp4l[53298.894]: [ens5f0] config item ens5f0.ptp_dst_mac is '01:1B:19:00:00:00' ptp4l[53298.894]: [ens5f0] config item ens5f0.p2p_dst_mac is '01:80:C2:00:00:0E' ptp4l[53298.902]: [ens5f0] tx_type 1 not 1 ptp4l[53298.902]: [ens5f0] rx_filter 1 not 12 ptp4l[53298.902]: [ens5f0] port 1: INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[53298.902]: [ens5f0] config item /var/run/ptp4l.0.socket.logMinDelayReqInterval is -4 ptp4l[53298.902]: [ens5f0] config item /var/run/ptp4l.0.socket.logAnnounceInterval is -3 ptp4l[53298.902]: [ens5f0] config item /var/run/ptp4l.0.socket.announceReceiptTimeout is 0 ptp4l[53298.902]: [ens5f0] config item /var/run/ptp4l.0.socket.syncReceiptTimeout is 0 ptp4l[53298.902]: [ens5f0] config item /var/run/ptp4l.0.socket.transportSpecific is 0 ptp4l[53298.902]: [ens5f0] config item /var/run/ptp4l.0.socket.ignore_transport_specific is 0 ptp4l[53298.902]: [ens5f0] config item /var/run/ptp4l.0.socket.masterOnly is 0 ptp4l[53298.902]: [ens5f0] config item /var/run/ptp4l.0.socket.G.8275.portDS.localPriority is 128 ptp4l[53298.902]: [ens5f0] config item /var/run/ptp4l.0.socket.logSyncInterval is -4 ptp4l[53298.902]: [ens5f0] config item /var/run/ptp4l.0.socket.logMinPdelayReqInterval is -4 ptp4l[53298.902]: [ens5f0] config item /var/run/ptp4l.0.socket.neighborPropDelayThresh is 20000000 ptp4l[53298.902]: [ens5f0] config item /var/run/ptp4l.0.socket.min_neighbor_prop_delay is -20000000 ptp4l[53298.902]: [ens5f0] config item (null).uds_address is '/var/run/ptp4l.0.socket' ptp4l[53298.902]: [ens5f0] port 0: INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[53298.902]: [ens5f0] port 1: received link status notification ptp4l[53298.902]: [ens5f0] interface index 4 is up ptp4l[53299.400]: [ens5f0] port 1: announce timeout ptp4l[53299.400]: [ens5f0] port 1: LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES ptp4l[53299.400]: [ens5f0] selected local clock 0c42a1.fffe.1df99a as best master ptp4l[53299.400]: [ens5f0] assuming the grand master role ptp4l[53299.401]: [ens5f0] port 1: master tx announce timeout ptp4l[53299.401]: [ens5f0] port 1: setting asCapable ptp4l[53299.462]: [ens5f0] port 1: master sync timeout ptp4l[53299.525]: [ens5f0] port 1: master sync timeout ptp4l[53299.526]: [ens5f0] port 1: master tx announce timeout ptp4l[53299.588]: [ens5f0] port 1: master sync timeout ptp4l[53299.650]: [ens5f0] port 1: master sync timeout ptp4l[53299.651]: [ens5f0] port 1: master tx announce timeout ptp4l[53299.713]: [ens5f0] port 1: master sync timeout ptp4l[53299.775]: [ens5f0] port 1: master sync timeout ptp4l[53299.776]: [ens5f0] port 1: master tx announce timeout ptp4l[53299.838]: [ens5f0] port 1: master sync timeout phc2sys[53299.894]: [ens5f0] Waiting for ptp4l... ptp4l[53299.894]: [ens5f0] port 0: setting asCapable ptp4l[53299.900]: [ens5f0] port 1: master sync timeout ptp4l[53299.901]: [ens5f0] port 1: master tx announce timeout ptp4l[53299.963]: [ens5f0] port 1: master sync timeout ptp4l[53300.025]: [ens5f0] port 1: master sync timeout ptp4l[53300.026]: [ens5f0] port 1: master tx announce timeout ptp4l[53300.088]: [ens5f0] port 1: master sync timeout ptp4l[53300.150]: [ens5f0] port 1: master sync timeout ptp4l[53300.151]: [ens5f0] port 1: master tx announce timeout ptp4l[53300.213]: [ens5f0] port 1: master sync timeout ptp4l[53300.275]: [ens5f0] port 1: master sync timeout ptp4l[53300.276]: [ens5f0] port 1: master tx announce timeout ptp4l[53300.338]: [ens5f0] port 1: master sync timeout ptp4l[53300.400]: [ens5f0] port 1: master sync timeout ptp4l[53300.401]: [ens5f0] port 1: master tx announce timeout ptp4l[53300.463]: [ens5f0] port 1: master sync timeout ptp4l[53300.526]: [ens5f0] port 1: master sync timeout ptp4l[53300.526]: [ens5f0] port 1: master tx announce timeout ptp4l[53300.588]: [ens5f0] port 1: master sync timeout ptp4l[53300.651]: [ens5f0] port 1: master sync timeout ptp4l[53300.651]: [ens5f0] port 1: master tx announce timeout ptp4l[53300.713]: [ens5f0] port 1: master sync timeout ptp4l[53300.776]: [ens5f0] port 1: master sync timeout ptp4l[53300.776]: [ens5f0] port 1: master tx announce timeout ptp4l[53300.838]: [ens5f0] port 1: master sync timeout phc2sys[53300.894]: [ens5f0] reconfiguring after port state change phc2sys[53300.894]: [ens5f0] selecting ens5f0 for synchronization phc2sys[53300.894]: [ens5f0] selecting CLOCK_REALTIME as the master clock phc2sys[53300.894]: [ens5f0] ens5f0 rms 275 max 275 freq -2008 +/- 0 delay 1402 +/- 0 ptp4l[53300.901]: [ens5f0] port 1: master sync timeout ptp4l[53300.901]: [ens5f0] port 1: master tx announce timeout ptp4l[53300.963]: [ens5f0] port 1: master sync timeout ptp4l[53301.026]: [ens5f0] port 1: master sync timeout ptp4l[53301.026]: [ens5f0] port 1: master tx announce timeout ptp4l[53301.088]: [ens5f0] port 1: master sync timeout ptp4l[53301.151]: [ens5f0] port 1: master sync timeout ptp4l[53301.151]: [ens5f0] port 1: master tx announce timeout ptp4l[53301.213]: [ens5f0] port 1: master sync timeout ptp4l[53301.276]: [ens5f0] port 1: master sync timeout ptp4l[53301.276]: [ens5f0] port 1: master tx announce timeout ptp4l[53301.338]: [ens5f0] port 1: master sync timeout ptp4l[53301.401]: [ens5f0] port 1: master sync timeout ptp4l[53301.401]: [ens5f0] port 1: master tx announce timeout ptp4l[53301.463]: [ens5f0] port 1: master sync timeout ptp4l[53301.526]: [ens5f0] port 1: master sync timeout ptp4l[53301.526]: [ens5f0] port 1: master tx announce timeout ptp4l[53301.588]: [ens5f0] port 1: master sync timeout ptp4l[53301.651]: [ens5f0] port 1: master sync timeout ptp4l[53301.651]: [ens5f0] port 1: master tx announce timeout ptp4l[53301.713]: [ens5f0] port 1: master sync timeout ptp4l[53301.776]: [ens5f0] port 1: master sync timeout ptp4l[53301.776]: [ens5f0] port 1: master tx announce timeout ptp4l[53301.839]: [ens5f0] port 1: master sync timeout phc2sys[53301.894]: [ens5f0] ens5f0 rms 274 max 274 freq -2007 +/- 0 delay 1395 +/- 0
Sebastian, what are the (presumably Fortville) NICs involved and at what link rate are you running at ? What is the switch model in between, and has it been properly configured as G.8275.1 T-BC ?
I have troubles finding a code path that would give this strange output with no specific error message. I assume there are no issues with memory allocation. Can you please run ptp4l in strace and post the output around that "SLAVE/MASTER to FAULTY" message? Can you please also capture all received and transmitted PTP packets around that point?
Miroslav, We are getting delay timeout messages: ptp4l[73182.878]: [ens5f0] port 1: delay timeout ptp4l[73182.888]: [ens5f0] port 1: delay timeout ptp4l[73182.966]: [ens5f0] port 1: delay timeout ptp4l[73183.073]: [ens5f0] port 1: delay timeout ptp4l[73183.148]: [ens5f0] port 1: delay timeout ptp4l[73183.161]: [ens5f0] port 1: delay timeout ptp4l[73183.249]: [ens5f0] port 1: delay timeout ptp4l[73183.343]: [ens5f0] port 1: delay timeout ptp4l[73183.345]: [ens5f0] port 1: delay timeout ptp4l[73183.380]: [ens5f0] port 1: delay timeout ptp4l[73183.496]: [ens5f0] port 1: delay timeout ptp4l[73183.529]: [ens5f0] port 1: delay timeout ptp4l[73183.585]: [ens5f0] port 1: delay timeout ptp4l[73183.605]: [ens5f0] port 1: delay timeout ptp4l[73183.705]: [ens5f0] port 1: delay timeout ptp4l[73183.819]: [ens5f0] port 1: delay timeout What could cause this?
Those are expected debug messages, indicating timeouts for ptp4l sending delay requests.
Created attachment 1734255 [details] Behavior with RT Kernel In this file, I think interesting occurrence might give us a clue on the issues we are having with the Kernel: 4.18.0-193.24.1.rt13.74.el8_2.dt1.x86_64. You see regular delay timeouts; approximately every 63ms. Then, ptp4l spends 4.5 seconds consuming delay responses. We consume 72 packets without handling a delay timeout. It says to me that we are struggling to schedule the timeout and handle packets. It also says to me that we are not having "network" issues. All the packets are there; the daemon is not processing them.
Hi Pasi, This is the nic driver: i40e version: 2.8.20-k firmware-version: 7.10 0x800075e6 19.5.12 expansion-rom-version: bus-info: 0000:86:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes [root@cnfde4 /]# exit [root@cnfde4 core]# lspci -vvv 0000:86:00.0 [root@cnfde4 core]# lspci -vvv -s 0000:86:00.0 86:00.0 Ethernet controller: Intel Corporation Ethernet Controller XXV710 for 25GbE SFP28 (rev 02) Subsystem: Intel Corporation Ethernet 25G 2P XXV710 Adapter Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 66 NUMA node: 1 Region 0: Memory at d8000000 (64-bit, prefetchable) [size=16M] Region 3: Memory at d9008000 (64-bit, prefetchable) [size=32K] Expansion ROM at d3800000 [disabled] [size=512K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME- Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ Address: 0000000000000000 Data: 0000 Masking: 00000000 Pending: 00000000 Capabilities: [70] MSI-X: Enable+ Count=129 Masked- Vector table: BAR=3 offset=00000000 PBA: BAR=3 offset=00001000 Capabilities: [a0] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 2048 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+ RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop- FLReset- MaxPayload 256 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM L1, Exit Latency L1 <16us ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-, LTR-, OBFF Disabled AtomicOpsCtl: ReqEn- LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+ EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest- Capabilities: [e0] Vital Product Data Product Name: XXV710 25GbE Controller Read-only fields: [V0] Vendor specific: FFV19.5.12 [PN] Part number: 00M95 [MN] Manufacture ID: 1028 [V1] Vendor specific: DSV1028VPDR.VER2.1 [V3] Vendor specific: DTINIC [V4] Vendor specific: DCM1001FFFFFF2101FFFFFF1202FFFFFF2302FFFFFF1403FFFFFF2503FFFFFF1604FFFFFF2704FFFFFF1805FFFFFF2905FFFFFF1A06FFFFFF2B06FFFFFF1C07FFFFFF2D07FFFFFF1E08FFFFFF2F08FFFFFF [V5] Vendor specific: NPY2 [V6] Vendor specific: PMTD [V7] Vendor specific: NMVIntel Corp [V8] Vendor specific: L1D0 [RV] Reserved: checksum good, 1 byte(s) reserved Read/write fields: [Y1] System specific: CCF1 End Capabilities: [100 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt+ UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt+ UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ NonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn+ ECRCChkCap+ ECRCChkEn+ MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 Capabilities: [140 v1] Device Serial Number 30-25-0e-ff-ff-b7-a6-40 Capabilities: [1a0 v1] Transaction Processing Hints Device specific mode supported No steering table available Capabilities: [1b0 v1] Access Control Services ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans- ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans- Capabilities: [1d0 v1] #19 Kernel driver in use: i40e Kernel modules: i40e [root@cnfde4 core]# toolbox Container 'toolbox-root' already exists. Trying to start... (To remove the container and start with a fresh toolbox, run: sudo podman rm 'toolbox-root') toolbox-root Container started successfully. To exit, type 'exit'. [root@cnfde4 /]# ethtool ens5f0 Settings for ens5f0: Supported ports: [ FIBRE ] Supported link modes: 25000baseCR/Full Supported pause frame use: Symmetric Supports auto-negotiation: Yes Supported FEC modes: None BaseR RS Advertised link modes: 25000baseCR/Full Advertised pause frame use: No Advertised auto-negotiation: Yes Advertised FEC modes: None BaseR RS Speed: 25000Mb/s Duplex: Full Port: Direct Attach Copper PHYAD: 0 Transceiver: internal Auto-negotiation: on Supports Wake-on: d Wake-on: d Current message level: 0x00000007 (7) drv probe link Link detected: yes We enable ptp support on the switch Dell EMC Networking OS10 Enterprise Copyright (c) 1999-2020 by Dell Inc. All Rights Reserved. OS Version: 10.5.1.4 Build Version: 10.5.1.4.249 Build Time: 2020-07-22T21:29:21+0000 System Type: S5248F-ON Architecture: x86_64 Up Time: 4 weeks 3 days 21:50:48
Adding a request resource to the linuxptp-daemon helped make the synchronization more stable, there are other kernel patches for the RT kernel that also improve the synchronization.
@sscheink I can run for several days without a single FAULT or UNCALIBRATED. However sometimes there are sync losses like: 2020-12-21T04:11:19.079286820+00:00 stdout F ptp4l[44415.855]: port 1: SLAVE to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED) There are not many of these. It looks something different, probably related to the network / switch. I think we can move this to "verified".
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633