Bug 1245181
Summary: | Sanlock fail to set scheduler to SCHED_RR | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Nir Soffer <nsoffer> | ||||
Component: | sanlock | Assignee: | David Teigland <teigland> | ||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | rawhide | CC: | cfeist, fsimonce, teigland | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2015-07-26 15:28:08 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1243935 | ||||||
Attachments: |
|
Description
Nir Soffer
2015-07-21 12:08:50 UTC
Created attachment 1054317 [details]
sanlock.log
This seems a likely cause for the timeouts. The wdmd daemon also does the same scheduler steps, so I'd expect the same errors from wdmd to be in /var/log/messages. If not, could you check if wdmd was able to set its scheduling successfully? Running: ps ax -o pid,stat,cmd,class,rtprio | grep wdmd Should show this: 14282 SLs wdmd RR 99 (In reply to David Teigland from comment #2) > This seems a likely cause for the timeouts. > > The wdmd daemon also does the same scheduler steps, so I'd expect the same > errors from wdmd to be in /var/log/messages. If not, could you check if > wdmd was able to set its scheduling successfully? I see: # ps axf -o pid,stat,cmd,class,rtprio 721 SLs wdmd -G sanlock RR 99 723 SLsl sanlock daemon -U sanlock - RR 99 724 S \_ sanlock daemon -U sanlo TS - And I also do not see any error after yesterday at 01:30 - maybe the issue disappeared after reboot? I rebooted the host, and I see: (reboot) # sanlock.log 2015-07-21 22:10:31+0300 11 [747]: sanlock daemon started 3.2.2 host 708de246-f98b-4f9a-b9b2-de8d8a10a291.bamba.tlv. 2015-07-21 22:10:53+0300 33 [752]: cmd_add_lockspace 3,9 f4f54f47-9ccf-4978-a9a7-12a6d89bf94e:2:/rhev/data-center/mnt/multipass.eng.lab.tlv.redhat.com:_export_images_rnd_ahadas # ps axf -o pid,stat,cmd,class,rtprio 742 SLs wdmd -G sanlock RR 99 747 SLsl sanlock daemon -U sanlock - RR 99 748 S \_ sanlock daemon -U sanlo TS - And when running the tests program, it works now. Seems like a temporary failure that I cannot reproduce now. How do you suggest to proceed with this? That's good and bad I suppose. I don't have any clue what could have happened. Since the sched_setscheduler(2) issue disappeared, we cannot do much about this. Closing until we have more data. |