Bug 1120928
| Summary: | Loss storage LUNS ramdomly when using volumes as CIFS | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Do Hakyong <crazyraven> | ||||
| Component: | kernel | Assignee: | cifs-maint | ||||
| kernel sub component: | CIFS | QA Contact: | Filesystem QE <fs-qe> | ||||
| Status: | CLOSED NOTABUG | Docs Contact: | |||||
| Severity: | medium | ||||||
| Priority: | unspecified | CC: | gdeschner, sbose, sprabhu | ||||
| Version: | 5.8 | ||||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2016-07-01 10:15:31 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Hello, This issue seems to have slipped through the cracks. Please report such problems to Red Hat support to ensure adequate attention is given to issues. I am closing this issue since it was reported a couple of years ago and hasn't had any updates yet. RHEL 5 is currently in maintenance phase. Please re-open the case with GSS if you would like to continue debugging the problem. Sachin Prabhu |
Created attachment 918931 [details] message.log Description of problem: We created CIFS server for using at hospital. The enviroment is HP ProLiant DL380p Gen8 Server(certificated) + RHEL5.8(x86_64) + Veritas file system + Hitachi Storage. The problem is that when we operating samba server, we have loss LUNS ramdomly with below message: ---------------------------------------------------------------------------- Jul 10 11:44:54 IV03 kernel: qla2xxx 0000:07:00.1: Mailbox command timeout occurred, cmd=0x54 mb[0]=0x54. Issuing ISP abort. Jul 10 11:44:54 IV03 kernel: qla2xxx 0000:07:00.1: Performing ISP error recovery - ha= ffff81083ae584f8. Jul 10 11:44:55 IV03 kernel: qla2xxx 0000:07:00.1: LIP reset occured (f700). Jul 10 11:44:55 IV03 kernel: qla2xxx 0000:07:00.1: LOOP UP detected (4 Gbps). Jul 10 11:44:56 IV03 kernel: qla2xxx 0000:07:00.1: scsi(4:0:13): Abort command issued -- 0 410191 2002. Jul 10 11:45:36 IV03 kernel: qla2xxx 0000:07:00.1: Mailbox command timeout occurred, cmd=0x54 mb[0]=0x54. Issuing ISP abort. Jul 10 11:45:36 IV03 kernel: qla2xxx 0000:07:00.1: Performing ISP error recovery - ha= ffff81083ae584f8. Jul 10 11:45:37 IV03 kernel: qla2xxx 0000:07:00.1: LIP reset occured (f700). Jul 10 11:45:37 IV03 kernel: qla2xxx 0000:07:00.1: LOOP UP detected (4 Gbps). Jul 10 11:45:37 IV03 kernel: qla2xxx 0000:07:00.1: scsi(4:0:13): Abort command issued -- 0 410191 2002. Jul 10 11:46:17 IV03 kernel: qla2xxx 0000:07:00.1: Mailbox command timeout occurred, cmd=0x54 mb[0]=0x54. Issuing ISP abort. Jul 10 11:46:17 IV03 kernel: qla2xxx 0000:07:00.1: Performing ISP error recovery - ha= ffff81083ae584f8. Jul 10 11:46:19 IV03 kernel: qla2xxx 0000:07:00.1: LIP reset occured (f700). Jul 10 11:46:19 IV03 kernel: qla2xxx 0000:07:00.1: LOOP UP detected (4 Gbps). Jul 10 11:46:19 IV03 kernel: qla2xxx 0000:07:00.1: scsi(4:0:22): Abort command issued -- 0 410192 2002. Jul 10 11:46:48 IV03 kernel: INFO: task vx_worklist_thr:9957 blocked for more than 120 seconds. Jul 10 11:46:48 IV03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jul 10 11:46:48 IV03 kernel: vx_worklist_t D ffffffff801568f1 0 9957 1 9958 9956 (L-TLB) Jul 10 11:46:48 IV03 kernel: ffff8108282f5ae0 0000000000000046 ffffffffffffff9c ffff810840002c00 Jul 10 11:46:48 IV03 kernel: 0000000000011200 000000000000000a ffff810834013820 ffff81103fe4e0c0 Jul 10 11:46:48 IV03 kernel: 0003e5334062270b 00000000002ac803 ffff810834013a08 0000000a00031200 Jul 10 11:46:48 IV03 kernel: Call Trace: Jul 10 11:46:48 IV03 kernel: [<ffffffff8006ece7>] do_gettimeofday+0x40/0x90 Jul 10 11:46:48 IV03 kernel: [<ffffffff80028c7d>] sync_page+0x0/0x43 Jul 10 11:46:48 IV03 kernel: [<ffffffff800637de>] io_schedule+0x3f/0x67 Jul 10 11:46:48 IV03 kernel: [<ffffffff80028cbb>] sync_page+0x3e/0x43 Jul 10 11:46:48 IV03 kernel: [<ffffffff80063a0a>] __wait_on_bit+0x40/0x6e Jul 10 11:46:48 IV03 kernel: [<ffffffff800350fb>] wait_on_page_bit+0x6c/0x72 Jul 10 11:46:48 IV03 kernel: [<ffffffff800a34d9>] wake_bit_function+0x0/0x23 Jul 10 11:46:48 IV03 kernel: [<ffffffff886c7941>] :vxfs:vx_pvn_wait_writeback+0x64/0xca Jul 10 11:46:48 IV03 kernel: [<ffffffff886cc91e>] :vxfs:vx_pvn_range_dirty+0xad5/0xb63 Jul 10 11:46:48 IV03 kernel: [<ffffffff886c78cc>] :vxfs:vx_pvn_lookup_dirty_tag+0x0/0x7 Jul 10 11:46:48 IV03 kernel: [<ffffffff886ccba8>] :vxfs:vx_putpage_dirty_wbc+0xdf/0xec Jul 10 11:46:48 IV03 kernel: [<ffffffff80044e51>] mempool_free_slab+0x0/0xe Jul 10 11:46:48 IV03 kernel: [<ffffffff886cccd2>] :vxfs:vx_putpage_dirty+0x29/0x2e Jul 10 11:46:48 IV03 kernel: [<ffffffff886ac9f2>] :vxfs:vx_do_putpage+0xc2/0x147 Jul 10 11:46:48 IV03 kernel: [<ffffffff886371af>] :vxfs:vx_idelxwri_flush+0x11e/0x211 Jul 10 11:46:48 IV03 kernel: [<ffffffff8864baea>] :vxfs:vx_idalloc_off+0x303/0x413 Jul 10 11:46:48 IV03 kernel: [<ffffffff88639a11>] :vxfs:vx_dalloc_flush+0x186/0x21d Jul 10 11:46:48 IV03 kernel: [<ffffffff88635d20>] :vxfs:vx_workitem_process+0x2d/0x3d Jul 10 11:46:48 IV03 kernel: [<ffffffff88635ef7>] :vxfs:vx_worklist_process+0x1c7/0x2a8 Jul 10 11:46:48 IV03 kernel: [<ffffffff88639207>] :vxfs:vx_worklist_thread+0x0/0x98 Jul 10 11:46:48 IV03 kernel: [<ffffffff88639261>] :vxfs:vx_worklist_thread+0x5a/0x98 Jul 10 11:46:48 IV03 kernel: [<ffffffff8868f830>] :vxfs:vx_kthread_init+0x57/0x5e Jul 10 11:46:48 IV03 kernel: [<ffffffff88639207>] :vxfs:vx_worklist_thread+0x0/0x98 Jul 10 11:46:48 IV03 kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11 Jul 10 11:46:48 IV03 kernel: [<ffffffff88639207>] :vxfs:vx_worklist_thread+0x0/0x98 Jul 10 11:46:48 IV03 kernel: [<ffffffff8868f7d9>] :vxfs:vx_kthread_init+0x0/0x5e Jul 10 11:46:48 IV03 kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11 Jul 10 11:46:48 IV03 kernel: Jul 10 11:46:48 IV03 kernel: INFO: task vx_worklist_thr:9960 blocked for more than 120 seconds. Jul 10 11:46:48 IV03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jul 10 11:46:48 IV03 kernel: vx_worklist_t D ffffffff801568f1 0 9960 1 9961 9959 (L-TLB) Jul 10 11:46:48 IV03 kernel: ffff8108282fba30 0000000000000046 ffff810839102860 ffff810840002c00 Jul 10 11:46:48 IV03 kernel: 0000000000011200 000000000000000a ffff810839102860 ffff8108400dc080 Jul 10 11:46:48 IV03 kernel: 0003e542d2905269 00000000000022b7 ffff810839102a48 000000053a5079b8 ------------------------------------------------------------------------------- I've also attached message.log please refer to attachment. The LUNs are provided with Veritas file system(vxfs) and configured MPIO that handle by veritas file system. In my opinion, even we loss any of paths, CIFS should be working because we have 4 paths from SAN switch(2 Hba card are inserted), even if one or two ports are died, it should can not be affected to service. We tired change HBA card(Qlogic to Emulex)but same issue has still caused so it's not a HBA card problem. The Veritas engineer said that it's not a veritas problem. Just push this problem to OS and storage or HBA card. I've suffered this problem over 10days.. please let me go home...(cry) your valuable advise should be a great help to me. THANKS! Version-Release number of selected component (if applicable): RHEL 5.8(x86_64) How reproducible: Steps to Reproduce: 1. configure CIFS server with storage volume(volume is formatted veritas filesystem) 2. after starting CIFS, few ours or few days later the problem is caused with above log message. 3. Actual results: Expected results: Additional info: