Description of problem: after upgrade to 3.8.5 from 3.8.4 on Centos 7 host we get: block I/O error in device 'drive-ide0-0-0': Input/output error (5) on all VMs, all VMs we run have virtual ide drives... fuse mounts work OK though. I don't see any errors in volume's brick log: [2016-11-01 09:29:56.519571] I [MSGID: 115029] [server-handshake.c:692:server_setvolume] 0-pool-server: accepted client from son-27691-2016/11/01-09:29:51:431330-pool-client-6-0-0 (version: 3.8.5) [2016-11-01 09:29:56.522235] I [login.c:76:gf_auth] 0-auth/login: allowed user names: a02c192c-679c-4929-97d3-b0ae0595df52 [2016-11-01 09:29:56.522305] I [MSGID: 115029] [server-handshake.c:692:server_setvolume] 0-pool-server: accepted client from spirit-5113-2016/11/01-09:29:51:784139-pool-client-6-0-0 (version: 3.8.5) [2016-11-01 09:29:58.563046] I [login.c:76:gf_auth] 0-auth/login: allowed user names: a02c192c-679c-4929-97d3-b0ae0595df52 What is interesting here is that we have 3 nodes and volume is 3way replicated, we got errors right after we upgraded last node from 3.8.4 to 3.8.5. And we did not forget to upgrade [root@father qemu]# rpm -qf /lib64/libglusterfs.so.0 glusterfs-libs-3.8.5-1.el7.x86_64 and restart libvirtd. Downgrading back to 3.8.4 solved problem. Thank you!
I am not sure where qemu/gfapi driver logging happens. But do you see any warnings/errors there?
Created attachment 1216849 [details] qemu log Hello! This is log, which contains ide i/o error and previous messages. It also contains log after downgrade to 3.8.4. Thank you!
Okay. I do not see any obvious errors in the log file. CCin Prasanna. Do you see any issues with the newly created volume after upgrade to 3.8.5? A patch addressing few ref leaks in gfapi went into 3.8.5 I think. There are some issues reported in gluster-devel wrt that patch - http://www.gluster.org/pipermail/gluster-devel/2016-October/051234.html Not sure if its related to this issue. But patch http://review.gluster.org/#/c/15768/ is submitted to address that.
No, I did not try to create new volumes with 3.8.5. Unfortunately, we are in sort of production and have no test environment right now, so I can't try to reproduce this or do any tests...
From the log I cannot see anything suspicious, [2016-11-01 08:58:31.174381] I [MSGID: 104041] [glfs-resolve.c:885:__glfs_active_subvol] 0-pool: switched to graph 66617468-6572-2d32-3437-39342d323031 (0) block I/O error in device 'drive-ide0-0-0': Input/output error (5) The chance that http://review.gluster.org/#/c/15768/ probably fix is only if there are any async calls made before ?
Hello! Just tried to reproduce upgrade from 3.8.4 to 3.8.5 on Centos 7. Immediately reproduced. Any progress in fixing this problem for 3.8.x? Thank you!
(In reply to Need Real Name from comment #6) > Hello! > > Just tried to reproduce upgrade from 3.8.4 to 3.8.5 on Centos 7. > Immediately reproduced. > Any progress in fixing this problem for 3.8.x? > > Thank you! Hello there, The issue is already fixed with the patch[1] in gluster 3.8 branch, which I tested. Please try with the latest on glusterfs 3.8.z[2] and let us know, if you have more problems. [1] - http://review.gluster.org/#/c/15779/1 [2] - http://buildlogs.centos.org/centos/7/storage/x86_64/gluster-3.8/ Thanks again for tracking the issue and looking for the solution.
Rajesh, Since this bug is already fixed, can we move the state of the bug to appropriate state, so that the users will not be confused with the bug state.
Installed 3.8.7 , everything is fine. Thank you! btw, very strange that 3.8.5 is last version in stable SIG repository :-(
(In reply to Need Real Name from comment #9) > Installed 3.8.7 , everything is fine. > Thank you! > > btw, very strange that 3.8.5 is last version in stable SIG repository :-( Happy to hear that it worked for you !! @kshlm, By any chance, do you know any reasons why the stable SIG repository has glusterfs-3.8.5 ? It should be the latest of glusterfs isn't it ?
Following patch fixes this issue hence moving the bug state to MODIFIED. http://review.gluster.org/15779
This bug is getting closed because the 3.8 version is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days