Bug 1402037
| Summary: | GlusterFS - Server halts updateprocess ... AGAIN | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | customercare | ||||
| Component: | glusterd | Assignee: | Atin Mukherjee <amukherj> | ||||
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | |||||
| Severity: | high | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 3.8 | CC: | anoopcs, barumuga, bugs, customercare, humble.devassy, joe, jonathansteffan, kkeithle, ndevos, ramkrsna | ||||
| Target Milestone: | --- | Keywords: | Triaged | ||||
| Target Release: | --- | ||||||
| Hardware: | All | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2017-04-11 12:50:51 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
The command that got killed seems to be this one: glusterd --xlator-option *.upgrade=on -N This is not starting the glusterd service, but takes care of the updating the (generated) configuration files. The vol-files (may) need updating between releases, this command should not be skipped. We will need to find out why this command is causing a hang. Is this something you can reproduce at will? If so, please pass a list of all running gluster services with their commandline options, check with strace/ltrace if the problematic glusterd process is still doing something, or is blocked, attach the generated configurations (from /var/lib/glusterd) and if possible a coredump gathered with gcore (from the gdb RPM). Could be a problem, i can upgrade a server os only once, and this bugreport is the result of it :) I can't reproduce this until the next update in 6 months confirmed : \_ /usr/bin/python3 /usr/bin/dnf --allowerasing --releasever=24 --setopt=deltarpm=false distro-sync root 21609 0.0 0.1 5608 2888 pts/0 S+ 13:03 0:00 \_ /bin/sh /var/tmp/rpm-tmp.5dloRw 2 root 21621 0.0 0.8 93240 18148 pts/0 Sl+ 13:03 0:00 \_ glusterd --xlator-option *.upgrade=on -N and this time, it took a while, but completed whatever it did on it's own. Created attachment 1229035 [details]
glusterfsd informations
We're not going to rework the script. This is how the update process is designed to work. But if glusterd is starting volumes when *.upgrade=on is set that might very well be a bug in glusterd. (In reply to Kaleb KEITHLEY from comment #5) > We're not going to rework the script. This is how the update process is > designed to work. > > But if glusterd is starting volumes when *.upgrade=on is set that might very > well be a bug in glusterd. GlusterD doesn't modify the state of the volume/peer, it just recreates the volfiles. As Niels pointed out we need a reproducer or atleast some process trace to figure out what is causing this hung. I could see that comment 3 mentions that process took a little longer but didn't hung, if that's the case it could be because of too many volumes where glusterd_recreate_volfiles iterate over all the volumes and generate volfiles for each of them. In that case is it a valid bug then? Hmm, too many volfiles? I had just one volume created. Bumping again, do you have a reproducer or at worst case give us the process trace? I can't reproduce it as it was while upgrading a system from F23->F24 . You can only do that once ;) and i only had one system with gluster setup. And there was only one test volume setup with only a few files in it at best. I believe, i'm not helpful anymore. In the meantime, the system got changed from 32 to 64bits, so the environment changed. Maybe when it gets it upgrade from 24->25 we may get more infos. I won't able to strace it from the beginning, but when it hangs again, i will turn it on for you. I'm closing this bug with insufficient data as a reason, please feel free to reopen once you hit it again and share all the required details. |
Description of problem: I reported this same problem against FC22 -> FC23 and now with FC23 -> FC24 upgrades, i have to report it again: While the update to FC24 was installed, the rpm scriptlet started glusterfsd without forking it into the background : Aktualisieren : glusterfs-server-3.8.5-1.fc24.i686 977/2250 Result : ├─sshd───sshd───bash───dnf───sh───glusterd───6*[{glusterd}] i was forced to kill it in a paralell ssh session: warning: /var/lib/glusterd/vols/gv0/gv0.s145.resellerdesktop.de.data-brick1-gv0.vol saved as /var/lib/glusterd/vols/gv0/gv0.s145.resellerdesktop.de.data-brick1-gv0.vol.rpmsave warning: /var/lib/glusterd/vols/gv0/trusted-gv0.tcp-fuse.vol saved as /var/lib/glusterd/vols/gv0/trusted-gv0.tcp-fuse.vol.rpmsave warning: /var/lib/glusterd/vols/gv0/gv0.s113.resellerdesktop.de.data-brick1-gv0.vol saved as /var/lib/glusterd/vols/gv0/gv0.s113.resellerdesktop.de.data-brick1-gv0.vol.rpmsave warning: /var/lib/glusterd/vols/gv0/gv0.tcp-fuse.vol saved as /var/lib/glusterd/vols/gv0/gv0.tcp-fuse.vol.rpmsave warning: /var/lib/glusterd/vols/gv0/gv0-rebalance.vol saved as /var/lib/glusterd/vols/gv0/gv0-rebalance.vol.rpmsave /var/tmp/rpm-tmp.7jn4r5: Zeile 69: 15058 Getötet glusterd --xlator-option *.upgrade=on -N Please rework this rpm scritlet to not start the daemon at all, because the next thing after updating a server is to reboot it.. there is no need to start the daemon while installing the package whilst upgrading the os version.