Bug 1362540
| Summary: | glfs_fini() crashes with SIGSEGV | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Prashanth Pai <ppai> | ||||||
| Component: | libgfapi | Assignee: | Soumya Koduri <skoduri> | ||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Sudhir D <sdharane> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 3.8.1 | CC: | bugs, ndevos, oleksandr, pgurusid, rgowdapp, skoduri, thiago | ||||||
| Target Milestone: | --- | Keywords: | Triaged | ||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | All | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | glusterfs-3.8.3 | Doc Type: | If docs needed, set a value | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | |||||||||
| : | 1364026 (view as bug list) | Environment: | |||||||
| Last Closed: | 2016-08-24 10:20:46 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | 1364026 | ||||||||
| Bug Blocks: | |||||||||
| Attachments: |
|
||||||||
|
Description
Prashanth Pai
2016-08-02 13:13:38 UTC
Created attachment 1186826 [details]
Compressed core dump
Also, I can't seem to reproduce this when the test filesystem tree is small enough. I suspect below could have caused the issue -
In inode_table_destroy(), we first purge all the lru entries but the lru count is not adjusted accordingly. So when inode_table_prune() is called in case if the lru count was larger than lru limit (as can be seen in the core), we shall end up accessing invalid memory.
(gdb) f 3
#3 0x00007fcad764100e in inode_table_prune (table=table@entry=0x7fcac0004040) at inode.c:1521
1521 __inode_retire (entry);
(gdb) p table->lru_size
$4 = 132396
(gdb) p table->lru_limit
$5 = 131072
(gdb) p table->lru
$6 = {next = 0x90, prev = 0xcafecafe}
(gdb) p &&table->lru
A syntax error in expression, near `&&table->lru'.
(gdb) p &table->lru
$7 = (struct list_head *) 0x7fcac00040b8
(gdb)
I will send a fix for it.
(In reply to Soumya Koduri from comment #3) > I suspect below could have caused the issue - > > In inode_table_destroy(), we first purge all the lru entries but the lru > count is not adjusted accordingly. This is not true. Can you point out the code where an inode is moved into/out of lru list but lru_size is not modified atomically? I think we need to explore further to find out why lru_size and lru_list were out of sync, before considering this bug as closed with patch http://review.gluster.org/#/c/15087/ (In reply to Raghavendra G from comment #4) > (In reply to Soumya Koduri from comment #3) > > I suspect below could have caused the issue - > > > > In inode_table_destroy(), we first purge all the lru entries but the lru > > count is not adjusted accordingly. > > This is not true. Can you point out the code where an inode is moved > into/out of lru list but lru_size is not modified atomically? Sorry I missed out that inode_table_destroy is the culprit. We need to fix inode_table_destroy to update lru_size while moving inodes from lru to purge list (just like inode_table_prune). > > I think we need to explore further to find out why lru_size and lru_list > were out of sync, before considering this bug as closed with patch > http://review.gluster.org/#/c/15087/ (In reply to Raghavendra G from comment #5) > (In reply to Raghavendra G from comment #4) > > (In reply to Soumya Koduri from comment #3) > > > I suspect below could have caused the issue - > > > > > > In inode_table_destroy(), we first purge all the lru entries but the lru > > > count is not adjusted accordingly. > > > > This is not true. Can you point out the code where an inode is moved > > into/out of lru list but lru_size is not modified atomically? > > Sorry I missed out that inode_table_destroy is the culprit. We need to fix > inode_table_destroy to update lru_size while moving inodes from lru to purge > list (just like inode_table_prune). > yes.. and that is exactly what http://review.gluster.org/#/c/15087/ does. Sorry ..am I missing anything? REVIEW: http://review.gluster.org/15129 (inode: Adjust lru_size while retiring entries in lru list) posted (#2) for review on release-3.8 by Oleksandr Natalenko (oleksandr) *** Bug 1365748 has been marked as a duplicate of this bug. *** COMMIT: http://review.gluster.org/15129 committed in release-3.8 by Niels de Vos (ndevos) ------ commit dae860ab7e1c5a205646393f2cb80a0a06986c30 Author: Soumya Koduri <skoduri> Date: Thu Aug 4 16:00:31 2016 +0530 inode: Adjust lru_size while retiring entries in lru list As part of inode_table_destroy(), we first retire entries in the lru list but the lru_size is not adjusted accordingly. This may result in invalid memory reference in inode_table_prune if the lru_size > lru_limit. > Reviewed-on: http://review.gluster.org/15087 > Smoke: Gluster Build System <jenkins.org> > CentOS-regression: Gluster Build System <jenkins.org> > NetBSD-regression: NetBSD Build System <jenkins.org> > Reviewed-by: Raghavendra G <rgowdapp> > Reviewed-by: Prashanth Pai <ppai> BUG: 1362540 Change-Id: I29ee3c03b0eaa8a118d06dc0cefba85877daf963 Signed-off-by: Soumya Koduri <skoduri> Signed-off-by: Oleksandr Natalenko <oleksandr> Reviewed-on: http://review.gluster.org/15129 Smoke: Gluster Build System <jenkins.org> Reviewed-by: Prashanth Pai <ppai> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Niels de Vos <ndevos> This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.3, please open a new bug report. glusterfs-3.8.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://www.gluster.org/pipermail/announce/2016-August/000059.html [2] https://www.gluster.org/pipermail/gluster-users/ |