Bug 1196026
| Summary: | quota: Used field shows data in "PB" after deleting data from volume(happens again) | ||
|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Vijaikumar Mallikarjuna <vmallika> |
| Component: | quota | Assignee: | Raghavendra G <rgowdapp> |
| Status: | CLOSED WONTFIX | QA Contact: | |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | mainline | CC: | bugs, danodob, hgowtham, nlevinki, rhinduja, rhs-bugs, smohan, storage-qa-internal, vbellur |
| Target Milestone: | --- | Keywords: | Triaged |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1061068 | Environment: | |
| Last Closed: | 2018-11-21 03:05:56 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1061068 | ||
| Bug Blocks: | |||
|
Comment 1
Vijaikumar Mallikarjuna
2016-04-07 01:37:29 UTC
*** Bug 1336304 has been marked as a duplicate of this bug. *** Good day.
We are still seeing this same issue on 3.8.1 on multiple volumes. Is this a confirmed fix? If so, what versions of gluster has it been added to?
When the quota value changes, there are no errors in the logs or any other information that indicates there is an issue.
Environment:
2 primary gluster nodes on RHEL7.2
2 secondary gluster nodes on RHEL7.2 configured for geo-replication
One volume configuration:
Volume Name: appdata
Type: Replicate
Volume ID: 3bffc422-1bf3-4593-bf2e-399e0b3e2a7f
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 192.168.66.102:/data/gluster/appdata/brick1
Brick2: 192.168.66.101:/data/gluster/appdata/brick1
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
cluster.enable-shared-storage: enable
# gluster volume status appdata
Status of volume: appdata
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 192.168.66.102:/data/gluster/
appdata/brick1 49192 0 Y 8502
Brick 192.168.66.101:/data/gluster/
appdata/brick1 49194 0 Y 6600
Self-heal Daemon on localhost N/A N/A Y 22714
Quota Daemon on localhost N/A N/A Y 18396
Self-heal Daemon on 10.0.0.102 N/A N/A Y 7343
Quota Daemon on 10.0.0.102 N/A N/A Y 26234
Quota status post issue:
Path Hard-limit Soft-limit Used Available Soft-limit exceeded? Hard-limit exceeded?
-------------------------------------------------------------------------------------------------------------------------------
/ 6.0GB 80%(4.8GB) 16384.0PB 6.0GB No No
Could the 16384.0PB be the result of an evaluation expecting an unsigned integer but receiving a -1?
We have just updated to 3.8.2 but have yet to work on replicating the issue.
Thank you in advance for your assistance.
Regards,
Dan
Hi Dan, Good day too. Was there an 'rm -rf *' performed on mount point? Are you testing it on a fresh install of 3.8.1? Could you also let us know the steps to reproduce. -- Regards, Manikandan Selvaganesh. We have a few gluster volumes with this issue. One of them has a series of 'find' commands that executes an 'rm' on the output. the other gluster volume has data manipulated by an application and we are waiting for the code to verify. The 3.8.1 install had been in place for a couple weeks and the affected volumes had been in use for a week or so before the issue occurred. It is sporadic when the issue appears. We have disabled and re-enabled quotas between occurrences to get the volumes functional again. Currently, a non production volume has its quota enabled since we patched to 3.8.2. The issue has yet to introduce itself again. Thank you for looking into this. Regards, Dan We just had the non production volume hit the bug again some time in the last 24 hours. Quota said that we were using the 16.3 XB of data whilst df showed all 100GB and du show 18GB. On the georeplicated volume, df showed 93GB used and du showed 18GB. The quota log on the primary volume stopped logging at log rotation on 8/21 so we have no information on the quota behavior. The quota has been enabled for over a week without issue until now. We did verify that the other application is executing an 'rm' within sftp. So far all applications are executing deletes/removes. I saw that 3.8.3 was just released. I have not read the release notes yet. Hopefully there is a fix for this. Please let me know if you have identified anything or if you need more data from me. Thank you. Regards, Dan Hi, Can you paste the output of 'gluster v info', 'gluster v quota <VOLNAME> list' and the output of df -h? It would also be great if you could attach the logs on the system where you are hitting the issue. -- Regards, Manikandan Selvaganesh. We had the issue again. Here is the requested information.
# gluster volume info exports_nonprod
Volume Name: exports_nonprod
Type: Replicate
Volume ID: 6a5ac071-2f33-47b7-9630-ec644e723906
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 192.168.66.102:/data/gluster/exports_nonprod/brick1
Brick2: 192.168.66.101:/data/gluster/exports_nonprod/brick1
Options Reconfigured:
features.quota-deem-statfs: on
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on
features.inode-quota: on
features.quota: on
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
cluster.enable-shared-storage: enable
# gluster volume quota exports_nonprod list
Path Hard-limit Soft-limit Used Available Soft-limit exceeded? Hard-limit exceeded?
-------------------------------------------------------------------------------------------------------------------------------
/ 50.0GB 80%(40.0GB) 16384.0PB 73.8GB No No
# df -h /run/gluster/exports_nonprod
Filesystem Size Used Avail Use% Mounted on
localhost:exports_nonprod 50G 50G 0 100% /run/gluster/exports_nonprod
# du -sh /run/gluster/exports_nonprod
18G /run/gluster/exports_nonprod
# du -sh /data/gluster/exports_nonprod
19G /data/gluster/exports_nonprod
# du -csh /data/gluster/exports_nonprod/brick1/.[!.]* /data/gluster/exports_nonprod/brick1/* | sort -h
8.0K /data/gluster/exports_nonprod/brick1/P
16K /data/gluster/exports_nonprod/brick1/C
16K /data/gluster/exports_nonprod/brick1/.trashcan
48K /data/gluster/exports_nonprod/brick1/u
56K /data/gluster/exports_nonprod/brick1/d
56K /data/gluster/exports_nonprod/brick1/t
19G /data/gluster/exports_nonprod/brick1/.glusterfs
19G total
The other item that I have found interesting is each time we have encountered this issue, the current log has only one entry. I find that interesting.
# cat /var/log/glusterfs/quota-mount-exports_nonprod.log
[2016-08-28 08:42:12.082238] I [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
We are looking at gluster 3.8.3 currently as well.
Thank you again. I hope you can assist!
Regards,
Dan
Looking at the main gluster servers today, 'df' showed use of the 50GB quota at 8.5GB whilst 'du' showed 21GB. The 21GB lines up with what our geo-replication servers report. Strange that it under-reports. Regards, Dan Any further news? Regards, Dan Any updates? Thank you, Dan Hi, The above issue is because of negative accounting. It can happen because of a few combination of operations. We have fixed the operations that we are aware of causing this. Being an accounting issue, you can disable and enable quota to get the accounting done right. Or use the quota fsck script to fix the issues. fsck script: https://review.gluster.org/#/c/glusterfs/19179/ Closing this bug as quota is not being actively developed. If seen again the above work around can be used to fix it. -Hari. |