Bug 852983

Summary: sighup was failed on the brick log when disk was full.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Vijaykumar Koppad <vkoppad>
Component: glusterdAssignee: Avra Sengupta <asengupt>
Status: CLOSED WORKSFORME QA Contact: Sudhir D <sdharane>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.0CC: amarts, bbandari, rhs-bugs, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: The issue is not seen neither upstream, nor downstream. Consequence: We filled up the root partition of the brick. The log file at that moment was 1GB in size. Then we renamed the log file, and sent a SIGHUP to the brick process. A new blank log file was created, and the process fd was pointing to this newly created log file. It was however, not writing anywhere, because there was no space left on disk. The moment we delted the old (renamed) log file, it freed up space on the disk, and the process started writing on the new log file. Fix: No Fix, as the issue is not seen. Result: Logs : ********************************************************* [root@localhost bricks]# dd if=/dev/zero of=file4 bs=500M count=1 dd: writing `file4': No space left on device 1+0 records in 0+0 records out 0 bytes (0 B) copied, 0.11712 s, 0.0 kB/s [root@localhost bricks]# [root@localhost bricks]# [root@localhost bricks]# [root@localhost bricks]# ls -lrt total 1350272 -rw-------. 1 root root 5273 Feb 28 17:17 home-node1.log-20130127.gz -rw-------. 1 root root 1324 Mar 3 03:33 home-node1.log-20130228.gz -rw-------. 1 root root 4079 Mar 3 03:33 home-node1.log-20130303 -rw-------. 1 root root 1073741824 Mar 11 15:29 home-node1.log -rw-r--r--. 1 root root 104857600 Mar 11 15:31 file1 -rw-r--r--. 1 root root 203501568 Mar 11 15:31 file2 -rw-r--r--. 1 root root 524288 Mar 11 15:31 file3 -rw-r--r--. 1 root root 0 Mar 11 15:31 file4 [root@localhost bricks]# mv home-node1.log aaaa [root@localhost bricks]# kill -HUP 10264 [root@localhost bricks]# ls -lrt total 1350276 -rw-------. 1 root root 5273 Feb 28 17:17 home-node1.log-20130127.gz -rw-------. 1 root root 1324 Mar 3 03:33 home-node1.log-20130228.gz -rw-------. 1 root root 4079 Mar 3 03:33 home-node1.log-20130303 -rw-r--r--. 1 root root 104857600 Mar 11 15:31 file1 -rw-r--r--. 1 root root 203501568 Mar 11 15:31 file2 -rw-r--r--. 1 root root 524288 Mar 11 15:31 file3 -rw-r--r--. 1 root root 0 Mar 11 15:31 file4 -rw-------. 1 root root 1073742384 Mar 11 15:33 aaaa -rw-------. 1 root root 0 Mar 11 15:33 home-node1.log [root@localhost bricks]# \rm -rf aaaa [root@localhost bricks]# ls -lrt total 301700 -rw-------. 1 root root 5273 Feb 28 17:17 home-node1.log-20130127.gz -rw-------. 1 root root 1324 Mar 3 03:33 home-node1.log-20130228.gz -rw-------. 1 root root 4079 Mar 3 03:33 home-node1.log-20130303 -rw-r--r--. 1 root root 104857600 Mar 11 15:31 file1 -rw-r--r--. 1 root root 203501568 Mar 11 15:31 file2 -rw-r--r--. 1 root root 524288 Mar 11 15:31 file3 -rw-r--r--. 1 root root 0 Mar 11 15:31 file4 -rw-------. 1 root root 4750 Mar 11 15:33 home-node1.log [root@localhost bricks]# *********************************************************
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-03-12 05:15:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vijaykumar Koppad 2012-08-30 07:43:53 UTC
Description of problem: when disk was full , sighup was sent to brick process. It was failed, and it had a open fd on the old log file and was writing to old log file , though it had created new log file. And when that old log file was deleted , it kept writing to disk.   


Version-Release number of selected component (if applicable):RHS-2.0.z


How reproducible:Could produce only once 


Steps to Reproduce:
1.Create a distribute-replicate volume 
2.Keep running tests on the volume.  
3.Make root partition full.
4. send sighup to the brick process where the root partition is full . 
5. Check the fd , the process is writing to in /proc/ 
  
Actual results:sighup failed and the process was writing to old fd .


Expected results: sighup should succeed and process should write to new fd . 


Additional info:

Comment 2 Avra Sengupta 2013-03-12 05:15:38 UTC
Cause: The issue is not seen neither upstream, nor downstream. 

Consequence: We filled up the root partition of the brick. The log file at that moment was 1GB in size. Then we renamed the log file, and sent a SIGHUP to the brick process. A new blank log file was created, and the process fd was pointing to this newly created log file. It was however, not writing anywhere, because there was no space left on disk. The moment we delted the old (renamed) log file, it freed up space on the disk, and the process started writing on the new log file.

Fix: No Fix, as the issue is not seen. 

Result: Logs :
*********************************************************
[root@localhost bricks]# dd if=/dev/zero of=file4 bs=500M count=1
dd: writing `file4': No space left on device
1+0 records in
0+0 records out
0 bytes (0 B) copied, 0.11712 s, 0.0 kB/s
[root@localhost bricks]# 
[root@localhost bricks]# 
[root@localhost bricks]# 
[root@localhost bricks]# ls -lrt
total 1350272
-rw-------. 1 root root       5273 Feb 28 17:17 home-node1.log-20130127.gz
-rw-------. 1 root root       1324 Mar  3 03:33 home-node1.log-20130228.gz
-rw-------. 1 root root       4079 Mar  3 03:33 home-node1.log-20130303
-rw-------. 1 root root 1073741824 Mar 11 15:29 home-node1.log
-rw-r--r--. 1 root root  104857600 Mar 11 15:31 file1
-rw-r--r--. 1 root root  203501568 Mar 11 15:31 file2
-rw-r--r--. 1 root root     524288 Mar 11 15:31 file3
-rw-r--r--. 1 root root          0 Mar 11 15:31 file4
[root@localhost bricks]# mv home-node1.log aaaa
[root@localhost bricks]# kill -HUP 10264
[root@localhost bricks]# ls -lrt
total 1350276
-rw-------. 1 root root       5273 Feb 28 17:17 home-node1.log-20130127.gz
-rw-------. 1 root root       1324 Mar  3 03:33 home-node1.log-20130228.gz
-rw-------. 1 root root       4079 Mar  3 03:33 home-node1.log-20130303
-rw-r--r--. 1 root root  104857600 Mar 11 15:31 file1
-rw-r--r--. 1 root root  203501568 Mar 11 15:31 file2
-rw-r--r--. 1 root root     524288 Mar 11 15:31 file3
-rw-r--r--. 1 root root          0 Mar 11 15:31 file4
-rw-------. 1 root root 1073742384 Mar 11 15:33 aaaa
-rw-------. 1 root root          0 Mar 11 15:33 home-node1.log
[root@localhost bricks]# \rm -rf aaaa 
[root@localhost bricks]# ls -lrt
total 301700
-rw-------. 1 root root      5273 Feb 28 17:17 home-node1.log-20130127.gz
-rw-------. 1 root root      1324 Mar  3 03:33 home-node1.log-20130228.gz
-rw-------. 1 root root      4079 Mar  3 03:33 home-node1.log-20130303
-rw-r--r--. 1 root root 104857600 Mar 11 15:31 file1
-rw-r--r--. 1 root root 203501568 Mar 11 15:31 file2
-rw-r--r--. 1 root root    524288 Mar 11 15:31 file3
-rw-r--r--. 1 root root         0 Mar 11 15:31 file4
-rw-------. 1 root root      4750 Mar 11 15:33 home-node1.log
[root@localhost bricks]# 
*********************************************************