| Summary: | Rename of file seems to have a problem | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Gluster Storage | Reporter: | Jacob Shucart <jshucart> | ||||||||
| Component: | glusterfs | Assignee: | Anand Avati <aavati> | ||||||||
| Status: | CLOSED ERRATA | QA Contact: | SATHEESARAN <sasundar> | ||||||||
| Severity: | high | Docs Contact: | |||||||||
| Priority: | high | ||||||||||
| Version: | 1.0 | CC: | chrisw, gluster-bugs, kbarfiel, sdharane, shaines | ||||||||
| Target Milestone: | --- | ||||||||||
| Target Release: | --- | ||||||||||
| Hardware: | x86_64 | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2013-09-23 22:32:34 UTC | Type: | --- | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Attachments: |
|
||||||||||
Created attachment 555628 [details]
Logs at the time of the issue
Created attachment 555629 [details]
Sample data to run test on
Created attachment 555630 [details]
Test script to reproduce issue
This bug is not seen in current master branch (which will get branched as RHS 2.1.0 soon). To consider it for fixing, want to make sure this bug still exists in RHS servers. If not reproduced, would like to close this. Not seeing this issue while running the test script. Moving it to - VERIFIED state Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1262.html |
Description of problem: There is a problem when renaming files. The flow of the application is below: ================= GET LOCK Open file ‘log.last’ If file size > MAX then close file rename log.last to log.<n> create and open ‘log.last’ Write record to file. Close file UNLOCK ================= What I see is that sometimes after the rename, a process will open the ‘last.log’ file but the file they will actually have opened if the renamed file. Version-Release number of selected component (if applicable): 3.2.5 To reproduce the issue, I'm going to attach 3 files. 1 is the test script that reproduces the issue, one is the sample data I run the test script on, and 1 is the Gluster logs at the time of the issue: I have attached 3 files: glusterfs.tar.gz: Contains the gluster log created during the test. Test_src.tar.gz: Contains the source code for the test application. Td_test.tar.gz: Contains the test environment and test log files from the last test. This is the gluster volume info: =================== [root@localhost ~]# gluster volume info all Volume Name: glustervol Type: Distributed-Replicate Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 192.168.3.13:/xfs1 Brick2: 192.168.3.14:/xfs1 Brick3: 192.168.3.15:/xfs1 Brick4: 192.168.3.16:/xfs1 Options Reconfigured: performance.flush-behind: On geo-replication.indexing: Off performance.cache-refresh-timeout: 1 performance.read-ahead: on performance.quick-read: off [root@localhost ~]# ================================== ============================ If you look in the logs folder in the test environment you will see that log.1 and log.2 are identical which indicates that the ‘last_log’ was renamed twice. If you do a dump of the logs (./TDLogTest T logs 1) you will also see the last records: 194 ( 1018677) {DEADBEAF, 1018677, 4, 6, 4092} 195 ( 1022805) {DEADBEAF, 1022805, 5, 6, 1240} 196 ( 1024081) {0, 1024081, 3, 0, 0} 197 ( 1024117) {0, 1024081, 6, 0, 0} The last 2 records are log terminating records of which there should only be one. The test code is fairly straight forward. You will see that I use flock() on the file logs/logInfo to control access to updating of the logs. (There are a few things in the logs folder that are there from older versions of the test and you will also find code in the test which is no longer used.) ======================= To perform the test: Copy the test environment into the gluster volume. Before starting the test one machine needs to reset the test environment: ./resetTDLogTest To run the test each machine involved in the test executes the script: ./startTDLogTest This will start 3 test writers and 1 test reader in the background. A test clients will stop if an error is detected. To manually stop the test execute the scrip: ./stopTDLogTest On one of the test machines. You can dump the records in a log with the command: ./TDLogTest T logs <log number>