Bug 1324429 - Seeing lots of error message "JournalPlayer: unexpected tag in journal entry" while renaming snapshot
Summary: Seeing lots of error message "JournalPlayer: unexpected tag in journal entry"...
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: RBD
Version: 2.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: 2.0
Assignee: Jason Dillaman
QA Contact: ceph-qe-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-04-06 10:16 UTC by Tanay Ganguly
Modified: 2017-07-30 15:28 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-05-06 19:01:47 UTC
Target Upstream Version:


Attachments (Terms of Use)
rbd bench crash (835.28 KB, text/plain)
2016-05-06 18:41 UTC, Tanay Ganguly
no flags Details
renaming snap crash (6.38 KB, text/plain)
2016-05-06 18:42 UTC, Tanay Ganguly
no flags Details

Description Tanay Ganguly 2016-04-06 10:16:29 UTC
Description of problem:
Seeing lots of Error message while trying to rename a Clone, when primary Image is writing some IO

Version-Release number of selected component (if applicable):
ceph version 10.1.0-1

How reproducible:


Steps to Reproduce:
1.Create an Image with all features enabled.
rbd image 'testing2':
        size 10240 MB in 2560 objects
        order 22 (4096 kB objects)
        block_name_prefix: rbd_data.5fe8238e1f29
        format: 2
        features: layering, exclusive-lock, object-map, fast-diff, deep-flatten, journaling
        flags: 
        journal: 5fe8238e1f29
        mirroring state: disabled
2.Create 100 Snapshot on top the image.
3.Start IO on the parent volume, rbd bench-write -p Tanay-RBD --image testing2 --io-size 10240 --io-pattern rand
4.While IO is going on, try to rename all the 100 snapshot.


Actual results:
When initially i started the renaming of the snapshot, i saw lock_owner message popping.
Check the lock, it was held by rbd bench-write

rbd lock list Tanay-RBD/testing2
There is 1 exclusive lock on this image.
Locker       ID                   Address                  
client.25493 auto 139675557707120 10.8.128.46:0/2255449993 

Then i stopped the Write process, i.e. killed the IO.
After that i am seeing the lock owner changed but the renaming failed throwing these error message.

Expected results:
There should not be any error message.

Additional info:
----------------------------------------------------------------------------
----------------------------------------------------------------------------

2016-04-06 09:30:45.176374 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:45.499403 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:45.817712 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:46.134774 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:46.451229 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:46.767731 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:47.186626 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:47.503329 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:47.818182 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:48.169025 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:48.484491 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:48.801480 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:49.118825 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:49.435677 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:49.751402 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:50.067856 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:50.486481 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:50.801566 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:51.118479 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:51.434843 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:51.750799 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:52.069363 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]
2016-04-06 09:30:52.385749 7f42b8cf6700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=102, entry_tid=27966, data size=10270]

Comment 2 Jason Dillaman 2016-04-13 03:45:11 UTC
I haven't been able to recreate this with 10.1.0 (nor current upstream master).  Can you please recreate with "debug journaler = 20" / "debug rbd = 20" in your ceph.conf and attach the logs?

Comment 3 Tanay Ganguly 2016-04-13 11:17:09 UTC
Unable to reproduce in the latest 10.1.1.1 Build.
Will Close it after few more regression test.

Comment 4 Jason Dillaman 2016-04-13 11:37:34 UTC
Sounds good -- thanks.

Comment 5 Jason Dillaman 2016-05-02 19:50:46 UTC
@Tanay: Have you had a chance to re-run the test and are you comfortable closing this ticket?

Comment 6 Tanay Ganguly 2016-05-04 15:03:32 UTC
Jason,
Will update the BUG soon.

Comment 7 Tanay Ganguly 2016-05-06 18:41:25 UTC
Jason,

While reproducing this Bug i am hitting a crash.


rbd image 'testing2':
        size 102400 MB in 25600 objects
        order 22 (4096 kB objects)
        block_name_prefix: rbd_data.10af238e1f29
        format: 2
        features: layering, exclusive-lock, object-map, fast-diff, deep-flatten, journaling
        flags: 
        journal: 10af238e1f29
        mirroring state: disabled


Steps:
1. Firstly i am running IO on an RBD Image.
rbd bench-write -p None --image testing2 --io-size 10240 --io-pattern rand

2, While step 1 is happening i am renaming the already created snapshot.
for i in {1..100}; do rbd snap rename None/testing2@snappy$i None/testing2@snap$i; done


Both renaming and bench-write is crashing ( This happened on 2nd iteration )

PFA, both logs (bench-write.log and renaming_snap.log)


Note: I am unable to reproduce the original issue.

Comment 8 Tanay Ganguly 2016-05-06 18:41:59 UTC
Created attachment 1154724 [details]
rbd bench crash

Comment 9 Tanay Ganguly 2016-05-06 18:42:30 UTC
Created attachment 1154725 [details]
renaming snap crash

Comment 10 Tanay Ganguly 2016-05-06 19:01:47 UTC
Marking this BUG as CLOSED as i am unable to reproduce it with latest build.

For Comment7 i have opened a new ticket after discussion with Jason.

https://bugzilla.redhat.com/show_bug.cgi?id=1333950


Note You need to log in before you can comment on or make changes to this bug.