Bug 1229353
Summary: | Checking status with snapper receiving Failure (org.freedesktop.DBus.Error.NoReply) | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | John Pittman <jpittman> | ||||
Component: | snapper | Assignee: | Ondrej Kozina <okozina> | ||||
snapper sub component: | general | QA Contact: | Bruno Goncalves <bgoncalv> | ||||
Status: | CLOSED ERRATA | Docs Contact: | |||||
Severity: | medium | ||||||
Priority: | unspecified | CC: | agk, bgoncalv, jbrassow, jpittman, lmiksik, okozina, prajnoha | ||||
Version: | 7.1 | ||||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | snapper-0.1.7-10.el7 | Doc Type: | Bug Fix | ||||
Doc Text: |
Cause: Snapper provides API to deliver file list with modifications made in between two snapshots to its dbus clients. No matter the file list size, the list was transferred via single dbus message over the bus. This was design mistake in snapper since dbus protocol has hard coded limit of message size set to 128 MiB.
Consequence: If client asked for file list of modifications that would result in being bigger than 128 MiB after the marshalling took place on server side, the client always received the answer org.freedesktop.DBus.Error.NoReply and snapperd daemon exited with error code = 1
Fix: File list is transferred using pipe IPC between snapper daemon and clients.
Result: New DBus method GetFilesByPipe() returns opened pipe file descriptot for client to read file list from. Clients can receive larger file lists and snapper daemon doesn't exit with return code = 1 when the file list is bigger than approximately 128 MiB anymore
|
Story Points: | --- | ||||
Clone Of: | |||||||
: | 1231684 (view as bug list) | Environment: | |||||
Last Closed: | 2015-11-19 14:39:00 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
John Pittman
2015-06-08 14:10:26 UTC
I have a feeling this is already fixed in upstream or at least I recall some bug report for upstream snapper with similar descriptipons. Let me check that... Anyway I would like to split the bugzilla in two. The bug dealing with segfault and second one dealing with NoReply error. The thing with timeouts: snapper should _not_ ever timeout as the client calls dbus_connection_send_with_reply_and_block() setting timeout parm to infinite. So it has to be something else. Could you paste here snapper daemon debug log capturing the NoReply error returned by cmd line utility? Hey John, I'v got the debug data and I think I know what went wrong. The bug exists also in current upstream snapper. Give me some time to investigate how difficult would it be to fix it... Created attachment 1050607 [details]
File list causing snapperd to exit with error
Hi the reproduce is a little bit longer and time consuming if you decide to take path B. Use path A but for the sake of completeness you can play with path B if you like: Also, I demonstrate the issue on lvm backend but btrfs is affected as well. So it's not related only to lvm(xfs) or lvm(ext*) backends. 1) create volume group named 'vg' (vgcreate vg /dev/sdx) 2) create thin volume named 'thin_lv' (lvcreate -L5G -T vg/pool -V 30G --name thin_lv) [5GiBs should be enough for pool size or test device size] 3) mkfs.xfs /dev/vg/thin_lv 4) mkdir /mnt/test 5) mount /dev/vg/thin_lv /mnt/test 6) create dummy user named e.g. 'dummy_a' (useradd -m dummy_a) 7) chown dummy_a:dummy_a /mnt/test 8) snapper -c bugtest create-config -f "lvm(xfs)" /mnt/test Now take either path A, or path B. Both leads to same error but path B will take long time to finish: Path A: a9) snapper -c bugtest create -t pre (required to result in snapshot No: 1) a10) snapper -c bugtest create -t post --pre-num 1 a11) pkill snapperd a12) copy unpacked attachement in directory /mnt/test/.snapshots/2/ (filelist-1.txt) a13) snapper -c bugtests status 1..2 Path B (this is correct reproducer from user's perspective but took me long time to finish on my vm): b9) log in as user 'dummy_a' b10) cd /mnt/test b11) run script: "for i in $(seq 1 800000); do mkdir ./dir_$i; touch ./dir_$i/file_$i; done" b12) as root run: snapper -c bugtest create -t pre (required to result in snapshot No: 1) b13) as dummy_a run: chmod -R g-w /mnt/test/ b14) as root run: snapper -c bugtest create -t post --pre-num 1 a14) as root run: snapper -c bugtests status 1..2 Both ways should result in response: Failure (org.freedesktop.DBus.Error.NoReply) after last step and snapperd daemon will exit with retval == 1 and unfortunately no meaningful log message even in --debug mode. Thanks to upstream, we have a work around. Unfortunately the workaround is temporary only (note that you need to restart dbus service after increasing the limit): https://github.com/openSUSE/snapper/issues/176#issuecomment-120321717 If the message transferred over dbus gets larger than 128MB we hit the hard dbus limit for message passed over the bus. IOW the workaround has its own limits as well... Regarding the step b12) and later: you can use -p option to print out snapshot number created by the snapper create command: snapper -c bugtest create -t pre -p use number resulting in command above in step b14. So If the number was i.e 4 the b14 would look like: snapper -c bugtest create -t post --pre-num 4 -p Finally if the number printed out as result of line above was 5 the last command in path be would be: snapper -c bugtest status 4..5 You can always run "snapper -c bugtest list" to see list of all snapshots Tested with snapper snapper-0.1.7-10.el7 kernel 3.10.0-306.0.1.el7 # vgs VG #PV #LV #SN Attr VSize VFree rhel_dell-pe2900-02 1 3 0 wz--n- 202.76g 50.06g # lvcreate -L5G -T rhel_dell-pe2900-02/pool -V 30G --name thin_lv WARNING: Sum of all thin volume sizes (30.00 GiB) exceeds the size of thin pool rhel_dell-pe2900-02/pool (5.00 GiB)! For thin pool auto extension activation/thin_pool_autoextend_threshold should be below 100. Logical volume "thin_lv" created. # mkfs.xfs /dev/rhel_dell-pe2900-02/thin_lv meta-data=/dev/rhel_dell-pe2900-02/thin_lv isize=256 agcount=16, agsize=491504 blks = sectsz=512 attr=2, projid32bit=1 = crc=0 finobt=0 data = bsize=4096 blocks=7864064, imaxpct=25 = sunit=16 swidth=16 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=0 log =internal log bsize=4096 blocks=3840, version=2 = sectsz=512 sunit=16 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 # mkdir /mnt/test # mount /dev/rhel_dell-pe2900-02/thin_lv /mnt/test # useradd -m dummy_a # chown dummy_a:dummy_a /mnt/test # snapper -c bugtest create-config -f "lvm(xfs)" /mnt/test #### Path A #### # snapper -c bugtest create -t pre # snapper -c bugtest create -t post --pre-num 1 # pkill snapperd # xz -d filelist-1.txt.xz # cp filelist-1.txt /mnt/test/.snapshots/2/ cp: overwrite ‘/mnt/test/.snapshots/2/filelist-1.txt’? y # snapper -c bugtest status 1..2 <snip> .p.... /mnt/test/dir_99999/file_99999 # echo $? 0 #### Path B #### $ cd /mnt/test $ for i in $(seq 1 800000); do mkdir ./dir_$i; touch ./dir_$i/file_$i; done $ exit # snapper -c bugtest create -t pre -p 5 # su - dummy_a $ chmod -R g-w /mnt/test chmod: changing permissions of ‘/mnt/test/.snapshots’: Operation not permitted chmod: cannot read directory ‘/mnt/test/.snapshots’: Permission denied $ exit # snapper -c bugtest create -t post --pre-num 5 -p 6 # snapper -c bugtest status 5..6 <snip> .p.... /mnt/test/dir_99999/file_99999 # snapper -c bugtest list Type | # | Pre # | Date | User | Cleanup | Description | Userdata -------+---+-------+------------------------------+------+----------+-------------+--------- single | 0 | | | root | | current | single | 1 | | Fri 28 Aug 2015 05:01:01 EDT | root | timeline | timeline | single | 2 | | Fri 28 Aug 2015 06:01:01 EDT | root | timeline | timeline | single | 3 | | Fri 28 Aug 2015 07:01:01 EDT | root | timeline | timeline | single | 4 | | Fri 28 Aug 2015 08:01:02 EDT | root | timeline | timeline | pre | 5 | | Fri 28 Aug 2015 08:08:56 EDT | root | | | post | 6 | 5 | Fri 28 Aug 2015 08:13:55 EDT | root | | | Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2426.html |