Bug 1474726

Summary: fsfreeze -f / will cause system to stuck on readonly operation
Product: Red Hat Enterprise Linux 7 Reporter: Marcel Gazdík <mgazdik>
Component: kernelAssignee: fs-maint
kernel sub component: VFS QA Contact: Filesystem QE <fs-qe>
Status: CLOSED NOTABUG Docs Contact:
Severity: high    
Priority: medium CC: aviro, bfoster, esandeen, mszeredi, socketpair, vfeenstr
Version: 7.3   
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-07-25 15:13:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Marcel Gazdík 2017-07-25 09:18:58 UTC
Description of problem:
  We are using rsync to make a backup of a system. To make sure, the backup will not be corrupted, the / is being frozen. 
  strace -Ttfv cat /var/log/messages will get stuck on close operation, please have a look at additional info bellow.

Version-Release number of selected component (if applicable):
kernel 3.10.0-514.26.2.el7.x86_64, tested even with elder (EL/Centos 7.1, 7.2)

How reproducible:
  This happens during LeApp tool integration tests (https://github.com/GAZDOWN/leapp/tree/enable-fsfreeze). The test causing this is: httpd-stateless.feature - "Migrate between remote hosts using rsync - systemd". Unfortunately we were unable to narrow this down to make small reproducer (yet).


Steps to Reproduce:
1. git clone https://github.com/GAZDOWN/leapp/tree/enable-fsfreeze
2. run integration tests. 
3. after a minute or so, when the "Migrate between remote hosts using rsync - systemd" scenario is running, the source system will stop responding. 

Actual results:
  System get stuck completely, sometimes even fsfreeze -u / doesn't respond and the machine must be reset

Expected results:
  Read only operation should not lock the system.

Additional info:
08:20:49 read(3, "4 10:23:44 localhost systemd: Re"..., 65536) = 65536 <0.000994>
08:20:49 write(1, "4 10:23:44 localhost systemd: Re"..., 65536) = 65536 <0.000125>
08:20:49 read(3, "tworkManager[579]: <warn>  [1499"..., 65536) = 65536 <0.000203>
08:20:49 write(1, "tworkManager[579]: <warn>  [1499"..., 65536) = 65536 <0.000216>
08:20:49 read(3, "ktop.DBus proxy (Error calling S"..., 65536) = 65536 <0.000098>
08:20:49 write(1, "ktop.DBus proxy (Error calling S"..., 65536) = 65536 <0.000096>
08:20:49 read(3, "CRITICAL **: Error while sending"..., 65536) = 65536 <0.000871>
08:20:49 write(1, "CRITICAL **: Error while sending"..., 65536) = 65536 <0.000157>
08:20:49 read(3, "DBus: The connection is closed);"..., 65536) = 65536 <0.000260>
08:20:49 write(1, "DBus: The connection is closed);"..., 65536) = 65536 <0.000121>
08:20:49 read(3, " closed\nJul  4 11:28:10 localhos"..., 65536) = 65536 <0.000095>
08:20:49 write(1, " closed\nJul  4 11:28:10 localhos"..., 65536) = 65536 <0.000120>
08:20:49 read(3, "info>  [1499168153.4404] dhcp4 ("..., 65536) = 65536 <0.000014>
08:20:49 write(1, "info>  [1499168153.4404] dhcp4 ("..., 65536) = 65536 <0.000115>
08:20:49 read(3, "8626.3760] bus-manager: could no"..., 65536) = 65536 <0.000039>
08:20:49 write(1, "8626.3760] bus-manager: could no"..., 65536) = 65536 <0.000024>
08:20:49 read(3, "host NetworkManager: (NetworkMan"..., 65536) = 65536 <0.000012>
08:20:49 write(1, "host NetworkManager: (NetworkMan"..., 65536) = 65536 <0.000026>
08:20:49 read(3, "t create org.freedesktop.DBus pr"..., 65536) = 65536 <0.000013>
08:20:49 write(1, "t create org.freedesktop.DBus pr"..., 65536) = 65536 <0.000023>
08:20:49 read(3, "nly the private D-Bus socket wil"..., 65536) = 65536 <0.000012>
08:20:49 write(1, "nly the private D-Bus socket wil"..., 65536) = 65536 <0.000023>
08:20:49 read(3, "ogin1.service'\nJul  4 12:14:58 l"..., 65536) = 65536 <0.000023>
08:20:49 write(1, "ogin1.service'\nJul  4 12:14:58 l"..., 65536) = 65536 <0.000026>
08:20:49 read(3, "ktop.DBus proxy (Error calling S"..., 65536) = 65536 <0.000012>
08:20:49 write(1, "ktop.DBus proxy (Error calling S"..., 65536) = 65536 <0.000115>
08:20:49 read(3, "tch() message: The connection is"..., 65536) = 65536 <0.000014>
08:20:49 write(1, "tch() message: The connection is"..., 65536) = 65536 <0.000026>
08:20:49 read(3, "DBus: The connection is closed);"..., 65536) = 65536 <0.000037>
08:20:49 write(1, "DBus: The connection is closed);"..., 65536) = 65536 <0.000041>
08:20:49 read(3, "n1' unit='dbus-org.freedesktop.l"..., 65536) = 65536 <0.000014>
08:20:49 write(1, "n1' unit='dbus-org.freedesktop.l"..., 65536) = 65536 <0.000034>
08:20:49 read(3, ":53:25 localhost NetworkManager["..., 65536) = 65536 <0.000116>
08:20:49 write(1, ":53:25 localhost NetworkManager["..., 65536) = 65536 <0.000031>
08:20:49 read(3, "ace parallel port driver\nJul  4 "..., 65536) = 65536 <0.000015>
08:20:49 write(1, "ace parallel port driver\nJul  4 "..., 65536) = 65536 <0.000030>
08:20:49 read(3, "rnel: No NUMA configuration foun"..., 65536) = 65536 <0.000013>
08:20:49 write(1, "rnel: No NUMA configuration foun"..., 65536) = 65536 <0.000029>
08:20:49 read(3, " 20 2]\nJul  5 07:58:34 leapp-tes"..., 65536) = 65536 <0.000014>
08:20:49 write(1, " 20 2]\nJul  5 07:58:34 leapp-tes"..., 65536) = 65536 <0.000052>
08:20:49 read(3, "krefs=False insertafter=None pat"..., 65536) = 65536 <0.000270>
08:20:49 write(1, "krefs=False insertafter=None pat"..., 65536) = 65536 <0.000031>
08:20:49 read(3, "e journal is using 4.0M (max all"..., 65536) = 65536 <0.000020>
08:20:49 write(1, "e journal is using 4.0M (max all"..., 65536) = 65536 <0.000042>
08:20:49 read(3, "sts-centos7-guest-httpd systemd:"..., 65536) = 65536 <0.000014>
08:20:49 write(1, "sts-centos7-guest-httpd systemd:"..., 65536) = 65536 <0.000029>
08:20:49 read(3, "ssfully loaded SELinux policy in"..., 65536) = 65536 <0.000014>
08:20:49 write(1, "ssfully loaded SELinux policy in"..., 65536) = 65536 <0.000028>
08:20:49 read(3, " version 17, address 0xfec00000,"..., 65536) = 65536 <0.000013>
08:20:49 write(1, " version 17, address 0xfec00000,"..., 65536) = 65536 <0.000029>
08:20:49 read(3, "h0): dhclient started with pid 6"..., 65536) = 65536 <0.000014>
08:20:49 write(1, "h0): dhclient started with pid 6"..., 65536) = 65536 <0.000040>
08:20:49 read(3, "tos7-guest-httpd systemd-logind:"..., 65536) = 45589 <0.000016>
08:20:49 write(1, "tos7-guest-httpd systemd-logind:"..., 45589) = 45589 <0.000024>
08:20:49 read(3, "", 65536)             = 0 <0.000006>
08:20:49 close(3


[root@leapp-tests-centos7-guest-httpd ~]# ps
  PID TTY          TIME CMD
 1029 pts/1    00:00:00 bash
 1997 pts/1    00:00:00 strace
 1998 pts/1    00:00:00 tail
 2003 pts/1    00:00:00 cat
 2004 pts/1    00:00:00 ps


2003:
[root@leapp-tests-centos7-guest-httpd 2003]# cat stack
[<ffffffff8120153e>] __sb_start_write+0xde/0x110
[<ffffffffa020b114>] xfs_trans_alloc+0x24/0x40 [xfs]
[<ffffffffa01e9fe1>] xfs_free_eofblocks+0x111/0x260 [xfs]
[<ffffffffa0200aee>] xfs_release+0x9e/0x170 [xfs]
[<ffffffffa01f0ce5>] xfs_file_release+0x15/0x20 [xfs]
[<ffffffff812007d9>] __fput+0xe9/0x260
[<ffffffff81200a8e>] ____fput+0xe/0x10
[<ffffffff810ad1e7>] task_work_run+0xa7/0xe0
[<ffffffff8109d7a5>] ptrace_notify+0xb5/0xc0
[<ffffffff81039075>] tracehook_report_syscall_exit+0x45/0xe0
[<ffffffff81039447>] syscall_trace_leave+0x77/0x110
[<ffffffff81697aa2>] int_check_syscall_exit_work+0x34/0x3d
[<ffffffffffffffff>] 0xffffffffffffffff


[root@leapp-tests-centos7-guest-httpd 2003]# cat stat
2003 (cat) D 1997 1997 1029 34817 2006 4202496 209 0 1 0 0 0 0 0 20 0 1 0 64625 110501888 153 18446744073709551615 4194304 4238828 140735554496720 140735554496088 139761421007824 0 0 0 0 18446744071580947774 0 0 17 0 0 0 0 0 0 6339656 6341408 35041280 140735554500320 140735554500342 140735554500342 140735554502635 0

[root@leapp-tests-centos7-guest-httpd fd]# echo *
0 1 2


1197:
[root@leapp-tests-centos7-guest-httpd 1997]# cat stack
[<ffffffff8109c364>] do_signal_stop+0x134/0x2b0
[<ffffffff8109da2c>] get_signal_to_deliver+0x27c/0x6d0
[<ffffffff8102a427>] do_signal+0x57/0x6c0
[<ffffffff8102aaef>] do_notify_resume+0x5f/0xb0
[<ffffffff81697abd>] int_signal+0x12/0x17
[<ffffffffffffffff>] 0xffffffffffffffff

[root@leapp-tests-centos7-guest-httpd 1997]# cat stat
1997 (strace) T 1029 1997 1029 34817 1029 1077944576 383 78 1 0 0 11 0 0 20 0 1 0 64614 4816896 170 18446744073709551615 4194304 4623828 140726426387312 140726426386504 140619722584874 0 0 3145728 20487 18446744071579485028 0 0 17 0 0 0 0 0 0 6721040 6722544 29405184 140726426392272 140726426392307 140726426392307 140726426394600 0

[root@leapp-tests-centos7-guest-httpd fd]# echo *
0 1 2


rsync:
root      1985  0.0  0.5 193380  2804 ?        Ss   08:34   0:00 sudo rsync --server --sender -logDtpAXre.iLsfxC . /
root      1994  1.8  8.4 225900 42280 ?        R    08:34   0:08 rsync --server --sender -logDtpAXre.iLsfxC . /

[root@leapp-tests-centos7-guest-httpd 1985]# cat stack
[<ffffffff81213465>] poll_schedule_timeout+0x55/0xb0
[<ffffffff81213de1>] do_select+0x6d1/0x7c0
[<ffffffff812140ab>] core_sys_select+0x1db/0x300
[<ffffffff8121428a>] SyS_select+0xba/0x110
[<ffffffff81697809>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

[root@leapp-tests-centos7-guest-httpd fd]# echo *
0 1 2 3 5 6 7



[root@leapp-tests-centos7-guest-httpd 1994]# cat stack
[<ffffffffa01fc645>] xfs_iget+0x175/0x750 [xfs]
[<ffffffffa0205557>] xfs_lookup+0xf7/0x140 [xfs]
[<ffffffffa02021fb>] xfs_vn_lookup+0x7b/0xd0 [xfs]
[<ffffffff81208cad>] lookup_real+0x1d/0x50
[<ffffffff81209622>] __lookup_hash+0x42/0x60
[<ffffffff81684412>] lookup_slow+0x42/0xa7
[<ffffffff8120cb43>] path_lookupat+0x773/0x7a0
[<ffffffff8120cb9b>] filename_lookup+0x2b/0xc0
[<ffffffff812105b7>] user_path_at_empty+0x67/0xc0
[<ffffffff81210621>] user_path_at+0x11/0x20
[<ffffffff81203a93>] vfs_fstatat+0x63/0xc0
[<ffffffff81204061>] SYSC_newlstat+0x31/0x60
[<ffffffff812042ee>] SyS_newlstat+0xe/0x10
[<ffffffff81697809>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff


[root@leapp-tests-centos7-guest-httpd fd]# echo *
0 1 2 3


When I run ls fd:

root@leapp-tests-centos7-guest-httpd fd]# strace -Ttfv ls

^^^^
Command is not even executed unless the system has been manually unfrozen

Comment 2 Brian Foster 2017-07-25 12:25:56 UTC
(In reply to Marcel Gazdík from comment #0)
> Description of problem:
>   We are using rsync to make a backup of a system. To make sure, the backup
> will not be corrupted, the / is being frozen. 

This seems like a slightly odd use case for freeze. Why not use a snapshot or remount read-only? The expected behavior for any incidental write operation on a frozen fs is going to be to block indefinitely, and not all writes are explicit from userspace.

>   strace -Ttfv cat /var/log/messages will get stuck on close operation,
> please have a look at additional info bellow.
> 

I'm assuming /var is part of the root fs..? Also, what is the purpose of the 'cat /var/log/messages' here? Wouldn't writes to this file block? Does the blocking behavior occur without this?

...
> [root@leapp-tests-centos7-guest-httpd ~]# ps
>   PID TTY          TIME CMD
>  1029 pts/1    00:00:00 bash
>  1997 pts/1    00:00:00 strace
>  1998 pts/1    00:00:00 tail
>  2003 pts/1    00:00:00 cat
>  2004 pts/1    00:00:00 ps
> 
> 
> 2003:
> [root@leapp-tests-centos7-guest-httpd 2003]# cat stack
> [<ffffffff8120153e>] __sb_start_write+0xde/0x110
> [<ffffffffa020b114>] xfs_trans_alloc+0x24/0x40 [xfs]
> [<ffffffffa01e9fe1>] xfs_free_eofblocks+0x111/0x260 [xfs]
> [<ffffffffa0200aee>] xfs_release+0x9e/0x170 [xfs]
> [<ffffffffa01f0ce5>] xfs_file_release+0x15/0x20 [xfs]
> [<ffffffff812007d9>] __fput+0xe9/0x260
> [<ffffffff81200a8e>] ____fput+0xe/0x10
> [<ffffffff810ad1e7>] task_work_run+0xa7/0xe0
> [<ffffffff8109d7a5>] ptrace_notify+0xb5/0xc0
> [<ffffffff81039075>] tracehook_report_syscall_exit+0x45/0xe0
> [<ffffffff81039447>] syscall_trace_leave+0x77/0x110
> [<ffffffff81697aa2>] int_check_syscall_exit_work+0x34/0x3d
> [<ffffffffffffffff>] 0xffffffffffffffff
> 

As the stack above shows, release is not necessarily a read-only operation. XFS can allocate post-eof blocks at write time to help reduce fragmentation. These blocks may be truncated on file release, which is an fs modification and thus requires a transaction (which is where a frozen fs blocks write operations).

...
> [root@leapp-tests-centos7-guest-httpd 1994]# cat stack
> [<ffffffffa01fc645>] xfs_iget+0x175/0x750 [xfs]
> [<ffffffffa0205557>] xfs_lookup+0xf7/0x140 [xfs]
> [<ffffffffa02021fb>] xfs_vn_lookup+0x7b/0xd0 [xfs]
> [<ffffffff81208cad>] lookup_real+0x1d/0x50
> [<ffffffff81209622>] __lookup_hash+0x42/0x60
> [<ffffffff81684412>] lookup_slow+0x42/0xa7
> [<ffffffff8120cb43>] path_lookupat+0x773/0x7a0
> [<ffffffff8120cb9b>] filename_lookup+0x2b/0xc0
> [<ffffffff812105b7>] user_path_at_empty+0x67/0xc0
> [<ffffffff81210621>] user_path_at+0x11/0x20
> [<ffffffff81203a93>] vfs_fstatat+0x63/0xc0
> [<ffffffff81204061>] SYSC_newlstat+0x31/0x60
> [<ffffffff812042ee>] SyS_newlstat+0xe/0x10
> [<ffffffff81697809>] system_call_fastpath+0x16/0x1b
> [<ffffffffffffffff>] 0xffffffffffffffff
> 

It's not clear to me where this is blocked. What task is this?

Comment 3 Eric Sandeen 2017-07-25 13:39:57 UTC
Manually freezing the root filesystem is fraught with danger; you really should not expect this to be robust in any way.  As Brian says, a volume snapshot would be a much saner way to get a point in time view of the filesystem, if that's what you need.

From atime updates to eof block releases, etc, there are all kinds of things at any given time which may need write access, and will block, possibly blocking more things behind them.

This doesn't look at all like an xfs bug, this looks like a misapplication of the fsfreeze tool.

Comment 4 Marcel Gazdík 2017-07-25 14:11:13 UTC
> This seems like a slightly odd use case for freeze. Why not use a snapshot
> or remount read-only? The expected behavior for any incidental write
> operation on a frozen fs is going to be to block indefinitely, and not all
> writes are explicit from userspace.

The purpose is to keep the machine up as long as possible and without any physical interventions from the user (like i.e. booting to a rescue mode). We cannot mount / as read only while it is being used. LVM and other snapshots is one of our future plan, but so far we are trying to use fsfreeze only.

 
> I'm assuming /var is part of the root fs..? Also, what is the purpose of the
> 'cat /var/log/messages' here? Wouldn't writes to this file block? Does the
> blocking behavior occur without this?

Yeah, you are right, /var is part of /. And the purpose of cat command is simply just a show, that we are blocked on close call. We tried multiple files, all of them had same issue, no matter whether we tried to write into them or not.


> As the stack above shows, release is not necessarily a read-only operation.
> XFS can allocate post-eof blocks at write time to help reduce fragmentation.
> These blocks may be truncated on file release, which is an fs modification
> and thus requires a transaction (which is where a frozen fs blocks write
> operations).

OK, so does this means for us, that it is intended behavior even when the fs should be frozen?


> It's not clear to me where this is blocked. What task is this?
task 1994 is: rsync --server --sender -logDtpAXre.iLsfxC . /

I included this stack in case it will be somehow helpful, so if it doesn't contain any information relevant to you, you can simply ignore that.

Comment 5 Eric Sandeen 2017-07-25 15:13:56 UTC
I'm going to close this NOTABUG, as there is no evidence of any flaw in freeze handling or logic.

Operations such as "cat" or "close" may seem read-only from the user perspective, but they may in fact require metadata writes which will block.

A manual freeze of the root filesystem is dangerous and is almost never the right tool for the job.

Thanks,
-Eric

Comment 6 Коренберг Марк 2023-10-09 15:25:25 UTC
Kernel 6.3.8. Exactly the same bug: rsync + XFS + close() hangs.

Comment 7 Коренберг Марк 2023-10-09 15:37:27 UTC
https://bugzilla.kernel.org/show_bug.cgi?id=205833 the same bug