Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 917336

Summary:

vdsm: VDSM is stuck during prepareForShutdown

Product:

Red Hat Enterprise Virtualization Manager

Reporter:

Elad <ebenahar>

Component:

vdsm

Assignee:

Yaniv Bronhaim <ybronhei>

Status:

CLOSED NOTABUG

QA Contact:

Elad <ebenahar>

Severity:

high

Docs Contact:

Priority:

unspecified

Version:

3.2.0

CC:

bazulay, danken, ewarszaw, fsimonce, hateya, iheim, lpeer, ykaul

Target Milestone:

---

Target Release:

3.2.0

Hardware:

x86_64

OS:

Unspecified

Whiteboard:

infra

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2013-03-27 11:02:55 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

Infra

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
logs	none

Description Elad 2013-03-03 10:27:43 UTC

Created attachment 704528 [details]
logs

Description of problem:

When blocking connection between SPM to storage server, VDSM is stuck in prepare for shutdown and we can see it with the following log entries which repeat 41 times:
 

MainThread::DEBUG::2013-02-28 17:56:42,371::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrentl

altough VDSM is recovered (fixed in previous bug), we can still see it's stuck

Version-Release number of selected component (if applicable):

vdsm-4.10.2-9.0.el6ev.x86_64


How reproducible:

100%

Steps to Reproduce:
1. Have 1 host, have 1 iSCSI storage domain .
2. block connection between SPM and master storage domain by iptables
3. Check for 'cannot run prepareForShutdown concurrently' in VDSM.log --> should see it 41 times
  
Actual results:
the massage appear 41 time. which shows that vdsm is stuck

Expected results:
vdsm should not be stuck

Additional info:

Comment 1 Yaniv Bronhaim 2013-03-12 13:45:05 UTC

This happens because we call to prepareForShutdown each time we get SIGTERM signal or when we manually shutdown vdsm (fencing), we receive this signal few times until we finish to execute the prepareForShutdown part. I prefer to keep this logic and remove the print: 
(prepareForShutdown) cannot run prepareForShutdown concurrently. 
This print doesn't help the user anyway.

Agreed?

Comment 2 Dan Kenigsberg 2013-03-12 14:24:14 UTC

(In reply to comment #1)
> This happens because we call to prepareForShutdown each time we get SIGTERM
> signal or when we manually shutdown vdsm (fencing), we receive this signal
> few times until we finish to execute the prepareForShutdown part. I prefer
> to keep this logic and remove the print: 
> (prepareForShutdown) cannot run prepareForShutdown concurrently. 
> This print doesn't help the user anyway.
> 

That's killing the messenger. The problem is NOT the multiple events, but the fact that it took vdsm a long (infinite?) time to shut down. Dropping the the print would make the logs nicer, but may make it a bit harder to debug such cases.

Comment 3 Eduardo Warszawski 2013-03-14 08:46:23 UTC

(In reply to comment #2)
> (In reply to comment #1)
> > This happens because we call to prepareForShutdown each time we get SIGTERM
> > signal or when we manually shutdown vdsm (fencing), we receive this signal
> > few times until we finish to execute the prepareForShutdown part. I prefer
> > to keep this logic and remove the print: 
> > (prepareForShutdown) cannot run prepareForShutdown concurrently. 
> > This print doesn't help the user anyway.
> > 
> 
> That's killing the messenger. The problem is NOT the multiple events, but
> the fact that it took vdsm a long (infinite?) time to shut down. Dropping
> the the print would make the logs nicer, but may make it a bit harder to
> debug such cases.

I'm not sure that is such long time, less than 40 [sec] till restart:

MainThread::INFO::2013-02-28 17:56:13,852::logUtils::37::dispatcher::(wrapper) Run and protect: prepareForShutdown(options=None)

MainThread::INFO::2013-02-28 17:56:53,445::vdsm::89::vds::(run) I am the actual vdsm 4.10-9.0 nott-vds2.qa.lab.tlv.redhat.com (2.6.32-353.el6.x86_64)

There are concurrent calls to prepareForShutdown() which are logged, probably due to that the function is called for each task whithout waiting for the previous one to finish.
Dumpoing the tasks statuses when starting the shutdown is desirable.
May be this calls can be consolidated, solving the log issue (?).


###################################################################
MainThread::INFO::2013-02-28 17:56:13,852::logUtils::37::dispatcher::(wrapper) Run and protect: prepareForShutdown(options=None)
Thread-11::DEBUG::2013-02-28 17:56:13,853::storageServer::674::ConnectionMonitor::(_monitorConnections) Monitoring stopped
MainThread::DEBUG::2013-02-28 17:56:13,856::sp::359::Storage.StoragePool::(cleanupMasterMount) master `/rhev/data-center/mnt/blockSD/a3a19511-ec12-4e02-8d69-8266cde77c21/master` is not mounted, skipping
MainThread::DEBUG::2013-02-28 17:56:13,858::sp::359::Storage.StoragePool::(cleanupMasterMount) master `/rhev/data-center/mnt/blockSD/cc18ed67-0de7-47a7-8b38-9564b0e0691c/master` is not mounted, skipping
MainThread::DEBUG::2013-02-28 17:56:13,860::sp::359::Storage.StoragePool::(cleanupMasterMount) master `/rhev/data-center/mnt/blockSD/7a6dd857-c1bd-4d70-88eb-7bdd1de17412/master` is not mounted, skipping
MainThread::DEBUG::2013-02-28 17:56:13,862::sp::359::Storage.StoragePool::(cleanupMasterMount) master `/rhev/data-center/mnt/blockSD/3ee0b491-b4c8-4adb-aa39-9dc7c23d2829/master` is not mounted, skipping
MainThread::DEBUG::2013-02-28 17:56:13,865::sp::359::Storage.StoragePool::(cleanupMasterMount) master `/rhev/data-center/mnt/blockSD/4946a9f8-9880-4eb0-9d4a-792758466d51/master` is not mounted, skipping
MainThread::DEBUG::2013-02-28 17:56:13,867::sp::353::Storage.StoragePool::(cleanupMasterMount) unmounting /rhev/data-center/mnt/blockSD/c1a71c2a-5f08-4fb5-a2e3-cbb440c3b8c5/master
MainThread::DEBUG::2013-02-28 17:56:13,976::misc::83::Storage.Misc.excCmd::(<lambda>) '/usr/bin/sudo -n /bin/umount /rhev/data-center/mnt/blockSD/c1a71c2a-5f08-4fb5-a2e3-cbb440c3b8c5/master' (cwd None)
MainThread::DEBUG::2013-02-28 17:56:14,350::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:15,351::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:16,351::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:17,352::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:18,353::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:19,353::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:20,354::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:21,355::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:22,356::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:23,356::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:24,357::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:25,358::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:26,359::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:27,360::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:28,360::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:29,361::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:30,362::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:31,363::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:32,363::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:33,364::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:34,365::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:35,366::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:36,366::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:37,367::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:38,368::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:39,369::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:40,369::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:41,370::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:42,371::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:43,372::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:44,372::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:45,373::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:46,374::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:47,375::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:48,375::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:49,376::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:50,377::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:51,378::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::DEBUG::2013-02-28 17:56:52,378::clientIF::228::vds::(prepareForShutdown) cannot run prepareForShutdown concurrently
MainThread::INFO::2013-02-28 17:56:53,445::vdsm::89::vds::(run) I am the actual vdsm 4.10-9.0 nott-vds2.qa.lab.tlv.redhat.com (2.6.32-353.el6.x86_64)

Comment 4 Ayal Baron 2013-03-17 07:59:17 UTC

I agree that the logging is not the issue (although would be nice if we'd log *who* called prepareForShutdown) and that not logging this would only hide the problem.

1. why is it taking vdsm 40s to shutdown?
2. why is preparedForShutdown called so many times?

Comment 5 Yaniv Bronhaim 2013-03-17 08:22:08 UTC

spmstop script does not run in this scenario, and i don't familiar with the reasons why we don't use this script anymore to avoid duplicate mount of the mastermount and promise the kill after 10 SIGTERM signals.

Eduardo says that its not a bug and storage operations before exit can take time, maybe he could supply more information here.
Most of the operations in prepareForShutdown that can take time are storage related. I explained why it is called in my first comment..

Please supply more information about which storage operations should handled in this part and what do you want to do if it takes too long, and how long is too long is for storage operations before shutdown

Comment 6 Eduardo Warszawski 2013-03-17 15:23:23 UTC

About the too long times:

What should be prepared before restart vdsm?

The SPM is the only one that have the master directory mounted.
Therefore the only that can write there.
After the restart, the host will be not SPM any more.
If you are afraid of stale master mounts of a former SPM and the new one, making corruptions of journalized file systems is better to use ext2 for the master,
In such case if the master remains mounted in the former SPM, this host will not write since it is not SPM.

This will reduce the required lapse before the restart.


If vdsm interface will be atomic the task concept lost it's validity.

In addition the SPM operations should be, and we are going towards, state defined.
Then the storage state defines next steps.
In this case again there is no need to save the task state and the master dir is redundant.
(Where ovf files should be saved is out of the scope of this bug.)

Examples:
https://bugzilla.redhat.com/show_bug.cgi?id=890922
https://bugzilla.redhat.com/show_bug.cgi?id=917363

This makes redundant the tasks dir => master mount.


The log is product of prepareForShutdown being called in a loop, look at the precise 1 [s] deltas between calls in the log.
(This is not a RT kernel.)

Comment 7 Ayal Baron 2013-03-17 22:14:54 UTC

(In reply to comment #5)
> spmstop script does not run in this scenario, and i don't familiar with the
> reasons why we don't use this script anymore to avoid duplicate mount of the
> mastermount and promise the kill after 10 SIGTERM signals.
> 
> Eduardo says that its not a bug and storage operations before exit can take
> time, maybe he could supply more information here.

The fact that storage operations *can* take several minutes even doesn't mean that 40s here is ok.
If the problem here is that umount is taking a long time then fine, but I see no evidence to this in any of the above comments.

> Most of the operations in prepareForShutdown that can take time are storage
> related. I explained why it is called in my first comment..

No you didn't, you explained 'when' it is called, but not 'why'.
*Who* is calling SIGTERM so many times and *why*?

Also, if this is the SPM and it lost the lease then it should have been fenced?

> 
> Please supply more information about which storage operations should handled
> in this part and what do you want to do if it takes too long, and how long
> is too long is for storage operations before shutdown

Comment 8 Yaniv Bronhaim 2013-03-18 12:23:28 UTC

The shutdown process is a mix of several unclear operations and seems like it includes a lot of code that we don't use.

spmstop script does not run at all, I'm not sure if spmprotect still runs because I can't capture its running and its log always empty (might be that we omitted the use of those scripts..)

The shutdown procedure starts from closing all tasks, and cleaning all mounts, during that somehow vdsm receives SIGTERM signals. I tried to find where in the code we send those signals but I'm not sure, and it might be externally to vdsm, as the spm scripts should do..

How the shutdown process should work? why spmstop doesn't run at all ? and why do we need to close the tasks before fencing (who cares about the tasks... they can persist in the future anyway). What engine's part in this process, if there is.. and if Eduardo wants to omit the mastermount, why do we still kill vdsm, and not just starting the failover procedure?

Please help me to clarify the shutdown process implementation and I'll be able to investigate it more

Comment 9 Ayal Baron 2013-03-18 12:27:03 UTC

(In reply to comment #8)
> The shutdown process is a mix of several unclear operations and seems like
> it includes a lot of code that we don't use.
> 
> spmstop script does not run at all, I'm not sure if spmprotect still runs
> because I can't capture its running and its log always empty (might be that
> we omitted the use of those scripts..)

SANLock is protecting the SPM, check sanlock.log

> 
> The shutdown procedure starts from closing all tasks, and cleaning all
> mounts, during that somehow vdsm receives SIGTERM signals. I tried to find
> where in the code we send those signals but I'm not sure, and it might be
> externally to vdsm, as the spm scripts should do..
> 
> How the shutdown process should work? why spmstop doesn't run at all ? and
> why do we need to close the tasks before fencing (who cares about the
> tasks... they can persist in the future anyway). What engine's part in this
> process, if there is.. and if Eduardo wants to omit the mastermount, why do
> we still kill vdsm, and not just starting the failover procedure?
> 
> Please help me to clarify the shutdown process implementation and I'll be
> able to investigate it more

I'm not sure all these questions are at all relevant because no-one here has shown what is taking 40s...
Once that question is answered we can pinpoint the problem and start discussing the processes and solutions.  Theorizing on what is going on is a waste of time.

Comment 10 Yaniv Bronhaim 2013-03-19 08:38:12 UTC

I ran it with nfs and iscsi SD, it happens only with iscsi during cleanupMasterMount in sp.py call we get stuck, and vdsm doesn't continue at all until it gets SIGKILL. 

During that vdsm receives SIGTERM every second for 35 seconds until the SIGKILL, that I still don't know where it comes from (that's why I asked the previous questions... and I hope someone can fill in the answer if spmstop.sh does not run).

Please correct me if I'm wrong, but the issue here is the way we do the umount and not the way we shutdown vdsm.. the umount that gets stuck is :

MainThread::DEBUG::2013-03-19 10:17:49,873::misc::85::Storage.Misc.excCmd::(<lambda>) '/usr/bin/sudo -n /bin/umount /rhev/data-center/mnt/blockSD/978a9c71-c5fb-4604-8643-d9f5658c9029/master' (cwd None)

Any storagish suggestion how to fix that?

Comment 11 Federico Simoncelli 2013-03-27 09:25:06 UTC

(In reply to comment #10)
> I ran it with nfs and iscsi SD, it happens only with iscsi during
> cleanupMasterMount in sp.py call we get stuck, and vdsm doesn't continue at
> all until it gets SIGKILL. 
> 
> During that vdsm receives SIGTERM every second for 35 seconds until the
> SIGKILL, that I still don't know where it comes from (that's why I asked the
> previous questions... and I hope someone can fill in the answer if
> spmstop.sh does not run).

The SIGTERM/SIGKILL signals are coming from the sanlock daemon.
spmstop.sh is not called because it is part of the SafeLease implementation (clusterlock.py), and here the clusterlock is managed by sanlock.

Comment 12 Yaniv Bronhaim 2013-03-27 11:02:55 UTC

The timeout for SIGTERM is set to 40 second in sanlock.
src/sanlock_internal.h:#define DEFAULT_GRACE_SEC 40

The prints appear in each SIGTERM signal that received from sanlock and they mean that vdsm trying to shutdown. Because the mount was stuck, after 40 seconds, sanlock sends SIGKILL and vdsm is restarted.

prepareForShutdown mechanism works properly.