982621 – Disk monitoring not shutting down with nsslapd-disk-monitoring-logging-critical set to off

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 982621 - Disk monitoring not shutting down with nsslapd-disk-monitoring-logging-critical set to off

Summary: Disk monitoring not shutting down with nsslapd-disk-monitoring-logging-critic...

Keywords:
Status:	CLOSED DUPLICATE of bug 972930
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	389-ds-base
Sub Component:
Version:	6.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	mreynolds
QA Contact:	Sankar Ramalingam
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-07-09 12:24 UTC by Ján Rusnačko
Modified:	2013-07-12 21:25 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2013-07-12 21:25:55 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Ján Rusnačko 2013-07-09 12:24:02 UTC

Description of problem:
Disk Monitoring plugin is not triggered and does not work if nsslapd-disk-monitoring-logging-critical set to off. However, with nsslapd-disk-monitoring-logging-critical set to on, it works properly.

Version-Release number of selected component (if applicable):
389-ds-base-1.2.11.15-16.el6_4.x86_64

How reproducible:
always

Steps to Reproduce:
1. Enabled disk monitoring plugin and set nsslapd-disk-monitoring-logging-critical to off:

[jrusnack@dstet dstet]$ ldapsearch -D "cn=directory manager" -w Secret123 -b "cn=config" -s base | grep "nsslapd-disk-monitoring"
nsslapd-disk-monitoring: on
nsslapd-disk-monitoring-threshold: 30000000
nsslapd-disk-monitoring-grace-period: 1
nsslapd-disk-monitoring-logging-critical: off

2. Restart DS
3. Fill up space to go below the threshold:
[jrusnack@dstet dstet]$ dd if=/dev/zero of=/var/log/dirsrv/slapd-dstet/foo bs=1M count=20
20+0 records in
20+0 records out
20971520 bytes (21 MB) copied, 0.462266 s, 45.4 MB/s
[jrusnack@dstet dstet]$ df -h /var/log/dirsrv/slapd-dstet
Filesystem            Size  Used Avail Use% Mounted on
/home/jrusnack/tmpfs   39M   25M   13M  67% /var/log/dirsrv/slapd-dstet

4. Check whether DS is in shutdown mode:

[jrusnack@dstet dstet]$ tail /var/log/dirsrv/slapd-dstet/errors
[09/Jul/2013:05:26:59 +0200] - slapd stopped.
[09/Jul/2013:05:27:48 +0200] - 389-Directory/1.2.11.15 B2013.182.2043 starting up
[09/Jul/2013:05:27:48 +0200] - slapd started.  Listening on All Interfaces port 389 for LDAP requests
[09/Jul/2013:05:28:10 +0200] - slapd shutting down - signaling operation threads
[09/Jul/2013:05:28:10 +0200] - slapd shutting down - closing down internal subsystems and plugins
[09/Jul/2013:05:28:11 +0200] - Waiting for 4 database threads to stop
[09/Jul/2013:05:28:11 +0200] - All database threads now stopped
[09/Jul/2013:05:28:12 +0200] - slapd stopped.
[09/Jul/2013:05:28:14 +0200] - 389-Directory/1.2.11.15 B2013.182.2043 starting up
[09/Jul/2013:05:28:14 +0200] - slapd started.  Listening on All Interfaces port 389 for LDAP requests

Actual results:
If filesystem is filled to go below the threshold, disk monitoring plugin is not tiggered - it does not disable verbose logging, remove logs or disable logging. 

If filesystem is filled to go below 1/2 of the threshold, DS does not enter shutdown period and no error message appears in the error log.

Default behavior, when disk monitoring plugin is invoked every 10 seconds, was taken into account.

Expected results:
Disk monitoring plugin should be tiggered.

Additional info:
Already automated in disk_monitoring testsuite.

Comment 4 mreynolds 2013-07-09 19:59:54 UTC

The server is not going into shut-down mode because you are not continuing to lose disk space.  See the design doc:

http://directory.fedoraproject.org/wiki/Disk_Monitoring

The feature only wants to shut-down the server as a last resort.  If things remain stable, it doesn't do anything.  I do think there is room for improvement in this area though.

It looks like you just consumed a chunk of disk space at one shot.  If you continue to consume disk space does it go into shutdown mode?

Comment 5 Ján Rusnačko 2013-07-09 20:07:09 UTC

(In reply to mreynolds from comment #4)
> The server is not going into shut-down mode because you are not continuing
> to lose disk space.  See the design doc:
> 
> http://directory.fedoraproject.org/wiki/Disk_Monitoring
> 
> The feature only wants to shut-down the server as a last resort.  If things
> remain stable, it doesn't do anything.  I do think there is room for
> improvement in this area though.
> 
> It looks like you just consumed a chunk of disk space at one shot.  If you
> continue to consume disk space does it go into shutdown mode?

According to the design doc: 

"Once the available disk space on any of the disks gets below the threshold we start taking action."

The space was consumed in one shot, however, since the threshold was passed, as the first thing verbose logging should get disabled, right ? It was not, therefore I assumed disk monitoring plugin was not even triggered when the space dropped below the threshold.

I will investigate what will happen if available space continues to drop and report back.

Comment 6 mreynolds 2013-07-09 20:15:41 UTC

(In reply to Ján Rusnačko from comment #5)
> (In reply to mreynolds from comment #4)
> > The server is not going into shut-down mode because you are not continuing
> > to lose disk space.  See the design doc:
> > 
> > http://directory.fedoraproject.org/wiki/Disk_Monitoring
> > 
> > The feature only wants to shut-down the server as a last resort.  If things
> > remain stable, it doesn't do anything.  I do think there is room for
> > improvement in this area though.
> > 
> > It looks like you just consumed a chunk of disk space at one shot.  If you
> > continue to consume disk space does it go into shutdown mode?
> 
> According to the design doc: 
> 
> "Once the available disk space on any of the disks gets below the threshold
> we start taking action."
> 
> The space was consumed in one shot, however, since the threshold was passed,
> as the first thing verbose logging should get disabled, right ? It was not,
> therefore I assumed disk monitoring plugin was not even triggered when the
> space dropped below the threshold.

This is the area that can use some improvement.  It expects that the threshold will be hit first, but not past the halfway mark.  So it makes a complete pass, then on the next pass if space continues to drop it will go into shutdown mode.  But in your case, even though you are past the threshold halfway mark, it will not enter the shutdown code because it happened in one shot and did not continue to lose disk space.  

So yes, there is a bug to fix, but I just wanted you to verify this behavior.

Thanks,
Mark
> 
> I will investigate what will happen if available space continues to drop and
> report back.

Comment 7 Ján Rusnačko 2013-07-09 20:19:09 UTC

(In reply to mreynolds from comment #6)
> (In reply to Ján Rusnačko from comment #5)
> > (In reply to mreynolds from comment #4)
> > > The server is not going into shut-down mode because you are not continuing
> > > to lose disk space.  See the design doc:
> > > 
> > > http://directory.fedoraproject.org/wiki/Disk_Monitoring
> > > 
> > > The feature only wants to shut-down the server as a last resort.  If things
> > > remain stable, it doesn't do anything.  I do think there is room for
> > > improvement in this area though.
> > > 
> > > It looks like you just consumed a chunk of disk space at one shot.  If you
> > > continue to consume disk space does it go into shutdown mode?
> > 
> > According to the design doc: 
> > 
> > "Once the available disk space on any of the disks gets below the threshold
> > we start taking action."
> > 
> > The space was consumed in one shot, however, since the threshold was passed,
> > as the first thing verbose logging should get disabled, right ? It was not,
> > therefore I assumed disk monitoring plugin was not even triggered when the
> > space dropped below the threshold.
> 
> This is the area that can use some improvement.  It expects that the
> threshold will be hit first, but not past the halfway mark.  So it makes a
> complete pass, then on the next pass if space continues to drop it will go
> into shutdown mode.  But in your case, even though you are past the
> threshold halfway mark, it will not enter the shutdown code because it
> happened in one shot and did not continue to lose disk space.  
> 
> So yes, there is a bug to fix, but I just wanted you to verify this behavior.
Ok, thanks !
> 
> Thanks,
> Mark
> > 
> > I will investigate what will happen if available space continues to drop and
> > report back.
I have gradually decreased amount of available space by 2MB every 2 seconds - disk monitoring plugin was not triggered. I was able to hit 0 free space without DS immediately shutting down.

Comment 8 mreynolds 2013-07-09 20:24:13 UTC

(In reply to Ján Rusnačko from comment #7)
> (In reply to mreynolds from comment #6)
> > (In reply to Ján Rusnačko from comment #5)
> > > (In reply to mreynolds from comment #4)
> > > > The server is not going into shut-down mode because you are not continuing
> > > > to lose disk space.  See the design doc:
> > > > 
> > > > http://directory.fedoraproject.org/wiki/Disk_Monitoring
> > > > 
> > > > The feature only wants to shut-down the server as a last resort.  If things
> > > > remain stable, it doesn't do anything.  I do think there is room for
> > > > improvement in this area though.
> > > > 
> > > > It looks like you just consumed a chunk of disk space at one shot.  If you
> > > > continue to consume disk space does it go into shutdown mode?
> > > 
> > > According to the design doc: 
> > > 
> > > "Once the available disk space on any of the disks gets below the threshold
> > > we start taking action."
> > > 
> > > The space was consumed in one shot, however, since the threshold was passed,
> > > as the first thing verbose logging should get disabled, right ? It was not,
> > > therefore I assumed disk monitoring plugin was not even triggered when the
> > > space dropped below the threshold.
> > 
> > This is the area that can use some improvement.  It expects that the
> > threshold will be hit first, but not past the halfway mark.  So it makes a
> > complete pass, then on the next pass if space continues to drop it will go
> > into shutdown mode.  But in your case, even though you are past the
> > threshold halfway mark, it will not enter the shutdown code because it
> > happened in one shot and did not continue to lose disk space.  
> > 
> > So yes, there is a bug to fix, but I just wanted you to verify this behavior.
> Ok, thanks !
> > 
> > Thanks,
> > Mark
> > > 
> > > I will investigate what will happen if available space continues to drop and
> > > report back.
> I have gradually decreased amount of available space by 2MB every 2 seconds
> - disk monitoring plugin was not triggered. I was able to hit 0 free space
> without DS immediately shutting down.

Yeah that's bad, and surprising.  I'll start working on this right away.

Comment 9 Ján Rusnačko 2013-07-09 20:28:45 UTC

(In reply to mreynolds from comment #8)
> (In reply to Ján Rusnačko from comment #7)
> > (In reply to mreynolds from comment #6)
> > > (In reply to Ján Rusnačko from comment #5)
> > > > (In reply to mreynolds from comment #4)
> > > > > The server is not going into shut-down mode because you are not continuing
> > > > > to lose disk space.  See the design doc:
> > > > > 
> > > > > http://directory.fedoraproject.org/wiki/Disk_Monitoring
> > > > > 
> > > > > The feature only wants to shut-down the server as a last resort.  If things
> > > > > remain stable, it doesn't do anything.  I do think there is room for
> > > > > improvement in this area though.
> > > > > 
> > > > > It looks like you just consumed a chunk of disk space at one shot.  If you
> > > > > continue to consume disk space does it go into shutdown mode?
> > > > 
> > > > According to the design doc: 
> > > > 
> > > > "Once the available disk space on any of the disks gets below the threshold
> > > > we start taking action."
> > > > 
> > > > The space was consumed in one shot, however, since the threshold was passed,
> > > > as the first thing verbose logging should get disabled, right ? It was not,
> > > > therefore I assumed disk monitoring plugin was not even triggered when the
> > > > space dropped below the threshold.
> > > 
> > > This is the area that can use some improvement.  It expects that the
> > > threshold will be hit first, but not past the halfway mark.  So it makes a
> > > complete pass, then on the next pass if space continues to drop it will go
> > > into shutdown mode.  But in your case, even though you are past the
> > > threshold halfway mark, it will not enter the shutdown code because it
> > > happened in one shot and did not continue to lose disk space.  
> > > 
> > > So yes, there is a bug to fix, but I just wanted you to verify this behavior.
> > Ok, thanks !
> > > 
> > > Thanks,
> > > Mark
> > > > 
> > > > I will investigate what will happen if available space continues to drop and
> > > > report back.
> > I have gradually decreased amount of available space by 2MB every 2 seconds
> > - disk monitoring plugin was not triggered. I was able to hit 0 free space
> > without DS immediately shutting down.
> 
> Yeah that's bad, and surprising.  I'll start working on this right away.
Oh, to be precise and avoid misunderstanding, I have first depleted space below half of the threshold in one shot, and only THEN decreased space by 2 MB till 0 free space.

Comment 10 mreynolds 2013-07-10 20:04:37 UTC

Jan, I have fixed all the reported issues.  Should I do a respin of 1.2.11, or are you still running other tests?

Thanks,
Mark

Comment 11 Ján Rusnačko 2013-07-11 07:41:18 UTC

(In reply to mreynolds from comment #10)
> Jan, I have fixed all the reported issues.  Should I do a respin of 1.2.11,
> or are you still running other tests?
> 
> Thanks,
> Mark
Hi Mark,

All the tests are automated and I have reported all issues I have found. Please continue with respin.

Thank you,
Jan

Comment 12 Nathan Kinder 2013-07-12 21:25:55 UTC

Closing this as a duplicate of bug 972930.  The fix for that bug was not correct, and caused the issue described in this report.  We will continue to use 972930 for this issue, as it is already acked for the planned 6.4.z update.

*** This bug has been marked as a duplicate of bug 972930 ***

Note You need to log in before you can comment on or make changes to this bug.