Bug 2075474

Summary: After an initial failure, subsequent online backups will not work.
Product: Red Hat Directory Server Reporter: Têko Mihinto <tmihinto>
Component: 389-ds-baseAssignee: LDAP Maintainers <idm-ds-dev-bugs>
Status: CLOSED MIGRATED QA Contact: LDAP QA Team <idm-ds-qe-bugs>
Severity: high Docs Contact: Evgenia Martynyuk <emartyny>
Priority: high    
Version: 12.2CC: idm-ds-dev-bugs, mreynolds, pasik, tbordaz
Target Milestone: DS12.5Keywords: Triaged
Target Release: dirsrv-12.5   
Hardware: x86_64   
OS: All   
Whiteboard: sync-to-jira
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-06-26 13:47:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Têko Mihinto 2022-04-14 10:30:17 UTC
Description of problem:
Once an online backup fails, the next attempts will be defective:

...
[14/Apr/2022:11:50:22.007888873 +0200] - INFO - task_backup_thread - Beginning backup of 'ldbm database'
[14/Apr/2022:11:50:22.010284307 +0200] - WARN - ldbm_back_ldbm2archive - Backend 'userRoot' is already in the middle of another task and cannot be disturbed.
[14/Apr/2022:11:50:22.012907630 +0200] - ERR - ldbm_back_ldbm2archive - Failed removing /local/backup_ds/backup-2022_04_14_11_50_21
[14/Apr/2022:11:50:22.016358296 +0200] - ERR - task_backup_thread - Backup failed (error -1)
[14/Apr/2022:11:50:24.696442611 +0200] - INFO - task_backup_thread - Beginning backup of 'ldbm database'
[14/Apr/2022:11:50:24.699312491 +0200] - WARN - ldbm_back_ldbm2archive - Backend 'userRoot' is already in the middle of another task and cannot be disturbed.
[14/Apr/2022:11:50:24.701486711 +0200] - ERR - ldbm_back_ldbm2archive - Failed removing /local/backup_ds/backup-2022_04_14_11_50_24

Version-Release number of selected component (if applicable):
$ cat /etc/redhat-release
Red Hat Enterprise Linux release 8.4 (Ootpa)
$
$ rpm -qa | grep 389-ds-base-1
389-ds-base-1.4.3.22-1.module+el8dsrv+10501+8ce33e95.x86_64
$

How reproducible:
I can reproduce the issue quite reliably.

Steps to Reproduce:
1. Start an online backup.
2. While the backup is still running, delete the backup files ( under the location specified by the "nsslapd-bakdir" parameter )
3. The current backup will fail
4. Try to run again online backups. They will fail with the following message in the errors log:
Backend 'XXX' is already in the middle of another task and cannot be disturbed.

Restarting the RHDS instance fixes the issue.
This looks quite similar to bug https://bugzilla.redhat.com/show_bug.cgi?id=1642838

Actual results:
Failing to run online backups.

Expected results:
Successful online backups.

Additional info:

$ dsconf -v -D "cn=Directory Manager" ldap://localhost:389 backup create
...
DEBUG: complete status: -1 -> Backup failed (error -1)
DEBUG: cn=backup_2022-04-14T11:51:33.785070,cn=backup,cn=tasks,cn=config getVal('nsTaskExitCode')
DEBUG: cn=backup_2022-04-14T11:51:33.785070,cn=backup,cn=tasks,cn=config getVal('nsTaskLog')
DEBUG: cn=backup_2022-04-14T11:51:33.785070,cn=backup,cn=tasks,cn=config getVal('nsTaskWarning')
DEBUG: cn=backup_2022-04-14T11:51:33.785070,cn=backup,cn=tasks,cn=config getVal('nsTaskStatus')
DEBUG: complete status: -1 -> Backup failed (error -1)
DEBUG: cn=backup_2022-04-14T11:51:33.785070,cn=backup,cn=tasks,cn=config getVal('nsTaskExitCode')
DEBUG: cn=backup_2022-04-14T11:51:33.785070,cn=backup,cn=tasks,cn=config getVal('nsTaskLog')
DEBUG: cn=backup_2022-04-14T11:51:33.785070,cn=backup,cn=tasks,cn=config getVal('nsTaskWarning')
DEBUG: cn=backup_2022-04-14T11:51:33.785070,cn=backup,cn=tasks,cn=config getVal('nsTaskStatus')
DEBUG: complete status: -1 -> Backup failed (error -1)
DEBUG: The backup create task has failed with the error code: (-1)
Traceback (most recent call last):
  File "/usr/sbin/dsconf", line 134, in <module>
    result = args.func(inst, None, log, args)
  File "/usr/lib/python3.6/site-packages/lib389/cli_conf/backup.py", line 20, in backup_create
    raise ValueError("The backup create task has failed with the error code: ({})".format(result))
ValueError: The backup create task has failed with the error code: (-1)
ERROR: Error: The backup create task has failed with the error code: (-1)
$

Comment 2 Têko Mihinto 2022-04-14 11:41:19 UTC
Issue is reproducible on RHEL 8.5 as well ( 389-ds-base-1.4.3.27-2 ).

Comment 4 Viktor Ashirov 2024-06-26 13:47:40 UTC
This BZ has been automatically migrated to Red Hat Issue Tracker https://issues.redhat.com/browse/DIRSRV-33. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated. Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.