Bug 2059110

Summary: [RHEL8] system role is not stable about luks testing
Product: Red Hat Enterprise Linux 8 Reporter: guazhang <guazhang>
Component: rhel-system-rolesAssignee: Vojtech Trefny <vtrefny>
Status: CLOSED WORKSFORME QA Contact: CS System Management SST QE <rhel-cs-system-management-subsystem-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.6CC: rmeggins, spetrosi, vtrefny
Target Milestone: rcFlags: rmeggins: needinfo? (vtrefny)
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: role:storage
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-07-19 19:46:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description guazhang@redhat.com 2022-02-28 08:45:35 UTC
Description of problem:
rhel-system-role regression testing failed and found the case about luks is not stable,  the random failure in below  message.


"msg": "Failed to commit changes to disk: Process reported exit code 5:   Logical volume foo/test1 in use.\n"

"msg": "Found unexpected mount state for volume 'foo'  device"

"msg": "Unexpected behavior w/ encrypted pool w/ no key"



Version-Release number of selected component (if applicable):
rhel-system-roles-1.15.0-1.el8.noarch 
4.18.0-369.el8.x86_64 
ansible-core-2.12.2-2.el8.x86_64
cryptsetup-2.3.7-2.el8.x86_64

How reproducible:
30%

Steps to Reproduce:
1. echo "localhost  ansible_connection=local" > host
2. ansible-playbook -vv -i host tests_luks_nvme_generated.yml
3. ansible-playbook -vv -i host tests_luks_pool_nvme_generated.yml

the error can repruce with upstream storage role ,

Actual results:
failed

Expected results:
run pass

Additional info:

Broadcast message from root.eng.pek2.redhat.com (Mon 2022-02-28 03:06:59 EST):

Password entry required for 'Please enter passphrase for disk foo-test1 (luks-f89b3e11-9601-417e-9600-ddcdc76ec9af)!' (PID 474445).
Please enter password with the systemd-tty-ask-password-agent tool!

fatal: [localhost]: FAILED! => {"actions": [], "changed": false, "crypts": [], "leaves": [], "mounts": [], "msg": "Failed to commit changes to disk: Process reported exit code 5:   Logical volume foo/test1 in use.\n", "packages": ["xfsprogs", "lvm2", "cryptsetup", "dosfstools"], "pools": [], "volumes": []}



PLAY RECAP *****************************************************************************************************************************
localhost                  : ok=883  changed=76   unreachable=0    failed=9    skipped=445  rescued=9    ignored=0   

[root@storageqe-70 tests]# $?
-bash: 2: command not found
[root@storageqe-70 tests]# 
[root@storageqe-70 tests]# lsblk
NAME                        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                           8:0    0   223G  0 disk 
├─sda1                        8:1    0   600M  0 part /boot/efi
├─sda2                        8:2    0     1G  0 part /boot
└─sda3                        8:3    0 221.4G  0 part 
  ├─rhel_storageqe--70-root 253:0    0    70G  0 lvm  /
  ├─rhel_storageqe--70-swap 253:1    0   7.7G  0 lvm  [SWAP]
  └─rhel_storageqe--70-home 253:2    0 143.8G  0 lvm  /home
nvme1n1                     259:0    0 894.3G  0 disk 
nvme2n1                     259:1    0 894.3G  0 disk 
nvme4n1                     259:2    0 894.3G  0 disk 
└─foo-test1                 253:3    0     4G  0 lvm  
nvme0n1                     259:3    0 894.3G  0 disk 
nvme3n1                     259:4    0 894.3G  0 disk 



"msg": "Found unexpected mount state for volume 'foo'  device"

TASK [Get expected mount device based on device type] **********************************************************************************
task path: /usr/share/ansible/roles/rhel-system-roles.storage/tests/test-verify-volume-mount.yml:6
ok: [localhost] => {"ansible_facts": {"storage_test_device_path": "/dev/mapper/luks-d5681548-2523-43ca-b3cc-4e50e4d95375"}, "changed": false}

TASK [Set some facts] ******************************************************************************************************************
task path: /usr/share/ansible/roles/rhel-system-roles.storage/tests/test-verify-volume-mount.yml:14
ok: [localhost] => {"ansible_facts": {"storage_test_mount_device_matches": [], "storage_test_mount_expected_match_count": "1", "storage_test_mount_point_matches": [], "storage_test_swap_expected_matches": "0"}, "changed": false}

TASK [lsblk] ***************************************************************************************************************************
task path: /usr/share/ansible/roles/rhel-system-roles.storage/tests/test-verify-volume-mount.yml:25
changed: [localhost] => {"changed": true, "cmd": "lsblk > lsblk_out;sleep 10", "delta": "0:00:10.015405", "end": "2022-02-28 03:16:35.651181", "msg": "", "rc": 0, "start": "2022-02-28 03:16:25.635776", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}

TASK [Verify the current mount state by device] ****************************************************************************************
task path: /usr/share/ansible/roles/rhel-system-roles.storage/tests/test-verify-volume-mount.yml:32
fatal: [localhost]: FAILED! => {
    "assertion": false,
    "changed": false,
    "evaluated_to": false,
    "msg": "Found unexpected mount state for volume 'foo' '/opt/test1' 'foo' '[]' '1' device"
}

PLAY RECAP *****************************************************************************************************************************
localhost                  : ok=86   changed=10   unreachable=0    failed=2    skipped=23   rescued=1    ignored=0   


TASK [Verify the current mount state by device] ****************************************************************************************
task path: /usr/share/ansible/roles/rhel-system-roles.storage/tests/test-verify-volume-mount.yml:32
fatal: [localhost]: FAILED! => {
    "assertion": false,
    "changed": false,
    "evaluated_to": false,
    "msg": "Found unexpected mount state for volume 'foo' '/opt/test1' 'foo' '[]' '1' device"
}

/usr/share/ansible/roles/rhel-system-roles.storage/tests/test-verify-volume-mount.yml:32 

- name: Verify the current mount state by device
  assert:
    that: "{{ storage_test_mount_device_matches|length == storage_test_mount_expected_match_count|int }}"
    msg: "Found unexpected mount state for volume '{{ storage_test_volume.name }}' '{{ storage_test_volume.mount_point }}' '{{ storage_test_volume.name }}' '{{ storage_test_mount_device_matches }}' '{{ storage_test_mount_expected_match_count }}' device"
  when: _storage_test_volume_present and storage_test_volume.mount_point


{{ storage_test_mount_device_matches }} ==[],

{{ storage_test_mount_expected_match_count }} == 1




"msg": "Unexpected behavior w/ encrypted pool w/ no key"

TASK [Verify the output of the keyless luks test] ******************************
task path: /usr/share/ansible/roles/rhel-system-roles.storage/tests/tests_luks.yml:203
fatal: [localhost]: FAILED! => {
    "assertion": "blivet_output.failed and blivet_output.msg|regex_search('encrypted volume.*missing key') and not blivet_output.changed",
    "changed": false,
    "evaluated_to": false,
    "msg": "Unexpected behavior w/ encrypted pool w/ no key"
}


http://lab-04.rhts.eng.pek2.redhat.com/beaker/logs/tasks/140481+/140481113/taskout.log




upsteam test result also failed

http://lab-04.rhts.eng.pek2.redhat.com/beaker/logs/tasks/140481+/140481115/taskout.log


ansible-playbook -vv -i host tests_luks_pool_nvme_generated.yml

TASK [Verify the output of the keyless luks test] ******************************
task path: /mnt/tests/kernel/storage/storage_role/auto_run_role_local/storage/tests/tests_luks_pool.yml:59
fatal: [localhost]: FAILED! => {
    "assertion": "blivet_output.failed and blivet_output.msg|regex_search('encrypted pool.*missing key') and not blivet_output.changed",
    "changed": false,
    "evaluated_to": false,
    "msg": "Unexpected behavior w/ encrypted pool w/ no key"
}

PLAY RECAP *********************************************************************
localhost                  : ok=41   changed=4    unreachable=0    failed=2    skipped=17   rescued=1    ignored=0   
STDERR:[DEPRECATION WARNING]: Ansible will require Python 3.8 or newer on the 
controller starting with Ansible 2.12. Current version: 3.6.8 (default, Jan 14 
2022, 11:04:20) [GCC 8.5.0 20210514 (Red Hat 8.5.0-7)]. This feature will be 
removed from ansible-core in version 2.12. Deprecation warnings can be disabled
 by setting deprecation_warnings=False in ansible.cfg.
RETURN:2



ansible-playbook -vv -i host tests_luks_nvme_generated.yml

TASK [linux-system-roles.storage : manage the pools and volumes to match the specified state] ***
task path: /mnt/tests/kernel/storage/storage_role/auto_run_role_local/storage/tests/roles/linux-system-roles.storage/tasks/main-blivet.yml:75
fatal: [localhost]: FAILED! => {"actions": [], "changed": false, "crypts": [], "leaves": [], "mounts": [], "msg": "Failed to commit changes to disk: Process reported exit code 5:   Logical volume foo/test1 in use.\n", "packages": ["xfsprogs", "lvm2", "dosfstools"], "pools": [], "volumes": []}

PLAY RECAP *********************************************************************
localhost                  : ok=1102 changed=83   unreachable=0    failed=10   skipped=561  rescued=10   ignored=0   
RETURN:2

[INFO][23:25:41]############ finish run test tests_luks_nvme_generated.yml ansible-playbook -vv -i host tests_luks_nvme_generated.yml



https://beaker.engineering.redhat.com/recipes/11536640#task140481108,task140481111,task140481113

Comment 1 guazhang@redhat.com 2022-02-28 09:29:17 UTC
[root@storageqe-70 tests]# lvremove -y /dev/mapper/foo-test1
  Logical volume foo/test1 in use.

[root@storageqe-70 tests]# lsof -l  /dev/mapper/foo-test1 
COMMAND      PID     USER   FD   TYPE DEVICE SIZE/OFF    NODE NAME
systemd-c 590197        0    7r   BLK  253,3   0t1024 5444646 /dev/mapper/../dm-3
systemd-c 590198        0    7r   BLK  253,3   0t1024 5444646 /dev/mapper/../dm-3
systemd-c 590199        0    7r   BLK  253,3   0t1024 5444646 /dev/mapper/../dm-3
systemd-c 590200        0    7r   BLK  253,3   0t1024 5444646 /dev/mapper/../dm-3
systemd-c 590201        0    7r   BLK  253,3   0t1024 5444646 /dev/mapper/../dm-3
systemd-c 590202        0    7r   BLK  253,3   0t1024 5444646 /dev/mapper/../dm-3

[root@storageqe-70 tests]# ps -aux |grep 590197
root      590197  0.0  0.0  79044  6948 ?        Ss   04:07   0:00 /usr/lib/systemd/systemd-cryptsetup attach luks-3cfc027c-ec05-42f0-bb8d-2adebb0cc9f5 /dev/mapper/foo-test1 -
root      591503  0.0  0.0 221936  1088 pts/0    S+   04:22   0:00 grep --color=auto 590197
[root@storageqe-70 tests]# 
[root@storageqe-70 tests]# ps -aux |grep 590198
root      590198  0.0  0.0  79044  7064 ?        Ss   04:07   0:00 /usr/lib/systemd/systemd-cryptsetup attach luks-c7334019-6d58-4d55-b721-0e7c442dd91d /dev/mapper/foo-test1 -
root      591505  0.0  0.0 221936  1100 pts/0    S+   04:22   0:00 grep --color=auto 590198
[root@storageqe-70 tests]# ps -aux |grep 590199
root      590199  0.0  0.0  79044  6956 ?        Ss   04:07   0:00 /usr/lib/systemd/systemd-cryptsetup attach luks-f89b3e11-9601-417e-9600-ddcdc76ec9af /dev/mapper/foo-test1 -
root      591507  0.0  0.0 221936  1064 pts/0    S+   04:22   0:00 grep --color=auto 590199
[root@storageqe-70 tests]# ps -aux |grep 590200
root      590200  0.0  0.0  79044  6928 ?        Ss   04:07   0:00 /usr/lib/systemd/systemd-cryptsetup attach luks-794b2d7e-654b-4b13-b803-fdda7d6cf94a /dev/mapper/foo-test1 -
root      591509  0.0  0.0 221936  1096 pts/0    S+   04:23   0:00 grep --color=auto 590200
[root@storageqe-70 tests]# ps -aux |grep 590201
root      590201  0.0  0.0  79044  6968 ?        Ss   04:07   0:00 /usr/lib/systemd/systemd-cryptsetup attach luks-c5160c6c-6b18-4a6a-a4fb-7fe67dcad794 /dev/mapper/foo-test1 -
root      591511  0.0  0.0 221936  1120 pts/0    S+   04:23   0:00 grep --color=auto 590201
[root@storageqe-70 tests]# ps -aux |grep 590202
root      590202  0.0  0.0  79044  6980 ?        Ss   04:07   0:00 /usr/lib/systemd/systemd-cryptsetup attach luks-a2f01a46-0b1f-4082-9813-f7f5d01b9549 /dev/mapper/foo-test1 -
root      591513  0.0  0.0 221936  1100 pts/0    S+   04:23   0:00 grep --color=auto 590202





Password entry required for 'Please enter passphrase for disk foo-test1 (luks-794b2d7e-654b-4b13-b803-fdda7d6cf94a)!' (PID 590200).
Please enter password with the systemd-tty-ask-password-agent tool!


Broadcast message from root.eng.pek2.redhat.com (Mon 2022-02-28 04:07:18 EST):

Password entry required for 'Please enter passphrase for disk foo-test1 (luks-f89b3e11-9601-417e-9600-ddcdc76ec9af)!' (PID 590199).
Please enter password with the systemd-tty-ask-password-agent tool!


Broadcast message from root.eng.pek2.redhat.com (Mon 2022-02-28 04:07:18 EST):

Password entry required for 'Please enter passphrase for disk foo-test1 (luks-c7334019-6d58-4d55-b721-0e7c442dd91d)!' (PID 590198).
Please enter password with the systemd-tty-ask-password-agent tool!


Broadcast message from root.eng.pek2.redhat.com (Mon 2022-02-28 04:07:18 EST):

Password entry required for 'Please enter passphrase for disk foo-test1 (luks-3cfc027c-ec05-42f0-bb8d-2adebb0cc9f5)!' (PID 590197).
Please enter password with the systemd-tty-ask-password-agent tool!


Broadcast message from root.eng.pek2.redhat.com (Mon 2022-02-28 04:07:18 EST):

Password entry required for 'Please enter passphrase for disk foo-test1 (luks-a2f01a46-0b1f-4082-9813-f7f5d01b9549)!' (PID 590202).
Please enter password with the systemd-tty-ask-password-agent tool!


Broadcast message from root.eng.pek2.redhat.com (Mon 2022-02-28 04:07:18 EST):

Password entry required for 'Please enter passphrase for disk foo-test1 (luks-c5160c6c-6b18-4a6a-a4fb-7fe67dcad794)!' (PID 590201).
Please enter password with the systemd-tty-ask-password-agent tool!


Broadcast message from root.eng.pek2.redhat.com (Mon 2022-02-28 04:07:18 EST):

Password entry required for 'Please enter passphrase for disk foo-test1 (luks-794b2d7e-654b-4b13-b803-fdda7d6cf94a)!' (PID 590200).
Please enter password with the systemd-tty-ask-password-agent tool!


Broadcast message from root.eng.pek2.redhat.com (Mon 2022-02-28 04:07:18 EST):

Password entry required for 'Please enter passphrase for disk foo-test1 (luks-f89b3e11-9601-417e-9600-ddcdc76ec9af)!' (PID 590199).
Please enter password with the systemd-tty-ask-password-agent tool!


Broadcast message from root.eng.pek2.redhat.com (Mon 2022-02-28 04:07:18 EST):

Password entry required for 'Please enter passphrase for disk foo-test1 (luks-c7334019-6d58-4d55-b721-0e7c442dd91d)!' (PID 590198).
Please enter password with the systemd-tty-ask-password-agent tool!


Broadcast message from root.eng.pek2.redhat.com (Mon 2022-02-28 04:07:18 EST):

Password entry required for 'Please enter passphrase for disk foo-test1 (luks-3cfc027c-ec05-42f0-bb8d-2adebb0cc9f5)!' (PID 590197).
Please enter password with the systemd-tty-ask-password-agent tool!



Is there something wrong here when input password?

Comment 2 Rich Megginson 2022-03-01 15:09:26 UTC
> Is there something wrong here when input password?

Not sure.  Do you only see this error with the tests_luks_*_nvme_generated.tests?  For example, there are 6 luks related tests:

tests/tests_luks_nvme_generated.yml	  tests/tests_luks_pool.yml
tests/tests_luks_pool_nvme_generated.yml  tests/tests_luks_scsi_generated.yml
tests/tests_luks_pool_scsi_generated.yml  tests/tests_luks.yml

Do you see this error with all of them, or only the nvme tests?

Comment 3 guazhang@redhat.com 2022-03-02 01:41:25 UTC
(In reply to Rich Megginson from comment #2)
> > Is there something wrong here when input password?
> 
> Not sure.  Do you only see this error with the
> tests_luks_*_nvme_generated.tests?  For example, there are 6 luks related
> tests:
> 
> tests/tests_luks_nvme_generated.yml	  tests/tests_luks_pool.yml
> tests/tests_luks_pool_nvme_generated.yml  tests/tests_luks_scsi_generated.yml
> tests/tests_luks_pool_scsi_generated.yml  tests/tests_luks.yml
> 
> Do you see this error with all of them, or only the nvme tests?

I got a test server with nvme disks so filter some tests which need nvme.
from my test job, other tests have passed.
but those errors is not 100% reproduce with same nvme server.

same nvme server test pass
https://beaker.engineering.redhat.com/jobs/6354041

the failed job
https://beaker.engineering.redhat.com/jobs/6350222
https://beaker.engineering.redhat.com/jobs/6348715

Comment 4 Rich Megginson 2022-03-02 14:48:57 UTC
@vtrefny is it possible that this is related to https://bugzilla.redhat.com/show_bug.cgi?id=2059426 ?