Bug 1562035
Summary: | DPDK: Deployment fails intermittently only when tuned profile "cpu-partitioning" is activated | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Saravanan KR <skramaja> | ||||||
Component: | openstack-tripleo-heat-templates | Assignee: | Saravanan KR <skramaja> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Yariv <yrachman> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 13.0 (Queens) | CC: | aschultz, atelang, atheurer, dbecker, ealcaniz, emacchi, fherrman, houyatao, jianzzha, jraju, jschluet, jskarvad, ksundara, mbabushk, mburns, morazi, noama, openstack, owalsh, pablo.iranzo, rhel-osp-director-maint, sbaker, skramaja, vchundur, wdec, yrachman, yroblamo | ||||||
Target Milestone: | z1 | Keywords: | Triaged, ZStream | ||||||
Target Release: | 13.0 (Queens) | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: |
Cause: All NFV deployment fails when starting a container randomly with nsenter invalid argument error.
Consequence: A timing issue between tuned cpu-partitioning profile and unshare kernel call from the container runtime, uncovered the existing issue in the kernel of wrong usage of mm_users count.
Fix: Previously, the unshare(CLONE_NEWPID) function could fail when racing with the get_task_mm() function, because the CLONE_NEWPID parameter incorrectly implied the CLONE_THREAD parameter and thus also the CLONE_VM parameter. This update fixes sys_unshare() to not imply CLONE_THREAD if CLONE_NEWPID is used, thus fixing the bug.
Result: Need to use kernel 3.10.0-862.6.3.el7.x86_64 to fix this issue.
|
Story Points: | --- | ||||||
Clone Of: | |||||||||
: | 1577745 (view as bug list) | Environment: | |||||||
Last Closed: | 2018-07-24 06:14:09 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 1577745 | ||||||||
Bug Blocks: | 1566561 | ||||||||
Attachments: |
|
From the logs, it looks similar to this known issue: https://github.com/moby/moby/issues/34971 - the reporter is on CentOS7 though. I wonder if the kernel parameters for namespaces in RHEL are different from CentOS. Could you please give the version of your rpms? and the command used to deploy so we can try to reproduce. Thanks (In reply to Emilien Macchi from comment #1) > From the logs, it looks similar to this known issue: > https://github.com/moby/moby/issues/34971 - the reporter is on CentOS7 > though. I wonder if the kernel parameters for namespaces in RHEL are > different from CentOS. > > Could you please give the version of your rpms? and the command used to > deploy so we can try to reproduce. > > Thanks Able to deploy successfully by reducing the docker puppet instances to 1 and with Tuned profile enabled. DockerPuppetProcessCount: 1 I am re-deploying few times to ensure that it works always. I will update the BZ. (In reply to Saravanan KR from comment #2) > (In reply to Emilien Macchi from comment #1) > Able to deploy successfully by reducing the docker puppet instances to 1 and > with Tuned profile enabled. > > DockerPuppetProcessCount: 1 > > I am re-deploying few times to ensure that it works always. I will update > the BZ. It failed with the same error for the 2nd run. Tested with OSP-13 puddle 2018-03-16.1. DPDK specific parameters: -------------------------------- parameter_defaults: ComputeOvsDpdkParameters: KernelArgs: "default_hugepagesz=1GB hugepagesz=1G hugepages=64 intel_iommu=on iommu=pt isolcpus=10-87" TunedProfileName: "cpu-partitioning" IsolCpusList: "10-87" OvsDpdkSocketMemory: "2048,2048" OvsDpdkMemoryChannels: "4" NovaReservedHostMemory: 4096 OvsPmdCoreList: "10,11,22,23" OvsDpdkCoreList: "0,1,2,3,4,5,6,7,8,9" NovaVcpuPinSet: ['12-21','24-87'] -------------------------------- Just to make a general comment, the purpose of docker-puppet.py is to use puppet write configuration files, so it is optimised for concurrency. This means if docker-puppet.py modifies any state on the host there is a risk that a race will be triggered, resulting in unpredictable errors. Since you've tried with DockerPuppetProcessCount:1 that may not be the cause of this particular issue, but it sounds like the host *is* being modified, and we should make the required change to do this in a separate "puppet apply" call rather than using docker-puppet.py. Looking at https://github.com/moby/moby/issues/34971, I suppose we could be hitting some kind of namespace limit. I'll keep investigating. According to this[1] it might be worth adding this to your KernelArgs. user_namespace.enable=1 Could you please try this with the DockerPuppetProcessCount:1 parameter and post any failed stack log? [1] https://github.com/opencontainers/runc/issues/1343 (In reply to Steve Baker from comment #5) > According to this[1] it might be worth adding this to your KernelArgs. > > user_namespace.enable=1 > > Could you please try this with the DockerPuppetProcessCount:1 parameter and > post any failed stack log? > > [1] https://github.com/opencontainers/runc/issues/1343 Yes, I too have tried with the kernel args and process count, but doesn't help. ------------------------------------- parameter_defaults: ComputeOvsDpdkParameters: KernelArgs: "default_hugepagesz=1GB hugepagesz=1G hugepages=64 intel_iommu=on iommu=pt isolcpus=10-87 user_namespace.enable=1" ExtraSysctlSettings: user.max_user_namespaces: value: 15000 ------------------------------------- And it started happening in rhel7.5. Event I have tried with the latest docker packages in rhel. Also this issue never occurs in CentOS based deployments. If I disable the TunedProfileName alone, then the deployment is successful. Assigning to DF:NFV dradaz and marking as untriaged. Once the issue is diagnosed it can already be reassigned back to us if the fix is changing the container config. *** Bug 1570412 has been marked as a duplicate of this bug. *** On trying with puddle 2018-04-19.2, I am unable to reproduce this issue. Asked for confirmation from QE. Still the root cause of the BZ is unknown. All it is known is is related to user namespace but the solutions suggested above does not work. If it is reproduced again, we have to work with container and tuned team to fix this issue. Moving it to ON_QA to ensure that QE validates this scenario on the latest puddle. Verified with SRIOV with puddle 2018-04-26.1 The issue appears back in the latest 2018-05-04.1 puddle. (In reply to Maxim Babushkin from comment #12) > The issue appears back in the latest 2018-05-04.1 puddle. Can you past the error logs? openstack stack failures list --long overcloud Created attachment 1434177 [details]
stack_failures_log
Looks like this kernel issue: https://lkml.org/lkml/2015/7/28/833 Looking at the source for the kernel on that deployment (from kernel/fork.c): static int check_unshare_flags(unsigned long unshare_flags) { if (unshare_flags & ~(CLONE_THREAD|CLONE_FS|CLONE_NEWNS|CLONE_SIGHAND| CLONE_VM|CLONE_FILES|CLONE_SYSVSEM| CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWNET| CLONE_NEWUSER|CLONE_NEWPID)) return -EINVAL; /* * Not implemented, but pretend it works if there is nothing to * unshare. Note that unsharing CLONE_THREAD or CLONE_SIGHAND * needs to unshare vm. */ if (unshare_flags & (CLONE_THREAD | CLONE_SIGHAND | CLONE_VM)) { /* FIXME: get_task_mm() increments ->mm_users */ if (atomic_read(¤t->mm->mm_users) > 1) return -EINVAL; } return 0; } While https://github.com/moby/moby/issues/34971 show the flags are CLONE_NEWUTS|CLONE_NEWPID|CLONE_NEWNET|CLONE_NEWNS|CLONE_NEWIPC we don't avoid the issue as CLONE_NEWPID implicitly sets CLONE_THREAD: SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags) { struct fs_struct *fs, *new_fs = NULL; struct files_struct *fd, *new_fd = NULL; struct cred *new_cred = NULL; struct nsproxy *new_nsproxy = NULL; int do_sysvsem = 0; int err; /* * If unsharing a user namespace must also unshare the thread. */ if (unshare_flags & CLONE_NEWUSER) unshare_flags |= CLONE_THREAD | CLONE_FS; /* * If unsharing a pid namespace must also unshare the thread. */ if (unshare_flags & CLONE_NEWPID) unshare_flags |= CLONE_THREAD; /* * If unsharing a thread from a thread group, must also unshare vm. */ if (unshare_flags & CLONE_THREAD) unshare_flags |= CLONE_VM; /* * If unsharing vm, must also unshare signal handlers. */ if (unshare_flags & CLONE_VM) unshare_flags |= CLONE_SIGHAND; /* * If unsharing namespace, must also unshare filesystem information. */ if (unshare_flags & CLONE_NEWNS) unshare_flags |= CLONE_FS; err = check_unshare_flags(unshare_flags); if (err) goto bad_unshare_out; Maybe setting pid=host in the docker run command line would remove CLONE_NEWPID from the unshare flags used and avoid the issue... I face similar failures in paunch container start also on overcloud.AllNodesDeploySteps.ComputeOvsDpdkDeployment_Step4.0 "Error running ['docker', 'run', '--name', 'ceilometer_agent_compute', '--label', 'config_id=tripleo_step4', '--label', 'container_name=ceilometer_agent_compute', '--label', 'managed_by=paunch', '--label', 'config_data={\"environment\": [\"KOLLA_CONFIG_STRATEGY=COPY_ALWAYS\", \"TRIPLEO_CONFIG_HASH=6b3b5b2d9dd55807977f98e19c7c715d\"], \"volumes\": [\"/etc/hosts:/etc/hosts:ro\", \"/etc/localtime:/etc/localtime:ro\", \"/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro\", \"/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro\", \"/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro\", \"/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro\", \"/dev/log:/dev/log\", \"/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro\", \"/etc/puppet:/etc/puppet:ro\", \"/var/lib/kolla/config_files/ceilometer_agent_compute.json:/var/lib/kolla/config_files/config.json:ro\", \"/var/lib/config-data/puppet-generated/ceilometer/:/var/lib/kolla/config_files/src:ro\", \"/var/run/libvirt:/var/run/libvirt:ro\", \"/var/log/containers/ceilometer:/var/log/ceilometer\"], \"image\": \"192.0.10.1:8787/rhosp13/openstack-ceilometer-compute:2018-05-07.2\", \"net\": \"host\", \"restart\": \"always\", \"privileged\": false}', '--detach=true', '--env=KOLLA_CONFIG_STRATEGY=COPY_ALWAYS', '--env=TRIPLEO_CONFIG_HASH=6b3b5b2d9dd55807977f98e19c7c715d', '--net=host', '--privileged=false', '--restart=always', '--volume=/etc/hosts:/etc/hosts:ro', '--volume=/etc/localtime:/etc/localtime:ro', '--volume=/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro', '--volume=/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro', '--volume=/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro', '--volume=/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro', '--volume=/dev/log:/dev/log', '--volume=/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro', '--volume=/etc/puppet:/etc/puppet:ro', '--volume=/var/lib/kolla/config_files/ceilometer_agent_compute.json:/var/lib/kolla/config_files/config.json:ro', '--volume=/var/lib/config-data/puppet-generated/ceilometer/:/var/lib/kolla/config_files/src:ro', '--volume=/var/run/libvirt:/var/run/libvirt:ro', '--volume=/var/log/containers/ceilometer:/var/log/ceilometer', '192.0.10.1:8787/rhosp13/openstack-ceilometer-compute:2018-05-07.2']. [125]", "stdout: 811a9940b8afa258ccb467cd765a9b24edbce059e106ef125dc1a79686df3e32", "stderr: /usr/bin/docker-current: Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused \"process_linux.go:245: running exec setns process for init caused \\\"exit status 29\\\"\".", We're seeing this issue also in Pike. Reproduced it standalone by running the docker-puppet script directly on the compute nodes. Out of 5 trials, the same error occurred once. Trying to figure out if it could be reproduced by directly with docker command. export PROCESS_COUNT=3 export NET_HOST='true' python /var/lib/docker-puppet/docker-puppet.py http://paste.openstack.org/show/720984/ *** Bug 1578355 has been marked as a duplicate of this bug. *** cat <<EOF>/home/stack/nsenter_error_tht.patch diff --git a/docker/docker-puppet.py b/docker/docker-puppet.py index 599ed14b..89e92ce4 100755 --- a/docker/docker-puppet.py +++ b/docker/docker-puppet.py @@ -275,6 +275,7 @@ def mp_puppet_config((config_volume, puppet_tags, manifest, config_image, volume pull_image(config_image) dcmd = ['/usr/bin/docker', 'run', + '--pid', 'host', '--user', 'root', '--name', 'docker-puppet-%s' % config_volume, '--env', 'PUPPET_TAGS=%s' % puppet_tags, EOF cat <<EOF>/home/stack/nsenter_error_tht.patch diff --git a/docker/docker-puppet.py b/docker/docker-puppet.py index 599ed14b..89e92ce4 100755 --- a/docker/docker-puppet.py +++ b/docker/docker-puppet.py @@ -275,6 +275,7 @@ def mp_puppet_config((config_volume, puppet_tags, manifest, config_image, volume pull_image(config_image) dcmd = ['/usr/bin/docker', 'run', + '--pid', 'host', '--user', 'root', '--name', 'docker-puppet-%s' % config_volume, '--env', 'PUPPET_TAGS=%s' % puppet_tags, EOF sudo patch -d /usr/share/openstack-tripleo-heat-templates/ -p1 < /home/stack/nsenter_error_tht.patch sudo yum install libguestfs-tools -y cd /home/stack/images sudo su mkdir -p mount_image export LIBGUESTFS_BACKEND=direct guestmount -a overcloud-full.qcow2 -m /dev/sda mount_image patch -d mount_image/usr/lib/python2.7/site-packages/ -p1 < /home/stack/nsenter_error_paunch.patch exit openstack overcloud image upload --update-existing cat <<EOF>/home/stack/nsenter_error_paunch.patch diff --git a/paunch/builder/compose1.py b/paunch/builder/compose1.py index 5472d37..a041bc0 100644 --- a/paunch/builder/compose1.py +++ b/paunch/builder/compose1.py @@ -144,6 +144,7 @@ class ComposeV1Builder(object): for v in cconfig.get('environment', []): if v: cmd.append('--env=%s' % v) + cmd.append('--pid=host') if cconfig.get('remove', False): cmd.append('--rm') if cconfig.get('interactive', False): EOF *** Bug 1581711 has been marked as a duplicate of this bug. *** *** Bug 1566561 has been marked as a duplicate of this bug. *** Verified with Kernel 3.10.0-862.6.3.el7.x86_64 # With Tuned enabled, please refer NFV CI' THT already merged https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/nfv/view/nfv/ As per comment #46, this issue has been verified by QE. And this kernel is available as part of OSP13 19th July update. Moving to verified state. Closing it as the required kernel version is already part of 19th July 2018 release of OSP13. |
Created attachment 1414748 [details] stack failure log with tuned enabled deployment Description of problem: Deployment failed with docker run failing intermittently with below error log ----------------------------- "2018-03-21 06:57:02,719 ERROR: 12318 -- Failed running docker-puppet.py for iscsid", "2018-03-21 06:57:02,719 ERROR: 12318 -- nsenter: failed to unshare namespaces: Invalid argument", "container_linux.go:247: starting container process caused \"process_linux.go:245: running exec setns process for init caused \\\"exit status 29\\\"\"", "/usr/bin/docker-current: Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused \"process_linux.go:245: running exec setns process for init caused \\\"exit status 29\\\"\".", "time=\"2018-03-21T06:57:02Z\" level=error msg=\"error getting events from daemon: net/http: request canceled\" ", ----------------------------- * Happens only with RHEL, verified in centos, doesn't happen. * It is not always with the same docker image, randomly occuring on different containers and different steps of deployment * Sometimes the deployment is successful when deployed again (stack update), but not always. * Deployment is successful when the tuned profile is removed from the parameters (2 out of 2 passed)