Bug 1472310
Summary: | Compute node freezes after two server instance deploys in ROL-Staging | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Philip Sweany <psweany> | ||||||
Component: | openstack-nova | Assignee: | Eoghan Glynn <eglynn> | ||||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Joe H. Rahme <jhakimra> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | unspecified | CC: | berrange, dasmith, eglynn, kchamart, psweany, rlocke, sbauza, sferdjao, sgordon, srevivo, svanders, vromanso | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2017-08-15 02:06:17 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Philip Sweany
2017-07-18 12:49:55 UTC
Created attachment 1300469 [details]
The steps taken starting from a clean CL210-OSP10 ROL-stage environment
Created attachment 1300470 [details]
An overview of working with ROL-stage, for those unfamiliar with Red Hat Training
Is the compute node deployed on a baremetal machine or on a vm? In any case, that could not be a nova issue, which should never be in the position to freeze a machine. ;-) All machines (undercloud and overcloud) are VMs. It appears that I, as a novice bugzilla submitter, have chosen the wrong category. Since my troubleshooting has been unsuccessful to point to a root cause, and there is no category for just "openstack", I chose openstack-nova. This might be caused by nova overcommit misconfiguration, errant CPU detection or classification, or many other things. Being that this is running on top of our ROL platform (Ravello emulation), I am unclear about how to proceed. If you know that this cannot be a nova issue, then please help me determine who should be looking at this. This is critical to the Red Hat Training group's ability to deploy this course on our ROL platform for our customers, and we do not have the engineering knowledge depth we think is needed to track this one down. This has already stumped us for over two weeks. (In reply to Philip Sweany from comment #4) > All machines (undercloud and overcloud) are VMs. It appears that I, as a > novice bugzilla submitter, have chosen the wrong category. Since my > troubleshooting has been unsuccessful to point to a root cause, and there is > no category for just "openstack", I chose openstack-nova. This might be > caused by nova overcommit misconfiguration, errant CPU detection or > classification, or many other things. Being that this is running on top of > our ROL platform (Ravello emulation), I am unclear about how to proceed. So, you say the node "freezes", but you just can't reach it anymore, right? Does it still respond to ping? So I only see two possibilities: the network gets messed up or the machine really freezes. I'm not 100% confident about the network case, but if the machine really freezes, that cannot be a nova issue, even if it triggers it. The node shouldn't freeze, no matter what nova does. > If you know that this cannot be a nova issue, then please help me determine > who should be looking at this. This is critical to the Red Hat Training > group's ability to deploy this course on our ROL platform for our customers, > and we do not have the engineering knowledge depth we think is needed to > track this one down. This has already stumped us for over two weeks. Who is operating this platform? They would be the first people I would talk to. Other than that, it depends if we can find out if the VM actually freezes, or just gets inaccessible. In the second case I would ask the kernel people, in the first maybe openstack neutron (networking). No further information. The environment is Ravello. Our internal Red Hat course development staging. We were counting on assistance to determine *how* to determine this cause, as the undercloud gives no indication of what happened. Closing this, since we are still where we were at the beginning. (To answer your question: yes, the node just freezes. No, I hadn't ever experienced that before. Being frozen, it did not give us much to look at.) |