Bug 1351795 - Isolated metadata will not work if ipv6 subnet is created first
Summary: Isolated metadata will not work if ipv6 subnet is created first
Keywords:
Status: CLOSED DUPLICATE of bug 1367947
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 8.0 (Liberty)
Hardware: Unspecified
OS: Linux
high
urgent
Target Milestone: async
: 8.0 (Liberty)
Assignee: Jakub Libosvar
QA Contact: Toni Freger
URL:
Whiteboard: hot
Depends On:
Blocks: 1194008
TreeView+ depends on / blocked
 
Reported: 2016-06-30 21:28 UTC by kahou
Modified: 2017-01-18 14:10 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-01-18 14:10:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description kahou 2016-06-30 21:28:05 UTC
Description of problem:

we notice that VM cannot talk to metadata server when the VM is booting up.

Turns out the metadata agent cannot talk to neutron server to get the port information as the metadata agent locked up itself.

If I do either one of the following options, then metadata server will start working again:
1. Turn off syslog
2. Change the metadata agent worker to 0

Version-Release number of selected component (if applicable):

Neutron 7.0

Steps to Reproduce:
1. Boot a cirros instance
2. Run nova console-log <vm>
3. Observe the vm cannot talk to the metadata server

Actual results:


Expected results:


Additional info:

Comment 1 kahou 2016-06-30 21:29:39 UTC
This may relate to https://bugzilla.redhat.com/show_bug.cgi?id=1330778

Comment 2 Charles Crouch 2016-07-18 21:55:13 UTC
FWIW Kahou tested this issue with the fix from https://bugzilla.redhat.com/show_bug.cgi?id=1330778 and the problem remained, i.e. doing

[admin@mcp1 ~]$ sudo ip netns exec qdhcp-c2c6cb56-d741-4650-9dc6-db0f7e59dea2 curl http://169.254.169.254/

is expected to return immediately but it hangs.

So this issue and BZ1330778 appear to have different causes

Comment 3 Charles Crouch 2016-07-18 21:57:02 UTC
Also here is the corresponding support case: https://access.redhat.com/support/cases/#/case/01640942

Comment 4 Charles Crouch 2016-07-20 14:25:00 UTC
My bad please ignore the above, that case is for a different issue we have with the neutron metadata agent (BZ1339014). We dont have a support case for this BZ

Comment 5 Assaf Muller 2016-07-26 21:04:41 UTC
Jakub can you please help triage?

Comment 6 Jakub Libosvar 2016-07-26 21:27:30 UTC
(In reply to Charles Crouch from comment #4)
> My bad please ignore the above, that case is for a different issue we have
> with the neutron metadata agent (BZ1339014). We dont have a support case for
> this BZ

The symptoms for bug 1339014 where that rpc was initialized before forking process. According description of this bug it seems like the same symptom. Is it possible to test the metadata agent with the same hotfix provided for bug 1339014 ?

Comment 7 kahou 2016-08-02 22:20:37 UTC
Hi,

I installed the suggested build but I still see the issue.

info: initramfs: up at 2.34
GROWROOT: CHANGED: partition=1 start=16065 old: size=64260 end=80325 new: size=2072385,end=2088450
info: initramfs loading root from /dev/vda1
info: /etc/init.d/rc.sysinit: up at 2.44
info: container: none
Starting logging: OK
modprobe: module virtio_blk not found in modules.dep
modprobe: module virtio_net not found in modules.dep
WARN: /etc/rc3.d/S10-load-modules failed
Initializing random number generator... done.
Starting acpid: OK
cirros-ds 'local' up at 2.56
no results found for mode=local. up 2.60. searched: nocloud configdrive ec2
Starting network...
udhcpc (v1.20.1) started
Sending discover...
Sending select for 10.209.0.3...
Lease of 10.209.0.3 obtained, lease time 14400
route: SIOCADDRT: File exists
WARN: failed: route add -net "0.0.0.0/0" gw "10.209.0.1"
cirros-ds 'net' up at 2.65
checking http://169.254.169.254/2009-04-04/instance-id
failed 1/20: up 2.66. request failed
failed 2/20: up 14.70. request failed
failed 3/20: up 26.71. request failed
failed 4/20: up 38.72. request failed
failed 5/20: up 50.73. request failed
failed 6/20: up 62.75. request failed

Comment 8 kahou 2016-08-02 22:20:48 UTC
Hi,

I installed the suggested build but I still see the issue.

info: initramfs: up at 2.34
GROWROOT: CHANGED: partition=1 start=16065 old: size=64260 end=80325 new: size=2072385,end=2088450
info: initramfs loading root from /dev/vda1
info: /etc/init.d/rc.sysinit: up at 2.44
info: container: none
Starting logging: OK
modprobe: module virtio_blk not found in modules.dep
modprobe: module virtio_net not found in modules.dep
WARN: /etc/rc3.d/S10-load-modules failed
Initializing random number generator... done.
Starting acpid: OK
cirros-ds 'local' up at 2.56
no results found for mode=local. up 2.60. searched: nocloud configdrive ec2
Starting network...
udhcpc (v1.20.1) started
Sending discover...
Sending select for 10.209.0.3...
Lease of 10.209.0.3 obtained, lease time 14400
route: SIOCADDRT: File exists
WARN: failed: route add -net "0.0.0.0/0" gw "10.209.0.1"
cirros-ds 'net' up at 2.65
checking http://169.254.169.254/2009-04-04/instance-id
failed 1/20: up 2.66. request failed
failed 2/20: up 14.70. request failed
failed 3/20: up 26.71. request failed
failed 4/20: up 38.72. request failed
failed 5/20: up 50.73. request failed
failed 6/20: up 62.75. request failed

Comment 9 Jakub Libosvar 2016-08-05 14:32:53 UTC
(In reply to kahou from comment #8)
> Hi,
> 
> I installed the suggested build but I still see the issue.
> 
> info: initramfs: up at 2.34
> GROWROOT: CHANGED: partition=1 start=16065 old: size=64260 end=80325 new:
> size=2072385,end=2088450
> info: initramfs loading root from /dev/vda1
> info: /etc/init.d/rc.sysinit: up at 2.44
> info: container: none
> Starting logging: OK
> modprobe: module virtio_blk not found in modules.dep
> modprobe: module virtio_net not found in modules.dep
> WARN: /etc/rc3.d/S10-load-modules failed
> Initializing random number generator... done.
> Starting acpid: OK
> cirros-ds 'local' up at 2.56
> no results found for mode=local. up 2.60. searched: nocloud configdrive ec2
> Starting network...
> udhcpc (v1.20.1) started
> Sending discover...
> Sending select for 10.209.0.3...
> Lease of 10.209.0.3 obtained, lease time 14400
> route: SIOCADDRT: File exists
> WARN: failed: route add -net "0.0.0.0/0" gw "10.209.0.1"
> cirros-ds 'net' up at 2.65
> checking http://169.254.169.254/2009-04-04/instance-id
> failed 1/20: up 2.66. request failed
> failed 2/20: up 14.70. request failed
> failed 3/20: up 26.71. request failed
> failed 4/20: up 38.72. request failed
> failed 5/20: up 50.73. request failed
> failed 6/20: up 62.75. request failed

Can you please provide debug logs from metadata agent? Does it work, when you turn off syslog? Do you use "use_syslog" config option?

Comment 10 Charles Crouch 2016-08-09 16:33:02 UTC
[5:28 PM] Kahou Lei: Turn off syslog doesn't work either. Sorry I forgot to update the ticket
[5:29 PM] Charles Crouch: thanks, then I think they will definitely need some logs
[5:29 PM] Kahou Lei: Tell him that metadata log doesn't show anything even I turn on debug log level

Comment 11 Jakub Libosvar 2016-08-10 09:15:16 UTC
(In reply to Charles Crouch from comment #10)
> [5:28 PM] Kahou Lei: Turn off syslog doesn't work either. Sorry I forgot to
> update the ticket
> [5:29 PM] Charles Crouch: thanks, then I think they will definitely need
> some logs
> [5:29 PM] Kahou Lei: Tell him that metadata log doesn't show anything even I
> turn on debug log level

That sounds like configuration issue of loggers. If there are no logs, can we get the sos report, please?

Comment 12 kahou 2016-09-19 15:04:46 UTC
Hi Jakub,

Sorry for the late reply. I will generate the sosreport by tomorrow.

Thanks,
Kahou

Comment 13 kahou 2016-09-20 17:22:36 UTC
Hi Jakub,

Due to some technical issue which delay the debugging. I am still trying to gather the sosreport.

Thanks,
Kahou

Comment 14 Charles Crouch 2016-10-04 15:05:48 UTC
Quick update
There has been good progress on this issue in the background between Red Hat and Metacloud engineering. There should be an update posted this week with further details.

Comment 15 Jakub Libosvar 2016-11-21 09:51:53 UTC
Any updates on this?

Comment 16 kahou 2016-11-21 19:15:26 UTC
Hi Jakub,

This is the same issue as the upstream bug: https://bugs.launchpad.net/neutron/+bug/1556991

Thanks,
Kahou


Note You need to log in before you can comment on or make changes to this bug.