Bug 1508694 - How to get per instance bandwidth
Summary: How to get per instance bandwidth
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 8.0 (Liberty)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: zstream
: 10.0 (Newton)
Assignee: Pradeep Kilambi
QA Contact: Gurenko Alex
URL:
Whiteboard:
Depends On:
Blocks: 1473146
TreeView+ depends on / blocked
 
Reported: 2017-11-02 00:49 UTC by Robin Cernin
Modified: 2021-02-01 02:39 UTC (History)
12 users (show)

Fixed In Version: openstack-tripleo-heat-templates-5.3.3-3.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-03-08 07:44:22 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Robin Cernin 2017-11-02 00:49:10 UTC
Description of problem:

How to set up Bandwidth monitoring.

We have configured the L3 Meter using following resources:
https://wiki.openstack.org/wiki/Neutron/Metering/Bandwidth

We need to document this feature for our customers.

Version-Release number of selected component (if applicable):

# rpm -qa | grep neutron-metering
openstack-neutron-metering-agent-7.2.0-20.el7ost.noarch

# rpm -qa | grep ceilometer
openstack-ceilometer-polling-5.0.5-6.el7ost.noarch
openstack-ceilometer-alarm-5.0.5-6.el7ost.noarch
openstack-ceilometer-api-5.0.5-6.el7ost.noarch
openstack-ceilometer-notification-5.0.5-6.el7ost.noarch
openstack-ceilometer-collector-5.0.5-6.el7ost.noarch
openstack-ceilometer-central-5.0.5-6.el7ost.noarch
python-ceilometer-5.0.5-6.el7ost.noarch
openstack-ceilometer-common-5.0.5-6.el7ost.noarch
python-ceilometerclient-1.5.2-1.el7ost.noarch
openstack-ceilometer-compute-5.0.5-6.el7ost.noarch


How reproducible:

See below:

Steps to Reproduce:

1. Check that metering is enabled in `/etc/neutron/neutron.conf`:
----
[DEFAULT]
service_plugins=router,metering

2. Check that metering_agent is enabled in  `/etc/neutron/metering_agent.ini`:
---
[DEFAULT]
debug = True
driver = neutron.services.metering.drivers.iptables.iptables_driver.IptablesMeteringDriver
measure_interval = 30
report_interval = 300
interface_driver = neutron.agent.linux.interface.OVSInterfaceDriver
use_namespaces = True

3. Restart the `neutron-server`:
---
pcs resource restart neutron-server

3a. Start and Enable `metering-agent` service:
---
systemctl enable neutron-metering-agent.service
systemctl start neutron-metering-agent.service

4. Create neutron metering label `project1`:
---
neutron meter-label-create project1 --shared

5. Create neutron meter-label-rule for the instance IP you wish to monitor bandwidth on:
---
neutron meter-label-rule-create project1 10.0.0.114/32 --direction egress
neutron meter-label-rule-create project1 10.0.0.114/32 --direction ingress

Now that creates a bandwidth meter:

$ ceilometer meter-list  | grep bandwidth
| bandwidth                  | delta | B         | 40b6d8b9-a687-423f-8add-fb904db6e7a9 | None                             | 2c86836ca10e47b4b2a9be60316772c2 |

Which we should be able to monitor the bandwidth on top of the instance:

$  ceilometer statistics -m bandwidth -q "resource=970e25fb-2f8e-47ef-a433-b5f0ea22073e"
+--------+--------------+------------+-----+-----+-----+-----+-------+----------+----------------+--------------+
| Period | Period Start | Period End | Max | Min | Avg | Sum | Count | Duration | Duration Start | Duration End |
+--------+--------------+------------+-----+-----+-----+-----+-------+----------+----------------+--------------+
+--------+--------------+------------+-----+-----+-----+-----+-------+----------+----------------+--------------+

Actual results:

When we are trying to pull the statistics in general from bandwidth meter we see some data are collected, but not directly on top of instance.

$  ceilometer statistics -m bandwidth
+--------+----------------------------+----------------------------+-----+-----+-----+-----+-------+----------+----------------------------+----------------------------+
| Period | Period Start               | Period End                 | Max | Min | Avg | Sum | Count | Duration | Duration Start             | Duration End               |
+--------+----------------------------+----------------------------+-----+-----+-----+-----+-------+----------+----------------------------+----------------------------+
| 0      | 2017-10-11T17:06:30.004000 | 2017-11-02T00:34:48.012000 | 0.0 | 0.0 | 0.0 | 0.0 | 34    | 2639.992 | 2017-11-01T23:50:48.020000 | 2017-11-02T00:34:48.012000 |
+--------+----------------------------+----------------------------+-----+-----+-----+-----+-------+----------+----------------------------+----------------------------+

Expected results:

We want to be able to get bandwidth data per instance.

Additional info:

We noticed that there is no neutron-meter rule created in iptables on top of qrouter:

# ip netns exec qrouter-9c5cf064-01e0-4dec-a210-e874559526bc iptables -vnL
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
  166 17380 neutron-l3-agent-INPUT  all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain FORWARD (policy ACCEPT 10 packets, 840 bytes)
 pkts bytes target     prot opt in     out     source               destination         
  150 18430 neutron-filter-top  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
  150 18430 neutron-l3-agent-FORWARD  all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain OUTPUT (policy ACCEPT 196 packets, 7840 bytes)
 pkts bytes target     prot opt in     out     source               destination         
87262 3501K neutron-filter-top  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
87262 3501K neutron-l3-agent-OUTPUT  all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain neutron-filter-top (2 references)
 pkts bytes target     prot opt in     out     source               destination         
87412 3520K neutron-l3-agent-local  all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain neutron-l3-agent-FORWARD (1 references)
 pkts bytes target     prot opt in     out     source               destination         

Chain neutron-l3-agent-INPUT (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            mark match 0x1/0xffff
    0     0 DROP       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:9697

Chain neutron-l3-agent-OUTPUT (1 references)
 pkts bytes target     prot opt in     out     source               destination         

Chain neutron-l3-agent-local (1 references)
 pkts bytes target     prot opt in     out     source               destination         


---

# ip netns exec qrouter-9c5cf064-01e0-4dec-a210-e874559526bc iptables -vnL -t nat
Chain PREROUTING (policy ACCEPT 2 packets, 80 bytes)
 pkts bytes target     prot opt in     out     source               destination         
  129  6130 neutron-l3-agent-PREROUTING  all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
   14   688 neutron-l3-agent-OUTPUT  all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain POSTROUTING (policy ACCEPT 1 packets, 84 bytes)
 pkts bytes target     prot opt in     out     source               destination         
   23  1396 neutron-l3-agent-POSTROUTING  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
   11   832 neutron-postrouting-bottom  all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain neutron-l3-agent-OUTPUT (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DNAT       all  --  *      *       0.0.0.0/0            10.0.0.114           to:192.168.3.5

Chain neutron-l3-agent-POSTROUTING (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    6   240 ACCEPT     all  --  !qg-585145aa-93 !qg-585145aa-93  0.0.0.0/0            0.0.0.0/0            ! ctstate DNAT

Chain neutron-l3-agent-PREROUTING (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    8   624 DNAT       all  --  *      *       0.0.0.0/0            10.0.0.114           to:192.168.3.5
    0     0 REDIRECT   tcp  --  qr-+   *       0.0.0.0/0            169.254.169.254      tcp dpt:80 redir ports 9697

Chain neutron-l3-agent-float-snat (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 SNAT       all  --  *      *       192.168.3.5          0.0.0.0/0            to:10.0.0.114

Chain neutron-l3-agent-snat (1 references)
 pkts bytes target     prot opt in     out     source               destination         
   11   832 neutron-l3-agent-float-snat  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
    0     0 SNAT       all  --  *      qg-585145aa-93  0.0.0.0/0            0.0.0.0/0            to:10.0.0.113
    0     0 SNAT       all  --  *      *       0.0.0.0/0            0.0.0.0/0            mark match ! 0x2/0xffff ctstate DNAT to:10.0.0.113

Chain neutron-postrouting-bottom (1 references)
 pkts bytes target     prot opt in     out     source               destination         
   11   832 neutron-l3-agent-snat  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* Perform source NAT on outgoing traffic. */

---

# ip netns exec qrouter-9c5cf064-01e0-4dec-a210-e874559526bc ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
18: ha-896295ae-af: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UNKNOWN qlen 1000
    link/ether fa:16:3e:29:93:39 brd ff:ff:ff:ff:ff:ff
    inet 169.254.192.2/18 brd 169.254.255.255 scope global ha-896295ae-af
       valid_lft forever preferred_lft forever
    inet 169.254.0.1/24 scope global ha-896295ae-af
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe29:9339/64 scope link 
       valid_lft forever preferred_lft forever
21: qr-ce86a6be-ba: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UNKNOWN qlen 1000
    link/ether fa:16:3e:25:68:b4 brd ff:ff:ff:ff:ff:ff
    inet 192.168.3.1/24 scope global qr-ce86a6be-ba
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe25:68b4/64 scope link 
       valid_lft forever preferred_lft forever
23: qg-585145aa-93: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UNKNOWN qlen 1000
    link/ether fa:16:3e:e1:9d:f0 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.113/24 scope global qg-585145aa-93
       valid_lft forever preferred_lft forever
    inet 10.0.0.114/32 scope global qg-585145aa-93
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fee1:9df0/64 scope link 
       valid_lft forever preferred_lft forever

Comment 8 Mehdi ABAAKOUK 2017-11-09 08:10:55 UTC
The instance network bandwidth works only if libvirt is able to measure it.
To check if libvirt side works well:

# virsh domiflist 20932a75-3d04-44b1-b5c6-fcbeb7bfbf69
Interface  Type       Source     Model       MAC
-------------------------------------------------------
tapda0261ee-9f bridge     brq07487477-8c virtio      fa:16:3e:f0:c3:89


# virsh domifstat 20932a75-3d04-44b1-b5c6-fcbeb7bfbf69 tapda0261ee-9f
tapda0261ee-9f rx_bytes 1658017639
tapda0261ee-9f rx_packets 19072434
tapda0261ee-9f rx_errs 0
tapda0261ee-9f rx_drop 0
tapda0261ee-9f tx_bytes 38658724198
tapda0261ee-9f tx_packets 20335025
tapda0261ee-9f tx_errs 0
tapda0261ee-9f tx_drop 0

If "Interface" is empty in "virsh domiflist" or if  "virsh domifstat" doesn't output anything, that means the network interface type does not support statistics, so Ceilometer can't retrieved bandwidth.

Comment 9 Mehdi ABAAKOUK 2017-11-09 08:13:36 UTC
I'm sure it works with libvirt >= 2.0.0, maybe in recent 1.3.X, but it was not implemented on < 1.3.X.

Comment 10 Robin Cernin 2017-11-10 21:36:46 UTC
In the lab:

libvirt-3.2.0-14.el7_4.2.x86_64

[root@compute-0 ~]# virsh domiflist instance-0000000c
Interface  Type       Source     Model       MAC
-------------------------------------------------------
tapaef4ac7c-50 bridge     qbraef4ac7c-50 virtio      fa:16:3e:34:71:95


[root@compute-0 ~]# virsh domifstat instance-0000000c tapaef4ac7c-50
tapaef4ac7c-50 rx_bytes 47819
tapaef4ac7c-50 rx_packets 387
tapaef4ac7c-50 rx_errs 0
tapaef4ac7c-50 rx_drop 0
tapaef4ac7c-50 tx_bytes 40082
tapaef4ac7c-50 tx_packets 347
tapaef4ac7c-50 tx_errs 0
tapaef4ac7c-50 tx_drop 0

No network in ceilometer meter-list
---

+----------------------------+-------+-----------+--------------------------------------+----------------------------------+----------------------------------+
| Name                       | Type  | Unit      | Resource ID                          | User ID                          | Project ID                       |
+----------------------------+-------+-----------+--------------------------------------+----------------------------------+----------------------------------+
| bandwidth                  | delta | B         | 40b6d8b9-a687-423f-8add-fb904db6e7a9 | None                             | 2c86836ca10e47b4b2a9be60316772c2 |
| disk.ephemeral.size        | gauge | GB        | 794677ff-50fb-4c0f-b3cf-da38752ecbe7 | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| disk.ephemeral.size        | gauge | GB        | 970e25fb-2f8e-47ef-a433-b5f0ea22073e | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| disk.ephemeral.size        | gauge | GB        | e01f6a01-d3d8-4eb8-8880-e670f6882164 | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| disk.ephemeral.size        | gauge | GB        | f6e421a0-3d12-40c1-9ee1-9f6c15893c63 | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| disk.root.size             | gauge | GB        | 794677ff-50fb-4c0f-b3cf-da38752ecbe7 | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| disk.root.size             | gauge | GB        | 970e25fb-2f8e-47ef-a433-b5f0ea22073e | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| disk.root.size             | gauge | GB        | e01f6a01-d3d8-4eb8-8880-e670f6882164 | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| disk.root.size             | gauge | GB        | f6e421a0-3d12-40c1-9ee1-9f6c15893c63 | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| image                      | gauge | image     | cacbb8be-921b-4051-9d82-db43dd3b2ae6 | None                             | 2c86836ca10e47b4b2a9be60316772c2 |
| image.size                 | gauge | B         | cacbb8be-921b-4051-9d82-db43dd3b2ae6 | None                             | 2c86836ca10e47b4b2a9be60316772c2 |
| memory                     | gauge | MB        | 794677ff-50fb-4c0f-b3cf-da38752ecbe7 | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| memory                     | gauge | MB        | 970e25fb-2f8e-47ef-a433-b5f0ea22073e | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| memory                     | gauge | MB        | e01f6a01-d3d8-4eb8-8880-e670f6882164 | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| memory                     | gauge | MB        | f6e421a0-3d12-40c1-9ee1-9f6c15893c63 | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| storage.objects            | gauge | object    | 2c86836ca10e47b4b2a9be60316772c2     | None                             | 2c86836ca10e47b4b2a9be60316772c2 |
| storage.objects            | gauge | object    | 6583fa75c05d42ee849adba853483a1f     | None                             | 6583fa75c05d42ee849adba853483a1f |
| storage.objects.containers | gauge | container | 2c86836ca10e47b4b2a9be60316772c2     | None                             | 2c86836ca10e47b4b2a9be60316772c2 |
| storage.objects.containers | gauge | container | 6583fa75c05d42ee849adba853483a1f     | None                             | 6583fa75c05d42ee849adba853483a1f |
| storage.objects.size       | gauge | B         | 2c86836ca10e47b4b2a9be60316772c2     | None                             | 2c86836ca10e47b4b2a9be60316772c2 |
| storage.objects.size       | gauge | B         | 6583fa75c05d42ee849adba853483a1f     | None                             | 6583fa75c05d42ee849adba853483a1f |
| vcpus                      | gauge | vcpu      | 794677ff-50fb-4c0f-b3cf-da38752ecbe7 | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| vcpus                      | gauge | vcpu      | 970e25fb-2f8e-47ef-a433-b5f0ea22073e | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| vcpus                      | gauge | vcpu      | e01f6a01-d3d8-4eb8-8880-e670f6882164 | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| vcpus                      | gauge | vcpu      | f6e421a0-3d12-40c1-9ee1-9f6c15893c63 | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
+----------------------------+-------+-----------+--------------------------------------+----------------------------------+----------------------------------+

Comment 11 Mehdi ABAAKOUK 2017-11-13 10:22:30 UTC
I have checked what wrong on the labs.

I have found that "host" is unset in ceilometer.conf, that means Ceilometer use the hostname (without domain) as fallback to determine the name of the compute node registered in Nova.

But Nova have fqdn configured as "host". So Ceilometer and Nova compute host are out of sync, making all libvirt based metrics collection broken.

I have set "host = compute-0.localdomain" to ceilometer.conf and restart openstack-ceilometer-compute.service.

Metrics are there now:

$ ceilometer meter-list | grep network
| network.incoming.bytes          | cumulative | B         | instance-0000000c-970e25fb-2f8e-47ef-a433-b5f0ea22073e-tapaef4ac7c-50 | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| network.incoming.bytes.rate     | gauge      | B/s       | instance-0000000c-970e25fb-2f8e-47ef-a433-b5f0ea22073e-tapaef4ac7c-50 | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| network.incoming.packets        | cumulative | packet    | instance-0000000c-970e25fb-2f8e-47ef-a433-b5f0ea22073e-tapaef4ac7c-50 | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| network.incoming.packets.rate   | gauge      | packet/s  | instance-0000000c-970e25fb-2f8e-47ef-a433-b5f0ea22073e-tapaef4ac7c-50 | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| network.outgoing.bytes          | cumulative | B         | instance-0000000c-970e25fb-2f8e-47ef-a433-b5f0ea22073e-tapaef4ac7c-50 | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| network.outgoing.bytes.rate     | gauge      | B/s       | instance-0000000c-970e25fb-2f8e-47ef-a433-b5f0ea22073e-tapaef4ac7c-50 | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| network.outgoing.packets        | cumulative | packet    | instance-0000000c-970e25fb-2f8e-47ef-a433-b5f0ea22073e-tapaef4ac7c-50 | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |
| network.outgoing.packets.rate   | gauge      | packet/s  | instance-0000000c-970e25fb-2f8e-47ef-a433-b5f0ea22073e-tapaef4ac7c-50 | 6d361abcd4a1471ea69519f578961f9e | 2c86836ca10e47b4b2a9be60316772c2 |


We fixed "out of sync host" in recent OSP version, we may need to backport this to OSP8.

I will see with the team what we can do to fix it.

Comment 16 Mehdi ABAAKOUK 2018-03-08 07:44:22 UTC
The fix have been related since a while noww and already part of the latest z release. So I'm closing this.


Note You need to log in before you can comment on or make changes to this bug.