Bug 1818033 - s390x - ccw network devices are not configured after hotplug
Summary: s390x - ccw network devices are not configured after hotplug
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Multi-Arch
Version: 4.3.z
Hardware: s390x
OS: Unspecified
low
medium
Target Milestone: ---
: ---
Assignee: jschinta
QA Contact: Douglas Slavens
URL:
Whiteboard: MULTI-ARCH
Depends On:
Blocks: ocp-42-45-z-tracker
TreeView+ depends on / blocked
 
Reported: 2020-03-27 13:54 UTC by wvoesch
Modified: 2022-04-12 06:01 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-04-12 06:01:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github coreos fedora-coreos-config pull 848 0 None open [WIP] overlay: generate and propagate udev rules for s390x znet devices 2021-02-23 15:48:41 UTC

Description wvoesch 2020-03-27 13:54:45 UTC
Description of the problem:
When varying off and on the chpid  to simulate an outage, the network does not come up automatically afterwards. The guest has to be restarted.
 
The Hipersocket will come up with it's persistent configuration which is missing a configuration for layer2=x (usually 1) and end up in layer 3 mode which does not work if the rest of the network is layer2 and which con only be set when enabling the device, not while it's already online.
 
If the command chzdev <device>  -p layer2=1 is issued before varying the chpid, the network come up on its on.
 
Configuration:
Z13
Hypersockets
Server Version: 4.3.0-0.nightly-s390x-2020-03-09-183623
Kubernetes Version: v1.16.2
Kernel Version: 4.18.0-147.5.1.el8_1.s390x
 OS Image: Red Hat Enterprise Linux CoreOS 43.81.202003091812.0 (Ootpa)
 
How reproducible:
Every time
 
Steps to Reproduce:
On a node:

    chchp -v 0 <chpid>
    wait 3 seconds
    chchp -v 1 <chpid>
    chzdev -e <device>

 
 
Actual results:
Network not available
 
 
Expected results:
Network available.

Comment 1 Carvel Baus 2020-05-19 01:44:51 UTC
Did you collect any logs from the node when this occurred?

Comment 2 wvoesch 2020-05-20 07:59:01 UTC
Not any particular logs, however currently I have a cluster on hipersockets. Version 4.4.0-0.nightly-s390x-2020-05-18-143518. I could reproduce the behavior. Is there any log that you want in particular?

I have attached the output of the script, that varies off and on the hipersockets.

$ lszdev <DEVICE> -i -i
DEVICE qeth 0.0.<DEVICE>:0.0.<DEVICE>:0.0.<DEVICE>
  Names              : <interface-name>
  Network interfaces : <interface-name>
  Resources provided : IPv4 address <worker-ip>
                       IPv6 address <worker-ip>
  Modules            : qeth
  Online             : yes
  Exists             : yes
  Persistent         : yes
  Device path        : /sys/bus/ccwgroup/drivers/qeth/0.0.<DEVICE>/

  ATTRIBUTE                ACTIVE             PERSISTENT
  bridge_hostnotify        "0"                -
  bridge_reflect_promisc   "none"             -
  bridge_role              "none"             -
  bridge_state             "inactive"         -
  buffer_count             "128"              -
  card_type                "HiperSockets"     -
  chpid                    "FD"               -
  hw_trap                  "disarm"           -
  if_name                  "<interface-name>" -
  inbuf_size               "40k"              -
  isolation                "none"             -
  layer2                   "1"                -
  online                   "1"                "1"
  performance_stats        "0"                -
  portname                 ""                 -
  portno                   "0"                -
  priority_queueing        "always queue 2"   -
  state                    "UP (LAN ONLINE)"  -
  vnicc/bridge_invisible   "n/a"              -
  vnicc/flooding           "n/a"              -
  vnicc/learning           "n/a"              -
  vnicc/learning_timeout   "n/a"              -
  vnicc/mcast_flooding     "n/a"              -
  vnicc/rx_bcast           "n/a"              -
  vnicc/takeover_learning  "n/a"              -
  vnicc/takeover_setvmac   "n/a"              -

  READONLY   ACTIVE
  cdev0      "0.0.<DEVICE>"
  cdev1      "0.0.<DEVICE>"
  cdev2      "0.0.<DEVICE>"
  driver     "qeth"
  subsystem  "ccwgroup"

$ chchp -v 0 <CHPID>
Vary offline 0.<CHPID>... done.

waiting 3 seconds

$ chchp -v 1 <CHPID>
Vary online 0.<CHPID>... done.

waiting 5 seconds

$ chzdev -e <DEVICE>-<DEVICE>
QETH device 0.0.<DEVICE>:0.0.<DEVICE>:0.0.<DEVICE> already configured

$ lszdev <DEVICE> -i -i
DEVICE qeth 0.0.<DEVICE>:0.0.<DEVICE>:0.0.<DEVICE>
  Names              : <interface-name>
  Network interfaces : <interface-name>
  Resources provided : IPv4 address <worker-ip>
                       IPv6 address <worker-ip>
  Modules            : qeth
  Online             : yes
  Exists             : yes
  Persistent         : yes
  Device path        : /sys/bus/ccwgroup/drivers/qeth/0.0.<DEVICE>/

  ATTRIBUTE             ACTIVE             PERSISTENT
  buffer_count          "128"              -
  card_type             "HiperSockets"     -
  chpid                 "FD"               -
  fake_broadcast        "0"                -
  hsuid                 ""                 -
  hw_trap               "disarm"           -
  if_name               "<interface-name>" -
  inbuf_size            "40k"              -
  ipa_takeover/add4     ""                 -
  ipa_takeover/add6     ""                 -
  ipa_takeover/enable   "0"                -
  ipa_takeover/invert4  "0"                -
  ipa_takeover/invert6  "0"                -
  isolation             "none"             -
  layer2                "0"                -
  online                "1"                "1"
  performance_stats     "0"                -
  portname              ""                 -
  portno                "0"                -
  priority_queueing     "always queue 2"   -
  route4                "no_router"        -
  route6                "no_router"        -
  rxip/add4             ""                 -
  rxip/add6             ""                 -
  sniffer               "0"                -
  state                 "UP (LAN ONLINE)"  -
  vipa/add4             ""                 -
  vipa/add6             ""                 -

  READONLY   ACTIVE
  cdev0      "0.0.<DEVICE>"
  cdev1      "0.0.<DEVICE>"
  cdev2      "0.0.<DEVICE>"
  driver     "qeth"
  subsystem  "ccwgroup"

$ ping -c1 <bastion-ip>
PING <bastion-ip> (<bastion-ip>) 56(84) bytes of data.

--- <bastion-ip> ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

Comment 5 Carvel Baus 2020-09-08 13:31:21 UTC
Adding UpcomingSprint as this will not be resolved during current sprint.

Comment 6 Dan Li 2020-09-30 20:08:49 UTC
Adding UpcomingSprint as team is fixing other bugs and will not have the bandwidth to resolve this bug.

Comment 7 Carvel Baus 2020-10-12 19:53:11 UTC
Do we know if this is still an issue in the current relase, 4.5?

Comment 8 Carvel Baus 2020-10-20 12:55:42 UTC
Adding Upcomingsprint. Other issues/effort took priority.

Comment 9 wvoesch 2020-10-20 14:06:23 UTC
This issue is still current in version: 4.6.0-0.nightly-s390x-2020-10-06-145952

Comment 10 Carvel Baus 2020-11-10 14:07:22 UTC
Adding Upcomingsprint. Other issues/effort took priority.

Comment 11 Dan Li 2020-12-16 15:12:28 UTC
Adding UpcomingSprint and re-setting the target version to "---"

Comment 12 Carvel Baus 2021-01-11 21:40:33 UTC
Currently getting worked, but won't make it by end of sprint. Adding upcoming sprint.

Comment 13 Carvel Baus 2021-01-12 18:18:52 UTC
Are you able to provide the dmesg log from the moment just before configuring the channel path to offline through when you attempt to bring it back up? I am interested in any errors that show up in there.

Comment 14 wvoesch 2021-01-21 09:37:41 UTC
Here is the output from dmesg from starting with the channel path going offline until it varied back online.

[  105.300296] cio: 0.0.8213: The device stopped operating while being set offline
[  105.352377] cio: 0.0.8214: The device stopped operating while being set offline
[  105.352406] cio: 0.0.8212: The device stopped operating while being set offline
[  113.432843] ctcm: CTCM driver initialized
[  113.459884] lcs: Loading LCS driver
[  113.488148] qeth 0.0.8212: Completion Queueing supported
[  113.532917] qeth: register layer 3 discipline
[  113.533380] qeth 0.0.8212: Completion Queueing supported
[  113.535167] qdio: 0.0.8214 HS on SC 10 using AI:1 QEBSM:1 PRI:0 TDD:1 SIGA:RW A
[  113.535193] qeth 0.0.8212: Completion Queue support disabled
[  113.572947] qeth 0.0.8212: Device is a HiperSockets card (level: HSSP)
               with link type HiperSockets.
[  113.573058] qeth 0.0.8212: Inbound source MAC-address not supported on hsi%d
[  113.573125] qeth 0.0.8212: VLAN enabled
[  113.573186] qeth 0.0.8212: Multicast enabled
[  113.573189] qeth 0.0.8212: IPV6 enabled
[  113.573296] qeth 0.0.8212: Broadcast enabled
[  113.575409] qeth 0.0.8212 enc8212: renamed from hsi0
[  113.650108] IPv6: ADDRCONF(NETDEV_UP): enc8212: link is not ready

Comment 15 wvoesch 2021-02-05 15:27:36 UTC
After a conversation with development we believe that the scenario describe here is valid test case. The firmware can potentially decide to turn off and on hipersockets. SSHing to he node is just an easy way to emulate this behavior.
Development will investigate further and will add the findings here. We propose to keep the status of this bug as is.

Comment 16 Julian Wiedmann 2021-02-08 12:43:46 UTC
Hi Wolfgang, there's two aspects here I believe.

1. "The Hipersocket will come up with it's persistent configuration which is missing a configuration for layer2=x (usually 1) and end up in layer 3 mode ..."

This is to be expected. If the persistent configuration doesn't specify a layer2 attribute, the qeth driver will select layer2=0 for HiperSockets devices. So we should understand why the persistent configuration doesn't specify a layer2 attribute.


2. Looking at your test scenario
   > chchp -v 0 <chpid>
   > wait 3 seconds
   > chchp -v 1 <chpid>
   > wait 5 seconds
   > chzdev -e <device>

the 'chzdev -e' step shouldn't be necessary. We just need to wait a bit until the CHP is back up & has raised uevents for the re-discovered ccw devices, so that the persistent udev rules can be applied. If this doesn't work reliably, it's a different issue that we should chase down separately.

Comment 17 Julian Wiedmann 2021-02-10 10:59:24 UTC
Ok, some progress on both aspects. The persistent config isn't just missing the layer2 attribute - we're missing the udev rules for the configured device(s) as a whole :). This also explains why the device isn't re-configured automatically when the CHP comes back up.

Nikita suspects that we're missing a step when switching over from dracut, and likely just need to copy over the s390-specific udev rules at that stage. He offered to follow-up on this in a free moment.

Comment 18 Nikita Dubrovskii (IBM) 2021-02-11 13:42:25 UTC
Hi all. 
I did some debugging and here is my observations:
- despite s390-tools provides a dracut module, it doesn't generate any udev rules
- RHEL 8.2 doesn't contain 41-***.rules as well

I've made a patch (for rhcos) which creates and propagates those rules, but i'm not sure should we fix this. 
Maybe it's not an issue but design.

Comment 19 Hendrik Brueckner 2021-02-11 14:48:46 UTC
Hi,

(In reply to Julian Wiedmann from comment #17)
> Ok, some progress on both aspects. The persistent config isn't just missing
> the layer2 attribute - we're missing the udev rules for the configured
> device(s) as a whole :). This also explains why the device isn't
> re-configured automatically when the CHP comes back up.

AFAIK, I would confirm with that :) The point here is that RHCOS uses the rd.znet= dracut parameter to configure the network interface. In RHEL, this parameter is only specified for the installation because anaconda creates the respective configuration files for the NetworkManager. After that, the networkmanager manages device enable and layer2 configuration.  Note that anaconda does not yet use zdev tooling. You can use zdev but probably mix up with other means, like, dasd, zfcp, and network configuration.

With that said, the hipersockets will not come up after a chpid because there is no infrastructure in place to handle and re-configure it.

> 
> Nikita suspects that we're missing a step when switching over from dracut,
> and likely just need to copy over the s390-specific udev rules at that
> stage. He offered to follow-up on this in a free moment.

So looks like the dracut module should ideally create either a zdev udev rule out of the rd.znet specification (and remove the kernel parameter then) or somehow integrate entirely into NetworkManager to let NM manage the device on its own.

Hope this helps.

Comment 20 Nikita Dubrovskii (IBM) 2021-02-11 14:57:03 UTC
Here is a PR fix workaround/fix: https://github.com/coreos/fedora-coreos-config/pull/848

Comment 21 Julian Wiedmann 2021-02-12 10:21:33 UTC
(In reply to Hendrik Brueckner from comment #19)
> Hi,
> 
> (In reply to Julian Wiedmann from comment #17)
> > Ok, some progress on both aspects. The persistent config isn't just missing
> > the layer2 attribute - we're missing the udev rules for the configured
> > device(s) as a whole :). This also explains why the device isn't
> > re-configured automatically when the CHP comes back up.
> 
> AFAIK, I would confirm with that :) The point here is that RHCOS uses the
> rd.znet= dracut parameter to configure the network interface. In RHEL, this
> parameter is only specified for the installation because anaconda creates
> the respective configuration files for the NetworkManager. After that, the
> networkmanager manages device enable and layer2 configuration.  Note that
> anaconda does not yet use zdev tooling. You can use zdev but probably mix up
> with other means, like, dasd, zfcp, and network configuration.
> 

Thanks. We had been wondering how RHEL handles this part...

> With that said, the hipersockets will not come up after a chpid because
> there is no infrastructure in place to handle and re-configure it.
> 

Ack. For context, the same of course also applies to any CP DETACH / ATTACH scenario (Wolfgang had asked about this).

> > 
> > Nikita suspects that we're missing a step when switching over from dracut,
> > and likely just need to copy over the s390-specific udev rules at that
> > stage. He offered to follow-up on this in a free moment.
> 
> So looks like the dracut module should ideally create either a zdev udev
> rule out of the rd.znet specification (and remove the kernel parameter then)
> or somehow integrate entirely into NetworkManager to let NM manage the
> device on its own.
> 

As you already mentioned above, dasd and zfcp devices might have similar needs. So not sure if NM integration is a good fit here.

> Hope this helps.

Comment 22 Carvel Baus 2021-02-23 20:43:11 UTC
There is a PR WIP but it does not appear that it will complete during this sprint.

Comment 23 Carvel Baus 2021-03-17 13:37:11 UTC
The referenced PR is still WIP so this will not complete this sprint.

Comment 24 Carvel Baus 2021-04-06 20:03:57 UTC
The referenced PR is still WIP so this will not complete this sprint.

Comment 25 Carvel Baus 2021-04-29 12:29:59 UTC
The referenced PR is still WIP so this will not complete this sprint.

Comment 26 Carvel Baus 2021-05-17 18:03:43 UTC
The referenced PR is still WIP so this will not complete this sprint.

Comment 27 Dan Li 2021-06-07 19:16:06 UTC
Hi Nikita, do you know if this bug will be resolved before the end of the current sprint? If not, I hope to set the "reviewed-in-sprint" flag.

Comment 28 Dan Li 2021-06-08 13:43:16 UTC
Re-assigning to Julian per Z team's feedback.

Comment 29 Julian Wiedmann 2021-06-17 11:21:11 UTC
(In reply to Nikita Dubrovskii (IBM) from comment #20)
> Here is a PR fix workaround/fix:
> https://github.com/coreos/fedora-coreos-config/pull/848

Alrighto, I had a good enough dig through the early-stage network parts to get a first idea. Wow!

Will update in the pull request for now, as that has all the usual suspects on CC...

Comment 30 Dan Li 2021-06-28 19:45:32 UTC
Hi Julian, do you think you will continue to work on this bug during the next sprint (after July 3rd)? If so, I'd like to set the "reviewed-in-sprint" flag.

Comment 31 Julian Wiedmann 2021-06-29 07:35:44 UTC
Hi Dan, yes please set the flag.

Comment 32 Julian Wiedmann 2021-07-05 15:09:51 UTC
Looks like the problem is sufficiently understood:
1. under some circumstances, nm-initrd-generator doesn't think it's necessary to create a NM keyfile (which would contain the needed s390-options for znet hotplug). But we can work-around this so that nm-initrd-generator _does_ write the file, by eg. explicitly spelling out the interface name in the ip=... statement on the cmdline.
2. ccw_init in s390utils currently doesn't pick up the s390-options from a NM keyfile. So we need to cherry-pick Dan Horak's fix (https://bugzilla.redhat.com/show_bug.cgi?id=1885913) for RHEL 8 and RHCOS.

Long-term there's discussions ongoing how the s390-options should be best handled in a pure-NM keyfile environment.

Comment 33 Julian Wiedmann 2021-07-09 08:18:55 UTC
Dan, could you please pick up your fix [1] for RHEL 8, so that we can bring it into RHCOS ? I'm happy to take care of testing as needed ...

[1] https://fedorapeople.org/cgit/sharkcz/public_git/utils.git/commit/?id=6f264c2a4279fa9616f2450f09fcc93bd4c2b7c6

Comment 34 Dan Horák 2021-07-09 09:28:35 UTC
I can include the commit, but it will need a separate bug opened against s390utils.

Comment 35 Julian Wiedmann 2021-07-09 10:39:06 UTC
(In reply to Dan Horák from comment #34)
> I can include the commit, but it will need a separate bug opened against
> s390utils.

Sure, hopefully https://bugzilla.redhat.com/show_bug.cgi?id=1980708 as all what's needed.

Comment 36 Dan Li 2021-07-19 17:54:25 UTC
Hi Julian, do you think this bug will be continued during the next sprint (after July 24th)? If so, I'd like to set the "reviewed-in-sprint" flag.

Comment 37 Julian Wiedmann 2021-07-20 06:21:59 UTC
Hi Dan Li, yes please do. We're waiting for the identified fix to propagate into RHEL and RHCOS.

Comment 38 Dan Li 2021-08-10 17:18:52 UTC
Julian is out of the office this week, so I am adding "reviewed-in-sprint" flag as it is unlikely that he will get to this bug.

Comment 39 Dan Li 2021-08-30 16:28:49 UTC
Hi Julian, do you think this bug will continued to be worked on in the next sprint (after Sep 4th)? If so, I'd like to set the "reviewed-in-sprint" flag.

Comment 40 Julian Wiedmann 2021-08-31 06:51:28 UTC
Hi Dan - yes, please set the flag. The s390utils fix is in flight for RHEL 8.5, but imho it feels too late to still squeeze it into OCP 4.9

Comment 41 Timothée Ravier 2021-08-31 10:27:36 UTC
Note that RHCOS is RHEL and currently RHCOS 4.7 and later use RHEL 8.4. Thus anything that gets backported to RHEL 8.4 packages will be included in RHCOS 4.7 and later at some point.

Comment 43 Julian Wiedmann 2021-09-08 11:56:13 UTC
Thanks Timothée, I added an 8.4-z request to the RHEL bz.

Comment 44 Dan Li 2021-09-20 18:27:31 UTC
Hi Julian, do you think this bug will be continued to be worked on in the next sprint (after Sep 25th)? If so, I'd like to set the "reviewed-in-sprint" flag.

Comment 45 Julian Wiedmann 2021-09-21 13:40:17 UTC
Hi Dan - the bug is not finished, but I also don't expect further work until the 8.4-z request for RHEL makes progress.

Comment 46 Dan Li 2021-09-21 13:43:08 UTC
Thanks Julian - adding reviewed-in-sprint

Comment 47 Dan Li 2021-10-11 14:32:37 UTC
Hi Julian, is this bug still waiting for the 8.4-z request for RHEL? If so, I'd like to add "reviewed-in-sprint" flag to indicate that we have looked at this bug but are unable to progress further until things are ready from the RHEL side.

Comment 48 Dan Li 2021-10-13 14:10:45 UTC
Add reviewed-in-sprint as Julian is on PTO and it is unlikely that this bug will be fixed this sprint.

Comment 49 Dan Li 2021-11-01 13:15:33 UTC
Hi Julian, is this bug still waiting for the 8.4-z request for RHEL? If so, I'd like to add "reviewed-in-sprint" flag to indicate that the work will continue.

Comment 50 Julian Wiedmann 2021-11-01 19:38:30 UTC
Hi Dan Li, yes still waiting for the 8.4-z request. It progressed to ON_QA today.

Comment 51 Dan Li 2021-11-22 21:05:36 UTC
Hi Julian, is this bug still waiting for the 8.4-z request? If so, I'd like to set "reviewed-in-sprint" flag to indicate that the work will continue.

Comment 52 Julian Wiedmann 2021-11-23 08:21:06 UTC
Hi Dan Li, yes still waiting for the 8.4.z fix (which is VERIFIED now).

For the record, RHEL 8.5 recently went GA with the fix.

Comment 53 Dan Li 2022-01-04 17:33:25 UTC
Jan is OOTO until next week, therefore it is unlikely that this bug would be resolved before the current sprint. Setting reviewed-in-sprint.

Comment 54 Dan Li 2022-01-24 16:19:13 UTC
Hi Jan, do you think this bug would be resolved before the current sprint (reaches ON_QA state) before the end of the current sprint on January 29th? If not, I'd like to set the reviewed-in-sprint flag.

Comment 55 jschinta 2022-01-24 16:29:58 UTC
Hi Dan,
https://bugzilla.redhat.com/show_bug.cgi?id=2002391 is still in verified, so you can go ahead and set the reviewed-in-sprint flag.

Comment 57 Dan Li 2022-02-14 18:41:00 UTC
Hi Jan, checking in again, do you think this bug would be resolved before the current sprint (reaches ON_QA state) before the end of the current sprint on February 19th? If not, I'd like to set the reviewed-in-sprint flag. Thank you!

Comment 58 jschinta 2022-02-15 07:24:03 UTC
Hi Dan,
no changes in status here. Please set the flag.

Comment 59 Dan Li 2022-03-07 14:41:31 UTC
Hi Jan, do you think this bug will be resolved before the end of this sprint (March 12th)? If not, I'd like to set "reviewed-in-sprint" flag.

Comment 60 jschinta 2022-03-07 16:29:41 UTC
Hi Dan,

no changes in status in the RHEL BZ. I guess i need to follow up what is (not) happening there.
For now please set the flag.

Comment 61 Dan Li 2022-03-07 17:34:26 UTC
Thanks Jan. Setting the flag.

Comment 62 jschinta 2022-03-28 07:24:51 UTC
Hi Dan,
the BZ is still in verified, so you can set the "reviewed-in-sprint" flag.
But i'm wondering why that BZ hasn't landed in a Z stream yet (or if it has and was just not updated).
I'll look into it next sprint.

Comment 63 Dan Li 2022-03-28 10:41:05 UTC
Thanks Jan. Keeping the reviewed-in-sprint flag

Comment 64 jschinta 2022-04-11 14:52:36 UTC
Hi Dan,

after doing some investigating, i found that all the latest rhcos nightlys, exlcuding rhcos 4.6, had s390utils-core-2.15.1-5.el8.s390x installed, which according to https://bugzilla.redhat.com/show_bug.cgi?id=2002391 contains the fix.
I verified this with OCP 4.7.46 and RHCOS 47.84.202203141333-0
Steps:
    chchp -v 0 <chpid>
    wait 3 seconds
    chchp -v 1 <chpid>
Result:
    Network available

Seems to me like this is probably an accounting error in the BZs. From my side we can close this BZ, do you need any additional information for that?

Comment 65 Dan Li 2022-04-11 16:12:43 UTC
Hi Jan, just to be thorough it might be good to update the team on the fix during our Thursday meeting. But I don't have any additional information or concerns about bug closure.


Note You need to log in before you can comment on or make changes to this bug.