Description of the problem: When varying off and on the chpid to simulate an outage, the network does not come up automatically afterwards. The guest has to be restarted. The Hipersocket will come up with it's persistent configuration which is missing a configuration for layer2=x (usually 1) and end up in layer 3 mode which does not work if the rest of the network is layer2 and which con only be set when enabling the device, not while it's already online. If the command chzdev <device> -p layer2=1 is issued before varying the chpid, the network come up on its on. Configuration: Z13 Hypersockets Server Version: 4.3.0-0.nightly-s390x-2020-03-09-183623 Kubernetes Version: v1.16.2 Kernel Version: 4.18.0-147.5.1.el8_1.s390x OS Image: Red Hat Enterprise Linux CoreOS 43.81.202003091812.0 (Ootpa) How reproducible: Every time Steps to Reproduce: On a node: chchp -v 0 <chpid> wait 3 seconds chchp -v 1 <chpid> chzdev -e <device> Actual results: Network not available Expected results: Network available.
Did you collect any logs from the node when this occurred?
Not any particular logs, however currently I have a cluster on hipersockets. Version 4.4.0-0.nightly-s390x-2020-05-18-143518. I could reproduce the behavior. Is there any log that you want in particular? I have attached the output of the script, that varies off and on the hipersockets. $ lszdev <DEVICE> -i -i DEVICE qeth 0.0.<DEVICE>:0.0.<DEVICE>:0.0.<DEVICE> Names : <interface-name> Network interfaces : <interface-name> Resources provided : IPv4 address <worker-ip> IPv6 address <worker-ip> Modules : qeth Online : yes Exists : yes Persistent : yes Device path : /sys/bus/ccwgroup/drivers/qeth/0.0.<DEVICE>/ ATTRIBUTE ACTIVE PERSISTENT bridge_hostnotify "0" - bridge_reflect_promisc "none" - bridge_role "none" - bridge_state "inactive" - buffer_count "128" - card_type "HiperSockets" - chpid "FD" - hw_trap "disarm" - if_name "<interface-name>" - inbuf_size "40k" - isolation "none" - layer2 "1" - online "1" "1" performance_stats "0" - portname "" - portno "0" - priority_queueing "always queue 2" - state "UP (LAN ONLINE)" - vnicc/bridge_invisible "n/a" - vnicc/flooding "n/a" - vnicc/learning "n/a" - vnicc/learning_timeout "n/a" - vnicc/mcast_flooding "n/a" - vnicc/rx_bcast "n/a" - vnicc/takeover_learning "n/a" - vnicc/takeover_setvmac "n/a" - READONLY ACTIVE cdev0 "0.0.<DEVICE>" cdev1 "0.0.<DEVICE>" cdev2 "0.0.<DEVICE>" driver "qeth" subsystem "ccwgroup" $ chchp -v 0 <CHPID> Vary offline 0.<CHPID>... done. waiting 3 seconds $ chchp -v 1 <CHPID> Vary online 0.<CHPID>... done. waiting 5 seconds $ chzdev -e <DEVICE>-<DEVICE> QETH device 0.0.<DEVICE>:0.0.<DEVICE>:0.0.<DEVICE> already configured $ lszdev <DEVICE> -i -i DEVICE qeth 0.0.<DEVICE>:0.0.<DEVICE>:0.0.<DEVICE> Names : <interface-name> Network interfaces : <interface-name> Resources provided : IPv4 address <worker-ip> IPv6 address <worker-ip> Modules : qeth Online : yes Exists : yes Persistent : yes Device path : /sys/bus/ccwgroup/drivers/qeth/0.0.<DEVICE>/ ATTRIBUTE ACTIVE PERSISTENT buffer_count "128" - card_type "HiperSockets" - chpid "FD" - fake_broadcast "0" - hsuid "" - hw_trap "disarm" - if_name "<interface-name>" - inbuf_size "40k" - ipa_takeover/add4 "" - ipa_takeover/add6 "" - ipa_takeover/enable "0" - ipa_takeover/invert4 "0" - ipa_takeover/invert6 "0" - isolation "none" - layer2 "0" - online "1" "1" performance_stats "0" - portname "" - portno "0" - priority_queueing "always queue 2" - route4 "no_router" - route6 "no_router" - rxip/add4 "" - rxip/add6 "" - sniffer "0" - state "UP (LAN ONLINE)" - vipa/add4 "" - vipa/add6 "" - READONLY ACTIVE cdev0 "0.0.<DEVICE>" cdev1 "0.0.<DEVICE>" cdev2 "0.0.<DEVICE>" driver "qeth" subsystem "ccwgroup" $ ping -c1 <bastion-ip> PING <bastion-ip> (<bastion-ip>) 56(84) bytes of data. --- <bastion-ip> ping statistics --- 1 packets transmitted, 0 received, 100% packet loss, time 0ms
Adding UpcomingSprint as this will not be resolved during current sprint.
Adding UpcomingSprint as team is fixing other bugs and will not have the bandwidth to resolve this bug.
Do we know if this is still an issue in the current relase, 4.5?
Adding Upcomingsprint. Other issues/effort took priority.
This issue is still current in version: 4.6.0-0.nightly-s390x-2020-10-06-145952
Adding UpcomingSprint and re-setting the target version to "---"
Currently getting worked, but won't make it by end of sprint. Adding upcoming sprint.
Are you able to provide the dmesg log from the moment just before configuring the channel path to offline through when you attempt to bring it back up? I am interested in any errors that show up in there.
Here is the output from dmesg from starting with the channel path going offline until it varied back online. [ 105.300296] cio: 0.0.8213: The device stopped operating while being set offline [ 105.352377] cio: 0.0.8214: The device stopped operating while being set offline [ 105.352406] cio: 0.0.8212: The device stopped operating while being set offline [ 113.432843] ctcm: CTCM driver initialized [ 113.459884] lcs: Loading LCS driver [ 113.488148] qeth 0.0.8212: Completion Queueing supported [ 113.532917] qeth: register layer 3 discipline [ 113.533380] qeth 0.0.8212: Completion Queueing supported [ 113.535167] qdio: 0.0.8214 HS on SC 10 using AI:1 QEBSM:1 PRI:0 TDD:1 SIGA:RW A [ 113.535193] qeth 0.0.8212: Completion Queue support disabled [ 113.572947] qeth 0.0.8212: Device is a HiperSockets card (level: HSSP) with link type HiperSockets. [ 113.573058] qeth 0.0.8212: Inbound source MAC-address not supported on hsi%d [ 113.573125] qeth 0.0.8212: VLAN enabled [ 113.573186] qeth 0.0.8212: Multicast enabled [ 113.573189] qeth 0.0.8212: IPV6 enabled [ 113.573296] qeth 0.0.8212: Broadcast enabled [ 113.575409] qeth 0.0.8212 enc8212: renamed from hsi0 [ 113.650108] IPv6: ADDRCONF(NETDEV_UP): enc8212: link is not ready
After a conversation with development we believe that the scenario describe here is valid test case. The firmware can potentially decide to turn off and on hipersockets. SSHing to he node is just an easy way to emulate this behavior. Development will investigate further and will add the findings here. We propose to keep the status of this bug as is.
Hi Wolfgang, there's two aspects here I believe. 1. "The Hipersocket will come up with it's persistent configuration which is missing a configuration for layer2=x (usually 1) and end up in layer 3 mode ..." This is to be expected. If the persistent configuration doesn't specify a layer2 attribute, the qeth driver will select layer2=0 for HiperSockets devices. So we should understand why the persistent configuration doesn't specify a layer2 attribute. 2. Looking at your test scenario > chchp -v 0 <chpid> > wait 3 seconds > chchp -v 1 <chpid> > wait 5 seconds > chzdev -e <device> the 'chzdev -e' step shouldn't be necessary. We just need to wait a bit until the CHP is back up & has raised uevents for the re-discovered ccw devices, so that the persistent udev rules can be applied. If this doesn't work reliably, it's a different issue that we should chase down separately.
Ok, some progress on both aspects. The persistent config isn't just missing the layer2 attribute - we're missing the udev rules for the configured device(s) as a whole :). This also explains why the device isn't re-configured automatically when the CHP comes back up. Nikita suspects that we're missing a step when switching over from dracut, and likely just need to copy over the s390-specific udev rules at that stage. He offered to follow-up on this in a free moment.
Hi all. I did some debugging and here is my observations: - despite s390-tools provides a dracut module, it doesn't generate any udev rules - RHEL 8.2 doesn't contain 41-***.rules as well I've made a patch (for rhcos) which creates and propagates those rules, but i'm not sure should we fix this. Maybe it's not an issue but design.
Hi, (In reply to Julian Wiedmann from comment #17) > Ok, some progress on both aspects. The persistent config isn't just missing > the layer2 attribute - we're missing the udev rules for the configured > device(s) as a whole :). This also explains why the device isn't > re-configured automatically when the CHP comes back up. AFAIK, I would confirm with that :) The point here is that RHCOS uses the rd.znet= dracut parameter to configure the network interface. In RHEL, this parameter is only specified for the installation because anaconda creates the respective configuration files for the NetworkManager. After that, the networkmanager manages device enable and layer2 configuration. Note that anaconda does not yet use zdev tooling. You can use zdev but probably mix up with other means, like, dasd, zfcp, and network configuration. With that said, the hipersockets will not come up after a chpid because there is no infrastructure in place to handle and re-configure it. > > Nikita suspects that we're missing a step when switching over from dracut, > and likely just need to copy over the s390-specific udev rules at that > stage. He offered to follow-up on this in a free moment. So looks like the dracut module should ideally create either a zdev udev rule out of the rd.znet specification (and remove the kernel parameter then) or somehow integrate entirely into NetworkManager to let NM manage the device on its own. Hope this helps.
Here is a PR fix workaround/fix: https://github.com/coreos/fedora-coreos-config/pull/848
(In reply to Hendrik Brueckner from comment #19) > Hi, > > (In reply to Julian Wiedmann from comment #17) > > Ok, some progress on both aspects. The persistent config isn't just missing > > the layer2 attribute - we're missing the udev rules for the configured > > device(s) as a whole :). This also explains why the device isn't > > re-configured automatically when the CHP comes back up. > > AFAIK, I would confirm with that :) The point here is that RHCOS uses the > rd.znet= dracut parameter to configure the network interface. In RHEL, this > parameter is only specified for the installation because anaconda creates > the respective configuration files for the NetworkManager. After that, the > networkmanager manages device enable and layer2 configuration. Note that > anaconda does not yet use zdev tooling. You can use zdev but probably mix up > with other means, like, dasd, zfcp, and network configuration. > Thanks. We had been wondering how RHEL handles this part... > With that said, the hipersockets will not come up after a chpid because > there is no infrastructure in place to handle and re-configure it. > Ack. For context, the same of course also applies to any CP DETACH / ATTACH scenario (Wolfgang had asked about this). > > > > Nikita suspects that we're missing a step when switching over from dracut, > > and likely just need to copy over the s390-specific udev rules at that > > stage. He offered to follow-up on this in a free moment. > > So looks like the dracut module should ideally create either a zdev udev > rule out of the rd.znet specification (and remove the kernel parameter then) > or somehow integrate entirely into NetworkManager to let NM manage the > device on its own. > As you already mentioned above, dasd and zfcp devices might have similar needs. So not sure if NM integration is a good fit here. > Hope this helps.
There is a PR WIP but it does not appear that it will complete during this sprint.
The referenced PR is still WIP so this will not complete this sprint.
Hi Nikita, do you know if this bug will be resolved before the end of the current sprint? If not, I hope to set the "reviewed-in-sprint" flag.
Re-assigning to Julian per Z team's feedback.
(In reply to Nikita Dubrovskii (IBM) from comment #20) > Here is a PR fix workaround/fix: > https://github.com/coreos/fedora-coreos-config/pull/848 Alrighto, I had a good enough dig through the early-stage network parts to get a first idea. Wow! Will update in the pull request for now, as that has all the usual suspects on CC...
Hi Julian, do you think you will continue to work on this bug during the next sprint (after July 3rd)? If so, I'd like to set the "reviewed-in-sprint" flag.
Hi Dan, yes please set the flag.
Looks like the problem is sufficiently understood: 1. under some circumstances, nm-initrd-generator doesn't think it's necessary to create a NM keyfile (which would contain the needed s390-options for znet hotplug). But we can work-around this so that nm-initrd-generator _does_ write the file, by eg. explicitly spelling out the interface name in the ip=... statement on the cmdline. 2. ccw_init in s390utils currently doesn't pick up the s390-options from a NM keyfile. So we need to cherry-pick Dan Horak's fix (https://bugzilla.redhat.com/show_bug.cgi?id=1885913) for RHEL 8 and RHCOS. Long-term there's discussions ongoing how the s390-options should be best handled in a pure-NM keyfile environment.
Dan, could you please pick up your fix [1] for RHEL 8, so that we can bring it into RHCOS ? I'm happy to take care of testing as needed ... [1] https://fedorapeople.org/cgit/sharkcz/public_git/utils.git/commit/?id=6f264c2a4279fa9616f2450f09fcc93bd4c2b7c6
I can include the commit, but it will need a separate bug opened against s390utils.
(In reply to Dan Horák from comment #34) > I can include the commit, but it will need a separate bug opened against > s390utils. Sure, hopefully https://bugzilla.redhat.com/show_bug.cgi?id=1980708 as all what's needed.
Hi Julian, do you think this bug will be continued during the next sprint (after July 24th)? If so, I'd like to set the "reviewed-in-sprint" flag.
Hi Dan Li, yes please do. We're waiting for the identified fix to propagate into RHEL and RHCOS.
Julian is out of the office this week, so I am adding "reviewed-in-sprint" flag as it is unlikely that he will get to this bug.
Hi Julian, do you think this bug will continued to be worked on in the next sprint (after Sep 4th)? If so, I'd like to set the "reviewed-in-sprint" flag.
Hi Dan - yes, please set the flag. The s390utils fix is in flight for RHEL 8.5, but imho it feels too late to still squeeze it into OCP 4.9
Note that RHCOS is RHEL and currently RHCOS 4.7 and later use RHEL 8.4. Thus anything that gets backported to RHEL 8.4 packages will be included in RHCOS 4.7 and later at some point.
Thanks Timothée, I added an 8.4-z request to the RHEL bz.
Hi Julian, do you think this bug will be continued to be worked on in the next sprint (after Sep 25th)? If so, I'd like to set the "reviewed-in-sprint" flag.
Hi Dan - the bug is not finished, but I also don't expect further work until the 8.4-z request for RHEL makes progress.
Thanks Julian - adding reviewed-in-sprint
Hi Julian, is this bug still waiting for the 8.4-z request for RHEL? If so, I'd like to add "reviewed-in-sprint" flag to indicate that we have looked at this bug but are unable to progress further until things are ready from the RHEL side.
Add reviewed-in-sprint as Julian is on PTO and it is unlikely that this bug will be fixed this sprint.
Hi Julian, is this bug still waiting for the 8.4-z request for RHEL? If so, I'd like to add "reviewed-in-sprint" flag to indicate that the work will continue.
Hi Dan Li, yes still waiting for the 8.4-z request. It progressed to ON_QA today.
Hi Julian, is this bug still waiting for the 8.4-z request? If so, I'd like to set "reviewed-in-sprint" flag to indicate that the work will continue.
Hi Dan Li, yes still waiting for the 8.4.z fix (which is VERIFIED now). For the record, RHEL 8.5 recently went GA with the fix.
Jan is OOTO until next week, therefore it is unlikely that this bug would be resolved before the current sprint. Setting reviewed-in-sprint.
Hi Jan, do you think this bug would be resolved before the current sprint (reaches ON_QA state) before the end of the current sprint on January 29th? If not, I'd like to set the reviewed-in-sprint flag.
Hi Dan, https://bugzilla.redhat.com/show_bug.cgi?id=2002391 is still in verified, so you can go ahead and set the reviewed-in-sprint flag.
Hi Jan, checking in again, do you think this bug would be resolved before the current sprint (reaches ON_QA state) before the end of the current sprint on February 19th? If not, I'd like to set the reviewed-in-sprint flag. Thank you!
Hi Dan, no changes in status here. Please set the flag.
Hi Jan, do you think this bug will be resolved before the end of this sprint (March 12th)? If not, I'd like to set "reviewed-in-sprint" flag.
Hi Dan, no changes in status in the RHEL BZ. I guess i need to follow up what is (not) happening there. For now please set the flag.
Thanks Jan. Setting the flag.
Hi Dan, the BZ is still in verified, so you can set the "reviewed-in-sprint" flag. But i'm wondering why that BZ hasn't landed in a Z stream yet (or if it has and was just not updated). I'll look into it next sprint.
Thanks Jan. Keeping the reviewed-in-sprint flag
Hi Dan, after doing some investigating, i found that all the latest rhcos nightlys, exlcuding rhcos 4.6, had s390utils-core-2.15.1-5.el8.s390x installed, which according to https://bugzilla.redhat.com/show_bug.cgi?id=2002391 contains the fix. I verified this with OCP 4.7.46 and RHCOS 47.84.202203141333-0 Steps: chchp -v 0 <chpid> wait 3 seconds chchp -v 1 <chpid> Result: Network available Seems to me like this is probably an accounting error in the BZs. From my side we can close this BZ, do you need any additional information for that?
Hi Jan, just to be thorough it might be good to update the team on the fix during our Thursday meeting. But I don't have any additional information or concerns about bug closure.