Bug 1910082 - ipv6 lo (loopback) device is wrong state on Internal Copr?
Summary: ipv6 lo (loopback) device is wrong state on Internal Copr?
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Internal Copr
Classification: Internal
Component: builder
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Copr Team
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-22 15:21 UTC by Jun Aruga
Modified: 2021-02-08 17:42 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-06 21:23:43 UTC
Embargoed:


Attachments (Terms of Use)
ruby-3.0.0-a9a7f4d8b8.tar.xz (13.62 MB, application/x-xz)
2021-01-04 16:23 UTC, Jun Aruga
no flags Details

Description Jun Aruga 2020-12-22 15:21:07 UTC
Description of problem:

I am seeing the following the test failures that is related to connecting "localhost" tcp ip addresses on the Internal Copr when building rh-ruby30-ruby.

I suppose the lo (loopback) decide to connect to localhost is not available on the mock environment of the used machine for the Intenral Copr.

Here is the build and log.
https://copr.devel.redhat.com/coprs/jaruga/rh-ruby30/build/61354/
http://coprbe.devel.redhat.com/results/jaruga/rh-ruby30/rhel-7-x86_64/00061354-ruby/build.log.gz

```
  1) Error:
Net::TestSMTP#test_eof_error_backtrace:
Errno::EADDRNOTAVAIL: Cannot assign requested address - bind(2) for [::1]:36657
    /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:689:in `bind'
    /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:689:in `block in ip_sockets_port0'
    /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:674:in `each'
    /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:674:in `ip_sockets_port0'
    /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:707:in `tcp_server_sockets_port0'
    /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:758:in `tcp_server_sockets'
    /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/test/net/smtp/test_smtp.rb:174:in `test_eof_error_backtrace'
  2) Error:
Net::TestSMTP#test_start:
Errno::EADDRNOTAVAIL: Cannot assign requested address - bind(2) for [::1]:39365
    /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:689:in `bind'
    /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:689:in `block in ip_sockets_port0'
    /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:674:in `each'
    /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:674:in `ip_sockets_port0'
    /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:707:in `tcp_server_sockets_port0'
    /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:758:in `tcp_server_sockets'
    /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/test/net/smtp/test_smtp.rb:269:in `fake_server_start'
    /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/test/net/smtp/test_smtp.rb:196:in `test_start'
...
Finished tests in 629.107777s, 33.2042 tests/s, 4230.7425 assertions/s.
20889 tests, 2661593 assertions, 0 failures, 13 errors, 65 skips
```

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Do "git clone" for this repository: https://src.osci.redhat.com/fork/jaruga/rpms/ruby/tree/private-jaruga-rhscl-3.7-rh-ruby30-rhel-7
2. $ cd ruby
3. $ git checkout private-jaruga-rhscl-3.7-rh-ruby30-rhel-7
4. $ rhpkg --release rhel-7 srpm
5. $ copr-cli --config ~/.config/copr-internal build --nowait rh-ruby30 *.rpm

Actual results:

The build succeeds

Expected results:

The build fails.


Additional info:

Comment 1 Pavel Raiskup 2021-01-04 10:37:03 UTC
I can not generate the source RPM:

$ rhpkg --release rhel-7 srpm
Downloading ruby-2.7.1.tar.xz
######################################################################## 100.0%
error: Bad source: /tmp/ruby/ruby-3.0.0-a9a7f4d8b8.tar.xz: No such file or directory
Could not execute srpm: Failed to execute command.

What port are you trying to bind to?  Non-privileged ports should just work.

Comment 2 Jun Aruga 2021-01-04 16:23:34 UTC
Created attachment 1744338 [details]
ruby-3.0.0-a9a7f4d8b8.tar.xz

Sorry I would upload the source file temporarily for now.
I did want to upload it by `rhpkg import`, as it is a kind of the working source code.

Comment 3 Jun Aruga 2021-01-04 16:29:36 UTC
> What port are you trying to bind to?  Non-privileged ports should just work.

Checking the source code, the port number is "0".

Comment 4 Jun Aruga 2021-01-04 16:35:49 UTC
> > What port are you trying to bind to?  Non-privileged ports should just work.
> 
> Checking the source code, the port number is "0".

Sorry "0" is wrong. "0" is used as a flag to chose the port number dynamically in the logic.
It seems the "36657" and "39365" in the error log are the actual port number.

* bind(2) for [::1]:36657
* bind(2) for [::1]:39365

Comment 5 Pavel Raiskup 2021-01-05 08:01:32 UTC
I'll probably need a minimal reproducer.  You seem to open the ports on
ipv6, so I tried the equivalent command:

   nc -l 36657 -6

See the log http://coprbe.devel.redhat.com/results/praiskup/rh-ruby30/rhel-7-x86_64/00061477-dummy-pkg/builder-live.log.gz

And the 'nc' seems to bind to 36657 just fine.  So I can't reproduce
easily.  Could this be a problem in the ruby testsuite?

I could give you a testing copr machine if you wanted to experiment
manually with the copr builder, mock, etc.  Ping me on IRC so we can
handover the machine (I could try to debug personally, but it is a too
tough task for me as I'm not familiar with the ruby testsuite).

Comment 6 Pavel Raiskup 2021-01-05 08:12:08 UTC
Well, ipv6 is not working correctly in the lab, so we have it
disabled the ipv6 stack on copr builders by
sysctl net.ipv6.conf.all.disable_ipv6=1
That still doesn't block netcat from binding to a particular
port.

Are you intentionally testing ipv6 or by some accident?  IPv4 should
work fine.

Comment 7 Jun Aruga 2021-01-07 10:37:41 UTC
> Are you intentionally testing ipv6 or by some accident?  IPv4 should
> work fine.

Thanks for the investigation! Let me check it.

Comment 8 Jun Aruga 2021-01-08 16:05:05 UTC
Seeing the source code, the logic is to get available addrinfo list from the 'localhost', then for each adderinfo to create the socket and listen  by Addrinfo.getaddrinfo method [1] .

Here is just my local's test on the irb (= interactive Ruby Shell) on my local.
There are 2 addrinfos. `#<Addrinfo: ::1 TCP (localhost)>` (IPv6) and `#<Addrinfo: 127.0.0.1 TCP (localhost)>`.

The result might be different on the Internal Copr. It's possible to get IPv6's addrinfo.

```
$ irb
irb(main):002:0> require 'socket'
=> true
irb(main):003:0> Addrinfo.getaddrinfo('localhost', 0, nil, :STREAM, nil, Socket::AI_PASSIVE)
=> [#<Addrinfo: ::1 TCP (localhost)>, #<Addrinfo: 127.0.0.1 TCP (localhost)>]
```


> sysctl net.ipv6.conf.all.disable_ipv6=1

I think this command is not enough to disable IPv6. There is a way to disable IPv6 by 3 items on [2].
I assume currently lo's IPv6 interface is unintentionally enabled, as a result, `Addrinfo.getaddrinfo` gets IPv6 addrinfo

```
net.ipv6.conf.all.disable_ipv6=1
net.ipv6.conf.default.disable_ipv6=1
net.ipv6.conf.lo.disable_ipv6=1
```

So, could you run the following commands too in Internal Copr?

```
# sysctl net.ipv6.conf.default.disable_ipv6=1
# sysctl net.ipv6.conf.lo.disable_ipv6=1
```

[1] https://ruby-doc.org/stdlib-3.0.0/libdoc/socket/rdoc/Addrinfo.html#method-c-getaddrinfo
[2] https://www.systutorials.com/disabling-ipv6-on-fedora-17-linux/

Comment 9 Pavel Raiskup 2021-01-10 17:14:15 UTC
Thank you for the details!  I applied the config you supposed.  Please check
it is OK now and reopen if not.  Or ping me on IRC, I can give you access on
one of the copr builders so you could experiment over the ssh.

Comment 10 Jun Aruga 2021-01-12 12:28:22 UTC
I checked with the following debug code, but unfortunately I still see the issue showing the status of the disabled IPv6.
Here is the result: https://copr.devel.redhat.com/coprs/jaruga/rh-ruby30/build/61743/ .


```
diff --git a/ruby.spec b/ruby.spec
index e700dfc..cb88358 100644
--- a/ruby.spec
+++ b/ruby.spec
@@ -183,6 +183,7 @@ BuildRequires: multilib-rpm-config
 BuildRequires: gcc
 BuildRequires: make
 BuildRequires: zlib-devel
+BuildRequires: /usr/sbin/ip
 
 # This package provides %%{_bindir}/ruby-mri therefore it is marked by this
 # virtual provide. It can be installed as dependency of rubypick.
@@ -609,6 +610,13 @@ analysis result in RBS format, a standard type description format for Ruby
 
 
 %prep
+# Debug
+ip a
+cat /proc/sys/net/ipv6/conf/all/disable_ipv6
+sysctl net.ipv6.conf.all.disable_ipv6
+sysctl net.ipv6.conf.default.disable_ipv6
+sysctl net.ipv6.conf.lo.disable_ipv6
+
 %setup -q -n %{ruby_archive}
 
 # Remove bundled libraries to be sure they are not used.
```

I remember I fixed an IPv6 related issue on the Ruby upstream's CI where the IPv6 is not available by the following way.

https://bugs.ruby-lang.org/issues/16360#note-16

```
- sudo sysctl -w net.ipv6.conf.all.disable_ipv6=1
- sudo sysctl -w net.ipv6.conf.default.disable_ipv6=1
- sudo sysctl -w net.ipv6.conf.lo.disable_ipv6=1
```

I will debug the tests by disabling IPv6 on my local environment.

Comment 11 Jun Aruga 2021-01-12 14:22:50 UTC
I found the reason. I see the currently the /etc/hosts on Internal Copr is like this.

```
+ cat /etc/hosts
127.0.0.1 localhost localhost.localdomain
::1       localhost localhost.localdomain localhost6 localhost6.localdomain6
```

Could you comment out the line of "::1 ..." like this? Then immediately the result of `Addrinfo.getaddrinfo` will gets only IPv6 info. I tested it on my local.

```
# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
# ::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

$ irb
irb(main):001:0> require 'socket'
=> true
irb(main):002:0> Addrinfo.getaddrinfo('localhost', 0, nil, :STREAM, nil, Socket::AI_PASSIVE)
=> [#<Addrinfo: 127.0.0.1 TCP (localhost)>]
```


In this case, the following commands were not related to this issue, while it disables the IPv6 result of `ip a`.

```
# sysctl net.ipv6.conf.all.disable_ipv6=1
# sysctl net.ipv6.conf.default.disable_ipv6=1
# sysctl net.ipv6.conf.lo.disable_ipv6=1
```

Possibly it is the common behavior of `getaddrinfo` (See `man getaddrinfo`).

Comment 12 Jun Aruga 2021-01-13 10:42:54 UTC
> In this case, the following commands were not related to this issue, while it disables the IPv6 result of `ip a`.

I think setting both the sysctl ... commands and modifying /etc/hosts makes sense to me as a proper setting.

I remembered I created the tool/travis_disable_ipv6.sh script in Ruby project in the past that sets both to disable IPv6.
Here https://github.com/ruby/ruby/pull/2819/files .

Comment 13 Jun Aruga 2021-01-14 10:33:22 UTC
> Then immediately the result of `Addrinfo.getaddrinfo` will gets only IPv6 info.

Typo. Not IPv6 but IPv4.
Then immediately the result of `Addrinfo.getaddrinfo` will gets only IPv4 info.

Comment 14 Pavel Raiskup 2021-01-14 15:31:55 UTC
I'm not sure we can change that, unfortunately.  /etc/hosts is owned by setup.rpm
inside the mock chroot, and on top of that mock itself tries to restore the contents
if that is changed.  That is to allow 'localhost' resolution work.  The fact that
::1 is there shouldn't hurt in general I think.

I'm curious if it is correct that the `Addrinfo.getaddrinfo` returns ipv6 address
when it is actually unusable?

Comment 15 Jun Aruga 2021-01-14 17:39:38 UTC
> I'm curious if it is correct that the `Addrinfo.getaddrinfo` returns ipv6 address
> when it is actually unusable?

I compared the C's `getaddrinfo` behavior with the Ruby's `Addrinfo.getaddrinfo` using the testing code here.
https://github.com/junaruga/getaddrinfo-test

Then I confirmed that The both behaviors were same.
You can check it with the `main.c` on the repository by yourself.

> I'm not sure we can change that, unfortunately.  /etc/hosts is owned by setup.rpm
> inside the mock chroot, and on top of that mock itself tries to restore the contents
> if that is changed.  That is to allow 'localhost' resolution work.  The fact that
> ::1 is there shouldn't hurt in general I think.

> I can give you access on
> one of the copr builders so you could experiment over the ssh.

I assume the used /etc/hosts on the mock chroot is just a copy from the host environment.
Could you give me the the access on the one of the Copr builders to take a look at it?
Is it difficult to modify the builder machine's /etc/hosts if the machine is on the network where IPv6 does not work?

Comment 16 Jun Aruga 2021-01-14 17:55:49 UTC
I found the following article about the getaddrinfo seeing `/etc/nsswitch.conf` to check `/etc/hosts`.

https://jameshfisher.com/2018/02/03/what-does-getaddrinfo-do/

> getaddrinfo doesn’t know anything about files, DNS, or any other way to find the address for a host. Instead, getaddrinfo gets a list of these “sources” at runtime from another file, /etc/nsswitch.conf, the “Name Service Switch”. Here’s some of mine:
> ...
> But host addresses information doesn’t only come from DNS! There’s another source of information on UNIX systems: the file /etc/hosts. Here’s mine:

On my local Fedora, the setting is like this.

```
$ cat /etc/nsswitch.conf
...
hosts:      files mdns4_minimal [NOTFOUND=return] resolve [!UNAVAIL=return] myhostname dns
...
networks:   files dns
...
```

If we can remove "files" in `/etc/nsswitch.conf` hosts: or networks:, the `getaddrinfo` might not check the `/etc/hosts`. But it is risky.

Comment 17 Pavel Raiskup 2021-01-14 18:10:53 UTC
> I compared the C's `getaddrinfo` behavior with the Ruby's `Addrinfo.getaddrinfo` using the testing code here.
> https://github.com/junaruga/getaddrinfo-test
> 
> Then I confirmed that The both behaviors were same.
> You can check it with the `main.c` on the repository by yourself.

Ok, I admit I don't know what should be the correct behavior.  We should
probably consult this with networking people.  The fact that we have to actually
disable ipv6 and the built software isn't able to handle that, is bad.

It is also weird that most of the software out there just works.  E.g.
PostgreSQL server creates local servers and clients are connecting to that
servers, without problems.

> I assume the used /etc/hosts on the mock chroot is just a copy from the host environment.

It is not.  Checked.

> Could you give me the the access on the one of the Copr builders to take a look at it?

Sure, ping me on IRC when you are ready for machine handover.

> If we can remove "files" in `/etc/nsswitch.conf` hosts: or networks:, the `getaddrinfo` might not check the `/etc/hosts`. But it is risky.

It is not easy to remove files from the chroot because one package needs that.

I tend to say that we should revert all the sysctl hacks that are disabling
ipv6, and keep the default configuration.  That is what most of the existing
real-life boxes run with (default configuration and non-working ipv6 stack
because, hm, the network providers don't support ipv6).

Comment 18 Pavel Raiskup 2021-02-06 21:23:43 UTC
Ok, we don't disable ipv6 on our builders anymore.  Please give it one more try
and reopen if you still see ruby build failure.

Comment 19 Jun Aruga 2021-02-08 17:42:40 UTC
Interestingly soon after you did not disable ipv6 at that time, I still saw the failed tests when I tried to build.
But now when I tried to build, it succeeds.

Here are the results.
https://copr.devel.redhat.com/coprs/jaruga/rh-ruby30/build/62818/
https://copr.devel.redhat.com/coprs/jaruga/rh-ruby30/build/62817/

Here is the steps that I tested.

```
$ rhpkg co ruby
$ cd ruby
$ git checkout rhscl-3.7-rh-ruby30-rhel-7
$ rhpkg --release rhscl-3.7-rh-ruby30-rhel-7 srpm
$ copr-cli --config ~/.config/copr-internal build --nowait rh-ruby30 *.rpm
```

So, I am okay to close this ticket for now. Thank you for your help!


Note You need to log in before you can comment on or make changes to this bug.