Description of problem: I am seeing the following the test failures that is related to connecting "localhost" tcp ip addresses on the Internal Copr when building rh-ruby30-ruby. I suppose the lo (loopback) decide to connect to localhost is not available on the mock environment of the used machine for the Intenral Copr. Here is the build and log. https://copr.devel.redhat.com/coprs/jaruga/rh-ruby30/build/61354/ http://coprbe.devel.redhat.com/results/jaruga/rh-ruby30/rhel-7-x86_64/00061354-ruby/build.log.gz ``` 1) Error: Net::TestSMTP#test_eof_error_backtrace: Errno::EADDRNOTAVAIL: Cannot assign requested address - bind(2) for [::1]:36657 /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:689:in `bind' /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:689:in `block in ip_sockets_port0' /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:674:in `each' /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:674:in `ip_sockets_port0' /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:707:in `tcp_server_sockets_port0' /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:758:in `tcp_server_sockets' /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/test/net/smtp/test_smtp.rb:174:in `test_eof_error_backtrace' 2) Error: Net::TestSMTP#test_start: Errno::EADDRNOTAVAIL: Cannot assign requested address - bind(2) for [::1]:39365 /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:689:in `bind' /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:689:in `block in ip_sockets_port0' /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:674:in `each' /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:674:in `ip_sockets_port0' /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:707:in `tcp_server_sockets_port0' /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/.ext/common/socket.rb:758:in `tcp_server_sockets' /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/test/net/smtp/test_smtp.rb:269:in `fake_server_start' /builddir/build/BUILD/ruby-3.0.0-a9a7f4d8b8/test/net/smtp/test_smtp.rb:196:in `test_start' ... Finished tests in 629.107777s, 33.2042 tests/s, 4230.7425 assertions/s. 20889 tests, 2661593 assertions, 0 failures, 13 errors, 65 skips ``` Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Do "git clone" for this repository: https://src.osci.redhat.com/fork/jaruga/rpms/ruby/tree/private-jaruga-rhscl-3.7-rh-ruby30-rhel-7 2. $ cd ruby 3. $ git checkout private-jaruga-rhscl-3.7-rh-ruby30-rhel-7 4. $ rhpkg --release rhel-7 srpm 5. $ copr-cli --config ~/.config/copr-internal build --nowait rh-ruby30 *.rpm Actual results: The build succeeds Expected results: The build fails. Additional info:
I can not generate the source RPM: $ rhpkg --release rhel-7 srpm Downloading ruby-2.7.1.tar.xz ######################################################################## 100.0% error: Bad source: /tmp/ruby/ruby-3.0.0-a9a7f4d8b8.tar.xz: No such file or directory Could not execute srpm: Failed to execute command. What port are you trying to bind to? Non-privileged ports should just work.
Created attachment 1744338 [details] ruby-3.0.0-a9a7f4d8b8.tar.xz Sorry I would upload the source file temporarily for now. I did want to upload it by `rhpkg import`, as it is a kind of the working source code.
> What port are you trying to bind to? Non-privileged ports should just work. Checking the source code, the port number is "0".
> > What port are you trying to bind to? Non-privileged ports should just work. > > Checking the source code, the port number is "0". Sorry "0" is wrong. "0" is used as a flag to chose the port number dynamically in the logic. It seems the "36657" and "39365" in the error log are the actual port number. * bind(2) for [::1]:36657 * bind(2) for [::1]:39365
I'll probably need a minimal reproducer. You seem to open the ports on ipv6, so I tried the equivalent command: nc -l 36657 -6 See the log http://coprbe.devel.redhat.com/results/praiskup/rh-ruby30/rhel-7-x86_64/00061477-dummy-pkg/builder-live.log.gz And the 'nc' seems to bind to 36657 just fine. So I can't reproduce easily. Could this be a problem in the ruby testsuite? I could give you a testing copr machine if you wanted to experiment manually with the copr builder, mock, etc. Ping me on IRC so we can handover the machine (I could try to debug personally, but it is a too tough task for me as I'm not familiar with the ruby testsuite).
Well, ipv6 is not working correctly in the lab, so we have it disabled the ipv6 stack on copr builders by sysctl net.ipv6.conf.all.disable_ipv6=1 That still doesn't block netcat from binding to a particular port. Are you intentionally testing ipv6 or by some accident? IPv4 should work fine.
> Are you intentionally testing ipv6 or by some accident? IPv4 should > work fine. Thanks for the investigation! Let me check it.
Seeing the source code, the logic is to get available addrinfo list from the 'localhost', then for each adderinfo to create the socket and listen by Addrinfo.getaddrinfo method [1] . Here is just my local's test on the irb (= interactive Ruby Shell) on my local. There are 2 addrinfos. `#<Addrinfo: ::1 TCP (localhost)>` (IPv6) and `#<Addrinfo: 127.0.0.1 TCP (localhost)>`. The result might be different on the Internal Copr. It's possible to get IPv6's addrinfo. ``` $ irb irb(main):002:0> require 'socket' => true irb(main):003:0> Addrinfo.getaddrinfo('localhost', 0, nil, :STREAM, nil, Socket::AI_PASSIVE) => [#<Addrinfo: ::1 TCP (localhost)>, #<Addrinfo: 127.0.0.1 TCP (localhost)>] ``` > sysctl net.ipv6.conf.all.disable_ipv6=1 I think this command is not enough to disable IPv6. There is a way to disable IPv6 by 3 items on [2]. I assume currently lo's IPv6 interface is unintentionally enabled, as a result, `Addrinfo.getaddrinfo` gets IPv6 addrinfo ``` net.ipv6.conf.all.disable_ipv6=1 net.ipv6.conf.default.disable_ipv6=1 net.ipv6.conf.lo.disable_ipv6=1 ``` So, could you run the following commands too in Internal Copr? ``` # sysctl net.ipv6.conf.default.disable_ipv6=1 # sysctl net.ipv6.conf.lo.disable_ipv6=1 ``` [1] https://ruby-doc.org/stdlib-3.0.0/libdoc/socket/rdoc/Addrinfo.html#method-c-getaddrinfo [2] https://www.systutorials.com/disabling-ipv6-on-fedora-17-linux/
Thank you for the details! I applied the config you supposed. Please check it is OK now and reopen if not. Or ping me on IRC, I can give you access on one of the copr builders so you could experiment over the ssh.
I checked with the following debug code, but unfortunately I still see the issue showing the status of the disabled IPv6. Here is the result: https://copr.devel.redhat.com/coprs/jaruga/rh-ruby30/build/61743/ . ``` diff --git a/ruby.spec b/ruby.spec index e700dfc..cb88358 100644 --- a/ruby.spec +++ b/ruby.spec @@ -183,6 +183,7 @@ BuildRequires: multilib-rpm-config BuildRequires: gcc BuildRequires: make BuildRequires: zlib-devel +BuildRequires: /usr/sbin/ip # This package provides %%{_bindir}/ruby-mri therefore it is marked by this # virtual provide. It can be installed as dependency of rubypick. @@ -609,6 +610,13 @@ analysis result in RBS format, a standard type description format for Ruby %prep +# Debug +ip a +cat /proc/sys/net/ipv6/conf/all/disable_ipv6 +sysctl net.ipv6.conf.all.disable_ipv6 +sysctl net.ipv6.conf.default.disable_ipv6 +sysctl net.ipv6.conf.lo.disable_ipv6 + %setup -q -n %{ruby_archive} # Remove bundled libraries to be sure they are not used. ``` I remember I fixed an IPv6 related issue on the Ruby upstream's CI where the IPv6 is not available by the following way. https://bugs.ruby-lang.org/issues/16360#note-16 ``` - sudo sysctl -w net.ipv6.conf.all.disable_ipv6=1 - sudo sysctl -w net.ipv6.conf.default.disable_ipv6=1 - sudo sysctl -w net.ipv6.conf.lo.disable_ipv6=1 ``` I will debug the tests by disabling IPv6 on my local environment.
I found the reason. I see the currently the /etc/hosts on Internal Copr is like this. ``` + cat /etc/hosts 127.0.0.1 localhost localhost.localdomain ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 ``` Could you comment out the line of "::1 ..." like this? Then immediately the result of `Addrinfo.getaddrinfo` will gets only IPv6 info. I tested it on my local. ``` # cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 # ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 $ irb irb(main):001:0> require 'socket' => true irb(main):002:0> Addrinfo.getaddrinfo('localhost', 0, nil, :STREAM, nil, Socket::AI_PASSIVE) => [#<Addrinfo: 127.0.0.1 TCP (localhost)>] ``` In this case, the following commands were not related to this issue, while it disables the IPv6 result of `ip a`. ``` # sysctl net.ipv6.conf.all.disable_ipv6=1 # sysctl net.ipv6.conf.default.disable_ipv6=1 # sysctl net.ipv6.conf.lo.disable_ipv6=1 ``` Possibly it is the common behavior of `getaddrinfo` (See `man getaddrinfo`).
> In this case, the following commands were not related to this issue, while it disables the IPv6 result of `ip a`. I think setting both the sysctl ... commands and modifying /etc/hosts makes sense to me as a proper setting. I remembered I created the tool/travis_disable_ipv6.sh script in Ruby project in the past that sets both to disable IPv6. Here https://github.com/ruby/ruby/pull/2819/files .
> Then immediately the result of `Addrinfo.getaddrinfo` will gets only IPv6 info. Typo. Not IPv6 but IPv4. Then immediately the result of `Addrinfo.getaddrinfo` will gets only IPv4 info.
I'm not sure we can change that, unfortunately. /etc/hosts is owned by setup.rpm inside the mock chroot, and on top of that mock itself tries to restore the contents if that is changed. That is to allow 'localhost' resolution work. The fact that ::1 is there shouldn't hurt in general I think. I'm curious if it is correct that the `Addrinfo.getaddrinfo` returns ipv6 address when it is actually unusable?
> I'm curious if it is correct that the `Addrinfo.getaddrinfo` returns ipv6 address > when it is actually unusable? I compared the C's `getaddrinfo` behavior with the Ruby's `Addrinfo.getaddrinfo` using the testing code here. https://github.com/junaruga/getaddrinfo-test Then I confirmed that The both behaviors were same. You can check it with the `main.c` on the repository by yourself. > I'm not sure we can change that, unfortunately. /etc/hosts is owned by setup.rpm > inside the mock chroot, and on top of that mock itself tries to restore the contents > if that is changed. That is to allow 'localhost' resolution work. The fact that > ::1 is there shouldn't hurt in general I think. > I can give you access on > one of the copr builders so you could experiment over the ssh. I assume the used /etc/hosts on the mock chroot is just a copy from the host environment. Could you give me the the access on the one of the Copr builders to take a look at it? Is it difficult to modify the builder machine's /etc/hosts if the machine is on the network where IPv6 does not work?
I found the following article about the getaddrinfo seeing `/etc/nsswitch.conf` to check `/etc/hosts`. https://jameshfisher.com/2018/02/03/what-does-getaddrinfo-do/ > getaddrinfo doesn’t know anything about files, DNS, or any other way to find the address for a host. Instead, getaddrinfo gets a list of these “sources” at runtime from another file, /etc/nsswitch.conf, the “Name Service Switch”. Here’s some of mine: > ... > But host addresses information doesn’t only come from DNS! There’s another source of information on UNIX systems: the file /etc/hosts. Here’s mine: On my local Fedora, the setting is like this. ``` $ cat /etc/nsswitch.conf ... hosts: files mdns4_minimal [NOTFOUND=return] resolve [!UNAVAIL=return] myhostname dns ... networks: files dns ... ``` If we can remove "files" in `/etc/nsswitch.conf` hosts: or networks:, the `getaddrinfo` might not check the `/etc/hosts`. But it is risky.
> I compared the C's `getaddrinfo` behavior with the Ruby's `Addrinfo.getaddrinfo` using the testing code here. > https://github.com/junaruga/getaddrinfo-test > > Then I confirmed that The both behaviors were same. > You can check it with the `main.c` on the repository by yourself. Ok, I admit I don't know what should be the correct behavior. We should probably consult this with networking people. The fact that we have to actually disable ipv6 and the built software isn't able to handle that, is bad. It is also weird that most of the software out there just works. E.g. PostgreSQL server creates local servers and clients are connecting to that servers, without problems. > I assume the used /etc/hosts on the mock chroot is just a copy from the host environment. It is not. Checked. > Could you give me the the access on the one of the Copr builders to take a look at it? Sure, ping me on IRC when you are ready for machine handover. > If we can remove "files" in `/etc/nsswitch.conf` hosts: or networks:, the `getaddrinfo` might not check the `/etc/hosts`. But it is risky. It is not easy to remove files from the chroot because one package needs that. I tend to say that we should revert all the sysctl hacks that are disabling ipv6, and keep the default configuration. That is what most of the existing real-life boxes run with (default configuration and non-working ipv6 stack because, hm, the network providers don't support ipv6).
Ok, we don't disable ipv6 on our builders anymore. Please give it one more try and reopen if you still see ruby build failure.
Interestingly soon after you did not disable ipv6 at that time, I still saw the failed tests when I tried to build. But now when I tried to build, it succeeds. Here are the results. https://copr.devel.redhat.com/coprs/jaruga/rh-ruby30/build/62818/ https://copr.devel.redhat.com/coprs/jaruga/rh-ruby30/build/62817/ Here is the steps that I tested. ``` $ rhpkg co ruby $ cd ruby $ git checkout rhscl-3.7-rh-ruby30-rhel-7 $ rhpkg --release rhscl-3.7-rh-ruby30-rhel-7 srpm $ copr-cli --config ~/.config/copr-internal build --nowait rh-ruby30 *.rpm ``` So, I am okay to close this ticket for now. Thank you for your help!