Bug 485214 - HPC 5.3: ocs-setup errors stating 5.3 not supported
Summary: HPC 5.3: ocs-setup errors stating 5.3 not supported
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat HPC Solution
Classification: Red Hat
Component: ocs
Version: 5.3
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
Assignee: OCS Support
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-02-12 12:55 UTC by Robert Allton
Modified: 2018-10-20 02:29 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-12-14 16:12:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
HPC Head Node sosreport (539.14 KB, application/x-bzip)
2009-02-12 13:01 UTC, Robert Allton
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2009:1667 0 normal SHIPPED_LIVE Red Hat High Performance Computing Solution (HPC) 5.4 2009-12-14 16:12:44 UTC

Description Robert Allton 2009-02-12 12:55:25 UTC
Description of problem:

When following new 5.3 Doc for installing RHEL HPC Solution, the installer errors out stating RHEL 5.3 is not supported.

Version-Release number of selected component (if applicable):


How reproducible:

100% (performed twice)

Steps to Reproduce:
1. Install RHEL 5.3 Server x86_64, register, yum update, assign to HPC Solution channel.
2. yum install ocs mod_ssl
3. /opt/kusu/sbin/ocs-setup
4. Take defaults and point to ISO of RHEL 5.3 Server x86_64
  
Actual results:

Setting up repository.  Please wait
Kit: rhel, version 5.3, architecture x86_64, has been added to repo: rhel5_x86_64.  Remember to refresh with -u
Kit: base, version 5.1, architecture noarch, has been added to repo: rhel5_x86_64.  Remember to refresh with -u
Unable to refresh repo: rhel5_x86_64. Reason: rhel 5.3 not supported

Expected results:

No Errors

Additional info:

ocs 5.1-5
ocs-kit-base-5.0.2

sosreport on its way

Comment 1 Robert Allton 2009-02-12 12:59:59 UTC
Additio0nal Info:

From HPC install Doc:

Installation Prerequisites
Installing Red Hat HPC Solution (Red Hat HPC) requires one system to be designated as an installer
node. This installer node is responsible for installing the rest of the nodes in the cluster.
Prior to installing Red Hat HPC, confirm that the designated machine has Red Hat Enterprise Linux
5.3 installed and meets the following requirements:


Summary of error:

Unable to refresh repo: rhel5_x86_64. Reason: rhel 5.3 not supported



Full output:

[root@malachite ~]# source /etc/profile.d/kusuenv.sh
[root@malachite ~]# /opt/kusu/sbin/ocs-setup

Fully-qualified Hostname = malachite.usersys.redhat.com
Short Hostname           = malachite
Timezone                 = America/New_York
UTC time                 = 1
Gateway                  = 
DNS servers              = ['172.16.52.28', '10.11.255.27']
DNS Search order         = rdu.redhat.com corp.redhat.com redhat.com
Language                 = en
Keyboard                 = us

Detected network interfaces:

   eth1
   =============================================================
      IP      =                       Enabled = True
      Network =                       Subnet  = 
      MAC     = 00:14:5E:5D:02:0A     Gateway = 
      DHCP    = True                  Boot    = 1

   eth0
   =============================================================
      IP      = 192.168.0.10          Enabled = True
      Network = 192.168.0.0           Subnet  = 255.255.255.0
      MAC     = 00:40:05:43:af:d5     Gateway = 
      DHCP    = False                 Boot    = 1

Warning: This host has only one static network interface available for
provisioning.  OCS will run a DHCP server on this interface, which may be
disruptive to your network operation.

Do you wish to continue [N/y]? y

OCS creates a DNS domain for all nodes it installs.  This node will function as a primary DNS server for this domain.

Enter the private DNS domain to create [default: ocs5]: 

OCS requires a depot directory to store kits, images and other files
A minimum of 10Gbytes is needed.

Would you like to use the default /depot directory [N/y]? 
Where would you like to locate this directory? /depot
Initializing MySQL database:  Installing MySQL system tables...
OK
Filling help tables...
OK

To start mysqld at boot time you have to copy
support-files/mysql.server to the right place for your system

PLEASE REMEMBER TO SET A PASSWORD FOR THE MySQL root USER !
To do so, start the server, then issue the following commands:
/usr/bin/mysqladmin -u root password 'new-password'
/usr/bin/mysqladmin -u root -h malachite.usersys.redhat.com password 'new-password'
See the manual for more instructions.
You can start the MySQL daemon with:
cd /usr ; /usr/bin/mysqld_safe &

You can test the MySQL daemon with mysql-test-run.pl
cd mysql-test ; perl mysql-test-run.pl

Please report any problems with the /usr/bin/mysqlbug script!

The latest information about MySQL is available on the web at
http://www.mysql.com
Support MySQL by buying support/licenses at http://shop.mysql.com
                                                           [  OK  ]
Starting MySQL:                                            [  OK  ]


The OS media is needed at this time.  Do you have OS media on
Disks, ISO, or Filesystem? (Disk|Iso|File) [Disk] I

Enter the fully qualified path to the ISO file, or the directory containing the files:
/sdb1/rhel-server-5.3-x86_64-dvd.iso
Copying the media.  Please wait this will take some time!
Any more disks for this OS kit? [y/n] 
N
Added kit: rhel-5.3-x86_64
Installing OCS base kit downloader
Loaded plugins: rhnplugin, security
Setting up Install Process
Parsing package install arguments
Resolving Dependencies
--> Running transaction check
---> Package ocs-kit-base.noarch 0:5.0-2 set to be updated
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package           Arch        Version      Repository                     Size
================================================================================
Installing:
 ocs-kit-base      noarch      5.0-2        rhel-x86_64-server-hpc-5      3.6 k

Transaction Summary
================================================================================
Install      1 Package(s)         
Update       0 Package(s)         
Remove       0 Package(s)         

Total download size: 3.6 k
Downloading Packages:
Running rpm_check_debug
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing     : ocs-kit-base                                      [1/1] 

Installed: ocs-kit-base.noarch 0:5.0-2
Complete!
Executing OCS base kit downloader script
Loaded plugins: rhnplugin
Adding kit using kitops...
Added kit: base-5.1-noarch



Setting up repository.  Please wait
Kit: rhel, version 5.3, architecture x86_64, has been added to repo: rhel5_x86_64.  Remember to refresh with -u
Kit: base, version 5.1, architecture noarch, has been added to repo: rhel5_x86_64.  Remember to refresh with -u
Unable to refresh repo: rhel5_x86_64. Reason: rhel 5.3 not supported
Traceback (most recent call last):
  File "/opt/kusu/sbin/sqlrunner", line 91, in ?
    app.run()
  File "/opt/kusu/sbin/sqlrunner", line 75, in run
    self.db.execute(self._options.querystring)
  File "/opt/kusu/lib/python/kusu/core/db.py", line 113, in execute
    return self.__dbcursor.execute(query)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/cursors.py", line 163, in execute
    self.errorhandler(self, exc, value)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/connections.py", line 35, in defaulterrorhandler
    raise errorclass, errorvalue
_mysql_exceptions.OperationalError: (1048, "Column 'cid' cannot be null")
Traceback (most recent call last):
  File "/opt/kusu/sbin/sqlrunner", line 91, in ?
    app.run()
  File "/opt/kusu/sbin/sqlrunner", line 75, in run
    self.db.execute(self._options.querystring)
  File "/opt/kusu/lib/python/kusu/core/db.py", line 113, in execute
    return self.__dbcursor.execute(query)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/cursors.py", line 163, in execute
    self.errorhandler(self, exc, value)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/connections.py", line 35, in defaulterrorhandler
    raise errorclass, errorvalue
_mysql_exceptions.OperationalError: (1048, "Column 'cid' cannot be null")
Traceback (most recent call last):
  File "/opt/kusu/sbin/sqlrunner", line 91, in ?
    app.run()
  File "/opt/kusu/sbin/sqlrunner", line 75, in run
    self.db.execute(self._options.querystring)
  File "/opt/kusu/lib/python/kusu/core/db.py", line 113, in execute
    return self.__dbcursor.execute(query)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/cursors.py", line 163, in execute
    self.errorhandler(self, exc, value)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/connections.py", line 35, in defaulterrorhandler
    raise errorclass, errorvalue
_mysql_exceptions.OperationalError: (1048, "Column 'cid' cannot be null")
Traceback (most recent call last):
  File "/opt/kusu/sbin/sqlrunner", line 91, in ?
    app.run()
  File "/opt/kusu/sbin/sqlrunner", line 75, in run
    self.db.execute(self._options.querystring)
  File "/opt/kusu/lib/python/kusu/core/db.py", line 113, in execute
    return self.__dbcursor.execute(query)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/cursors.py", line 163, in execute
    self.errorhandler(self, exc, value)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/connections.py", line 35, in defaulterrorhandler
    raise errorclass, errorvalue
_mysql_exceptions.OperationalError: (1048, "Column 'cid' cannot be null")
   Setting up httpd:                                       [  OK  ]
   Setting up dhcpd:                                       [  OK  ]
   Generating hosts, hosts.equiv, and resolv.conf:         [  OK  ]
   Setting up motd:                                        [  OK  ]
   Setting up named:                                       [  OK  ]
   Setting up shared home nfs export:                      [  OK  ]
   Setting up ntpd:                                        [  OK  ]
   Setting up SSH public keys:                             [  OK  ]
   Setting up SSH host file:                               [  OK  ]
   Setting up syslog:                                      [  OK  ]
   Setting up user skel files:                             [  OK  ]
   Setting up xinetd:                                      [  OK  ]
   Setting yum repos:                                      [  OK  ]
   Initializing initrd-templates:                          [  OK  ]
   Creating images for imaged and diskless nodes
      This will take some time.  Please wait:              [  OK  ]
   Generating nodeinstaller patchfiles:                    [  OK  ]
   Setting up CFM:                                         [  OK  ]
   Setting up default Firefox homepage:                    [  OK  ]
   Setting up fstab for home directories:                  [  OK  ]
   Synchronizing System configuration files:               [  OK  ]

Congratulations!  The base kit is installed and configured to provision on:

   Network 192.168.0.0 on interface eth0

[root@malachite ~]#

Comment 2 Robert Allton 2009-02-12 13:01:51 UTC
Created attachment 331684 [details]
HPC Head Node sosreport

Comment 3 Robert Allton 2009-02-12 13:07:10 UTC
More Verbose Setup steps:

1) Installed RHEL Server 5.2 x86_64 from Corporate pxeboot server. (Defaults)
2) Registered machine and performed yum update to 5.3.
3) Added HPC Solution child channel and performed additional yum update.
4) Performed yum install ocs mod_ssl (installed approx 53 packages)
5) Downloaded rhel-server-5.3-x86_64-dvd.iso from rhn.
6) Performed ocs-setup using all defaults and the above ISO.
7 Install completes but gives error while trying to refresh the rhel5_x86_64 repo.

Comment 4 Bryan Aldridge 2009-02-21 16:06:42 UTC
Followed almost exactly the same steps that Robert followed, got the exact same thing.  Interestingly enough we both started out with RHEL5.2 then yum updated to 5.3, and also used the rhel-server-5.3.x86_64-dvd.iso for the OS kit.  I have not tried an OS install with the 5.3 isos yet.  I will hopefully try that next week and post back

Comment 7 Florin MANAILA 2009-03-06 12:11:49 UTC
Hi all,

I have the same problem !!!!! 

Steps follow from the Installation Manual provided by RedHat !!!!!
www.redhat.com/docs/en-US/hpc/1.0/pdf/Installation_Guide.pdf

1. Install RHEL Server 5.3 x86_64 on the MASTER NODE (pxeboot server)
2. Register the server
3. Update the system - almost 89 MB of data from RH repository
4. Perform: yum install ocs mod_ssl
5. Perform: source /etc/profile.d/kusuenv.sh
6. Perform: /opt/kusu/sbin/ocs-setup  
- only one netwok card eth0 with IP 10.52.0.1/23 haze been see by the setup; the eth1 with the IP 10.50.0.254/24 was not see by the setup. Eth1 used for connecting the users to the master node !
- Install /home/depot with a symlink onto /
- Install the repository image (requested by the setup) from the same DVD RHEL Server 5.3 x86_64 from were I perform the installation at 1 step.

I get the same error as above (SAME !!!!!!!!)

Setting up repository.  Please wait
Kit: rhel, version 5.3, architecture x86_64, has been added to repo:
rhel5_x86_64.  Remember to refresh with -u
Kit: base, version 5.1, architecture noarch, has been added to repo:
rhel5_x86_64.  Remember to refresh with -u
Unable to refresh repo: rhel5_x86_64. Reason: rhel 5.3 not supported
Traceback (most recent call last):
  File "/opt/kusu/sbin/sqlrunner", line 91, in ?
    app.run()
  File "/opt/kusu/sbin/sqlrunner", line 75, in run
    self.db.execute(self._options.querystring)
  File "/opt/kusu/lib/python/kusu/core/db.py", line 113, in execute
    return self.__dbcursor.execute(query)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/cursors.py", line 163, in
execute
    self.errorhandler(self, exc, value)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/connections.py", line 35, in
defaulterrorhandler
    raise errorclass, errorvalue
_mysql_exceptions.OperationalError: (1048, "Column 'cid' cannot be null")
Traceback (most recent call last):
  File "/opt/kusu/sbin/sqlrunner", line 91, in ?
    app.run()
  File "/opt/kusu/sbin/sqlrunner", line 75, in run
    self.db.execute(self._options.querystring)
  File "/opt/kusu/lib/python/kusu/core/db.py", line 113, in execute
    return self.__dbcursor.execute(query)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/cursors.py", line 163, in
execute
    self.errorhandler(self, exc, value)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/connections.py", line 35, in
defaulterrorhandler
    raise errorclass, errorvalue
_mysql_exceptions.OperationalError: (1048, "Column 'cid' cannot be null")
Traceback (most recent call last):
  File "/opt/kusu/sbin/sqlrunner", line 91, in ?
    app.run()
  File "/opt/kusu/sbin/sqlrunner", line 75, in run
    self.db.execute(self._options.querystring)
  File "/opt/kusu/lib/python/kusu/core/db.py", line 113, in execute
    return self.__dbcursor.execute(query)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/cursors.py", line 163, in
execute
    self.errorhandler(self, exc, value)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/connections.py", line 35, in
defaulterrorhandler
    raise errorclass, errorvalue
_mysql_exceptions.OperationalError: (1048, "Column 'cid' cannot be null")
Traceback (most recent call last):
  File "/opt/kusu/sbin/sqlrunner", line 91, in ?
    app.run()
  File "/opt/kusu/sbin/sqlrunner", line 75, in run
    self.db.execute(self._options.querystring)
  File "/opt/kusu/lib/python/kusu/core/db.py", line 113, in execute
    return self.__dbcursor.execute(query)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/cursors.py", line 163, in
execute
    self.errorhandler(self, exc, value)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/connections.py", line 35, in
defaulterrorhandler
    raise errorclass, errorvalue
_mysql_exceptions.OperationalError: (1048, "Column 'cid' cannot be null")


7. Perform: /opt/kusu/bin/kusurc /opt/kusu/etc/S02KusuIptables.rc.py
8. Perform: /etc/rc.d/init.d/iptables stop [I don't like the firewall]
9. Perform: addhost -u 

10. Perform: addhost
- Blue Screen Display 
- Get the following list options:  (DIFERENCE's against the Installation Guide)
-- installer
-- compute-rhel
-- compute-imaged
-- compute-diskless
-- unmanaged
- I select -- compute-rhel
- Blue Screen waiting for Compute Nodes to be installed


11. Power ON - a new blade server (planed to be used as compute node)
-- PXEBOOT is doing the JOB, getting IP, going to the TFTP server :)
-- SURPRISE: no image to boot in /tftpboot/kusu
-- The Compute Node it was searching for:
---- kernel-rhel-5-x86_64 
---- initrd-rhel-5-x86_64.img 

-- In the folder it was:
---- kernel-rhel-5.3-x86_64 
---- initrd-rhel-5.3-x86_64.img 

-------> WORKARROUND: create symlinks to the existent 
ln -s kernel-rhel-5.3-x86_64 kernel-rhel-5-x86_64 
ln -s initrd-rhel-5.3-x86_64.img initrd-rhel-5-x86_64.img


12. Finaly the Compute Node BOOT-UP ! ;) and loding a kernel

13. The setup get stuck into a blue scren requesting from a URL: 10.52.0.1 (MASTER NODE) a file called ks.cfg.10.52.0.1

14. Going into /var/www/html/repos/1000/ I realize that there was no ks.cfg.10.52.0.1 file

-------> WORKARROUND: vi  /var/www/html/repos/1000/ks.cfg.10.52.0.1
---- file content -----
install
url --url http://10.52.0.1/kits/rhel/5.3/x86_64
network --bootproto=dhcp --noipv6
lang en_US
langsupport --default="us us"
keyboard us
---- file content -----

15. try again to boot the blade, evrything is OK and I get Linux RedHat AS5.3 Setup screen (requesting installation umbers etc ... as a brand new system)

16. In this time at step (10) I can see the MAC address of this system as in install process

17. The compute node finish the Linux installation - requesting root password, packages etc

18. After reboot, Surprise! - in the step (10) the system is still in installation stage !!!!! 


So here I am DONE , STOP .... ANY IDEAS ????????????

My impression:
-- all this effort for nothing
-- to get this result I will done'it myself (BIND setup, DHCP setup, TFTP Setup) to get a network installation of some blades servers

Questions:
-- we shuld have diferent linux kernel, setup etc for Compute-Nodes ?
-- how we can see the compute nodes as beeing installed ?? - withint mysql database of ocs ????


SO GUYS, REDHAT TECH, WHAT IS THIS ?!!!! ANY REAL RESPONSE FROM REDHAT OR I GET THE IMPRESSION THAT I SHOULD GO AWAY FROM REDHAT HPC SOLUTION ???

Comment 8 Robert Allton 2009-04-03 14:22:18 UTC
It looks like this issue has now been cleared. The kusu packages have all migrated from 5.1.X versions to 5.2.X versions. I was able to successfully install 5.2, migrate to 5.3 and then install HPC Solution.

Comment 13 errata-xmlrpc 2009-12-14 16:12:52 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-1667.html


Note You need to log in before you can comment on or make changes to this bug.