Bug 1571290

Summary: [RFE] ContentView kickstart versioning
Product: Red Hat Satellite Reporter: Nikola Kresic <nkresic>
Component: RepositoriesAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED WONTFIX QA Contact: vijsingh
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.3.0CC: bkearney, cdonnell, ddolguik, inecas, lzap, pdwyer, peter.vreman
Target Milestone: UnspecifiedKeywords: FutureFeature, Reopened, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-11-04 14:03:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1122832    

Description Nikola Kresic 2018-04-24 13:08:19 UTC
Description of problem:
.img files in /var/lib/tftpboot/boot are dupes or corrupted in certain cases


Version-Release number of selected component (if applicable):
Tested with Sat 6.3 and RHEL 7.4 and 7.5


How reproducible:
need 2 ContentViews, i used one for 7.4 and one for 7.5

Steps to Reproduce:

- Remove the /var/lib/tftpboot/boot/RedHat-7.*
- Create a Hostgroup with OS version set to RHEL7.4
- Assign a ContentView with only RHEL7.4 Kickstart  repo included
- Create default PXE template
- create host
- ......
- Create a Hostgroup with OS version set to RHEL7.5
- Assign a ContentView with only RHEL7.5 Kickstart  repo included
- Create default PXE template
- create host


Actual results:
[root@dhcp87 boot]# ls -la
total 108680
drwxr-xr-x. 2 foreman-proxy root               144 Apr 20 15:57 .
drwxr-xr-x. 8 root          root               177 Apr 13 09:54 ..
-rw-r--r--. 1 foreman-proxy foreman-proxy 49763300 Apr 13 12:44 RedHat-7.4-x86_64-initrd.img
-rw-r--r--. 1 foreman-proxy foreman-proxy  5875184 Apr 13 12:43 RedHat-7.4-x86_64-vmlinuz
-rw-r--r--. 1 foreman-proxy foreman-proxy 49763300 Apr 13 12:44 RedHat-7.5-x86_64-initrd.img
-rw-r--r--. 1 foreman-proxy foreman-proxy  5875184 Apr 13 12:43 RedHat-7.5-x86_64-vmlinuz

[root@dhcp87 boot]# md5sum *
f0bc285eb44f9e363d1327f99d28d976  RedHat-7.4-x86_64-initrd.img
f15a0fb05249b3d1daa46ec179e9928e  RedHat-7.4-x86_64-vmlinuz
f0bc285eb44f9e363d1327f99d28d976  RedHat-7.5-x86_64-initrd.img
f15a0fb05249b3d1daa46ec179e9928e  RedHat-7.5-x86_64-vmlinuz

7.4 duped as 7.5 


Expected results:

Proper files for RHEL 7.5



Additional info:

Sometimes the files will not be duped, but corrupted and not usable

Comment 1 Lukas Zapletal 2018-04-24 13:15:48 UTC
Hey, known bug, WIP, workaround possible - one liner.

*** This bug has been marked as a duplicate of bug 1447963 ***

Comment 2 Peter Vreman 2018-04-26 08:09:45 UTC
Lukas, this issue is different than the BZ1447963 describes.
This is that it created a tftp boot files with 7.5 in the name but the content from 6.4.

In the Sat6.3 log it is also visible that foreman passes really the wrong source and destination with differnt rhel versions to the smart proxy. This is still going wrong with the oneliner 'wget remove -c' fix.

----------------------
2018-04-17 11:41:30 2f02471c [app] [D] RestClient.post "https://li-lc-1578.hag.hilti.com:9090/tftp/fetch_boot_file", "prefix=boot%2F RedHat-7.5-x86_64 &path=http%3A%2F%2Fli-lc-1578.hag.hilti.com%2Fpulp%2Frepos%2FHilti%2FHostgroup%2Fhgp-crash_sat-6_2-ci%2Fcontent%2Fdist%2Frhel%2Fserver%2F7%2F 7.4 %2Fx86_64%2Fkickstart%2F%2Fimages%2Fpxeboot%2Finitrd.img", "Accept"=>"application/json", "Accept-Encoding"=>"gzip, deflate", "Content-Length"=>"231", "Content-Type"=>"application/x-www-form-urlencoded" ------ 
----------------------

To fix this issue you need to full change of http://projects.theforeman.org/issues/19389 to create the mediums with a unique id once and then create symlinks to if for each hostgroup and host.

Comment 3 Lukas Zapletal 2018-04-26 15:02:08 UTC
Peter,

thanks for detailed analysis. The planned patch will help to download the correct content, but the symlink will still be named incorrectly, which is RedHat-7.5-x86_64-xxx at the moment.

The filename is determined from Operating System family name, OS version and Architecture, after successful sync they are both concatenated together and this forms the prefix. If you have Operating System with family set to "Red Hat" and version set to 7.5, then it will create this file.

Can you tell me how did you get to this state where you have this inconsistency? What was the upstream URL during the initial sync?

I am flipping to the Katello team for further investigation, the TFTP problem is consequence of the wrong version.

Comment 4 Lukas Zapletal 2018-04-26 15:21:35 UTC
Peter, some updates.

I re-reviewed the PR you refer to and I can confirm that at this stage, it will help to workaround the TFTP corruption. But we would like to solve this incorrect versioning in Katello.

We found out that 7.4 and 7.5 RHEL Workstation kickstarts have incorrect kernel/initramdisk (they are from 7.3), we have an internal ticket and release engineering is investigating this issue already.

Can you tell me if you synced Workstation kickstart in this case to rule out this issue?

Also can you confirm me that the reproduce steps from the description are correct? We will try here.

Comment 6 Peter Vreman 2018-04-26 16:16:01 UTC
Hoi Lukas,

The root cause is simple there is no validation that the KickstartRepo inside the ContentView matches the Foreman OS:
- Foreman uses the OS version from the Hostgroup
- Katello provides the ContentView


Reproduction:
- Remove the /var/lib/tftpboot/boot/RedHat-7.5*
- Create a Hostgroup with OS version set to RHEL7.5
- Assign a ContentView with only RHEL6.9 Kickstart repo included
- Use Hammer or API to set the Kickstart repo to the Kickstart repo in the Contentview
- Create default PXE template

------------
2018-04-26 16:09:35 cd90838b [app] [D] RestClient.post "https://li-lc-1578.hag.hilti.com:9090/tftp/fetch_boot_file", "prefix=boot%2FRedHat-7.5-x86_64&path=http%3A%2F%2Fli-lc-1578.hag.hilti.com%2Fpulp%2Frepos%2FHilti%2FHostgroup%2Fhgp-crash_rhel-6_9-ci%2Fcontent%2Fdist%2Frhel%2Fserver%2F6%2F6.9%2Fx86_64%2Fkickstart%2F%2Fimages%2Fpxeboot%2Finitrd.img", "Accept"=>"application/json", "Accept-Encoding"=>"gzip, deflate", "Content-Length"=>"232", "Content-Type"=>"application/x-www-form-urlencoded"
------------------

Comment 7 Peter Vreman 2018-04-26 16:17:09 UTC
Maybe a separate BZ: The kernel and initrd files are being ask to download for every Hostgroup. E.g. in my case with almost 200 Hostgroups the files there are 380 Rest calls to (re)download the 'same' rhel6.9,7.3,7.4,7.5 boot images

[crash/LI] root@li-lc-1578:~# grep '2018-04-26 16:09.*:9090/tftp/fetch_boot_file' /var/log/foreman/production.log -c
380

Comment 8 Peter Vreman 2018-04-26 16:23:47 UTC
Additional info on the API calls i use:
- Set medium_id=null
- Set kickstart_repo_id to repo from haivng Kickstart in the name CV+LC

Comment 9 Peter Vreman 2018-04-26 16:34:34 UTC
Real-World Scenario where this hits me with every RHEL minor release since the Sat6.2.x that has 'Synced Content' as medium:

- I have already a RHEL7Server ContentView with kcikstart repo RHEL7.y
- When the next RHEL7.x minor comes out i sync the new repos
- When then i have an the last OS 7.x there which i then use when i have to attach to '7Server' (RFE for better support in BZ1267885)
- Then i update my Hostgroups and Hosts with the latest info of 7Server that is then RHEL7.x, but the Contentview stays with RHEL7.y
- Then the PXE menu is build
- Now you have a RHEL7.x-initrd with the content of RHEL7.y-initd.
   - The first time you are lucky that it is a 100% dupe

Now step two:
- I update my RHEL7Server ContentView with kcikstart repo RHEL7.x
- Update the Hostgroups for the new RHEL7.x kickstart inside the CV
- Build PXE menu
- Now you are in bad luck, because it is going to 'dowmload continue' because of the 'wget -c' the previous RHEL7.y-initrd to the size of RHEL7.x-initrd. And there you have the corrupted boot file (matching the BZ1447963)

Comment 10 Lukas Zapletal 2018-04-27 12:57:23 UTC
Thanks Peter. This is indeed a weak part of the implementation we need to work on. A reasonable goal would be:

1) Get the referenced https://github.com/theforeman/foreman/pull/5244 PR merged and ship as part of Satellite 6.4 to minimize corruptions

2) Work on refactoring of Operating System vs ContentView Kickstart vs download handling.

Comment 13 Lukas Zapletal 2018-05-02 11:03:54 UTC
For the record, I started discussion on this topic at:

https://community.theforeman.org/t/rfc-change-tftp-file-naming-pattern/8955

Comment 15 Bryan Kearney 2019-11-04 14:03:32 UTC
Thank you for your interest in Satellite 6. We have evaluated this request, and while we recognize that it is a valid request, we do not expect this to be implemented in the product in the foreseeable future. This is due to other priorities for the product, and not a reflection on the request itself. We are therefore closing this out as WONTFIX. If you have any concerns about this, please do not reopen. Instead, feel free to contact Red Hat Technical Support. Thank you.