Description of problem: I installed Windows 10 on my computer (in UEFI). During installation I left some space for Fedora installation. Then I installed Fedora to empty space (also in UEFI). After reboot there is Windows entry in grub, but when I try to boot from it it only shows message: error: relocation failed. Windows can be booted from boot menu with windows efi entry. In windows grub entry there is: chainload /EFI/Microsoft/Boot/bootmgfw.efi which seems to be right. Version-Release number of selected component (if applicable): grub2-2.02-0.34.fc24.x86_64 I used Workstation Live RC 1.2 How reproducible: We tested this twice and had the same problem both times Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: I propose this as a blocker as it violates the criterion: "The installer must be able to install into free space alongside an existing clean Windows installation and install a bootloader which can boot into both Windows and Fedora."
I forgot to mention that this happens only on one of two computers where I tested this scenario. On the second machine there I got bug 1347273
I saw something similar on my bare metal test box, but it worked fine in a KVM, and we do have users reporting success in https://bugzilla.redhat.com/show_bug.cgi?id=1320273 (i.e. they didn't hit this after that was fixed), so it does seem like this works at least *some* of the time. On my test box the first attempt to boot Windows just immediately loops back to grub (the screen flashes briefly, basically), the *second* and subsequent attempts show the "relocation failed" error. If I try to boot Fedora after that, I see a "relocation failed" error, but then Fedora boots. The installs on that box were to a hardware RAID set, I guess I can re-test on a regular disk.
Sebastian Grill (CCed) reports something similar in the comments to 1320273: "With grub2-2.02-0.34.fc24 i get a 'relocation failed' error upon trying to boot Win 10."
I'm getting the exact same error here with RC 1.2 when trying to boot a Windows 10 that's set up for dual boot.
I know this was reported against W10, but I attempted F24RC1-2 with W2k12 dual boot in KVM and couldn't reproduce.
Discussed during the 2016-06-16 Go/No-Go #2 meeting: [1] The decision was made to not classify this bug as a blocker as there have been several reports of this working OK (50/50) and the firmware boot selector can still be used to boot the correct OS. [1] https://meetbot.fedoraproject.org/fedora-meeting-2/2016-06-16/f24-final-go_no_go-meeting.2016-06-16-17.00.txt
Thanks for the CC. If there's any info i can provide that is of interest to you, please let me know.
sebastian: could you let us know what system you're testing on - i.e. the model? If it's a self-build, what's the motherboard? Thanks!
Yes it's self built. Output of sudo lshw -short minus some HIDs below. H/W path Device Class Description ========================================================= system System Product Name (To be filled /0 bus P8Z68-V GEN3 /0/0 memory 64KiB BIOS /0/4 processor Intel(R) Core(TM) i5-2500K CPU @ 3 /0/4/5 memory 256KiB L1 cache /0/4/6 memory 1MiB L2 cache /0/4/7 memory 6MiB L3 cache /0/24 memory 16GiB System Memory /0/24/0 memory 4GiB DIMM DDR3 Synchronous 1600 MH /0/24/1 memory 4GiB DIMM DDR3 Synchronous 1600 MH /0/24/2 memory 4GiB DIMM DDR3 Synchronous 1600 MH /0/24/3 memory 4GiB DIMM DDR3 Synchronous 1600 MH /0/100 bridge 2nd Generation Core Processor Fami /0/100/1 bridge Xeon E3-1200/2nd Generation Core P /0/100/1/0 display Pitcairn XT [Radeon HD 7870 GHz Ed /0/100/1/0.1 multimedia Cape Verde/Pitcairn HDMI Audio [Ra /0/100/16 communication 6 Series/C200 Series Chipset Famil /0/100/19 eno1 network 82579V Gigabit Network Connection /0/100/1a bus 6 Series/C200 Series Chipset Famil /0/100/1a/1 usb1 bus EHCI Host Controller /0/100/1a/1/1 bus Integrated Rate Matching Hub /0/100/1b multimedia 6 Series/C200 Series Chipset Famil /0/100/1c bridge 6 Series/C200 Series Chipset Famil /0/100/1c.1 bridge 6 Series/C200 Series Chipset Famil /0/100/1c.1/0 bus ASM1042 SuperSpeed USB Host Contro /0/100/1c.1/0/0 usb4 bus xHCI Host Controller /0/100/1c.1/0/1 usb3 bus xHCI Host Controller /0/100/1c.2 bridge 6 Series/C200 Series Chipset Famil /0/100/1c.3 bridge 6 Series/C200 Series Chipset Famil /0/100/1c.3/0 storage JMB362 SATA Controller /0/100/1c.4 bridge 6 Series/C200 Series Chipset Famil /0/100/1c.4/0 bus ASM1042 SuperSpeed USB Host Contro /0/100/1c.4/0/0 usb6 bus xHCI Host Controller /0/100/1c.4/0/0/2 bus USB3.0 Hub /0/100/1c.4/0/1 usb5 bus xHCI Host Controller /0/100/1c.4/0/1/2 bus USB2.0 Hub /0/100/1c.6 bridge 82801 PCI Bridge /0/100/1c.6/0 bridge ASM1083/1085 PCIe to PCI Bridge /0/100/1c.7 bridge 6 Series/C200 Series Chipset Famil /0/100/1d bus 6 Series/C200 Series Chipset Famil /0/100/1d/1 usb2 bus EHCI Host Controller /0/100/1d/1/1 bus Integrated Rate Matching Hub /0/100/1f bridge Z68 Express Chipset Family LPC Con /0/100/1f.2 storage 6 Series/C200 Series Chipset Famil /0/100/1f.3 bus 6 Series/C200 Series Chipset Famil /0/1 scsi0 storage /0/1/0.0.0 /dev/sda disk 128GB M4-CT128M4SSD2 /0/1/0.0.0/2 /dev/sda2 volume 199MiB Windows FAT volume /0/1/0.0.0/3 /dev/sda3 volume 500MiB EXT4 volume /0/1/0.0.0/4 /dev/sda4 volume 118GiB LVM Physical Volume /0/2 scsi1 storage /0/2/0.0.0 /dev/sdb disk 500GB Samsung SSD 850 /0/2/0.0.0/1 /dev/sdb1 volume 127MiB reserved partition /0/2/0.0.0/2 /dev/sdb2 volume 464GiB Windows NTFS volume /0/2/0.0.0/3 /dev/sdb3 volume 259MiB Windows FAT volume /0/2/0.0.0/4 /dev/sdb4 volume 449MiB Windows NTFS volume /0/2/0.0.0/5 /dev/sdb5 volume 449MiB Windows NTFS volume /0/3 scsi2 storage /0/3/0.0.0 /dev/sdc disk 6001GB WDC WD60EZRX-00M /0/3/0.0.0/1 /dev/sdc1 volume 5589GiB Windows NTFS volume /0/5 scsi4 storage /0/5/0.0.0 /dev/cdrom disk BD-RE BH10LS38 /1 power To Be Filled By O.E.M. /2 power To Be Filled By O.E.M. /3 virbr0 network Ethernet interface /4 virbr0-nic network Ethernet interface
Sorry, i just noticed the MB make is a bit not-so-obvious in that report. It's an ASUS P8Z68-V/GEN3.
I also get this error with a motherboard Asus P8Z68-V LE, with latest firmware version 4102.
I tested this on hardware from comment 0. It seems to affect only UEFI, in BIOS mode I see no error. Booting directly from UEFI menu works, and downgrading grub2* to grub2-2.02-0.25.fc23 (F23 version) also fixes the problem.
I have the same problem on Asus P8Z68 Deluxe BIOS 3603 (the last one). Can boot with F8. Upgraded Fedora with dnf system upgrade.
(In reply to Dmitry Burstein from comment #13) > I have the same problem on Asus P8Z68 Deluxe BIOS 3603 (the last one). Can > boot with F8. Upgraded Fedora with dnf system upgrade. Forgot to mention I have Win 8.1, not 10.
I'm seeing the same problem with windows 10 on an Asus P8P67 PRO
Actually, can somebody add this to the top of their config file and then post a log of the failure: set debug=secureboot,chain,linuxefi Thanks.
Created attachment 1172975 [details] relocation failed error with debug activated
Issue confirmed with Asus G74Sx.
Created attachment 1190596 [details] screenshots of debug with Asus P8P67-Pro motherboard Attached screenshots from the debug on my Asus P8P67 Pro motherboard.
I noticed that if in the boot menu I scroll down to the Windows line, press "c" to enter the command line mode and upon entering it immediately type "exit", the computer boots into Windows with no problem. I'm using it as an alternative workaround to engaging EFI's own boot sequence, but may be it will give any idea about the nature of the bug.
Created attachment 1195332 [details] Relocation Failed error with debug information Intel Board Mine too getting same error in Grub2 fedora24 when booting windows entries. For me, EFI Boot Manager itself got replace with Grub2 entries. So I can't able to boot windows using EFI Boot section too. efibootmgr Timeout: 1 seconds BootOrder: 0005,0006,0002,0000,0001,0003,0004,0007 Boot0000* CD/DVD Drive Boot0001* Hard Drive Boot0002* Fedora Boot0003* Network Card Boot0004* UEFI: ST500DM002-****** Boot0005* Fedora Boot0006* Linux Boot0007* UEFI: ST500DM002-******
Happening on an ASRock Z75 Pro3 motherboard. Reverted to f23 grub-efi as a workaround.
Sorry for the noise but might have some additional information: I've isolated to -26 of grub2 which is: * Fri Mar 04 2016 Peter Jones <pjones> - 2.02-0.26 - Rebased to newer upstream (grub-2.02-beta3) for fedora-24 The error I get with -26 from my UEFI is "error: out of memory". At some later version, between -26 and -34, the error isn't displayed but I just get a blank screen after selecting Windows from the grub menu.
I'm not sure if this is a universal solution (I don't understand the UEFI boot failover logic) but I converged the workaround in comment 20 to a more "permanent" solution. In /boot/efi/EFI/fedora I created a custom.cfg: menuentry "Windows 10 workaround" { exit } That entry causes the grub boot loader to quit, and at least with my motherboard, the UEFI firmware continues to the next EFI boot loader, which I've configured to be the Windows 10 boot loader, and it boots windows.
I grepped grub, shim, and upstream edk2 for "relocation failed". No hits. I also grepped their git histories for the message (-S). No hits again. Based on comment 2 (--> the message is printed both when booting Windows and when booting Fedora, under the "right" circumstances), I must think the message comes from a subset of proprietary UEFI firmwares. Presumably, this family of firmware requires, correctly or incorrectly, such things from the image to be launched that grub's chain-loader does not satisfy.
Fedora's grub build is not entirely a stock one. The 'relocation failed' error comes from this patch: http://pkgs.fedoraproject.org/cgit/rpms/grub2.git/tree/0070-Add-secureboot-support-on-efi-chainloader.patch
Thanks. Thus, "relocation failed" is printed after relocate_coff() fails (== returns a value different from GRUB_EFI_SUCCESS). Looking at the return statements in relocate_coff(), each one that returns a failure code is preceded by a specific grub_error() message. However, in the attachment from comment 21 ("relocation_failed_error_with_debug.jpg"), none of those messages can be seen. That's because grub_error() formats the message into the global variable "grub_errmsg", and whatever error was formatted by relocate_coff() itself is promptly overwritten by the "relocation failed" message in handle_image(), before "grub_errmsg" is printed anywhere. The return status of relocate_coff() is also not logged (grub_errno 18 is just GRUB_ERR_BAD_ARGUMENT).
This also happens in F25. grub2-2.02-0.34.fc24.x86_64 grub2-efi-2.02-0.34.fc24.x86_64 shim-0.8-10.x86_64 Motherboard: Asus P8Z68-V LE, firmware version 4102
I landed here through links from ask.fedoraproject.org, as the title does not match. I use Windows 8.1 along F24, and I get a black screen rather than an error message, however this seems to match comment #23. Hardware is a Dell Latitude E7240 laptop, which came with Windows 8.1 preinstalled. Originally I installed F22 alongside Windows; this was then upgraded to F23 and later to F24. The last upgrade broke booting into Windows. As suggested above, downgrading grub to the 2.02-0.25 version from F23 makes things work again. Apart from the 2 workarounds (downgrade grub, or make it finish and pass on the next EFI boot entry) I don't see much progress here. Is there any interest in making this work for F24 and later? Or is progress tracked somewhere else? What are the disadvantages of switching to the F23 grub version?
I also have the problem of a blank screen after chainloading. It doesn't matter if I use the grub menu entry or select the bootloader manually over the grub command line. I am trying to boot into Windows 7, and I am running on a Lenovo ThinkPad x230. Is there any information that could help? (In reply to Geoffrey Marr from comment #6) > Discussed during the 2016-06-16 Go/No-Go #2 meeting: [1] > > The decision was made to not classify this bug as a blocker as there have > been several reports of this working OK (50/50) and the firmware boot > selector can still be used to boot the correct OS. That's an... "interesting" decision. Forcing users to either a) meddle with their UEFI to get into the OS of their choice or b) downgrading packages to a non-broken version? Which benefits does the F24 and F25 version of Grub even have over the one that's not broken?
It's not 'meddling' with UEFI. The UEFI boot manager is specifically intended for allowing you to pick what to boot. Choosing a boot option isn't 'meddling'.
I think we'll have to ping-pong debug this. That is, I prepare a debug-patched grub package, then adventurous users that experience the symptoms and are willing to help install it, and come back with the debug output. Based on the debug output, we try to figure out what went wrong, refine the debug patches, goto 10. So, (1) Please install grub2 from <http://koji.fedoraproject.org/koji/taskinfo?taskID=16514783>. This is a scratch build with my debug patches. You are welcome to review the patches in the SRPM (they are numbered starting with 11000) and/or rebuild the package yourself. For installing this, you might have to run "dnf downgrade", as the "dbg01" counter that I put in the NVR happens to make "grub2-2.02-0.34.dbg01.fc24" sort before "grub2-2.02-0.34.fc24". For the further debug builds, I'll keep bumping the dbg counter, so dnf upgrade should work. This is in no way supported, signed, or anything like that. If it breaks your machine, you get to keep both pieces. You've been warned. I tested the debug patches in an OVMF VM where I installed Windows 10 + Fedora 24, for dual booting. While I cannot reproduce the subject symptoms, the debug patches seem to pass the smoke test (they produce a bunch of info messages and allow booting Windows 10). (2) I don't know how quickly Koji removes scratch build output files, so response time from affected users is essential. I could expose the build(s) more permanently in my people.redhat.com webspace, but I'd like to avoid that if possible. (3) Getting the output in a text file (by capturing serial port output) would be vastly preferable to screen photos. If you have the gear for this (i.e., your machine has an actual UART, you have a null-modem cable, and another box to read the data, or some serial-to-network kit), please set GRUB_TERMINAL_OUTPUT="console serial" in the "/etc/default/grub" file, then run # grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg (4) Reboot. You will have to disable Secure Boot, or authorize this grub binary in MokManager. (I set up Secure Boot in OVMF just out of principle, without a second thought, and was pleasantly surprised to see shim / MokManager catch the unsigned binary.) (5) No need to change grub debug settings; chainloader messages will be spewed to the console (screen, and if enabled, serial port) automatically. Please capture the log (pictures of text file) leading up to the failure, and attach it to this BZ. Thanks.
I just upgraded from a fully updated F24 to F25. After the upgrade I’m getting the “relocation failed” error when choosing Windows 10 from the grub2 boot menu.
This is a workaround until this is fixed: # dnf downgrade grub2* --releasever=23
Thanks Laszlo! Here my pong :) I attach the screen capture from my Dell E7240 with your dgb01, after choosing to bootstrap the Windows (8.1 in my case) chainloader. The output goes on and on, after pressing a key, the address after "entry" is then incremented by 2. Once in a while, a "reloc_base" appears with the address also incremented by 2 w.r.t. the previous address and the "n" incremented by 1 w.r.t. the previous "reloc_base" line. The address immediately after "reloc_base" is no longer incremented by 2, but by a multiple of 2.
Created attachment 1222252 [details] dbg01-e7240
Bogdan, thanks for the screenshot. Can you please upload the full output? The file you uploaded doesn't contain the debug messages that lead directly to the error. What I'd need is the last one or two pages right before the failure. --*-- Anyway, analyzing this debug log forced me to review the code line by line, and I think I found the bug. Well, let me put it like this: I think to have found *a* bug that looks consistent with the symptoms (that is, it *could* cause this issue), but I can't guarantee in advance that it's *the* bug. Okay, so here goes. The handle_image() function is called with the EFI binary loaded into memory. The input parameter "handle_image:data" points to the image allocated in memory that gets passed in. The handle_image() function calls gBS->AllocatePool(), and creates a *section-wise* copy of the original image. The copy is aligned appropriately, it has the right size, and it is pointed-to by the variable "buffer_aligned". The handle_image() function iterates over sections in the original memory area (that is, "handle_image:data"), investigates each section, and optionally copies the section to the right place in the target area; that is, into "handle_image:buffer_aligned". Notably, the reloc section (which contains the list of lists of relocations) is not necessarily copied (it can be marked as discardable in the image file), and Bogdan's log file fragment actually says "Discarding section" for the reloc section. After this is done, the relocate_coff() function is called, with the following parameters: argument --> parameter role -------- --------- -------- handle_image:data --> relocate_coff:orig ORIGINAL handle_image:buffer_aligned --> relocate_coff:data COPY Note the confusing variable names. In the caller function, "data" stands for the *original* image. In the callee, "data" stands for the section-wise copied, to-be-relocated image. This logic comes from the following downstream-only commit: Add secureboot support on efi chainloader Expand the chainloader to be able to verify the image by means of shim lock protocol. The PE/COFF image is loaded and relocated by the chainloader instead of calling LoadImage and StartImage UEFI boot Service as they require positive verification result from keys enrolled in KEK or DB. The shim will use MOK in addition to firmware enrolled keys to verify the image. The chainloader module could be used to load other UEFI bootloaders, such as xen.efi, and could be signed by any of MOK, KEK or DB. Based on https://build.opensuse.org/package/view_file/openSUSE:Factory/grub2/grub2-secureboot-chainloader.patch If you open the URL at the last line, you notice a difference: the SUSE version of handle_image() never drops sections. It copies all sections. This results in a difference for relocate_coff(): in the SUSE patch, relocate_coff() can work entirely within the copied (target) area, because the "reloc" section will *always* be in that area. So, in the SUSE patch, only the copy is passed to relocate_coff(): argument --> parameter role --------------------------- ------------------ ---- handle_image:buffer_aligned --> relocate_coff:data COPY In the RHEL version however, the reloc section may or may not be present in the copy, hence we can only parse the reloc entries in the *original*. Which is why in our version of the patch, relocate_coff() has parameters "orig" and "data". Let's talk a bit about the relocation section. The relocation section apparently consists of sequences like this: ------------------- uint32_t rva uint32_t size uint16_t entry[0] uint16_t entry[1] uint16_t entry[2] uint16_t entry[3] ... uint16_t entry[N] ------------------- uint32_t rva uint32_t size uint16_t entry[0] uint16_t entry[1] uint16_t entry[2] uint16_t entry[3] ... uint16_t entry[M] ------------------- uint32_t rva uint32_t size uint16_t entry[0] uint16_t entry[1] uint16_t entry[2] uint16_t entry[3] ... uint16_t entry[K] and so on. The "size" field describes the size of that sequence, including the "rva" and "size" fields themselves, and the number of entries (marked above as N, M, K) in each sequence can be derived from the "size" field in that sequence. The parsing code in relocate_coff() has sanity checks. Each time it starts working on a new sequence, it determines the end of the sequence -- in order to see where the entries will end, for that sequence. The purported end is subjected to a wraparound check. Let me quote the SUSE original: > static grub_efi_status_t > relocate_coff (pe_coff_loader_image_context_t *context, void *data) > [...] > grub_efi_uint64_t size = context->image_size; > void *image_end = (char *)data + size; > [...] > reloc_base = image_address (data, size, context->reloc_dir->rva); > [...] > while (reloc_base < reloc_base_end) > { > reloc = (grub_uint16_t *)((char*)reloc_base + sizeof (struct grub_pe32_data_directory)); > reloc_end = (grub_uint16_t *)((char*)reloc_base + reloc_base->size); > > if ((void *)reloc_end < data || (void *)reloc_end > image_end) > { > grub_error (GRUB_ERR_BAD_ARGUMENT, "Reloc table overflows binary"); > return GRUB_EFI_UNSUPPORTED; > } "image_end" stands for the end of the *copied* image, and "data" stands for the beginning of the same. So the check intends to ensure that the end of this reloc sequence will be within the copied image file. This check makes sense because the copied image fill is guaranteed, in the SUSE version, to contain the reloc section. (BTW, there's an off-by one in the code above: reloc_end is an exclusive boundary, while "data" is an inclusive one, so even their equality should be refused. If reloc_end == data, then the actual relocation entries are strictly before "data", which should be rejected. Anyway, I digress.) Okay, let's see how the same looks in the RHEL code: > static grub_efi_status_t > relocate_coff (pe_coff_loader_image_context_t *context, > struct grub_pe32_section_table *section, > void *orig, void *data) > [...] > grub_efi_uint64_t size = context->image_size; > void *image_end = (char *)orig + size; > [...] > reloc_base = image_address (orig, size, section->raw_data_offset); > [...] > while (reloc_base < reloc_base_end) > { > grub_uint16_t *entry; > reloc = (struct grub_pe32_fixup_block *)((char*)reloc_base); > [...] > reloc_end = (struct grub_pe32_fixup_block *) > ((char *)reloc_base + reloc_base->size); > > if ((void *)reloc_end < data || (void *)reloc_end > image_end) > { > grub_error (GRUB_ERR_BAD_ARGUMENT, "Reloc entry %d overflows binary", > n); > grub_dprintf ("chain", "Reloc entry %d overflows binary\n", n); > return GRUB_EFI_UNSUPPORTED; > } Remember, in the RHEL version, we must iterate over the reloc sequences in the *original* image. Therefore we correctly calculate "image_end" as "orig + size" (translated from SUSE's "data + size"). But... in the actual containment check, the first condition is still (void *)reloc_end < data which is bogus: here "data" stands for the *copied* image, which is entirely irrelevant for parsing the reloc entries. In other words, this is an omission in the port from SUSE's patch. This instance of "data" should have been replaced with "orig". It didn't cause a compilation error because the "data" parameter was also preserved. (In such cases it's best to rename the preexistent parameter as well; it helps with auditing all uses.) So why does it cause an invalid error on some firmware, and no error on other firmware? The reason is that we're effectively comparing the addresses of separate, unrelated memory allocations (i.e., the base address of the copied image, and an address that falls into the original image). On most firmware / with most actual memory maps, the AllocatePool() boot service allocates the copied image at a lower address than the original image (the allocator works top-down), and then "reloc_end", which points into the original image, will be higher than the base address of the copied image. (*) In other cases however, the copy happens to be placed at a higher address (the pool allocator might find a gap higher up that it can fill well with the copy), and then the incorrect check fires, even for valid EFI images. (We don't get to see the error message "... overflows binary" because of the separate grub bug I identified in comment 27.) (*) Now, looking at Bogdan's log file fragment, we can see that the subject bug does *not* fire in his case. From comment 37, the tentative location for the copied reloc section (if it wasn't about to be discarded) is 0xD5469000..0xD546995F, while the original reloc section starts higher, at 0xD55F4800. However, that's understandable: in comment 29 Bogdan wrote, "I get a black screen rather than an error message" -- his symptoms are different! So, my take is that Bogdan is affected by an independent bug. Alright, let's go back to the other attachments, and try to verify the idea: - Comment 17 does not contain enough information for saying either way. - Ditto for comment 19. - However, comment 21 contains enough information, and it confirms the analysis. The tentative location for the copied reloc section is 0xC967000..0xC96781F. The original reloc section starts *lower*, at 0xC80D400. This triggers the bug. The fix should be simple (again, this is *not* for the independent bug that Bogdan is witnessing): > diff --git a/grub-core/loader/efi/chainloader.c b/grub-core/loader/efi/chainloader.c > index da6554c1813e..24a1eb63b5e6 100644 > --- a/grub-core/loader/efi/chainloader.c > +++ b/grub-core/loader/efi/chainloader.c > @@ -402,7 +402,7 @@ relocate_coff (pe_coff_loader_image_context_t *context, > reloc_end = (struct grub_pe32_fixup_block *) > ((char *)reloc_base + reloc_base->size); > > - if ((void *)reloc_end < data || (void *)reloc_end > image_end) > + if ((void *)reloc_end < orig || (void *)reloc_end > image_end) > { > grub_error (GRUB_ERR_BAD_ARGUMENT, "Reloc entry %d overflows binary", > n); I'll try to provide a scratch build for testing tomorrow.
I do not get anything else before the attached output, or at least I cannot see any other output and can't attach a serial console to try to capture more. Would it be possible to produce another build which pauses at the point the interesting data is printed? Alternatively, I have experience with building SRPMs, so if you can at least provide a patch doing this, I can rebuild the two binary packages myself.
(In reply to Bogdan Costescu from comment #40) > I do not get anything else before the attached output, What I would be interested in is *after* the output you uploaded. The screenshot in comment 38 is from the start of the relocation. > or at least I cannot see any other output That's interesting. Normally when the debug messages fill the screen and --MORE-- appears at the bottom, you just need to hit Enter or Space to proceed. Grub will continue until another screenful of debug messages has been produced. Do you mean that the relocation hangs for you after the first screenful of messages? > and can't attach a serial console to try to capture > more. Would it be possible to produce another build which pauses at the > point the interesting data is printed? It shouldn't be necessary; the debug message printing has built-in paging functionality.
(In reply to Laszlo Ersek from comment #39) > The fix should be simple (again, this is *not* for the independent bug that > Bogdan is witnessing): > > > diff --git a/grub-core/loader/efi/chainloader.c b/grub-core/loader/efi/chainloader.c > > index da6554c1813e..24a1eb63b5e6 100644 > > --- a/grub-core/loader/efi/chainloader.c > > +++ b/grub-core/loader/efi/chainloader.c > > @@ -402,7 +402,7 @@ relocate_coff (pe_coff_loader_image_context_t *context, > > reloc_end = (struct grub_pe32_fixup_block *) > > ((char *)reloc_base + reloc_base->size); > > > > - if ((void *)reloc_end < data || (void *)reloc_end > image_end) > > + if ((void *)reloc_end < orig || (void *)reloc_end > image_end) > > { > > grub_error (GRUB_ERR_BAD_ARGUMENT, "Reloc entry %d overflows binary", > > n); > > I'll try to provide a scratch build for testing tomorrow. http://koji.fedoraproject.org/koji/taskinfo?taskID=16553935 Juan Orti, Andy Wang, veldur, can you guys please test this build? Thanks.
By "The output goes on and on," in comment #37 I meant pagination with "MORE". So indeed I get the expected output. What I didn't know was that it will eventually come to something else and I stopped (ctrl-alt-del) after allowing it to print 4-5 further screens... So after installing the dbg02 from comment #42, I allowed it to go further. After more than 20 screens, it reaches the "reloc_base_end" address, then comes to "booting via entry point", after which Windows does start. Yey!!! There seems to be indeed a single bug, both for those with some error message and for those with a black screen (like me). Many thanks for tracking it down! So I guess the next step would be a dbg03 with no debugging output or some release candidate for the official package? :)
(In reply to Laszlo Ersek from comment #42) > (In reply to Laszlo Ersek from comment #39) > > > The fix should be simple (again, this is *not* for the independent bug that > > Bogdan is witnessing): > > > > > diff --git a/grub-core/loader/efi/chainloader.c b/grub-core/loader/efi/chainloader.c > > > index da6554c1813e..24a1eb63b5e6 100644 > > > --- a/grub-core/loader/efi/chainloader.c > > > +++ b/grub-core/loader/efi/chainloader.c > > > @@ -402,7 +402,7 @@ relocate_coff (pe_coff_loader_image_context_t *context, > > > reloc_end = (struct grub_pe32_fixup_block *) > > > ((char *)reloc_base + reloc_base->size); > > > > > > - if ((void *)reloc_end < data || (void *)reloc_end > image_end) > > > + if ((void *)reloc_end < orig || (void *)reloc_end > image_end) > > > { > > > grub_error (GRUB_ERR_BAD_ARGUMENT, "Reloc entry %d overflows binary", > > > n); > > > > I'll try to provide a scratch build for testing tomorrow. > > http://koji.fedoraproject.org/koji/taskinfo?taskID=16553935 > > Juan Orti, Andy Wang, veldur, can you guys please test this build? Thanks. With this build I can boot Windows 10 from grub. Thank you! Are you interested in the gazillon photos I've taken?
Kudos to you guys for testing the dbg02 build! No more photos / logs are necessary then. Regarding an official build: since I don't maintain grub2, I couldn't push the patch to dist-git. And without pushing the patch to dist-git, I must not build a non-scratch package in Koji. So, we'll have to follow the traditional route in this part: I'll attach the patch to the BZ, Peter will review it, and hopefully apply it. Then he can submit an official Koji build too. Thanks guys again for your help!
Created attachment 1222471 [details] [PATCH] efi/chainloader: fix wrong sanity check in relocate_coff() In relocate_coff(), the relocation entries are parsed from the original image (not the section-wise copied image). The original image is pointed-to by the "orig" pointer. The current check (void *)reloc_end < data compares the addresses of independent memory allocations. "data" is a typo here, it should be "orig". Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1347291 Signed-off-by: Laszlo Ersek <lersek> Tested-by: Bogdan Costescu <bcostescu> Tested-by: Juan Orti <j.orti.alcaine> --- grub-core/loader/efi/chainloader.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
I pinged Peter on IRC. It appears that he's not going to be available this week. From <https://admin.fedoraproject.org/pkgdb/package/rpms/grub2/>: > Package Administrator(s) > > These users can help you if you need commit privileges for this package. > > bcl (Fedora devel, Fedora 25, Fedora 24, Fedora 23) > > pjones (Fedora devel, Fedora 25, Fedora 24, Fedora 23, Fedora EPEL 6, > Fedora EPEL 5) > > ausil (Fedora devel, Fedora 25, Fedora 24, Fedora 23) Brian, Dennis, can you guys please review - the analysis in comment 39, - the patch in comment 46, - the testing feedback in comment 43 / 44, and apply the patch and submit an official Koji build? Thank you.
I'm a provenpackager, if no-one else gets to it first I will try to take care of this.
well, crap, I can do a build, but I'm sure not qualified to review a patch like this (I do python, man, not this kinda stuff). Tell you one more who might be willing, though - mjg59. Any chance you could take a look at this, Matthew? Sorry to take the liberty, but we're trying to make things right for people dual-boot installing F25. If you're willing and you say it's good I can fire a koji build and stuff.
Analysis and patch sound fine to me.
Thanks Matthew! :)
Thanks a lot Matthew! I'll do a build.
Sigh, so this turns out to be a bit complicated and I don't want to do it without Peter, really. The f24 grub2 branch has several changes that neither master (Rawhide) nor f25 have, first of all. I'd really want to check with pjones if those need to come forward to f25 and rawhide, and whether any reconciliation is needed with this patch. And also, build on ppc64 on master is failing and I'm really not sure what the correct resolution for that is. The immediate issue is that the ppc64 build BuildRequires /usr/lib/crt1.o and nothing provides that (any more?), but I'm really not sure what the story or the correct resolution is there. So I'm kinda unwilling to touch this unless I screw something up, unfortunately. We could perhaps do a cleaned-up scratch build for affected folks to use in the mean time.
(In reply to Adam Williamson from comment #53) > Sigh, so this turns out to be a bit complicated and I don't want to do it > without Peter, really. > > The f24 grub2 branch has several changes that neither master (Rawhide) nor > f25 have, first of all. I'd really want to check with pjones if those need > to come forward to f25 and rawhide, and whether any reconciliation is needed > with this patch. And also, build on ppc64 on master is failing and I'm > really not sure what the correct resolution for that is. The immediate issue > is that the ppc64 build BuildRequires /usr/lib/crt1.o and nothing provides > that (any more?), but I'm really not sure what the story or the correct > resolution is there. So I'm kinda unwilling to touch this unless I screw > something up, unfortunately. > > We could perhaps do a cleaned-up scratch build for affected folks to use in > the mean time. /usr/lib/crt1.o is provided by the koji internal glibc32 package, that workarounds multilib in koji and allows to build gcc with both 32 and 64 backends on 64-bit arches => an update of glibc32 in primary koji instance is needed after the merger with ppc koji.
(In reply to Adam Williamson from comment #53) > Sigh, so this turns out to be a bit complicated and I don't want to do it > without Peter, really. > > The f24 grub2 branch has several changes that neither master (Rawhide) nor > f25 have, first of all. Can you please elaborate? In my (anonymous) clone of Fedora's grub2 dist-git repo, I see the f24 branch standing currently at commit 9a4054fb1c5af194030d44f4e0a2c008afb0323e ("Update ppc64 configure invocation"). Both the authorship date and the commit date of this commit are 2016-06-10. Furthermore, this commit updates the Release field in the "grub2.spec" file from "0.33%{?dist}" to "0.34%{?dist}". Looking in Koji, the latest official build for Fedora 24 is "grub2-2.02-0.34.fc24": <http://koji.fedoraproject.org/koji/buildinfo?buildID=771914> which completed on "Fri, 10 Jun 2016 18:33:04 UTC". Therefore it seems to me that the current f24 HEAD in dist-git matches the latest build from Koji, which also matches the NVR reported in this issue (see comment 0). The patch in comment 46 simply goes on top. IOW, it shouldn't matter if there are differences between the f24 and f25 branches in dist-git, as long as the basis for the new patch (the HEAD of each branch) matches the latest Brew build for the corresponding Fedora release. Let's mark this requirement with (*). The patch is really small (a one-liner), and the file being patched ("grub-core/loader/efi/chainloader.c") is identical between the above-mentioned f24 HEAD, and the current f25 HEAD. (The current f25 HEAD is 9d15b4d492f89ba926dd9de9418c6d0df0a794c9, "Update to be newer than f24's branch".) This means that the patch is equally correct for f24 and f25. However... I do see your point about Fedora 25. Namely, the latest official build in Koji is "grub2-2.02-0.30.fc25", at <http://koji.fedoraproject.org/koji/buildinfo?buildID=752004>, which does *not* correspond to the current f25 HEAD. (It corresponds to the parent of the current f25 HEAD, 336bf36497b1bdaac168d87a7e79802e8da4d333, "Revert 27e66193, which was replaced by upstream's 49426e9fd".) Meaning that the requirement I marked with (*) above is not satisfied for Fedora 25 -- the patch would apply and work just fine on top of the f25 dist-git HEAD, but that basis *itself* has never been built in Koji & released to users. In other words, the f25 branch for grub2 is in an un-updateable state. Regarding Rawhide (F26): - the file being patched is identical (with dist-git master at 0ea125de5aa1a96e5a35d5d45255e691574fdfd0, which you just committed), so that would be fine, - however the official build status is even worse than that of grub2/F25: there is *no* successful F26 build available in Koji. > I'd really want to check with pjones if those need > to come forward to f25 and rawhide, and whether any reconciliation is needed > with this patch. According to the above, source code reconciliation is not needed at this point for the patch, against F25 and F26. Instead, the problem is that it is currently impossible to do *targeted* F25 and F26 builds for this patch. > And also, build on ppc64 on master is failing and I'm > really not sure what the correct resolution for that is. The immediate issue > is that the ppc64 build BuildRequires /usr/lib/crt1.o and nothing provides > that (any more?), but I'm really not sure what the story or the correct > resolution is there. So I'm kinda unwilling to touch this unless I screw > something up, unfortunately. > > We could perhaps do a cleaned-up scratch build for affected folks to use in > the mean time. For Fedora 24 (where everything is in sync), I think it is safe to do an official build. For Fedora 25 and Fedora 26, the patch is safe to add to dist-git. However, we shouldn't submit official Koji builds, as those builds would present a lot more changes to users than just this fix. This is something that Peter will have to sort out indeed. I don't like the idea of a cleaned-up scratch build for several reasons: - for Fedora 24, it is unjustified (an official build will do just fine) - for Fedora 25 and Rawhide, it would expose as-yet unreleased bits - worst of all, Koji would remove the files after a while, and we'd have to find external storage for them. I suggest to go ahead with the F24 build (for which this BZ was originally reported, until it got bumped to F25 in comment 28). Dist-git and Koji are already out of sync for F25 and F26, so I agree not to build packages for those. Anyhow, I defer to your judgement. Thanks for your help!
I have prepared a new glibc32 build, so in a while new grub2 builds should succeed also on ppc64.
(In reply to Laszlo Ersek from comment #42) > http://koji.fedoraproject.org/koji/taskinfo?taskID=16553935 I can confirm this build fixes Window boot on system from comment 0. Great job, Laszlo, thanks a lot!
(In reply to Kamil Páral from comment #57) > (In reply to Laszlo Ersek from comment #42) > > http://koji.fedoraproject.org/koji/taskinfo?taskID=16553935 > > I can confirm this build fixes Window boot on system from comment 0. Great > job, Laszlo, thanks a lot! Not for me :( I now get several pages of output from loader/efi/chainloader.c, printing all kinds of sections. When I jumpt through those pages, the last message is > booting via entry point After which, nothing happens. If required, I can upload (camera) screenshots of the output, but is stuff that is printed somewhere as a text file?
Laszlo: the '-34' on master and f25 branches is not in fact at all the same as the '-34' on f24. the two diverge starting at 2ae5c1e . From that point on, f24 has: 9a4054f Update ppc64 configure invocation 35f33ea Two bug fixes... 2572c16 Revert 27e66193, which was replaced by upstream's 49426e9fd 8768d23 Fix ppc64 build failure on fedora-24 af00b6d Pull TPM updates from mjg59. 0fa4f49 Fix aarch64 build problem. 9bc6276 Rebased to newer upstream for fedora-24 e229c7b Update debuginfo too 133f7ea Minor fixup to our bzrignore fix. 83affdb make the sb thing work while master has (ignoring my commit and revert): 9d15b4d Update to be newer than f24's branch. 336bf36 Revert 27e66193, which was replaced by upstream's 49426e9fd 1713515 Rebased to newer f24 code 0ac23e2 Pull TPM updates from mjg59. dcaf8f6 switch to pkgconfig() style BR bec00f1 fix summary-ended-with-dot warning from rpmlint Possibly 1713515 brought them back in sync again, but even since then, f24 has "Two bug fixes..." and "Update ppc64 configure invocation", which master and f25 do not. If it were my package I'd do a big merge commit to bring f24, f25 and master all back into line and then put your patch on top of that. But I don't want to do that without pjones' signoff.
(In reply to Alexander Korsunsky from comment #58) > (In reply to Kamil Páral from comment #57) > > (In reply to Laszlo Ersek from comment #42) > > > http://koji.fedoraproject.org/koji/taskinfo?taskID=16553935 > > > > I can confirm this build fixes Window boot on system from comment 0. Great > > job, Laszlo, thanks a lot! > > Not for me :( > > > I now get several pages of output from loader/efi/chainloader.c, printing > all kinds of sections. When I jumpt through those pages, the last message is > > > booting via entry point > > After which, nothing happens. If you reach the entry point of the relocated binary, then it sounds like a different bug. Can you check the open bugs for grub2? It may have been reported earlier. If not, reporting it as a separate BZ might make sense. > If required, I can upload (camera) screenshots of the output, but is stuff > that is printed somewhere as a text file? No, grub2 does not save the log messages to a file (grub2 generally does not write to the disk, very justifiedly, apart from the grub environment block). In order to capture messages like this, you need a (virtual) serial connection and another (virtual) machine.
(In reply to Alexander Korsunsky from comment #58) > (In reply to Kamil Páral from comment #57) > > (In reply to Laszlo Ersek from comment #42) > > > http://koji.fedoraproject.org/koji/taskinfo?taskID=16553935 > > > > I can confirm this build fixes Window boot on system from comment 0. Great > > job, Laszlo, thanks a lot! > > Not for me :( > > > I now get several pages of output from loader/efi/chainloader.c, printing > all kinds of sections. When I jumpt through those pages, the last message is > > > booting via entry point > > After which, nothing happens. Are you sure to use the dbg02 package, as mentioned in comment #42?
(In reply to Laszlo Ersek from comment #60) > > I now get several pages of output from loader/efi/chainloader.c, printing > > all kinds of sections. When I jumpt through those pages, the last message is > > > > > booting via entry point > > > > After which, nothing happens. > > If you reach the entry point of the relocated binary, then it sounds like a > different bug. Can you check the open bugs for grub2? It may have been > reported earlier. If not, reporting it as a separate BZ might make sense. As I mentioned in comment #43, I also see "booting via entry point" as the last printed message. The difference with respect to Alexander's report is that for me it goes further and immediately starts booting Windows.
Okay. Maybe the entry point itself is translated incorrectly. Alexander, can you please attach your log? Perhaps I can find something in it. For example I can see a path through the code that would discard the section that contains the entry point, if the section in question were marked as discardable. This would arguably be a bug in the image being relocated.
So I'm definitely not trying to reconcile the pile of crazy that's the grub2 package build without pjones around. Instead, I've just done some scratch builds for now, all on top of whatever else is currently on the relevant branch, just with Laszlo's patch added. Here's F25: http://koji.fedoraproject.org/koji/taskinfo?taskID=16567423 F24: http://koji.fedoraproject.org/koji/taskinfo?taskID=16567714 Note these won't be signed for Secure Boot purposes, you won't be able to use them with SB enabled, I believe. But if you're running into this issue and you don't have SB enabled, those builds may help.
(In reply to Alexander Korsunsky from comment #58) > (In reply to Kamil Páral from comment #57) > > (In reply to Laszlo Ersek from comment #42) > > > http://koji.fedoraproject.org/koji/taskinfo?taskID=16553935 > > > > I can confirm this build fixes Window boot on system from comment 0. Great > > job, Laszlo, thanks a lot! > > Not for me :( > > > I now get several pages of output from loader/efi/chainloader.c, printing > all kinds of sections. When I jumpt through those pages, the last message is > > > booting via entry point > > After which, nothing happens. Good news: I can reproduce this in an OVMF virtual machine. The serial log contains: > loader/efi/chainloader.c:775: loader/efi/chainloader.c:775: booting via entry point > booting via entry point > !!!! X64 Exception Type - 0E(#PF - Page-Fault) CPU Apic ID - 00000000 !!!! > RIP - 000000005C40401A, CS - 0000000000000038, RFLAGS - 0000000000210203 > ExceptionData - 0000000000000000 > RAX - 0000000000000304, RCX - 0000000000001010, RDX - 00000000000024D0 > RBX - 0000000000002404, RSP - 000000007F2DA2C8, RBP - 0000000000000000 > RSI - 000000001004B230, RDI - 00000018B44B4253 > R8 - 0000000000002510, R9 - 0000000000080FC8, R10 - 000000001000A0E4 > R11 - 000000007D1EB900, R12 - 0000000010024B34, R13 - 000000001004B230 > R14 - 000000007D1EBB7C, R15 - 0000000000000000 > DS - 0000000000000030, ES - 0000000000000030, FS - 0000000000000030 > GS - 0000000000000030, SS - 0000000000000030 > CR0 - 0000000080000033, CR2 - 00000018B44B4253, CR3 - 000000007F279000 > CR4 - 0000000000000668, CR8 - 0000000000000000 > DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000 > DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400 > GDTR - 000000007F266A98 0000000000000047, LDTR - 0000000000000000 > IDTR - 000000007EB31018 0000000000000FFF, TR - 0000000000000000 > FXSAVE_STATE - 000000007F2D9F20 > !!!! Can't find image information. !!!! The latter part comes from the edk2 exception handler that is built into OVMF. The presence of this register dump in the serial log means that we crash still within the UEFI boot mgr application. Let me look some more at the log.
Forgot to say in comment 65 -- this is a Windows 7 guest, based on Alexander's comment 30.
Created attachment 1222983 [details] [PATCH] efi/chainloader: truncate overlong relocation section The UEFI Windows 7 boot loader ("EFI/Microsoft/Boot/bootmgfw.efi", SHA1 31b410e029bba87d2068c65a80b88882f9f8ea25) has inconsistent headers. Compare: > The Data Directory > ... > Entry 5 00000000000d9000 00000574 Base Relocation Directory [.reloc] Versus: > Sections: > Idx Name Size VMA LMA File off ... > ... > 10 .reloc 00000e22 00000000100d9000 00000000100d9000 000a1800 ... That is, the size reported by the RelocDir entry (0x574) is smaller than the virtual size of the .reloc section (0xe22). Quoting the grub2 debug log for the same: > chainloader.c:595: reloc_dir: 0xd9000 reloc_size: 0x00000574 > chainloader.c:603: reloc_base: 0x7d208000 reloc_base_end: 0x7d208573 > ... > chainloader.c:620: Section 10 ".reloc" at 0x7d208000..0x7d208e21 > chainloader.c:661: section is not reloc section? > chainloader.c:663: rds: 0x00001000, vs: 00000e22 > chainloader.c:664: base: 0x7d208000 end: 0x7d208e21 > chainloader.c:666: reloc_base: 0x7d208000 reloc_base_end: 0x7d208573 > chainloader.c:671: Section characteristics are 42000040 > chainloader.c:673: Section virtual size: 00000e22 > chainloader.c:675: Section raw_data size: 00001000 > chainloader.c:678: Discarding section After hexdumping "bootmgfw.efi" and manually walking its relocation blocks (yes, really), I determined that the (smaller) RelocDir value is correct. The remaining area that extends up to the .reloc section size (== 0xe22 - 0x574 == 0x8ae bytes) exists as zero padding in the file. This zero padding shouldn't be passed to relocate_coff() for parsing. In order to cope with it, split the handling of .reloc sections into the following branches: - original case (equal size): original behavior (--> relocation attempted), - overlong .reloc section (longer than reported by RelocDir): truncate the section to the RelocDir size for the purposes of relocate_coff(), and attempt relocation, - .reloc section is too short, or other checks fail: original behavior (--> relocation not attempted). Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1347291 Signed-off-by: Laszlo Ersek <lersek> --- grub-core/loader/efi/chainloader.c | 26 +++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-) ------------ This is the patch for the issue reported by Alexander. It is really a different issue -- the bug is actually in the Windows 7 UEFI boot loader binary; it is malformed. The patch makes grub2 cope with it, which can be considered a "feature", rather than a bugfix. The build below includes both patches (plus the debug log patches). http://koji.fedoraproject.org/koji/taskinfo?taskID=16573505
I can confirm that dbg03 from comment #67 still works for me (Windows 8.1).
(In reply to Adam Williamson from comment #64) > Here's F25: > http://koji.fedoraproject.org/koji/taskinfo?taskID=16567423 Works for the machine from comment 0. (In reply to Laszlo Ersek from comment #67) > The build below includes both patches (plus the debug log patches). > http://koji.fedoraproject.org/koji/taskinfo?taskID=16573505 Works for the machine from comment 0.
Thank you guys for the regression testing. Alexander, does the build in comment 67 make your problem go away? Thanks.
(In reply to Laszlo Ersek from comment #63) > Okay. Maybe the entry point itself is translated incorrectly. Alexander, can > you please attach your log? Perhaps I can find something in it. Here are the pages of output: http://i.imgur.com/fsZGaQ1.jpg http://i.imgur.com/dwEP6Mp.jpg http://i.imgur.com/CdCcv5O.jpg http://i.imgur.com/cSaa2yY.jpg (In reply to Bogdan Costescu from comment #61) > Are you sure to use the dbg02 package, as mentioned in comment #42? Yes, I'm sure: $ rpm -qa | grep grub2 grub2-efi-modules-2.02-0.34.dbg02.fc24.x86_64 grub2-tools-2.02-0.34.dbg02.fc24.x86_64 grub2-2.02-0.34.dbg02.fc24.x86_64 grub2-efi-2.02-0.34.dbg02.fc24.x86_64 grub2-starfield-theme-2.02-0.34.dbg02.fc24.x86_64 (In reply to Adam Williamson from comment #64) > So I'm definitely not trying to reconcile the pile of crazy that's the grub2 > package build without pjones around. Instead, I've just done some scratch > builds for now, all on top of whatever else is currently on the relevant > branch, just with Laszlo's patch added. > F24: > > http://koji.fedoraproject.org/koji/taskinfo?taskID=16567714 This one doesn NOT work for me. Still hangs with a single underscore on the screen after selecting the Win7 Boot entry. (In reply to Laszlo Ersek from comment #67) > This is the patch for the issue reported by Alexander. It is really a > different issue -- the bug is actually in the Windows 7 UEFI boot loader > binary; it is malformed. The patch makes grub2 cope with it, which can be > considered a "feature", rather than a bugfix. > > The build below includes both patches (plus the debug log patches). > > http://koji.fedoraproject.org/koji/taskinfo?taskID=16573505 This one works for me! (After skipping through dozens of debug output pages ;) )
(In reply to Laszlo Ersek from comment #70) > Thank you guys for the regression testing. Alexander, does the build in > comment 67 make your problem go away? Thanks. Yes! Thanks Laslzo for the great work. I will keep testing whatever contains this patch until it lands in the fedora-updates repo.
Side point: please take care to spell my name correctly. I'm kinda touchy on that subject. Use the clipboard when in doubt. (For example, I actually used the clipboard to copy your name into my comments :)) Regarding the second patch (comment 67): - Thank you Alexander for testing it :) - Since we've decided to wait for Peter's return before we commit / officially build anything, I think we shouldn burden Matt with another review request. I think Peter can review the second patch when he returns. - Adam, re: comment 64, do you think you could provide scratch builds for F24 and F25, for all arches, with the patches from comment 46 and comment 67 only (that is, none of my debug patches)? Thank you.
Yep, I can do that. Thanks for all the work. Sorry we can't get real builds done till Peter is back, but the whole grub2 package setup is a bit...special.
Can confirm the package from comment 64 solved the problem from comment 14 on F25.
Updated scratch builds (sorry for the delay) with both of Laszlo's patches but no debug stuff: Fedora 25 - http://koji.fedoraproject.org/koji/taskinfo?taskID=16685670 Fedora 24 - http://koji.fedoraproject.org/koji/taskinfo?taskID=16685693 Again, these won't work for Secure Boot. I'm hoping pjones will be around to do some official builds soon.
The latest builds work fine for me on my p8p67 pro. Thanks for the great work on this.
I can confirm that the latest build from comment #76 works for me on the hardware from comment #29. The laptop is now running F25, so here I have tested the F25 package.
grub2-2.02-0.36.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-ee4d91769e
Thanks Peter! The update (and the F26 build) seem to be missing Laszlo's second patch ("truncate overlong relocation section"), are you still checking that one?
grub2-2.02-0.36.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-ee4d91769e
I just tried grub2-2.02-0.36.fc25 and booting Windows 10 works on the hardware from comment 9. Thank you for your commitment.
grub2-2.02-0.36.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.
grub2-2.02-0.37.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-543df7ea8c
Will this patch make it to fc24 repo?
Setting back to ON_QA, as 0.36 had only Laszlo's *first* patch, 0.37 has his second. Can folks who needed Laszlo's second patch please test and provide karma for 0.37? I'll ask pjones again if he can do an f24 update too.
(In reply to Adam Williamson from comment #86) > Setting back to ON_QA, as 0.36 had only Laszlo's *first* patch, 0.37 has his > second. Can folks who needed Laszlo's second patch please test and provide > karma for 0.37? I just tested, and while 0.36 did not fix my Win7 boot issue, 0.37 did. Thanks for the fixes!
alex: thanks a lot! if you have a Fedora (FAS) account it's best to log in to Bodhi before giving feedback, as it only 'counts' feedback from logged-in people for voting purposes.
grub2-2.02-0.37.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-543df7ea8c
(In reply to Fedora Update System from comment #89) > grub2-2.02-0.37.fc25 has been pushed to the Fedora 25 testing repository. If This fixes Win10 booting on the machine from comment 0.
grub2-2.02-0.37.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.
grub2-2.02-0.35.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-a098b75b13
grub2-2.02-0.35.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-a098b75b13
grub2-2.02-0.38.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-a098b75b13
grub2-2.02-0.38.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-a098b75b13
grub2-2.02-0.38.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.
Hello Laszlo (Ersek), I inspected your thoughts in your comment #39: https://bugzilla.redhat.com/show_bug.cgi?id=1347291#c39 And you are Hell of the coder and observer! :-) I have question to you, though... Here: Why this should be done as you did it, and fixed RHEL the way you did it (from your comment #39)??? RHEL implementation: argument --> parameter role -------- --------- -------- handle_image:data --> relocate_coff:orig ORIGINAL handle_image:buffer_aligned --> relocate_coff:data COPY So, in the SUSE patch, only the copy is passed to relocate_coff(): argument --> parameter role --------------------------- ------------------ ---- handle_image:buffer_aligned --> relocate_coff:data COPY What prevents you to change code in RHEL, to actually match SUSE code and to pass ONLY relocatable copy of boot-loader (seems much simpler and much less error prone)??? What/why are real differences in these two implementations??? Why RHEL needs original and copy sections, and SUSE ONLY the copy to work from? Thank you, _nobody_
(In reply to niemand from comment #97) > What prevents you to change code in RHEL, to actually match SUSE code and to > pass ONLY relocatable copy of boot-loader (seems much simpler and much less > error prone)??? > > What/why are real differences in these two implementations??? Why RHEL needs > original and copy sections, and SUSE ONLY the copy to work from? My understanding -- which has become quite vague at this point -- is that the original SUSE patch was simpler, but it also copied stuff that was unnecessary for execution. In other words, it wasted memory & execution time. The RHEL variant of the patch intended to optimize this logic (as far as I understand anyway, again), and in that diversion from the SUSE original, the first issue was introduced.
I'm pretty sure this can be closed at this point.