Bug 475673

Summary: WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
Product: [Fedora] Fedora Reporter: Dimitri Maziuk <dmaziuk>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: 9CC: kernel-maint, quintela
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-02-27 08:56:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dmesg none

Description Dimitri Maziuk 2008-12-09 23:34:20 UTC
Description of problem:

Above error when trying to write partition table ("w" in fdisk). This is in single-user mode, adding a sata disk with no partitions on it.

I don't know whose problem this is: kernel, udev, fdisk, but the system is unusable as a file server.

Command (m for help): p

Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x000787a8

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1      121601   976760001   fd  Linux raid autodetect

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table.
The new table will be used at the next reboot.
Syncing disks.


Version-Release number of selected component (if applicable):
kernel-2.6.26.6-79.fc9.x86_64

Comment 1 Chuck Ebbert 2008-12-10 12:57:57 UTC
(In reply to comment #1)
> Description of problem:
> 
> Above error when trying to write partition table ("w" in fdisk). This is in
> single-user mode, adding a sata disk with no partitions on it.
> 
> I don't know whose problem this is: kernel, udev, fdisk, but the system is
> unusable as a file server.
> 

If you reboot the partition will be recognized. How does having to reboot make it unusable as a file server?

Comment 2 Dimitri Maziuk 2008-12-10 15:03:15 UTC
If I reboot and rum mdadm -C it fails with device busy, not enough devices to create the array. If I try to do fdisk "w" on the partition I get the same error 16 --

this is with only one drive, to narrow it down. With 6 drives connected, I get this error on one of the drives, seemingly at random. Same thing if I run partprobe.

WTF could possibly be using a blank unpartitioned disk in single user mode right after bootup anyway?

Comment 3 Dimitri Maziuk 2008-12-12 22:22:53 UTC
Created attachment 326790 [details]
dmesg

Comment 4 Dimitri Maziuk 2008-12-12 22:27:02 UTC
re-partitioned the drives and rebuilt the raid by booting off install cd and running "rescue" shell. This started happening at ~5am and eventually locked up all 8 cores -- they're all from mdadm, nmbd, snmpd (does free space and S.M.A.R.T.) checks, and events/7:

Dec 12 15:55:21 octopus kernel: BUG: soft lockup - CPU#7 stuck for 61s! [events/7:34]
Dec 12 15:55:21 octopus kernel: Modules linked in: nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc ipv6 ipt_REJECT xt_tcpudp nf_conntrack_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables dm_mirror dm_log dm_multipath scsi_dh dm_mod raid10 cfi_cmdset_0002 cfi_util forcedeth jedec_probe serio_raw cfi_probe gen_probe shpchp ck804xrom pcspkr sata_nv mtd chipreg map_funcs i2c_nforce2 i2c_core sg ata_generic pata_acpi pata_amd libata sd_mod scsi_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: freq_table]
Dec 12 15:55:21 octopus kernel: CPU 7:
Dec 12 15:55:21 octopus kernel: Modules linked in: nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc ipv6 ipt_REJECT xt_tcpudp nf_conntrack_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables dm_mirror dm_log dm_multipath scsi_dh dm_mod raid10 cfi_cmdset_0002 cfi_util forcedeth jedec_probe serio_raw cfi_probe gen_probe shpchp ck804xrom pcspkr sata_nv mtd chipreg map_funcs i2c_nforce2 i2c_core sg ata_generic pata_acpi pata_amd libata sd_mod scsi_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: freq_table]
Dec 12 15:55:21 octopus kernel: Pid: 34, comm: events/7 Not tainted 2.6.27.7-53.fc9.x86_64 #1
Dec 12 15:55:21 octopus kernel: RIP: 0010:[<ffffffff81060730>]  [<ffffffff81060730>] smp_call_function_mask+0x174/0x1dd
Dec 12 15:55:21 octopus kernel: RSP: 0018:ffff88022720fd40  EFLAGS: 00000202
Dec 12 15:55:21 octopus kernel: RAX: ffff88022720fdf0 RBX: ffff88022720fe20 RCX: 00000000000000fc
Dec 12 15:55:21 octopus kernel: RDX: ffffffff81625480 RSI: 00000000000008fc RDI: 0000000000000286
Dec 12 15:55:21 octopus kernel: RBP: 0000000000000007 R08: ffff88022720e000 R09: ffff88041893aa20
Dec 12 15:55:21 octopus kernel: R10: 0000000000000001 R11: 000000602720fd50 R12: 00000000000030c7
Dec 12 15:55:21 octopus kernel: R13: ffff8802adc30000 R14: ffff88022720e000 R15: ffffffff81622990
Dec 12 15:55:21 octopus kernel: FS:  00007f4cc1f657a0(0000) GS:ffff88042702af00(0000) knlGS:0000000000000000
Dec 12 15:55:21 octopus kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Dec 12 15:55:21 octopus kernel: CR2: 000000000065b400 CR3: 00000004188ca000 CR4: 00000000000006e0
Dec 12 15:55:21 octopus kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 12 15:55:21 octopus kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 12 15:55:21 octopus kernel:
Dec 12 15:55:21 octopus kernel: Call Trace:
Dec 12 15:55:21 octopus kernel: [<ffffffff8101b701>] ? mcheck_check_cpu+0x0/0x2b
Dec 12 15:55:21 octopus kernel: [<ffffffff8100e717>] ? __switch_to+0xb9/0x3e0
Dec 12 15:55:21 octopus kernel: [<ffffffff8103c76b>] ? finish_task_switch+0x31/0xc9
Dec 12 15:55:21 octopus kernel: [<ffffffff8101b701>] ? mcheck_check_cpu+0x0/0x2b
Dec 12 15:55:21 octopus kernel: [<ffffffff8101b09d>] ? mcheck_timer+0x0/0x7f
Dec 12 15:55:21 octopus kernel: [<ffffffff810607b4>] ? smp_call_function+0x1b/0x1d
Dec 12 15:55:21 octopus kernel: [<ffffffff81044a67>] ? on_each_cpu+0x18/0x46
Dec 12 15:55:21 octopus kernel: [<ffffffffa020326b>] ? do_cache_clean+0x0/0x36 [sunrpc]
Dec 12 15:55:21 octopus kernel: [<ffffffff8101b0b9>] ? mcheck_timer+0x1c/0x7f
Dec 12 15:55:21 octopus kernel: [<ffffffff8104fdf9>] ? run_workqueue+0xa3/0x146
Dec 12 15:55:21 octopus kernel: [<ffffffff8104ff91>] ? worker_thread+0xf5/0x109
Dec 12 15:55:21 octopus kernel: [<ffffffff810536f5>] ? autoremove_wake_function+0x0/0x38
Dec 12 15:55:21 octopus kernel: [<ffffffff8104fe9c>] ? worker_thread+0x0/0x109
Dec 12 15:55:21 octopus kernel: [<ffffffff8105338b>] ? kthread+0x49/0x76
Dec 12 15:55:21 octopus kernel: [<ffffffff810116e9>] ? child_rip+0xa/0x11
Dec 12 15:55:21 octopus kernel: [<ffffffff81010a07>] ? restore_args+0x0/0x30
Dec 12 15:55:21 octopus kernel: [<ffffffff81053342>] ? kthread+0x0/0x76
Dec 12 15:55:21 octopus kernel: [<ffffffff810116df>] ? child_rip+0x0/0x11

Comment 5 Dimitri Maziuk 2008-12-16 20:23:35 UTC
I should've mentioned that the machine is a SuperMicro, it seems I'm not the only one having problems with them.

The error seems to persist when there's a missing or failed sata disk in there. (The system was originally set up with 6 sata drives, then one failed, then another.) I just received replacement drives from the vendor, rebooted with all 6 sata drives in place, and was able to partition new disks as well as the one that was giving me the error.

We'll see if stuck cpu problem, goes away too.