RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 920681 - /dev/snapshot's design is accident prone
Summary: /dev/snapshot's design is accident prone
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.4
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Laszlo Ersek
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-03-12 14:31 UTC by J.H.M. Dassen (Ray)
Modified: 2018-12-01 16:46 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-04-04 11:18:09 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 920687 0 high CLOSED "sginfo -l" should not open /dev/snapshot 2021-02-22 00:41:40 UTC

Internal Links: 920687

Description J.H.M. Dassen (Ray) 2013-03-12 14:31:10 UTC
Description of problem:
The design of /dev/snapshot's interfaces is prone to accidents (unexpected side effects).

Version-Release number of selected component (if applicable):
2.6.32-358.0.1.el6

How reproducible:
100%

Steps to Reproduce:
1. Install RHEL6 as a virtualisation guest using a virtualisation technology that supports hot-add/hot-remove of vCPUs (e.g. KVM, vmware); configure at least 2 vCPUs.
2. Verify that CPU1 can be enabled or disabled dynamically:
   # echo -n 0 > /sys/devices/system/cpu/cpu1/online
   # echo -n 1 > /sys/devices/system/cpu/cpu1/online
3. Run
   # cat < /dev/snapshot
4. Try to disable CPU1:
   # echo -n 0 > /sys/devices/system/cpu/cpu1/online

Actual results:
[root@rhel6 ~]# echo -n 0 > /sys/devices/system/cpu/cpu1/online 
-bash: echo: write error: Device or resource busy

Expected results:
CPU1 can be still be disabled dynamically.

Additional info:
While "cat < /dev/snapshot" may look quite artificial, there are very normal commands that an administrator or an administrative tool can invoke that have the same side effect; in particular "sginfo -l".

Comment 1 J.H.M. Dassen (Ray) 2013-03-12 14:44:55 UTC
The "sginfo -l" aspect of this is now being tracked through bug #920687.

Comment 8 Laszlo Ersek 2013-04-04 11:18:09 UTC
(commenting publicly because the bug is private already)

I think this bug has been analyzed completely by Ray (in customer portal case 00796597, associated with bug 920687 too, comment 1 here), and by Sander Keemink from the customer side.

Basically what happens is: customer runs a script periodically that invokes "sginfo -l". "sginfo" in that mode happens to open /dev/snapshot (snapshot_open() in "kernel/power/user.c"), the O_RDONLY mode calls pm_notifier_call_chain(PM_HIBERNATION_PREPARE).

In "kernel/cpu.c" that notifier sets "cpu_hotplug_disabled":

cpu_hotplug_pm_callback()
  cpu_hotplug_disable_before_freeze()
    cpu_hotplug_disabled = 1

From the other side, massaging the CPU "online" files in sysfs enters store_online() in "drivers/base/cpu.c"

store_online()
  cpu_up() [kernel/cpu.c]
    if cpu_hotplug_disabled then return -EBUSY

If the qemu guest agent met this error when hot-(un-)plugging a VCPU, it would propagate the error to libvirtd. I think the entire logic is sane, except that opening a device *maybe* shouldn't immediately emit a significant notification.

The kernel logic is identical in upstream, so this bug should be taken there first... if we (or they) consider it a bug at all. I seem to recall that just opening eg. watchdog devices is not without consequences (*), and I guess the same could be said about opening tape drives or just plain FIFOs in the filesystem (if there's a writer blocked on them, waiting for the first reader to open).

(*) Yup, see "Documentation/watchdog/watchdog-api.txt":

  The simplest API:

  All drivers support the basic mode of operation, where the watchdog
  activates as soon as /dev/watchdog is opened and will reboot unless
  the watchdog is pinged within a certain time [...]

With that in mind I'm closing this BZ as NOTABUG: "sginfo" just shouldn't indiscriminately open device nodes. It could search for names more carefully, and/or call stat() before open() and check the st_rdev field.

Ray, if you disagree with the closure, please feel free to reopen the BZ, but in that case I'll de-assign myself (as the BZ is not virt-related). Thanks.

Comment 9 Laszlo Ersek 2013-04-04 11:23:05 UTC
FWIW the /dev/snapshot behavior is documented in "Documentation/power/userland-swsusp.txt":

  The interface consists of a character device providing the open(),
  release(), read(), and write() operations as well as several ioctl()
  commands defined in include/linux/suspend_ioctls.h .  The major and minor
  numbers of the device are, respectively, 10 and 231, and they can
  be read from /sys/class/misc/snapshot/dev.

  The device can be open either for reading or for writing.  If open for
  reading, it is considered to be in the suspend mode. [...]

My "/sys/class/misc/snapshot/dev" says "10:231", this even gives "sginfo" a hint what to look for in "st_rdev" with makedev()/major()/minor().

Comment 10 J.H.M. Dassen (Ray) 2013-04-15 10:00:17 UTC
Global Support Services has privatised comments pertaining to Red Hat internal workflow considerations and has not identified sensitive data in or attached to this bugzilla. Uncloaking this bugzilla as per Engineering's suggestion to facilitate discussions with upstream (kernel and/or sg3_utils).


Note You need to log in before you can comment on or make changes to this bug.